1
|
Radrizzani S, Kudla G, Izsvák Z, Hurst LD. Selection on synonymous sites: the unwanted transcript hypothesis. Nat Rev Genet 2024; 25:431-448. [PMID: 38297070 DOI: 10.1038/s41576-023-00686-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/04/2023] [Indexed: 02/02/2024]
Abstract
Although translational selection to favour codons that match the most abundant tRNAs is not readily observed in humans, there is nonetheless selection in humans on synonymous mutations. We hypothesize that much of this synonymous site selection can be explained in terms of protection against unwanted RNAs - spurious transcripts, mis-spliced forms or RNAs derived from transposable elements or viruses. We propose not only that selection on synonymous sites functions to reduce the rate of creation of unwanted transcripts (for example, through selection on exonic splice enhancers and cryptic splice sites) but also that high-GC content (but low-CpG content), together with intron presence and position, is both particular to functional native mRNAs and used to recognize transcripts as native. In support of this hypothesis, transcription, nuclear export, liquid phase condensation and RNA degradation have all recently been shown to promote GC-rich transcripts and suppress AU/CpG-rich ones. With such 'traps' being set against AU/CpG-rich transcripts, the codon usage of native genes has, in turn, evolved to avoid such suppression. That parallel filters against AU/CpG-rich transcripts also affect the endosomal import of RNAs further supports the unwanted transcript hypothesis of synonymous site selection and explains the similar design rules that have enabled the successful use of transgenes and RNA vaccines.
Collapse
Affiliation(s)
- Sofia Radrizzani
- Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, UK
- Milner Therapeutics Institute, Jeffrey Cheah Biomedical Centre, University of Cambridge, Cambridge, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, Institute for Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Zsuzsanna Izsvák
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Society, Berlin, Germany
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, UK.
| |
Collapse
|
2
|
Abrahams L, Savisaar R, Mordstein C, Young B, Kudla G, Hurst LD. Evidence in disease and non-disease contexts that nonsense mutations cause altered splicing via motif disruption. Nucleic Acids Res 2021; 49:9665-9685. [PMID: 34469537 PMCID: PMC8464065 DOI: 10.1093/nar/gkab750] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 08/17/2021] [Accepted: 08/19/2021] [Indexed: 12/21/2022] Open
Abstract
Transcripts containing premature termination codons (PTCs) can be subject to nonsense-associated alternative splicing (NAS). Two models have been evoked to explain this, scanning and splice motif disruption. The latter postulates that exonic cis motifs, such as exonic splice enhancers (ESEs), are disrupted by nonsense mutations. We employ genome-wide transcriptomic and k-mer enrichment methods to scrutinize this model. First, we show that ESEs are prone to disruptive nonsense mutations owing to their purine richness and paucity of TGA, TAA and TAG. The motif model correctly predicts that NAS rates should be low (we estimate 5–30%) and approximately in line with estimates for the rate at which random point mutations disrupt splicing (8–20%). Further, we find that, as expected, NAS-associated PTCs are predictable from nucleotide-based machine learning approaches to predict splice disruption and, at least for pathogenic variants, are enriched in ESEs. Finally, we find that both in and out of frame mutations to TAA, TGA or TAG are associated with exon skipping. While a higher relative frequency of such skip-inducing mutations in-frame than out of frame lends some credence to the scanning model, these results reinforce the importance of considering splice motif modulation to understand the etiology of PTC-associated disease.
Collapse
Affiliation(s)
- Liam Abrahams
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK
| | - Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK.,Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
| | - Christine Mordstein
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK.,MRC Human Genetics Unit, The University of Edinburgh, Crewe Road, Edinburgh EH4 2XU, UK.,Aarhus University, Department of Molecular Biology and Genetics, C F Møllers Allé 3, 8000 Aarhus, Denmark
| | - Bethan Young
- MRC Human Genetics Unit, The University of Edinburgh, Crewe Road, Edinburgh EH4 2XU, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, The University of Edinburgh, Crewe Road, Edinburgh EH4 2XU, UK
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK
| |
Collapse
|
3
|
Seoighe C, Kiniry SJ, Peters A, Baranov PV, Yang H. Selection Shapes Synonymous Stop Codon Use in Mammals. J Mol Evol 2020; 88:549-561. [PMID: 32617614 DOI: 10.1007/s00239-020-09957-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 06/19/2020] [Indexed: 12/15/2022]
Abstract
Phylogenetic models of the evolution of protein-coding sequences can provide insights into the selection pressures that have shaped them. In the application of these models synonymous nucleotide substitutions, which do not alter the encoded amino acid, are often assumed to have limited functional consequences and used as a proxy for the neutral rate of evolution. The ratio of nonsynonymous to synonymous substitution rates is then used to categorize the selective regime that applies to the protein (e.g., purifying selection, neutral evolution, diversifying selection). Here, we extend the Muse and Gaut model of codon evolution to explore the extent of purifying selection acting on substitutions between synonymous stop codons. Using a large collection of coding sequence alignments, we estimate that a high proportion (approximately 57%) of mammalian genes are affected by selection acting on stop codon preference. This proportion varies substantially by codon, with UGA stop codons far more likely to be conserved. Genes with evidence of selection acting on synonymous stop codons have distinctive characteristics, compared to unconserved genes with the same stop codon, including longer [Formula: see text] untranslated regions (UTRs) and shorter mRNA half-life. The coding regions of these genes are also much more likely to be under strong purifying selection pressure. Our results suggest that the preference for UGA stop codons found in many multicellular eukaryotes is selective rather than mutational in origin.
Collapse
Affiliation(s)
- Cathal Seoighe
- School of Mathematics, Statistics and Applied Mathematics, National University of Ireland Galway, Galway, Ireland.
| | - Stephen J Kiniry
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Andrew Peters
- School of Mathematics, Statistics and Applied Mathematics, National University of Ireland Galway, Galway, Ireland
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Haixuan Yang
- School of Mathematics, Statistics and Applied Mathematics, National University of Ireland Galway, Galway, Ireland
| |
Collapse
|
4
|
Rong S, Buerer L, Rhine CL, Wang J, Cygan KJ, Fairbrother WG. Mutational bias and the protein code shape the evolution of splicing enhancers. Nat Commun 2020; 11:2845. [PMID: 32504065 PMCID: PMC7275064 DOI: 10.1038/s41467-020-16673-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 04/28/2020] [Indexed: 02/06/2023] Open
Abstract
Exonic splicing enhancers (ESEs) are enriched in exons relative to introns and bind splicing activators. This study considers a fundamental question of co-evolution: How did ESE motifs become enriched in exons prior to the evolution of ESE recognition? We hypothesize that the high exon to intron motif ratios necessary for ESE function were created by mutational bias coupled with purifying selection on the protein code. These two forces retain certain coding motifs in exons while passively depleting them from introns. Through the use of simulations, genomic analyses, and high throughput splicing assays, we confirm the key predictions of this hypothesis, including an overlap between protein and splicing information in ESEs. We discuss the implications of mutational bias as an evolutionary driver in other cis-regulatory systems. Splicing is regulated by cis-acting elements in pre-mRNAs such as exonic or intronic splicing enhancers and silencers. Here the authors show that exonic splicing enhancers are enriched in exons compared to introns due to mutational bias coupled with purifying selection on the protein code.
Collapse
Affiliation(s)
- Stephen Rong
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.,Ecology and Evolutionary Biology, Brown University, Providence, RI, 02912, USA
| | - Luke Buerer
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA
| | - Christy L Rhine
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Jing Wang
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Kamil J Cygan
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.,Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - William G Fairbrother
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA. .,Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA. .,Hassenfeld Child Health Innovation Institute of Brown University, Providence, RI, 02912, USA.
| |
Collapse
|
5
|
Abrahams L, Hurst LD. A Depletion of Stop Codons in lincRNA is Owing to Transfer of Selective Constraint from Coding Sequences. Mol Biol Evol 2020; 37:1148-1164. [PMID: 31841162 PMCID: PMC7086181 DOI: 10.1093/molbev/msz299] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Although the constraints on a gene’s sequence are often assumed to reflect the functioning of that gene, here we propose transfer selection, a constraint operating on one class of genes transferred to another, mediated by shared binding factors. We show that such transfer can explain an otherwise paradoxical depletion of stop codons in long intergenic noncoding RNAs (lincRNAs). Serine/arginine-rich proteins direct the splicing machinery by binding exonic splice enhancers (ESEs) in immature mRNA. As coding exons cannot contain stop codons in one reading frame, stop codons should be rare within ESEs. We confirm that the stop codon density (SCD) in ESE motifs is low, even accounting for nucleotide biases. Given that serine/arginine-rich proteins binding ESEs also facilitate lincRNA splicing, a low SCD could transfer to lincRNAs. As predicted, multiexon lincRNA exons are depleted in stop codons, a result not explained by open reading frame (ORF) contamination. Consistent with transfer selection, stop codon depletion in lincRNAs is most acute in exonic regions with the highest ESE density, disappears when ESEs are masked, is consistent with stop codon usage skews in ESEs, and is diminished in both single-exon lincRNAs and introns. Owing to low SCD, the maximum lengths of pseudo-ORFs frequently exceed null expectations. This has implications for ORF annotation and the evolution of de novo protein-coding genes from lincRNAs. We conclude that not all constraints operating on genes need be explained by the functioning of the gene but may instead be transferred owing to shared binding factors.
Collapse
Affiliation(s)
- Liam Abrahams
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
6
|
Accounting for Programmed Ribosomal Frameshifting in the Computation of Codon Usage Bias Indices. G3-GENES GENOMES GENETICS 2018; 8:3173-3183. [PMID: 30111621 PMCID: PMC6169388 DOI: 10.1534/g3.118.200185] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Experimental evidence shows that synonymous mutations can have important consequences on genetic fitness. Many organisms display codon usage bias (CUB), where synonymous codons that are translated into the same amino acid appear with distinct frequency. Within genomes, CUB is thought to arise from selection for translational efficiency and accuracy, termed the translational efficiency hypothesis (TEH). Indeed, CUB indices correlate with protein expression levels, which is widely interpreted as evidence for translational selection. However, these tests neglect -1 programmed ribosomal frameshifting (-1 PRF), an important translational disruption effect found across all organisms of the tree of life. Genes that contain -1 PRF signals should cost more to express than genes without. Thus, CUB indices that do not consider -1 PRF may overestimate genes’ true adaptation to translational efficiency and accuracy constraints. Here, we first investigate whether -1 PRF signals do indeed carry such translational cost. We then propose two corrections for CUB indices for genes containing -1 PRF signals. We retest the TEH in Saccharomyces cerevisiae under these corrections. We find that the correlation between corrected CUB index and protein expression remains intact for most levels of uniform -1 PRF efficiencies, and tends to increase when these efficiencies decline with protein expression. We conclude that the TEH is strengthened and that -1 PRF events constitute a promising and useful tool to examine the relationships between CUB and selection for translation efficiency and accuracy.
Collapse
|
7
|
Savisaar R, Hurst LD. Exonic splice regulation imposes strong selection at synonymous sites. Genome Res 2018; 28:1442-1454. [PMID: 30143596 PMCID: PMC6169883 DOI: 10.1101/gr.233999.117] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 07/31/2018] [Indexed: 01/17/2023]
Abstract
What proportion of coding sequence nucleotides have roles in splicing, and how strong is the selection that maintains them? Despite a large body of research into exonic splice regulatory signals, these questions have not been answered. This is because, to our knowledge, previous investigations have not explicitly disentangled the frequency of splice regulatory elements from the strength of the evolutionary constraint under which they evolve. Current data are consistent both with a scenario of weak and diffuse constraint, enveloping large swaths of sequence, as well as with well-defined pockets of strong purifying selection. In the former case, natural selection on exonic splice enhancers (ESEs) might primarily act as a slight modifier of codon usage bias. In the latter, mutations that disrupt ESEs are likely to have large fitness and, potentially, clinical effects. To distinguish between these scenarios, we used several different methods to determine the distribution of selection coefficients for new mutations within ESEs. The analyses converged to suggest that ∼15%-20% of fourfold degenerate sites are part of functional ESEs. Most of these sites are under strong evolutionary constraint. Therefore, exonic splice regulation does not simply impose a weak bias that gently nudges coding sequence evolution in a particular direction. Rather, the selection to preserve these motifs is a strong force that severely constrains the evolution of a substantial proportion of coding nucleotides. Thus synonymous mutations that disrupt ESEs should be considered as a potentially common cause of single-locus genetic disorders.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom
| |
Collapse
|
8
|
Abrahams L, Hurst LD. Adenine Enrichment at the Fourth CDS Residue in Bacterial Genes Is Consistent with Error Proofing for +1 Frameshifts. Mol Biol Evol 2018; 34:3064-3080. [PMID: 28961919 PMCID: PMC5850271 DOI: 10.1093/molbev/msx223] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Beyond selection for optimal protein functioning, coding sequences (CDSs) are under selection at the RNA and DNA levels. Here, we identify a possible signature of “dual-coding,” namely extensive adenine (A) enrichment at bacterial CDS fourth sites. In 99.07% of studied bacterial genomes, fourth site A use is greater than expected given genomic A-starting codon use. Arguing for nucleotide level selection, A-starting serine and arginine second codons are heavily utilized when compared with their non-A starting synonyms. Several models have the ability to explain some of this trend. In part, A-enrichment likely reduces 5′ mRNA stability, promoting translation initiation. However T/U, which may also reduce stability, is avoided. Further, +1 frameshifts on the initiating ATG encode a stop codon (TGA) provided A is the fourth residue, acting either as a frameshift “catch and destroy” or a frameshift stop and adjust mechanism and hence implicated in translation initiation. Consistent with both, genomes lacking TGA stop codons exhibit weaker fourth site A-enrichment. Sequences lacking a Shine–Dalgarno sequence and those without upstream leader genes, that may be more error prone during initiation, have greater utilization of A, again suggesting a role in initiation. The frameshift correction model is consistent with the notion that many genomic features are error-mitigation factors and provides the first evidence for site-specific out of frame stop codon selection. We conjecture that the NTG universal start codon may have evolved as a consequence of TGA being a stop codon and the ability of NTGA to rapidly terminate or adjust a ribosome.
Collapse
Affiliation(s)
- Liam Abrahams
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| |
Collapse
|
9
|
Hurst LD, Batada NN. Depletion of somatic mutations in splicing-associated sequences in cancer genomes. Genome Biol 2017; 18:213. [PMID: 29115978 PMCID: PMC5678748 DOI: 10.1186/s13059-017-1337-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Accepted: 10/12/2017] [Indexed: 01/01/2023] Open
Abstract
Background An important goal of cancer genomics is to identify systematically cancer-causing mutations. A common approach is to identify sites with high ratios of non-synonymous to synonymous mutations; however, if synonymous mutations are under purifying selection, this methodology leads to identification of false-positive mutations. Here, using synonymous somatic mutations (SSMs) identified in over 4000 tumours across 15 different cancer types, we sought to test this assumption by focusing on coding regions required for splicing. Results Exon flanks, which are enriched for sequences required for splicing fidelity, have ~ 17% lower SSM density compared to exonic cores, even after excluding canonical splice sites. While it is impossible to eliminate a mutation bias of unknown cause, multiple lines of evidence support a purifying selection model above a mutational bias explanation. The flank/core difference is not explained by skewed nucleotide content, replication timing, nucleosome occupancy or deficiency in mismatch repair. The depletion is not seen in tumour suppressors, consistent with their role in positive tumour selection, but is otherwise observed in cancer-associated and non-cancer genes, both essential and non-essential. Consistent with a role in splicing modulation, exonic splice enhancers have a lower SSM density before and after controlling for nucleotide composition; moreover, flanks at the 5’ end of the exons have significantly lower SSM density than at the 3’ end. Conclusions These results suggest that the observable mutational spectrum of cancer genomes is not simply a product of various mutational processes and positive selection, but might also be shaped by negative selection. Electronic supplementary material The online version of this article (doi:10.1186/s13059-017-1337-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Nizar N Batada
- Institute for Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU, UK.
| |
Collapse
|
10
|
Savisaar R, Hurst LD. Both Maintenance and Avoidance of RNA-Binding Protein Interactions Constrain Coding Sequence Evolution. Mol Biol Evol 2017; 34:1110-1126. [PMID: 28138077 PMCID: PMC5400389 DOI: 10.1093/molbev/msx061] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
While the principal force directing coding sequence (CDS) evolution is selection on protein function, to ensure correct gene expression CDSs must also maintain interactions with RNA-binding proteins (RBPs). Understanding how our genes are shaped by these RNA-level pressures is necessary for diagnostics and for improving transgenes. However, the evolutionary impact of the need to maintain RBP interactions remains unresolved. Are coding sequences constrained by the need to specify RBP binding motifs? If so, what proportion of mutations are affected? Might sequence evolution also be constrained by the need not to specify motifs that might attract unwanted binding, for instance because it would interfere with exon definition? Here, we have scanned human CDSs for motifs that have been experimentally determined to be recognized by RBPs. We observe two sets of motifs-those that are enriched over nucleotide-controlled null and those that are depleted. Importantly, the depleted set is enriched for motifs recognized by non-CDS binding RBPs. Supporting the functional relevance of our observations, we find that motifs that are more enriched are also slower-evolving. The net effect of this selection to preserve is a reduction in the over-all rate of synonymous evolution of 2-3% in both primates and rodents. Stronger motif depletion, on the other hand, is associated with stronger selection against motif gain in evolution. The challenge faced by our CDSs is therefore not only one of attracting the right RBPs but also of avoiding the wrong ones, all while also evolving under selection pressures related to protein structure.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
11
|
Sromek M, Czetwertyńska M, Tarasińska M, Janiec-Jankowska A, Zub R, Ćwikła M, Nowakowska D, Chechlińska M. Analysis of Newly Identified and Rare Synonymous Genetic Variants in the RET Gene in Patients with Medullary Thyroid Carcinoma in Polish Population. Endocr Pathol 2017; 28. [PMID: 28647780 PMCID: PMC5552825 DOI: 10.1007/s12022-017-9487-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Gain-of-function germline mutations of the RET proto-oncogene are responsible for initiation of carcinogenesis within the thyroid gland and development of hereditary form of medullary thyroid carcinoma and MEN2 syndrome. Genotype-phenotype correlations are established for most RET mutations, but the importance of the synonymous changes in this gene remains debatable. We aimed to analyze RET gene variants in Polish population. Genetic testing for the RET gene variants was performed with standard methods in 585 people aged 1-85, including 448 patients with medullary thyroid carcinoma and 131 of their first- and second-degree relatives, as well as six patients suspected of MTC/MEN2. Besides the most frequent synonymous changes, p.Leu769Leu, p.Ser836Ser, and p.Ser904Ser, four rare changes-c.1827C>T (p.Cys609Cys), c.2364C>T (p.Ile788Ile), c.2418C>T (p.Tyr806Tyr), and c.2673G>A (p.Ser891Ser)-were found in the RET gene, in the Polish population. Two of the rare changes, p.Cys609Cys and p.Ile788Ile, had not been previously described. The frequency of molecular synonymous variants in the general population was evaluated by testing 400 anonymous blood samples of neonates. Our findings may contribute to a better understanding of the genetic diversity of the RET gene and the involvement of synonymous variants in this diversity.
Collapse
Affiliation(s)
- Maria Sromek
- Department of Immunology, Maria Sklodowska-Curie Institute - Oncology Center, Warsaw, Poland
- Laboratory of Cellular Immunology, Maria Sklodowska-Curie Institute - Oncology Center, W.K. Roentgen 5, 02-781 Warsaw, Poland
| | - Małgorzata Czetwertyńska
- Department of Nuclear Medicine and Endocrine Oncology, Maria Sklodowska-Curie Institute - Oncology Center, Warsaw, Poland
| | - Magdalena Tarasińska
- Department of Oncology, The Children’s Memorial Health Institute, Warsaw, Poland
| | - Aneta Janiec-Jankowska
- Department of Diagnostic Laboratory of Genetic Predispositions, Maria Sklodowska-Curie Institute - Oncology Center, Warsaw, Poland
| | - Renata Zub
- Department of Molecular and Translational Oncology, Maria Sklodowska-Curie Institute - Oncology Center, Warsaw, Poland
| | - Maria Ćwikła
- Department of Gastroenterological Oncology, Maria Sklodowska-Curie Institute - Oncology Center, Warsaw, Poland
| | - Dorota Nowakowska
- Genetic Counseling Unit, Cancer Prevention Center, Maria Sklodowska-Curie Institute - Oncology Center, Warsaw, Poland
| | - Magdalena Chechlińska
- Department of Immunology, Maria Sklodowska-Curie Institute - Oncology Center, Warsaw, Poland
| |
Collapse
|
12
|
Livingstone M, Folkman L, Yang Y, Zhang P, Mort M, Cooper DN, Liu Y, Stantic B, Zhou Y. Investigating DNA-, RNA-, and protein-based features as a means to discriminate pathogenic synonymous variants. Hum Mutat 2017. [DOI: 10.1002/humu.23283] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Mark Livingstone
- School of Information and Communication Technology; Griffith University; Southport Queensland 4222 Australia
| | - Lukas Folkman
- School of Information and Communication Technology; Griffith University; Southport Queensland 4222 Australia
| | - Yuedong Yang
- School of Information and Communication Technology; Griffith University; Southport Queensland 4222 Australia
- Institute for Glycomics; Griffith University; Southport Queensland 4222 Australia
| | - Ping Zhang
- Menzies Health Institute; Griffith University; Southport Queensland 4222 Australia
| | - Matthew Mort
- Institute of Medical Genetics; Cardiff University; Cardiff CF144XN United Kingdom
| | - David N. Cooper
- Institute of Medical Genetics; Cardiff University; Cardiff CF144XN United Kingdom
| | - Yunlong Liu
- Department of Medical and Molecular Genetics; Indiana University; Indianapolis Indiana 46202
| | - Bela Stantic
- School of Information and Communication Technology; Griffith University; Southport Queensland 4222 Australia
| | - Yaoqi Zhou
- School of Information and Communication Technology; Griffith University; Southport Queensland 4222 Australia
- Institute for Glycomics; Griffith University; Southport Queensland 4222 Australia
| |
Collapse
|
13
|
Savisaar R, Hurst LD. Estimating the prevalence of functional exonic splice regulatory information. Hum Genet 2017; 136:1059-1078. [PMID: 28405812 PMCID: PMC5602102 DOI: 10.1007/s00439-017-1798-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Accepted: 04/04/2017] [Indexed: 12/14/2022]
Abstract
In addition to coding information, human exons contain sequences necessary for correct splicing. These elements are known to be under purifying selection and their disruption can cause disease. However, the density of functional exonic splicing information remains profoundly uncertain. Several groups have experimentally investigated how mutations at different exonic positions affect splicing. They have found splice information to be distributed widely in exons, with one estimate putting the proportion of splicing-relevant nucleotides at >90%. These results suggest that splicing could place a major pressure on exon evolution. However, analyses of sequence conservation have concluded that the need to preserve splice regulatory signals only slightly constrains exon evolution, with a resulting decrease in the average human rate of synonymous evolution of only 1–4%. Why do these two lines of research come to such different conclusions? Among other reasons, we suggest that the methods are measuring different things: one assays the density of sites that affect splicing, the other the density of sites whose effects on splicing are visible to selection. In addition, the experimental methods typically consider short exons, thereby enriching for nucleotides close to the splice junction, such sites being enriched for splice-control elements. By contrast, in part owing to correction for nucleotide composition biases and to the assumption that constraint only operates on exon ends, the conservation-based methods can be overly conservative.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK.
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
14
|
DNA sequence diversity and the efficiency of natural selection in animal mitochondrial DNA. Heredity (Edinb) 2016; 118:88-95. [PMID: 27827387 DOI: 10.1038/hdy.2016.108] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Revised: 09/07/2016] [Accepted: 09/19/2016] [Indexed: 12/21/2022] Open
Abstract
Selection is expected to be more efficient in species that are more diverse because both the efficiency of natural selection and DNA sequence diversity are expected to depend upon the effective population size. We explore this relationship across a data set of 751 mammal species for which we have mitochondrial polymorphism data. We introduce a method by which we can examine the relationship between our measure of the efficiency of natural selection, the nonsynonymous relative to the synonymous nucleotide site diversity (πN/πS), and synonymous nucleotide diversity (πS), avoiding the statistical non-independence between the two quantities. We show that these two variables are strongly negatively and linearly correlated on a log scale. The slope is such that as πS doubles, πN/πS is reduced by 34%. We show that the slope of this relationship differs between the two phylogenetic groups for which we have the most data, rodents and bats, and that it also differs between species with high and low body mass, and between those with high and low mass-specific metabolic rate.
Collapse
|
15
|
Gotea V, Gartner JJ, Qutob N, Elnitski L, Samuels Y. The functional relevance of somatic synonymous mutations in melanoma and other cancers. Pigment Cell Melanoma Res 2016; 28:673-84. [PMID: 26300548 DOI: 10.1111/pcmr.12413] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2015] [Accepted: 08/19/2015] [Indexed: 01/07/2023]
Abstract
Recent technological advances in sequencing have flooded the field of cancer research with knowledge about somatic mutations for many different cancer types. Most cancer genomics studies focus on mutations that alter the amino acid sequence, ignoring the potential impact of synonymous mutations. However, accumulating experimental evidence has demonstrated clear consequences for gene function, leading to a widespread recognition of the functional role of synonymous mutations and their causal connection to various diseases. Here, we review the evidence supporting the direct impact of synonymous mutations on gene function via gene splicing; mRNA stability, folding, and translation; protein folding; and miRNA-based regulation of expression. These results highlight the functional contribution of synonymous mutations to oncogenesis and the need to further investigate their detection and prioritization for experimental assessment.
Collapse
Affiliation(s)
- Valer Gotea
- Translational and Functional Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD, USA
| | - Jared J Gartner
- Surgery Branch, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Nouar Qutob
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Laura Elnitski
- Translational and Functional Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD, USA
| | - Yardena Samuels
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
16
|
Wu X, Hurst LD. Determinants of the Usage of Splice-Associated cis-Motifs Predict the Distribution of Human Pathogenic SNPs. Mol Biol Evol 2016; 33:518-29. [PMID: 26545919 PMCID: PMC4866546 DOI: 10.1093/molbev/msv251] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2015] [Revised: 10/21/2015] [Accepted: 10/25/2015] [Indexed: 12/11/2022] Open
Abstract
Where in genes do pathogenic mutations tend to occur and does this provide clues as to the possible underlying mechanisms by which single nucleotide polymorphisms (SNPs) cause disease? As splice-disrupting mutations tend to occur predominantly at exon ends, known also to be hot spots of cis-exonic splice control elements, we examine the relationship between the relative density of such exonic cis-motifs and pathogenic SNPs. In particular, we focus on the intragene distribution of exonic splicing enhancers (ESE) and the covariance between them and disease-associated SNPs. In addition to showing that disease-causing genes tend to be genes with a high intron density, consistent with missplicing, five factors established as trends in ESE usage, are considered: relative position in exons, relative position in genes, flanking intron size, splice sites usage, and phase. We find that more than 76% of pathogenic SNPs are within 3-69 bp of exon ends where ESEs generally reside, this being 13% more than expected. Overall from enrichment of pathogenic SNPs at exon ends, we estimate that approximately 20-45% of SNPs affect splicing. Importantly, we find that within genes pathogenic SNPs tend to occur in splicing-relevant regions with low ESE density: they are found to occur preferentially in the terminal half of genes, in exons flanked by short introns and at the ends of phase (0,0) exons with 3' non-"AGgt" splice site. We suggest the concept of the "fragile" exon, one home to pathogenic SNPs owing to its vulnerability to splice disruption owing to low ESE density.
Collapse
Affiliation(s)
- XianMing Wu
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| |
Collapse
|
17
|
Abstract
Exonic splice enhancers (ESEs) are short nucleotide motifs, enriched near exon ends, that enhance the recognition of the splice site and thus promote splicing. Are intronless genes under selection to avoid these motifs so as not to attract the splicing machinery to an mRNA that should not be spliced, thereby preventing the production of an aberrant transcript? Consistent with this possibility, we find that ESEs in putative recent retrocopies are at a higher density and evolving faster than those in other intronless genes, suggesting that they are being lost. Moreover, intronless genes are less dense in putative ESEs than intron-containing ones. However, this latter difference is likely due to the skewed base composition of intronless sequences, a skew that is in line with the general GC richness of few exon genes. Indeed, after controlling for such biases, we find that both intronless and intron-containing genes are denser in ESEs than expected by chance. Importantly, nucleotide-controlled analysis of evolutionary rates at synonymous sites in ESEs indicates that the ESEs in intronless genes are under purifying selection in both human and mouse. We conclude that on the loss of introns, some but not all, ESE motifs are lost, the remainder having functions beyond a role in splice promotion. These results have implications for the design of intronless transgenes and for understanding the causes of selection on synonymous sites.
Collapse
Affiliation(s)
- Rosina Savisaar
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| |
Collapse
|
18
|
Bush SJ, Kover PX, Urrutia AO. Lineage-specific sequence evolution and exon edge conservation partially explain the relationship between evolutionary rate and expression level in A. thaliana. Mol Ecol 2015; 24:3093-106. [PMID: 25930165 PMCID: PMC4480654 DOI: 10.1111/mec.13221] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Revised: 04/21/2015] [Accepted: 04/28/2015] [Indexed: 02/06/2023]
Abstract
Rapidly evolving proteins can aid the identification of genes underlying phenotypic adaptation across taxa, but functional and structural elements of genes can also affect evolutionary rates. In plants, the ‘edges’ of exons, flanking intron junctions, are known to contain splice enhancers and to have a higher degree of conservation compared to the remainder of the coding region. However, the extent to which these regions may be masking indicators of positive selection or account for the relationship between dN/dS and other genomic parameters is unclear. We investigate the effects of exon edge conservation on the relationship of dN/dS to various sequence characteristics and gene expression parameters in the model plant Arabidopsis thaliana. We also obtain lineage-specific dN/dS estimates, making use of the recently sequenced genome of Thellungiella parvula, the second closest sequenced relative after the sister species Arabidopsis lyrata. Overall, we find that the effect of exon edge conservation, as well as the use of lineage-specific substitution estimates, upon dN/dS ratios partly explains the relationship between the rates of protein evolution and expression level. Furthermore, the removal of exon edges shifts dN/dS estimates upwards, increasing the proportion of genes potentially under adaptive selection. We conclude that lineage-specific substitutions and exon edge conservation have an important effect on dN/dS ratios and should be considered when assessing their relationship with other genomic parameters.
Collapse
Affiliation(s)
- Stephen J Bush
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Paula X Kover
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Araxi O Urrutia
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
19
|
Anisimova M. Darwin and Fisher meet at biotech: on the potential of computational molecular evolution in industry. BMC Evol Biol 2015; 15:76. [PMID: 25928234 PMCID: PMC4422139 DOI: 10.1186/s12862-015-0352-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Accepted: 04/15/2015] [Indexed: 12/22/2022] Open
Abstract
Background Today computational molecular evolution is a vibrant research field that benefits from the availability of large and complex new generation sequencing data – ranging from full genomes and proteomes to microbiomes, metabolomes and epigenomes. The grounds for this progress were established long before the discovery of the DNA structure. Specifically, Darwin’s theory of evolution by means of natural selection not only remains relevant today, but also provides a solid basis for computational research with a variety of applications. But a long-term progress in biology was ensured by the mathematical sciences, as exemplified by Sir R. Fisher in early 20th century. Now this is true more than ever: The data size and its complexity require biologists to work in close collaboration with experts in computational sciences, modeling and statistics. Results Natural selection drives function conservation and adaptation to emerging pathogens or new environments; selection plays key role in immune and resistance systems. Here I focus on computational methods for evaluating selection in molecular sequences, and argue that they have a high potential for applications. Pharma and biotech industries can successfully use this potential, and should take the initiative to enhance their research and development with state of the art bioinformatics approaches. Conclusions This review provides a quick guide to the current computational approaches that apply the evolutionary principles of natural selection to real life problems – from drug target validation, vaccine design and protein engineering to applications in agriculture, ecology and conservation.
Collapse
Affiliation(s)
- Maria Anisimova
- Institute of Applied Simulations, School of Life Sciences and Facility Management, Zürich University of Applied Sciences, Einsiedlerstrasse 31a, Wädenswil, 8820, Switzerland. .,Department of Computer Science, ETH, Zurich, Switzerland. .,Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
20
|
Wu X, Hurst LD. Why Selection Might Be Stronger When Populations Are Small: Intron Size and Density Predict within and between-Species Usage of Exonic Splice Associated cis-Motifs. Mol Biol Evol 2015; 32:1847-61. [PMID: 25771198 PMCID: PMC4476162 DOI: 10.1093/molbev/msv069] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
The nearly neutral theory predicts that small effective population size provides the conditions for weakened selection. This is postulated to explain why our genome is more “bloated” than that of, for example, yeast, ours having large introns and large intergene spacer. If a bloated genome is also an error prone genome might it, however, be the case that selection for error-mitigating properties is stronger in our genome? We examine this notion using splicing as an exemplar, not least because large introns can predispose to noisy splicing. We thus ask whether, owing to genomic decay, selection for splice error-control mechanisms is stronger, not weaker, in species with large introns and small populations. In humans much information defining splice sites is in cis-exonic motifs, most notably exonic splice enhancers (ESEs). These act as splice-error control elements. Here then we ask whether within and between-species intron size is a predictor of the commonality of exonic cis-splicing motifs. We show that, as predicted, the proportion of synonymous sites that are ESE-associated and under selection in humans is weakly positively correlated with the size of the flanking intron. In a phylogenetically controlled framework, we observe, also as expected, that mean intron size is both predicted by Ne.μ and is a good predictor of cis-motif usage across species, this usage coevolving with splice site definition. Unexpectedly, however, across taxa intron density is a better predictor of cis-motif usage than intron size. We propose that selection for splice-related motifs is driven by a need to avoid decoy splice sites that will be more common in genes with many and large introns. That intron number and density predict ESE usage within human genes is consistent with this, as is the finding of intragenic heterogeneity in ESE density. As intronic content and splice site usage across species is also well predicted by Ne.μ, the result also suggests an unusual circumstance in which selection (for cis-modifiers of splicing) might be stronger when population sizes are smaller, as here splicing is noisier, resulting in a greater need to control error-prone splicing.
Collapse
Affiliation(s)
- XianMing Wu
- Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| |
Collapse
|
21
|
Zhang X, Joehanes R, Chen BH, Huan T, Ying S, Munson PJ, Johnson AD, Levy D, O'Donnell CJ. Identification of common genetic variants controlling transcript isoform variation in human whole blood. Nat Genet 2015; 47:345-52. [PMID: 25685889 DOI: 10.1038/ng.3220] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2014] [Accepted: 01/20/2015] [Indexed: 12/17/2022]
Abstract
An understanding of the genetic variation underlying transcript splicing is essential to dissect the molecular mechanisms of common disease. The available evidence from splicing quantitative trait locus (sQTL) studies has been limited to small samples. We performed genome-wide screening to identify SNPs that might control mRNA splicing in whole blood collected from 5,257 Framingham Heart Study participants. We identified 572,333 cis sQTLs involving 2,650 unique genes. Many sQTL-associated genes (40%) undergo alternative splicing. Using the National Human Genome Research Institute (NHGRI) genome-wide association study (GWAS) catalog, we determined that 528 unique sQTLs were significantly enriched for 8,845 SNPs associated with traits in previous GWAS. In particular, we found 395 (4.5%) GWAS SNPs with evidence of cis sQTLs but not gene-level cis expression quantitative trait loci (eQTLs), suggesting that sQTL analysis could provide additional insights into the functional mechanism underlying GWAS results. Our findings provide an informative sQTL resource for further characterizing the potential functional roles of SNPs that control transcript isoforms relevant to common diseases.
Collapse
Affiliation(s)
- Xiaoling Zhang
- 1] Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, Maryland, USA. [2] National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, Massachusetts, USA
| | - Roby Joehanes
- 1] Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, Maryland, USA. [2] National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, Massachusetts, USA. [3] Mathematical and Statistical Computing Laboratory, Center for Information Technology, US National Institutes of Health, Bethesda, Maryland, USA
| | - Brian H Chen
- 1] Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, Maryland, USA. [2] National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, Massachusetts, USA
| | - Tianxiao Huan
- 1] Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, Maryland, USA. [2] National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, Massachusetts, USA
| | - Saixia Ying
- Mathematical and Statistical Computing Laboratory, Center for Information Technology, US National Institutes of Health, Bethesda, Maryland, USA
| | - Peter J Munson
- Mathematical and Statistical Computing Laboratory, Center for Information Technology, US National Institutes of Health, Bethesda, Maryland, USA
| | - Andrew D Johnson
- 1] Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, Maryland, USA. [2] National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, Massachusetts, USA
| | - Daniel Levy
- 1] Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, Maryland, USA. [2] National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, Massachusetts, USA
| | - Christopher J O'Donnell
- 1] Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, Maryland, USA. [2] National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, Massachusetts, USA. [3] Division of Cardiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
22
|
Schüler A, Ghanbarian AT, Hurst LD. Purifying selection on splice-related motifs, not expression level nor RNA folding, explains nearly all constraint on human lincRNAs. Mol Biol Evol 2014; 31:3164-83. [PMID: 25158797 PMCID: PMC4245815 DOI: 10.1093/molbev/msu249] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
There are two strong and equally important predictors of rates of human protein evolution: The amount the gene is expressed and the proportion of exonic sequence devoted to control splicing, mediated largely by selection on exonic splice enhancer (ESE) motifs. Is the same true for noncoding RNAs, known to be under very weak purifying selection? Prior evidence suggests that selection at splice sites in long intergenic noncoding RNAs (lincRNAs) is important. We now report multiple lines of evidence indicating that the great majority of purifying selection operating on lincRNAs in humans is splice related. Splice-related parameters explain much of the between-gene variation in evolutionary rate in humans. Expression rate is not a relevant predictor, although expression breadth is weakly so. In contrast to protein-coding RNAs, we observe no relationship between evolutionary rate and lincRNA stability. As in protein-coding genes, ESEs are especially abundant near splice junctions and evolve slower than non-ESE sequence equidistant from boundaries. Nearly all constraint in lincRNAs is at exon ends (N.B. the same is not witnessed in Drosophila). Although we cannot definitely answer the question as to why splice-related selection is so important, we find no evidence that splicing might enable the nonsense-mediated decay pathway to capture transcripts incorrectly processed by ribosomes. We find evidence consistent with the notion that splicing modifies the underlying chromatin through recruitment of splice-coupled chromatin modifiers, such as CHD1, which in turn might modulate neighbor gene activity. We conclude that most selection on human lincRNAs is splice mediated and suggest that the possibility of splice-chromatin coupling is worthy of further scrutiny.
Collapse
Affiliation(s)
- Andreas Schüler
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Avazeh T Ghanbarian
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
23
|
Du J, Dungan SZ, Sabouhanian A, Chang BSW. Selection on synonymous codons in mammalian rhodopsins: a possible role in optimizing translational processes. BMC Evol Biol 2014; 14:96. [PMID: 24884412 PMCID: PMC4021273 DOI: 10.1186/1471-2148-14-96] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Accepted: 04/11/2014] [Indexed: 01/21/2023] Open
Abstract
Background Synonymous codon usage can affect many cellular processes, particularly those associated with translation such as polypeptide elongation and folding, mRNA degradation/stability, and splicing. Highly expressed genes are thought to experience stronger selection pressures on synonymous codons. This should result in codon usage bias even in species with relatively low effective population sizes, like mammals, where synonymous site selection is thought to be weak. Here we use phylogenetic codon-based likelihood models to explore patterns of codon usage bias in a dataset of 18 mammalian rhodopsin sequences, the protein mediating the first step in vision in the eye, and one of the most highly expressed genes in vertebrates. We use these patterns to infer selection pressures on key translational mechanisms including polypeptide elongation, protein folding, mRNA stability, and splicing. Results Overall, patterns of selection in mammalian rhodopsin appear to be correlated with post-transcriptional and translational processes. We found significant evidence for selection at synonymous sites using phylogenetic mutation-selection likelihood models, with C-ending codons found to have the highest relative fitness, and to be significantly more abundant at conserved sites. In general, these codons corresponded with the most abundant tRNAs in mammals. We found significant differences in codon usage bias between rhodopsin loops versus helices, though there was no significant difference in mean synonymous substitution rate between these motifs. We also found a significantly higher proportion of GC-ending codons at paired sites in rhodopsin mRNA secondary structure, and significantly lower synonymous mutation rates in putative exonic splicing enhancer (ESE) regions than in non-ESE regions. Conclusions By focusing on a single highly expressed gene we both distinguish synonymous codon selection from mutational effects and analytically explore underlying functional mechanisms. Our results suggest that codon bias in mammalian rhodopsin arises from selection to optimally balance high overall translational speed, accuracy, and proper protein folding, especially in structurally complicated regions. Selection at synonymous sites may also be contributing to mRNA stability and splicing efficiency at exonic-splicing-enhancer (ESE) regions. Our results highlight the importance of investigating highly expressed genes in a broader phylogenetic context in order to better understand the evolution of synonymous substitutions.
Collapse
Affiliation(s)
| | | | | | - Belinda S W Chang
- Department of Ecology & Evolutionary Biology, University of Toronto, 25 Harbord Street, Toronto, ON M5S 3G5, Canada.
| |
Collapse
|
24
|
Falanga A, Stojanović O, Kiffer-Moreira T, Pinto S, Millán JL, Vlahoviček K, Baralle M. Exonic splicing signals impose constraints upon the evolution of enzymatic activity. Nucleic Acids Res 2014; 42:5790-8. [PMID: 24692663 PMCID: PMC4027185 DOI: 10.1093/nar/gku240] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Exon splicing enhancers (ESEs) overlap with amino acid coding sequences implying a dual evolutionary selective pressure. In this study, we map ESEs in the placental alkaline phosphatase gene (ALPP), absent in the corresponding exon of the ancestral tissue-non-specific alkaline phosphatase gene (ALPL). The ESEs are associated with amino acid differences between the transcripts in an area otherwise conserved. We switched out the ALPP ESEs sequences with the sequence from the related ALPL, introducing the associated amino acid changes. The resulting enzymes, produced by cDNA expression, showed different kinetic characteristics than ALPL and ALPP. In the organism, this enzyme will never be subjected to selection because gene splicing analysis shows exon skipping due to loss of the ESE. Our data prove that ESEs restrict the evolution of enzymatic activity. Thus, suboptimal proteins may exist in scenarios when coding nucleotide changes and consequent amino acid variation cannot be reconciled with the splicing function.
Collapse
Affiliation(s)
- Alessia Falanga
- Molecular Pathology Group, International Centre for Genetic Engineering and Biotechnology (ICGEB), Padriciano 99, 34149 Trieste, Italy
| | - Ozren Stojanović
- Bioinformatics Group, Department of Molecular Biology, Division of Biology, Faculty of Science, University of Zagreb, Horvatovac 102a, 10000 Zagreb, Croatia
| | - Tina Kiffer-Moreira
- Sanford Children's Health Research Center, Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Sofia Pinto
- Bioinformatics Group, Department of Molecular Biology, Division of Biology, Faculty of Science, University of Zagreb, Horvatovac 102a, 10000 Zagreb, Croatia
| | - José Luis Millán
- Sanford Children's Health Research Center, Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Kristian Vlahoviček
- Bioinformatics Group, Department of Molecular Biology, Division of Biology, Faculty of Science, University of Zagreb, Horvatovac 102a, 10000 Zagreb, Croatia Department of Informatics, University of Oslo, PO Box 1080 Blindern, NO-0316 Oslo, Norway
| | - Marco Baralle
- Molecular Pathology Group, International Centre for Genetic Engineering and Biotechnology (ICGEB), Padriciano 99, 34149 Trieste, Italy
| |
Collapse
|
25
|
Cáceres EF, Hurst LD. The evolution, impact and properties of exonic splice enhancers. Genome Biol 2013; 14:R143. [PMID: 24359918 PMCID: PMC4054783 DOI: 10.1186/gb-2013-14-12-r143] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2013] [Accepted: 12/20/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In humans, much of the information specifying splice sites is not at the splice site. Exonic splice enhancers are one of the principle non-splice site motifs. Four high-throughput studies have provided a compendium of motifs that function as exonic splice enhancers, but only one, RESCUE-ESE, has been generally employed to examine the properties of enhancers. Here we consider these four datasets to ask whether there is any consensus on the properties and impacts of exonic splice enhancers. RESULTS While only about 1% of all the identified hexamer motifs are common to all analyses we can define reasonably sized sets that are found in most datasets. These consensus intersection datasets we presume reflect the true properties of exonic splice enhancers. Given prior evidence for the properties of enhancers and splice-associated mutations, we ask for all datasets whether the exonic splice enhancers considered are purine enriched; enriched near exon boundaries; able to predict trends in relative codon usage; slow evolving at synonymous sites; rare in SNPs; associated with weak splice sites; and enriched near longer introns. While the intersect datasets match expectations, only one original dataset, RESCUE-ESE, does. Unexpectedly, a fully experimental dataset identifies motifs that commonly behave opposite to the consensus, for example, being enriched in exon cores where splice-associated mutations are rare. CONCLUSIONS Prior analyses that used the RESCUE-ESE set of hexamers captured the properties of consensus exonic splice enhancers. We estimate that at least 4% of synonymous mutations are deleterious owing to an effect on enhancer functioning.
Collapse
|
26
|
Bloom AJ, Martinez M, Chen LS, Bierut LJ, Murphy SE, Goate A. CYP2B6 non-coding variation associated with smoking cessation is also associated with differences in allelic expression, splicing, and nicotine metabolism independent of common amino-acid changes. PLoS One 2013; 8:e79700. [PMID: 24260284 PMCID: PMC3829832 DOI: 10.1371/journal.pone.0079700] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2013] [Accepted: 10/04/2013] [Indexed: 11/23/2022] Open
Abstract
The Cytochrome P450 2B6 (CYP2B6) enzyme makes a small contribution to hepatic nicotine metabolism relative to CYP2A6, but CYP2B6 is the primary enzyme responsible for metabolism of the smoking cessation drug bupropion. Using CYP2A6 genotype as a covariate, we find that a non-coding polymorphism in CYP2B6 previously associated with smoking cessation (rs8109525) is also significantly associated with nicotine metabolism. The association is independent of the well-studied non-synonymous variants rs3211371, rs3745274, and rs2279343 (CYP2B6*5 and *6). Expression studies demonstrate that rs8109525 is also associated with differences in CYP2B6 mRNA expression in liver biopsy samples. Splicing assays demonstrate that specific splice forms of CYP2B6 are associated with haplotypes defined by variants including rs3745274 and rs8109525. These results indicate differences in mRNA expression and splicing as potential molecular mechanisms by which non-coding variation in CYP2B6 may affect enzymatic activity leading to differences in metabolism and smoking cessation.
Collapse
Affiliation(s)
- A. Joseph Bloom
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri, United States of America
- * E-mail:
| | - Maribel Martinez
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Li-Shiun Chen
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Laura J. Bierut
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Sharon E. Murphy
- Department of Biochemistry Molecular Biology and BioPhysics, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Alison Goate
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri, United States of America
| |
Collapse
|
27
|
A compensatory effect upon splicing results in normal function of the CYP2A6*14 allele. Pharmacogenet Genomics 2013; 23:107-16. [PMID: 23292114 DOI: 10.1097/fpc.0b013e32835caf7d] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
A synonymous variant in the first exon of CYP2A6, rs1137115 (51G>A), defines the common reference allele CYP2A6*1A, and is associated with lower mRNA expression and slower in-vivo nicotine metabolism. Another common allele, CYP2A6*14, differs from CYP2A6*1A by a single variant, rs28399435 (86G>A, S29N). However, CYP2A6*14 shows in-vivo activity comparable with that of full-function alleles, and significantly higher than CYP2A6*1A. rs1137115A is predicted to create an exonic splicing suppressor site overlapping an exonic splicing enhancer (ESE) site in the first exon of CYP2A6, whereas rs28399435A is predicted to strengthen another adjacent ESE, potentially compensating for rs1137115A. Using an allelic expression assay to assess cDNAs produced from rs1137115 heterozygous liver biopsy samples, lower expression of the CYP2A6*1A allele is confirmed while CYP2A6*14 expression is found to be indistinguishable from that of rs1137115G alleles. Quantitative PCR assays to determine the relative abundance of spliced and unspliced or partially spliced CYP2A6 mRNAs in liver biopsy samples show that *1A/*1A homozygotes have a significantly lower ratio, due to both a reduction in spliced forms and an increase in unspliced or partially spliced CYP2A6. These results show the importance of common genetic variants that effect exonic splicing suppressor and ESEs to explain human variation regarding clinically-relevant phenotypes.
Collapse
|
28
|
Testing for natural selection in human exonic splicing regulators associated with evolutionary rate shifts. J Mol Evol 2013; 76:228-39. [PMID: 23529588 DOI: 10.1007/s00239-013-9555-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2012] [Accepted: 03/09/2013] [Indexed: 12/21/2022]
Abstract
Despite evidence that at the interspecific scale, exonic splicing silencers (ESSs) are under negative selection in constitutive exons, little is known about the effects of slightly deleterious polymorphisms on these splicing regulators. Through the application of a modified version of the McDonald-Kreitman test, we compared the normalized proportions of human polymorphisms and human/rhesus substitutions affecting exonic splicing regulators (ESRs) on sequences of constitutive and alternative exons. Our results show a depletion of substitutions and an enrichment of SNPs associated with ESS gain in constitutive exons. Moreover, we show that this evolutionary pattern is also present in a set of ESRs previously involved in the transition from constitutive to skipped exons in the mammalian lineage. The similarity between these two sets of ESRs suggests that the transition from constitutive to skipped exons in mammals is more frequently associated with the inhibition than with the promotion of splicing signals. This is in accordance with the hypothesis of a constitutive origin of exon skipping and corroborates previous findings about the antagonistic role of certain exonic splicing enhancers.
Collapse
|
29
|
Wu X, Tronholm A, Cáceres EF, Tovar-Corona JM, Chen L, Urrutia AO, Hurst LD. Evidence for deep phylogenetic conservation of exonic splice-related constraints: splice-related skews at exonic ends in the brown alga Ectocarpus are common and resemble those seen in humans. Genome Biol Evol 2013; 5:1731-45. [PMID: 23902749 PMCID: PMC3787667 DOI: 10.1093/gbe/evt115] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/25/2013] [Indexed: 12/22/2022] Open
Abstract
The control of RNA splicing is often modulated by exonic motifs near splice sites. Chief among these are exonic splice enhancers (ESEs). Well-described ESEs in mammals are purine rich and cause predictable skews in codon and amino acid usage toward exonic ends. Looking across species, those with relatively abundant intronic sequence are those with the more profound end of exon skews, indicative of exonization of splice site recognition. To date, the only intron-rich species that have been analyzed are mammals, precluding any conclusions about the likely ancestral condition. Here, we examine the patterns of codon and amino acid usage in the vicinity of exon-intron junctions in the brown alga Ectocarpus siliculosus, a species with abundant large introns, known SR proteins, and classical splice sites. We find that amino acids and codons preferred/avoided at both 3' and 5' ends in Ectocarpus, of which there are many, tend, on average, to also be preferred/avoided at the same exon ends in humans. Moreover, the preferences observed at the 5' ends of exons are largely the same as those at the 3' ends, a symmetry trend only previously observed in animals. We predict putative hexameric ESEs in Ectocarpus and show that these are purine rich and that there are many more of these identified as functional ESEs in humans than expected by chance. These results are consistent with deep phylogenetic conservation of SR protein binding motifs. Assuming codons preferred near boundaries are "splice optimal" codons, in Ectocarpus, unlike Drosophila, splice optimal and translationally optimal codons are not mutually exclusive. The exclusivity of translationally optimal and splice optimal codon sets is thus not universal.
Collapse
Affiliation(s)
- XianMing Wu
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| | - Ana Tronholm
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
- Present address: Department of Biological Sciences, University of Alabama, Mary Harmon Bryant Hall, Tuscaloosa, AL
| | - Eva Fernández Cáceres
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| | - Jaime M. Tovar-Corona
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| | - Lu Chen
- Human Genetics, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, United Kingdom
| | - Araxi O. Urrutia
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| | - Laurence D. Hurst
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| |
Collapse
|
30
|
Jakubauskiene E, Janaviciute V, Peciuliene I, Söderkvist P, Kanopka A. G/A polymorphism in intronic sequence affects the processing of MAO-B gene in patients with Parkinson disease. FEBS Lett 2012; 586:3698-704. [PMID: 22974659 DOI: 10.1016/j.febslet.2012.08.028] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2012] [Revised: 08/16/2012] [Accepted: 08/21/2012] [Indexed: 11/27/2022]
Abstract
Monoamine oxidase B (MAO-B) plays an important role in the metabolism of neuroactive and vasoactive amines in the central nervous system and peripheral tissues. Increased levels of MAO-B mRNA and enzymatic activity have been reported in platelets from patients with Parkinson's and Alzheimer's diseases, however the triggers of enhanced mRNA levels are unknown. Our results demonstrate for the first time that G/A dimorphism in intron 13 sequence creates splicing enhancer thus stimulating intron 13 removal efficiency. The increased MAO-B protein levels might serve as a surrogate marker for - Parkinson disease.
Collapse
Affiliation(s)
- Egle Jakubauskiene
- Department of Immunology and Cell Biology, Vilnius University, Institute of Biotechnology, LT-02241 Vilnius, Lithuania
| | | | | | | | | |
Collapse
|
31
|
Cordero P, Ashley EA. Whole-Genome Sequencing in Personalized Therapeutics. Clin Pharmacol Ther 2012; 91:1001-9. [DOI: 10.1038/clpt.2012.51] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
32
|
Rogozin IB, Carmel L, Csuros M, Koonin EV. Origin and evolution of spliceosomal introns. Biol Direct 2012; 7:11. [PMID: 22507701 PMCID: PMC3488318 DOI: 10.1186/1745-6150-7-11] [Citation(s) in RCA: 245] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 03/15/2012] [Indexed: 12/31/2022] Open
Abstract
Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information NLM/NIH, 8600 Rockville Pike, Bldg, 38A, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
33
|
Han Y, Fan X, Sun K, Wang X, Wang Y, Chen J, Zhen Y, Zhang W, Hui R. Hypertension associated polymorphisms in WNK1 / WNK4 are not associated with hydrochlorothiazide response. Clin Biochem 2011; 44:1045-1049. [DOI: 10.1016/j.clinbiochem.2011.06.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Revised: 05/09/2011] [Accepted: 06/03/2011] [Indexed: 02/04/2023]
|
34
|
Gorlov IP, Gorlova OY, Frazier ML, Spitz MR, Amos CI. Evolutionary evidence of the effect of rare variants on disease etiology. Clin Genet 2010; 79:199-206. [PMID: 20831747 DOI: 10.1111/j.1399-0004.2010.01535.x] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The common disease/common variant hypothesis has been popular for describing the genetic architecture of common human diseases for several years. According to the originally stated hypothesis, one or a few common genetic variants with a large effect size control the risk of common diseases. A growing body of evidence, however, suggests that rare single-nucleotide polymorphisms (SNPs), i.e. those with a minor allele frequency of less than 5%, are also an important component of the genetic architecture of common human diseases. In this study, we analyzed the relevance of rare SNPs to the risk of common diseases from an evolutionary perspective and found that rare SNPs are more likely than common SNPs to be functional and tend to have a stronger effect size than do common SNPs. This observation, and the fact that most of the SNPs in the human genome are rare, suggests that rare SNPs are a crucial element of the genetic architecture of common human diseases. We propose that the next generation of genomic studies should focus on analyzing rare SNPs. Further, targeting patients with a family history of the disease, an extreme phenotype, or early disease onset may facilitate the detection of risk-associated rare SNPs.
Collapse
Affiliation(s)
- I P Gorlov
- Department of Genitourinary Medical Oncology Department of Epidemiology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, TX 77030, USA.
| | | | | | | | | |
Collapse
|
35
|
Shumay E, Fowler JS, Volkow ND. Genomic features of the human dopamine transporter gene and its potential epigenetic States: implications for phenotypic diversity. PLoS One 2010; 5:e11067. [PMID: 20548783 PMCID: PMC2883569 DOI: 10.1371/journal.pone.0011067] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2010] [Accepted: 05/18/2010] [Indexed: 02/06/2023] Open
Abstract
Human dopamine transporter gene (DAT1 or SLC6A3) has been associated with various brain-related diseases and behavioral traits and, as such, has been investigated intensely in experimental- and clinical-settings. However, the abundance of research data has not clarified the biological mechanism of DAT regulation; similarly, studies of DAT genotype-phenotype associations yielded inconsistent results. Hence, our understanding of the control of the DAT protein product is incomplete; having this knowledge is critical, since DAT plays the major role in the brain's dopaminergic circuitry. Accordingly, we reevaluated the genomic attributes of the SLC6A3 gene that might confer sensitivity to regulation, hypothesizing that its unique genomic characteristics might facilitate highly dynamic, region-specific DAT expression, so enabling multiple regulatory modes. Our comprehensive bioinformatic analyzes revealed very distinctive genomic characteristics of the SLC6A3, including high inter-individual variability of its sequence (897 SNPs, about 90 repeats and several CNVs spell out all abbreviations in abstract) and pronounced sensitivity to regulation by epigenetic mechanisms, as evident from the GC-bias composition (0.55) of the SLC6A3, and numerous intragenic CpG islands (27 CGIs). We propose that this unique combination of the genomic features and the regulatory attributes enables the differential expression of the DAT1 gene and fulfills seemingly contradictory demands to its regulation; that is, robustness of region-specific expression and functional dynamics.
Collapse
Affiliation(s)
- Elena Shumay
- Brookhaven National Laboratory, Medical Department, Upton, New York, United States of America
- * E-mail: (ES); (JSF); (NDV)
| | - Joanna S. Fowler
- Brookhaven National Laboratory, Medical Department, Upton, New York, United States of America
- * E-mail: (ES); (JSF); (NDV)
| | - Nora D. Volkow
- National Institute on Drug Abuse, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (ES); (JSF); (NDV)
| |
Collapse
|
36
|
Abstract
Splicing is a post-transcriptional modification of RNA during which introns are removed and exons are joined. Most of the mammalian genes undergo constitutive and alternative splicing events. In addition to the strong signals of the splice sites, splicing is influenced at a distance by a range of trans factors that interact with cis regulatory elements and influence the spliceosome. The intention of the present mini-review is to give some insights into the complexity of this interaction and to introduce the consequences of some kinds of detrimental genetic variation on alternative splicing and disease.
Collapse
|
37
|
Genetic variants altering dopamine D2 receptor expression or function modulate the risk of opiate addiction and the dosage requirements of methadone substitution. Pharmacogenet Genomics 2009; 19:407-14. [PMID: 19373123 DOI: 10.1097/fpc.0b013e328320a3fd] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
AIM Addictive behavior is importantly mediated by mesolimbic dopaminergic signaling. Here, we comprehensively analyzed the DRD2 gene locus, and in addition, the ANKK1 rs1800497C>T single nucleotide polymorphism (SNP), formerly known as 'dopamine D2 receptor Taq1A C>T polymorphism', for associations with the risk of opiate addiction and the methadone dosage requirements. METHODS Allelic frequencies of DRD2/ANKK1 polymorphisms were compared between 85 methadone-substituted Caucasian patients and a random sample of 99 healthy Caucasian controls. Within patients, the average and maximum daily methadone dose during the first year of treatment and the time when that maximum dose was reached were analyzed for an association with DRD2/ANKK1 genetics. RESULTS Compared with the control group, drug users carried more frequently the minor allele of DRD2 SNP rs1076560G>T SNP (P=0.022, odds ratio 2.343) or the ATCT haplotype of DRD2 rs1799978A>G, rs1076560G>T, rs6277C>T, ANKK1 rs1800497C>T (P=0.048, odds ratio 2.23), with similar tendencies for ANKK1 rs1800497C>T (P=0.056, odds ratio 2.12) and the TCCTCTT haplotype of DRD2 rs12364283T>C, rs1799732C del, rs4648317C>T, rs1076560G>T, rs6275C>T, rs6277C>T, and ANKK1 rs1800497C>T (P=0.059, odds ratio 2.31). The average and maximum daily methadone doses were significantly associated with the DRD2 rs6275C>T SNP (P=0.016 and 0.005 for average and maximum dose, respectively). Carriers of the variant rs6275T allele needed higher methadone doses than noncarriers. In addition, this variant was associated with a longer time to reach the maximum methadone dose (P=0.025). CONCLUSION On the basis of an analysis spanning the whole gene locus, from the DRD2 promoter to the ANKK1 rs1800497C>T polymorphism, DRD2 genetic polymorphisms modulate both the risk of opiate addiction, leading to the necessity of methadone substitution therapy, and the course of this therapy in terms of dosage requirements.
Collapse
|
38
|
Warnecke T, Weber CC, Hurst LD. Why there is more to protein evolution than protein function: splicing, nucleosomes and dual-coding sequence. Biochem Soc Trans 2009; 37:756-61. [PMID: 19614589 DOI: 10.1042/bst0370756] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
There is considerable variation in the rate at which different proteins evolve. Why is this? Classically, it has been considered that the density of functionally important sites must predict rates of protein evolution. Likewise, amino acid choice is usually assumed to reflect optimal protein function. In the present article, we briefly review evidence suggesting that this protein function-centred view is too simplistic. In particular, we concentrate on how selection acting during the protein's production history can also affect protein evolutionary rates and amino acid choice. Exploring the role of selection at the DNA and RNA level, we specifically address how the need (i) to specify exonic splice enhancer motifs in pre-mRNA, and (ii) to ensure nucleosome positioning on DNA have an impact on amino acid choice and rates of evolution. For both, we review evidence that sequence affected by more than one coding demand is particularly constrained. Strikingly, in mammals, splicing-related constraints are quantitatively as important as expression parameters in predicting rates of protein evolution. These results indicate that there is substantially more to protein evolution than protein functional constraints.
Collapse
Affiliation(s)
- Tobias Warnecke
- Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK
| | | | | |
Collapse
|
39
|
Evolution of alternative splicing regulation: changes in predicted exonic splicing regulators are not associated with changes in alternative splicing levels in primates. PLoS One 2009; 4:e5800. [PMID: 19495418 PMCID: PMC2686173 DOI: 10.1371/journal.pone.0005800] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2009] [Accepted: 05/12/2009] [Indexed: 12/12/2022] Open
Abstract
Alternative splicing is tightly regulated in a spatio-temporal and quantitative manner. This regulation is achieved by a complex interplay between spliceosomal (trans) factors that bind to different sequence (cis) elements. cis-elements reside in both introns and exons and may either enhance or silence splicing. Differential combinations of cis-elements allows for a huge diversity of overall splicing signals, together comprising a complex ‘splicing code’. Many cis-elements have been identified, and their effects on exon inclusion levels demonstrated in reporter systems. However, the impact of interspecific differences in these elements on the evolution of alternative splicing levels has not yet been investigated at genomic level. Here we study the effect of interspecific differences in predicted exonic splicing regulators (ESRs) on exon inclusion levels in human and chimpanzee. For this purpose, we compiled and studied comprehensive datasets of predicted ESRs, identified by several computational and experimental approaches, as well as microarray data for changes in alternative splicing levels between human and chimpanzee. Surprisingly, we found no association between changes in predicted ESRs and changes in alternative splicing levels. This observation holds across different ESR exon positions, exon lengths, and 5′ splice site strengths. We suggest that this lack of association is mainly due to the great importance of context for ESR functionality: many ESR-like motifs in primates may have little or no effect on splicing, and thus interspecific changes at short-time scales may primarily occur in these effectively neutral ESRs. These results underscore the difficulties of using current computational ESR prediction algorithms to identify truly functionally important motifs, and provide a cautionary tale for studies of the effect of SNPs on splicing in human disease.
Collapse
|
40
|
Wang P, Yin S, Zhang Z, Xin D, Hu L, Kong X, Hurst LD. Evidence for common short natural trans sense-antisense pairing between transcripts from protein coding genes. Genome Biol 2008; 9:R169. [PMID: 19055728 PMCID: PMC2646273 DOI: 10.1186/gb-2008-9-12-r169] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2008] [Revised: 10/02/2008] [Accepted: 12/02/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There is increasing realization that regulation of genes is done partly at the RNA level by sense-antisense binding. Studies typically concentrate on the role of non-coding RNAs in regulating coding RNA. But the majority of transcripts in a cell are likely to be coding. Is it possible that coding RNA might regulate other coding RNA by short perfect sense-antisense binding? Here we compare all well-described human protein coding mRNAs against all others to identify sites 15-25 bp long that could potentially perfectly match sense-antisense. RESULTS From 24,968 protein coding mRNA RefSeq sequences, none failed to find at least one match in the transcriptome. By randomizations generating artificial transcripts matched for G+C content and length, we found that there are more such trans short sense-antisense pairs than expected. Several further features are consistent with functionality of some of the putative matches. First, transcripts with more potential partners have lower expression levels, and the pair density of tissue specific genes is significantly higher than that of housekeeping genes. Further, the single nucleotide polymorphism density is lower in short pairing regions than it is in flanking regions. We found no evidence that the sense-antisense pairing regions are associated with small RNAs derived from the protein coding genes. CONCLUSIONS Our results are consistent with the possibility of common short perfect sense-antisense pairing between transcripts of protein coding genes.
Collapse
Affiliation(s)
- Ping Wang
- Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai Jiao Tong University School of Medicine, 225 South Chong Qing Road, Shanghai 200025, PR China.
| | | | | | | | | | | | | |
Collapse
|
41
|
Abstract
BACKGROUND The acid-sensing ion channel 3 (ASIC3) is a ligand-gated cation channel activated by extracellular protons, and is associated with an exercise-induced pressor reflex and possibly autonomic imbalance. METHODS To test the statistical association between genetic polymorphisms of the ASIC3 gene and blood pressure (BP) variations in Taiwanese, 551 unrelated individuals (286 men and 265 women) were recruited from a routine health examination. The participants had no prior history of cardiovascular disease or medication use for hypertension. RESULTS Six ASIC3 gene polymorphisms were genotyped; three were polymorphic, and only the rs2288646 polymorphism was associated with variations in BP among participants. Significantly higher systolic, diastolic, and mean BP were observed in participants carrying the rs2288646-A allele (P=0.034, 0.023, and 0.010, respectively). Significantly higher frequencies of the rs2288646-A-containing genotype were observed in normotensive, prehypertensive, and hypertensive subgroups (P for trend=0.026); and in those with higher systolic and diastolic BPs (P for trend=0.005 and P for trend=0.002, respectively). The association between the rs2288646-A allele and BP persisted even after adjustment for age, sex, BMI, and other metabolic factors. When a second independent group of 403 individuals was combined with the first group of 551 (n=954), a significantly higher frequency of the rs2288646-A-containing genotype was observed in participants with hypertension (9.7 vs. 4.0%, P=0.003). CONCLUSION Our data showed an independent association between an ASIC3 genetic polymorphism and BP variations in Taiwanese. These results suggest that the ASIC3 may be involved in BP regulation.
Collapse
|
42
|
Lin Z, Ma H, Nei M. Ultraconserved coding regions outside the homeobox of mammalian Hox genes. BMC Evol Biol 2008; 8:260. [PMID: 18816392 PMCID: PMC2566984 DOI: 10.1186/1471-2148-8-260] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2008] [Accepted: 09/24/2008] [Indexed: 01/03/2023] Open
Abstract
Background All bilaterian animals share a general genetic framework that controls the formation of their body structures, although their forms are highly diversified. The Hox genes that encode transcription factors play a central role in this framework. All Hox proteins contain a highly conserved homeodomain encoded by the homeobox motif, but the other regions are generally assumed to be less conserved. In this study, we used comparative genomic methods to infer possible functional elements in the coding regions of mammalian Hox genes. Results We identified a set of ultraconserved coding regions (UCRs) outside the homeobox of mammalian Hox genes. Here a UCR is defined as a region of at least 120 nucleotides without synonymous and nonsynonymous nucleotide substitutions among different orders of mammals. Further analysis has indicated that these UCRs occur only in placental mammals and they evolved apparently after the split of placental mammals from marsupials. Analysis of human SNP data suggests that these UCRs are maintained by strong purifying selection. Conclusion Although mammalian genomes are known to contain ultraconserved non-coding elements (UNEs), this paper seems to be the first to report the UCRs in protein coding genes. The extremely high degree of sequence conservation in non-homeobox regions suggests that they might have important roles for the functions of Hox genes. We speculate that UCRs have some gene regulatory functions possibly in relation to the development of the intra-uterus child-bearing system.
Collapse
Affiliation(s)
- Zhenguo Lin
- Department of Biology and Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park, PA 16802, USA.
| | | | | |
Collapse
|
43
|
Ramensky VE, Nurtdinov RN, Neverov AD, Mironov AA, Gelfand MS. Positive selection in alternatively spliced exons of human genes. Am J Hum Genet 2008; 83:94-8. [PMID: 18571144 DOI: 10.1016/j.ajhg.2008.05.017] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2008] [Revised: 04/08/2008] [Accepted: 05/30/2008] [Indexed: 10/21/2022] Open
Abstract
Alternative splicing is a well-recognized mechanism of accelerated genome evolution. We have studied single-nucleotide polymorphisms and human-chimpanzee divergence in the exons of 6672 alternatively spliced human genes, with the aim of understanding the forces driving the evolution of alternatively spliced sequences. Here, we show that alternatively spliced exons and exon fragments (alternative exons) from minor isoforms experience lower selective pressure at the amino acid level, accompanied by selection against synonymous sequence variation. The results of the McDonald-Kreitman test suggest that alternatively spliced exons, unlike exons constitutively included in the mRNA, are also subject to positive selection, with up to 27% of amino acids fixed by positive selection.
Collapse
|
44
|
Ke S, Zhang XHF, Chasin LA. Positive selection acting on splicing motifs reflects compensatory evolution. Genome Res 2008; 18:533-43. [PMID: 18204002 DOI: 10.1101/gr.070268.107] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We have used comparative genomics to characterize the evolutionary behavior of predicted splicing regulatory motifs. Using base substitution rates in intronic regions as a calibrator for neutral change, we found a strong avoidance of synonymous substitutions that disrupt predicted exonic splicing enhancers or create predicted exonic splicing silencers. These results attest to the functionality of the hexameric motif set used and suggest that they are subject to purifying selection. We also found that synonymous substitutions in constitutive exons tend to create exonic splicing enhancers and to disrupt exonic splicing silencers, implying positive selection for these splicing promoting events. We present evidence that this positive selection is the result of splicing-positive events compensating for splicing-negative events as well as for mutations that weaken splice-site sequences. Such compensatory events include nonsynonymous mutations, synonymous mutations, and mutations at splice sites. Compensation was also seen from the fact that orthologous exons tend to maintain the same number of predicted splicing motifs. Our data fit a splicing compensation model of exon evolution, in which selection for splicing-positive mutations takes place to counter the effect of an ongoing splicing-negative mutational process, with the exon as a whole being conserved as a unit of splicing. In the course of this analysis, we observed that synonymous positions in general are conserved relative to intronic sequences, suggesting that messenger RNA molecules are rich in sequence information for functions beyond protein coding and splicing.
Collapse
Affiliation(s)
- Shengdong Ke
- Department of Biological Sciences Columbia University New York, New York 10027, USA
| | | | | |
Collapse
|
45
|
Shifting paradigm of association studies: value of rare single-nucleotide polymorphisms. Am J Hum Genet 2008; 82:100-12. [PMID: 18179889 DOI: 10.1016/j.ajhg.2007.09.006] [Citation(s) in RCA: 260] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2007] [Revised: 08/20/2007] [Accepted: 09/20/2007] [Indexed: 12/22/2022] Open
Abstract
Currently, single-nucleotide polymorphisms (SNPs) with minor allele frequency (MAF) of >5% are preferentially used in case-control association studies of common human diseases. Recent technological developments enable inexpensive and accurate genotyping of a large number of SNPs in thousands of cases and controls, which can provide adequate statistical power to analyze SNPs with MAF <5%. Our purpose was to determine whether evaluating rare SNPs in case-control association studies could help identify causal SNPs for common diseases. We suggest that slightly deleterious SNPs (sdSNPs) subjected to weak purifying selection are major players in genetic control of susceptibility to common diseases. We compared the distribution of MAFs of synonymous SNPs with that of nonsynonymous SNPs (1) predicted to be benign, (2) predicted to be possibly damaging, and (3) predicted to be probably damaging by PolyPhen. Our sources of data were the International HapMap Project, ENCODE, and the SeattleSNPs project. We found that the MAF distribution of possibly and probably damaging SNPs was shifted toward rare SNPs compared with the MAF distribution of benign and synonymous SNPs that are not likely to be functional. We also found an inverse relationship between MAF and the proportion of nsSNPs predicted to be protein disturbing. On the basis of this relationship, we estimated the joint probability that a SNP is functional and would be detected as significant in a case-control study. Our analysis suggests that including rare SNPs in genotyping platforms will advance identification of causal SNPs in case-control association studies, particularly as sample sizes increase.
Collapse
|
46
|
Newton-Cheh C, Guo CY, Larson MG, Musone SL, Surti A, Camargo AL, Drake JA, Benjamin EJ, Levy D, D'Agostino RB, Hirschhorn JN, O'donnell CJ. Common Genetic Variation in
KCNH2
Is Associated With QT Interval Duration. Circulation 2007; 116:1128-36. [PMID: 17709632 DOI: 10.1161/circulationaha.107.710780] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Background—
QT prolongation is associated with increased risk of sudden cardiac death in the general population and in people exposed to QT-prolonging drugs. Mutations in the
KCNH2
gene encoding the HERG potassium channel cause 30% of long-QT syndrome, and binding to this channel leads to drug-induced QT prolongation. We tested common
KCNH2
variants for association with continuous QT interval duration.
Methods and Results—
We selected 17 single nucleotide polymorphisms and rs1805123, a previously associated missense single nucleotide polymorphism, for genotyping in 1730 unrelated men and women from the Framingham Heart Study. rs3807375 genotypes were associated with continuous QT interval duration in men and women (2-
df
P
=0.002), with a dominant model suggested (
P
=0.0004). An independent sample of 871 Framingham Heart Study men and women replicated the association (1-sided dominant
P
=0.02). On combined analysis of 2123 subjects, individuals with AA or AG genotypes had a 0.14-SD (SE, 0.04) or 3.9-ms higher age-, sex- and RR-adjusted QT interval compared with GG individuals (
P
=0.00006). The previously reported association of rs1805123 (K897T) replicated under a dominant (AA/AC, 0.12 SD [SE, 0.07] or 3.1 ms higher versus CC; 1-sided
P
=0.04) or additive model (0.06 SD [SE, 0.03] or 1.6 ms higher per A allele; 1-sided
P
=0.01).
Conclusions—
Two common genetic variants at the
KCNH2
locus are associated with continuous QT interval duration in an unselected community-based sample. Studies to determine the influence of these variants on risk of sudden cardiac death and drug-induced arrhythmias should be considered.
Collapse
|
47
|
Artamonova II, Gelfand MS. Comparative Genomics and Evolution of Alternative Splicing: The Pessimists' Science. Chem Rev 2007; 107:3407-30. [PMID: 17645315 DOI: 10.1021/cr068304c] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Irena I Artamonova
- Group of Bioinformatics, Vavilov Institute of General Genetics, RAS, Gubkina 3, Moscow 119991, Russia
| | | |
Collapse
|
48
|
Parmley JL, Hurst LD. Exonic splicing regulatory elements skew synonymous codon usage near intron-exon boundaries in mammals. Mol Biol Evol 2007; 24:1600-3. [PMID: 17525472 DOI: 10.1093/molbev/msm104] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
In mammals there is a bias in amino acid usage near splice sites that is explained, in large part, by the high density of exonic splicing enhancers (ESEs) in these regions. Is there a similar bias for the relative use of synonymous codons, and can any such bias be predicted by their abundance in ESEs? Prior reports suggested that such trends may exist. From analysis of human exons, we find that 47 of the 59 codons with at least one synonym show differential usage in the proximity of exon ends, of which 42 remain significant after correction for multiple testing. Within sets of synonymous codons those more preferred near splice sites are generally those that are relatively more abundant within the ESEs. However, the examples given previously appear exceptionally good fits and there exist many exceptions, the usage of lysine's codons being a case in point. Similar results are observed in mouse exons. We conclude that splice regulation impacts on the choice of synonymous codons in mammals, but the magnitude of this effect is less than might at first have been supposed.
Collapse
Affiliation(s)
- Joanna L Parmley
- Department of Biology and Biochemistry, University of Bath, Bath, UK.
| | | |
Collapse
|
49
|
Parmley JL, Hurst LD. How common are intragene windows with KA > KS owing to purifying selection on synonymous mutations? J Mol Evol 2007; 64:646-55. [PMID: 17557167 DOI: 10.1007/s00239-006-0207-7] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2006] [Accepted: 03/07/2007] [Indexed: 12/14/2022]
Abstract
One method for diagnosing the mode of sequence evolution considers the ratio of nonsynonymous substitutions per nonsynonymous site (K (A)) to the corresponding figure for synonymous substitutions (K (S)). A ratio (K (A)/K (S)) greater than unity is taken as evidence for positive selection. This, however, need not necessarily be the case. Notably, there is one instance of a high intragenic K (A)/K (S) peak, revealed by sliding window analysis and observed in two pairwise comparisons, better accounted for by localised purifying selection on synonymous mutations that affect splicing. Is this example exceptional? To address this we isolate intragenic domains with K (A)/K (S) > 1 from more than 1000 long mouse-rat orthologues. Approximately one K (A)/K (S) > 1 peak is found per 12-15 kb of coding sequence. Surprisingly, low synonymous substitution rates underpin more incidences than do high nonsynonymous rates. Several reasons, however, prevent us from supposing that the low synonymous rates reflect purifying selection on synonymous mutations. First, for many peaks, the null that the peak is no higher than expected given the underlying rates of evolution, cannot be rejected. Second, of 18 statistically significant incidences with unusually low K (S) values, only 3 are repeatable across independent comparisons. At least two of these are within alternatively spliced exons. We conclude that repeatable statistically significant intragenic domains of low intragenic K (S) are rare. As so few K (A)/K (S) peaks reflect increased rates of protein evolution and so few hold statistical support, we additionally conclude that sliding window analysis to infer domains of positive selection is highly error-prone.
Collapse
Affiliation(s)
- Joanna L Parmley
- Department of Biology and Biochemistry, University of Bath, Bath, UK
| | | |
Collapse
|
50
|
Abstract
While it has often been assumed that, in humans, synonymous mutations would have no effect on fitness, let alone cause disease, this position has been questioned over the last decade. There is now considerable evidence that such mutations can, for example, disrupt splicing and interfere with miRNA binding. Two recent publications suggest involvement of additional mechanisms: modification of protein abundance most probably mediated by alteration in mRNA stability and modification of protein structure and activity, probably mediated by induction of translational pausing. These case histories put a further nail into the coffin of the assumption that synonymous mutations must be neutral.
Collapse
Affiliation(s)
- Joanna L Parmley
- Department of Biology and Biochemistry, University of Bath, Bath, UK
| | | |
Collapse
|