1
|
Kokate PP, Techtmann SM, Werner T. Codon usage bias and dinucleotide preference in 29 Drosophila species. G3 GENES|GENOMES|GENETICS 2021; 11:6291245. [PMID: 34849812 PMCID: PMC8496323 DOI: 10.1093/g3journal/jkab191] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Accepted: 05/13/2021] [Indexed: 12/30/2022]
Abstract
Abstract
Codon usage bias, where certain codons are used more frequently than their synonymous counterparts, is an interesting phenomenon influenced by three evolutionary forces: mutation, selection, and genetic drift. To better understand how these evolutionary forces affect codon usage bias, an extensive study to detect how codon usage patterns change across species is required. This study investigated 668 single-copy orthologous genes independently in 29 Drosophila species to determine how the codon usage patterns change with phylogenetic distance. We found a strong correlation between phylogenetic distance and codon usage bias and observed striking differences in codon preferences between the two subgenera Drosophila and Sophophora. As compared to the subgenus Sophophora, species of the subgenus Drosophila showed reduced codon usage bias and a reduced preference specifically for codons ending with C, except for codons with G in the second position. We found that codon usage patterns in all species were influenced by the nucleotides in the codon’s 2nd and 3rd positions rather than the biochemical properties of the amino acids encoded. We detected a concordance between preferred codons and preferred dinucleotides (at positions 2 and 3 of codons). Furthermore, we observed an association between speciation, codon preferences, and dinucleotide preferences. Our study provides the foundation to understand how selection acts on dinucleotides to influence codon usage bias.
Collapse
Affiliation(s)
- Prajakta P Kokate
- Department of Biological Sciences, Michigan Technological University, Houghton, MI 49931, USA
| | - Stephen M Techtmann
- Department of Biological Sciences, Michigan Technological University, Houghton, MI 49931, USA
| | - Thomas Werner
- Department of Biological Sciences, Michigan Technological University, Houghton, MI 49931, USA
| |
Collapse
|
2
|
Abstract
Understanding phylogenetic relationships among taxa is key to designing and implementing comparative analyses. The genus Drosophila, which contains over 1600 species, is one of the most important model systems in the biological sciences. For over a century, one species in this group, Drosophila melanogaster, has been key to studies of animal development and genetics, genome organization and evolution, and human disease. As whole-genome sequencing becomes more cost-effective, there is increasing interest in other members of this morphologically, ecologically, and behaviorally diverse genus. Phylogenetic relationships within Drosophila are complicated, and the goal of this paper is to provide a review of the recent taxonomic changes and phylogenetic relationships in this genus to aid in further comparative studies.
Collapse
|
3
|
Zhao F, Yu CH, Liu Y. Codon usage regulates protein structure and function by affecting translation elongation speed in Drosophila cells. Nucleic Acids Res 2017; 45:8484-8492. [PMID: 28582582 PMCID: PMC5737824 DOI: 10.1093/nar/gkx501] [Citation(s) in RCA: 83] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Accepted: 05/26/2017] [Indexed: 11/14/2022] Open
Abstract
Codon usage biases are found in all eukaryotic and prokaryotic genomes and have been proposed to regulate different aspects of translation process. Codon optimality has been shown to regulate translation elongation speed in fungal systems, but its effect on translation elongation speed in animal systems is not clear. In this study, we used a Drosophila cell-free translation system to directly compare the velocity of mRNA translation elongation. Our results demonstrate that optimal synonymous codons speed up translation elongation while non-optimal codons slow down translation. In addition, codon usage regulates ribosome movement and stalling on mRNA during translation. Finally, we show that codon usage affects protein structure and function in vitro and in Drosophila cells. Together, these results suggest that the effect of codon usage on translation elongation speed is a conserved mechanism from fungi to animals that can affect protein folding in eukaryotic organisms.
Collapse
Affiliation(s)
- Fangzhou Zhao
- Department of Physiology, The University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA
| | - Chien-Hung Yu
- Department of Physiology, The University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA
| | - Yi Liu
- Department of Physiology, The University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA
| |
Collapse
|
4
|
Fu J, Murphy KA, Zhou M, Li YH, Lam VH, Tabuloc CA, Chiu JC, Liu Y. Codon usage affects the structure and function of the Drosophila circadian clock protein PERIOD. Genes Dev 2017; 30:1761-75. [PMID: 27542830 PMCID: PMC5002980 DOI: 10.1101/gad.281030.116] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2016] [Accepted: 07/15/2016] [Indexed: 11/25/2022]
Abstract
Fu et al. show that Drosophila period (dper) codon usage is important for circadian clock function. Codon optimization of dper resulted in conformational changes of dPER protein, altered dPER phosphorylation profile and stability, and impaired dPER function in the circadian negative feedback loop, which manifests into changes in molecular rhythmicity and abnormal circadian behavioral output. Codon usage bias is a universal feature of all genomes, but its in vivo biological functions in animal systems are not clear. To investigate the in vivo role of codon usage in animals, we took advantage of the sensitivity and robustness of the Drosophila circadian system. By codon-optimizing parts of Drosophila period (dper), a core clock gene that encodes a critical component of the circadian oscillator, we showed that dper codon usage is important for circadian clock function. Codon optimization of dper resulted in conformational changes of the dPER protein, altered dPER phosphorylation profile and stability, and impaired dPER function in the circadian negative feedback loop, which manifests into changes in molecular rhythmicity and abnormal circadian behavioral output. This study provides an in vivo example that demonstrates the role of codon usage in determining protein structure and function in an animal system. These results suggest a universal mechanism in eukaryotes that uses a codon usage “code” within genetic codons to regulate cotranslational protein folding.
Collapse
Affiliation(s)
- Jingjing Fu
- Department of Physiology, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
| | - Katherine A Murphy
- Department of Entomology and Nematology, University of California at Davis, Davis, California 95616, USA
| | - Mian Zhou
- Department of Physiology, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA; School of Biotechnology, East China University of Science and Technology, Shanghai 200237, China
| | - Ying H Li
- Department of Entomology and Nematology, University of California at Davis, Davis, California 95616, USA
| | - Vu H Lam
- Department of Entomology and Nematology, University of California at Davis, Davis, California 95616, USA
| | - Christine A Tabuloc
- Department of Entomology and Nematology, University of California at Davis, Davis, California 95616, USA
| | - Joanna C Chiu
- Department of Entomology and Nematology, University of California at Davis, Davis, California 95616, USA
| | - Yi Liu
- Department of Physiology, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
| |
Collapse
|
5
|
Choi JY, Aquadro CF. Recent and Long-Term Selection Across Synonymous Sites in Drosophila ananassae. J Mol Evol 2016; 83:50-60. [PMID: 27481397 DOI: 10.1007/s00239-016-9753-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2015] [Accepted: 07/23/2016] [Indexed: 11/28/2022]
Abstract
In Drosophila, many studies have examined the short- or long-term evolution occurring across synonymous sites. Few, however, have examined both the recent and long-term evolution to gain a complete view of this selection. Here we have analyzed Drosophila ananassae DNA polymorphism and divergence data using several different methods, and have identified evidence of positive selection favoring preferred codons in both recent and long-term evolutionary time scale. Further in D. ananassae, the strength of selection for preferred codons was stronger on the X chromosome compared to the autosomes. We show that this stronger selection is not due to higher gene expression of X-linked genes. Analysis of the selectively neutral introns indicated that the X chromosome also had a preference for GC over AT nucleotides, potentially from GC-biased gene conversions (gcBGCs) that can also affect the base composition of synonymous sites. Thus selection for preferred codons and gcBGC both seem to be partially responsible for shaping the D. ananassae synonymous site evolution.
Collapse
Affiliation(s)
- Jae Young Choi
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, USA.
| | - Charles F Aquadro
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, USA
| |
Collapse
|
6
|
Nakashima Y, Higashiyama A, Ushimaru A, Nagoda N, Matsuo Y. Evolution of GC content in the histone gene repeating units from Drosophila lutescens, D. takahashii and D. pseudoobscura. Genes Genet Syst 2016; 91:27-36. [PMID: 27021916 DOI: 10.1266/ggs.15-00018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
A subset of histone genes (H1, H2A, H2B and H4), which are encoded along with H3 within repeating units, were analyzed in Drosophila lutescens, D. takahashii and D. pseudoobscura to investigate the evolutionary mechanisms influencing this multigene family and its GC content. Nucleotide divergence among species was more marked in the less functional regions. A strong inverse relationship was observed between the extent of evolutionary divergence and GC content within the repeating units; this finding indicated that the functional constraint on a region must be associated with both divergence and GC content. The GC content at 3(rd) codon positions in the histone genes from D. lutescens and D. takahashii was higher than that from D. melanogaster, while that from D. pseudoobscura was similar. These evolutionary patterns were similar to those of H3 gene regions. Based on these findings, we propose that the evolutionary mechanisms governing nucleotide content at 3(rd) codon positions tend to eliminate A and T nucleotides more frequently than G and C nucleotides. These changes might be the consequence of negative selection and would result in GC-rich 3(rd) codon positions. In addition, interspecific differences in GC content, which exhibited the same pattern for all histone genes, could be explained by different selection efficiencies that result from changes in population size.
Collapse
Affiliation(s)
- Yuko Nakashima
- Laboratory of Adaptive Evolution, Institute of Socio-Arts and Sciences, Tokushima University
| | | | | | | | | |
Collapse
|
7
|
Drosophila muller f elements maintain a distinct set of genomic properties over 40 million years of evolution. G3-GENES GENOMES GENETICS 2015; 5:719-40. [PMID: 25740935 PMCID: PMC4426361 DOI: 10.1534/g3.114.015966] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
The Muller F element (4.2 Mb, ~80 protein-coding genes) is an unusual autosome of Drosophila melanogaster; it is mostly heterochromatic with a low recombination rate. To investigate how these properties impact the evolution of repeats and genes, we manually improved the sequence and annotated the genes on the D. erecta, D. mojavensis, and D. grimshawi F elements and euchromatic domains from the Muller D element. We find that F elements have greater transposon density (25–50%) than euchromatic reference regions (3–11%). Among the F elements, D. grimshawi has the lowest transposon density (particularly DINE-1: 2% vs. 11–27%). F element genes have larger coding spans, more coding exons, larger introns, and lower codon bias. Comparison of the Effective Number of Codons with the Codon Adaptation Index shows that, in contrast to the other species, codon bias in D. grimshawi F element genes can be attributed primarily to selection instead of mutational biases, suggesting that density and types of transposons affect the degree of local heterochromatin formation. F element genes have lower estimated DNA melting temperatures than D element genes, potentially facilitating transcription through heterochromatin. Most F element genes (~90%) have remained on that element, but the F element has smaller syntenic blocks than genome averages (3.4–3.6 vs. 8.4–8.8 genes per block), indicating greater rates of inversion despite lower rates of recombination. Overall, the F element has maintained characteristics that are distinct from other autosomes in the Drosophila lineage, illuminating the constraints imposed by a heterochromatic milieu.
Collapse
|
8
|
Zaborske JM, Bauer DuMont VL, Wallace EWJ, Pan T, Aquadro CF, Drummond DA. A nutrient-driven tRNA modification alters translational fidelity and genome-wide protein coding across an animal genus. PLoS Biol 2014; 12:e1002015. [PMID: 25489848 PMCID: PMC4260829 DOI: 10.1371/journal.pbio.1002015] [Citation(s) in RCA: 83] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Accepted: 10/22/2014] [Indexed: 11/19/2022] Open
Abstract
Use of the nutrient queuine to modify tRNA anticodons can change the accuracy of certain codons during protein synthesis, resulting in evolutionary recoding of fruit fly genomes. Natural selection favors efficient expression of encoded proteins, but the causes, mechanisms, and fitness consequences of evolved coding changes remain an area of aggressive inquiry. We report a large-scale reversal in the relative translational accuracy of codons across 12 fly species in the Drosophila/Sophophora genus. Because the reversal involves pairs of codons that are read by the same genomically encoded tRNAs, we hypothesize, and show by direct measurement, that a tRNA anticodon modification from guanosine to queuosine has coevolved with these genomic changes. Queuosine modification is present in most organisms but its function remains unclear. Modification levels vary across developmental stages in D. melanogaster, and, consistent with a causal effect, genes maximally expressed at each stage display selection for codons that are most accurate given stage-specific queuosine modification levels. In a kinetic model, the known increased affinity of queuosine-modified tRNA for ribosomes increases the accuracy of cognate codons while reducing the accuracy of near-cognate codons. Levels of queuosine modification in D. melanogaster reflect bioavailability of the precursor queuine, which eukaryotes scavenge from the tRNAs of bacteria and absorb in the gut. These results reveal a strikingly direct mechanism by which recoding of entire genomes results from changes in utilization of a nutrient. Ribosomes translate mRNA into protein using tRNAs, and these tRNAs often translate multiple synonymous codons. Although synonymous codons specify the same amino acid, tRNAs read codons with differing speed and accuracy, and so some codons may be more accurately translated than their synonyms. Such variation in the efficiency of translation between synonymous codons can result in costs to cellular fitness. By favoring certain coding choices over evolutionary timescales, natural selection leaves signs of pressure for translational fidelity on evolved genomes. We have found that the way in which proteins are encoded has changed systematically across several closely related fruit fly species. Surprisingly, several of these changes involve two codons both read by the same tRNA. Here we confirm experimentally that the anticodons of these tRNAs are chemically modified—from guanine to queuosine—in vivo, and that the levels of this modification in different species track the differences in protein coding. Furthermore, queuosine modification levels are known to change during fruit fly development, and we find that genes expressed maximally during a given developmental stage have codings reflecting levels of modification at that stage. Remarkably, queuosine modification depends upon acquisition of its precursor, queuine, as a nutrient that eukaryotes must obtain from bacteria through the gut. We have thus elucidated a mechanism by which availability of a nutrient can shape the coding patterns of whole genomes.
Collapse
Affiliation(s)
- John M. Zaborske
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Vanessa L. Bauer DuMont
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Edward W. J. Wallace
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Tao Pan
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Charles F. Aquadro
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - D. Allan Drummond
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
9
|
Abstract
Evolutionary conservation has been an accurate predictor of functional elements across the first decade of metazoan genomics. More recently, there has been a move to define functional elements instead from biochemical annotations. Evolutionary methods are, however, more comprehensive than biochemical approaches can be and can assess quantitatively, especially for subtle effects, how biologically important--how injurious after mutation--different types of elements are. Evolutionary methods are thus critical for understanding the large fraction (up to 10%) of the human genome that does not encode proteins and yet might convey function. These methods can also capture the ephemeral nature of much noncoding functional sequence, with large numbers of functional elements having been gained and lost rapidly along each mammalian lineage. Here, we review how different strengths of purifying selection have impacted on protein-coding and non-protein-coding loci and on transcription factor binding sites in mammalian and fruit fly genomes.
Collapse
Affiliation(s)
- Wilfried Haerty
- MRC Functional Genomics Unit, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford OX1 3PT, United Kingdom; ,
| | | |
Collapse
|
10
|
Rouault H, Santolini M, Schweisguth F, Hakim V. Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation. Nucleic Acids Res 2014; 42:6128-45. [PMID: 24682824 PMCID: PMC4041412 DOI: 10.1093/nar/gku209] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Cis-regulatory modules (CRMs) and motifs play a central role in tissue and condition-specific gene expression. Here we present Imogene, an ensemble of statistical tools that we have developed to facilitate their identification and implemented in a publicly available software. Starting from a small training set of mammalian or fly CRMs that drive similar gene expression profiles, Imogene determines de novocis-regulatory motifs that underlie this co-expression. It can then predict on a genome-wide scale other CRMs with a regulatory potential similar to the training set. Imogene bypasses the need of large datasets for statistical analyses by making central use of the information provided by the sequenced genomes of multiple species, based on the developed statistical tools and explicit models for transcription factor binding site evolution. We test Imogene on characterized tissue-specific mouse developmental CRMs. Its ability to identify CRMs with the same specificity based on its de novo created motifs is comparable to that of previously evaluated ‘motif-blind’ methods. We further show, both in flies and in mammals, that Imogene de novo generated motifs are sufficient to discriminate CRMs related to different developmental programs. Notably, purely relying on sequence data, Imogene performs as well in this discrimination task as a previously reported learning algorithm based on Chromatin Immunoprecipitation (ChIP) data for multiple transcription factors at multiple developmental stages.
Collapse
Affiliation(s)
- Hervé Rouault
- Developmental and Stem Cell Biology Department, Institut Pasteur, F-75015 Paris, France CNRS, URA2578, F-75015 Paris, France
| | - Marc Santolini
- Laboratoire de Physique Statistique, CNRS, École Normale Supérieure, Université P. et M. Curie, Université Paris-Diderot
| | - François Schweisguth
- Developmental and Stem Cell Biology Department, Institut Pasteur, F-75015 Paris, France CNRS, URA2578, F-75015 Paris, France
| | - Vincent Hakim
- Laboratoire de Physique Statistique, CNRS, École Normale Supérieure, Université P. et M. Curie, Université Paris-Diderot
| |
Collapse
|
11
|
Balakirev ES, Chechetkin VR, Lobzin VV, Ayala FJ. Computational methods of identification of pseudogenes based on functionality: entropy and GC content. Methods Mol Biol 2014; 1167:41-62. [PMID: 24823770 DOI: 10.1007/978-1-4939-0835-6_4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Spectral entropy and GC content analyses reveal comprehensive structural features of DNA sequences. To illustrate the significance of these features, we analyze the β-esterase gene cluster, including the Est-6 gene and the ψEst-6 putative pseudogene, in seven species of the Drosophila melanogaster subgroup. The spectral entropies show distinctly lower structural ordering for ψEst-6 than for Est-6 in all species studied. However, entropy accumulation is not a completely random process for either gene and it shows to be nucleotide dependent. Furthermore, GC content in synonymous positions is uniformly higher in Est-6 than in ψEst-6, in agreement with the reduced GC content generally observed in pseudogenes and nonfunctional sequences. The observed differences in entropy and GC content reflect an evolutionary shift associated with the process of pseudogenization and subsequent functional divergence of ψEst-6 and Est-6 after the duplication event. The data obtained show the relevance and significance of entropy and GC content analyses for pseudogene identification and for the comparative study of gene-pseudogene evolution.
Collapse
Affiliation(s)
- Evgeniy S Balakirev
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, USA,
| | | | | | | |
Collapse
|
12
|
Poh YP, Ting CT, Fu HW, Langley CH, Begun DJ. Population genomic analysis of base composition evolution in Drosophila melanogaster. Genome Biol Evol 2013; 4:1245-55. [PMID: 23160062 PMCID: PMC3542573 DOI: 10.1093/gbe/evs097] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The relative importance of mutation, selection, and biased gene conversion to patterns of base composition variation in Drosophila melanogaster, and to a lesser extent, D. simulans, has been investigated for many years. However, genomic data from sufficiently large samples to thoroughly characterize patterns of base composition polymorphism within species have been lacking. Here, we report a genome-wide analysis of coding and noncoding polymorphism in a large sample of inbred D. melanogaster strains from Raleigh, North Carolina. Consistent with previous results, we observed that AT mutations fix more frequently than GC mutations in D. melanogaster. Contrary to predictions of previous models of codon usage in D. melanogaster, we found that synonymous sites segregating for derived AT polymorphisms were less skewed toward low frequencies compared with sites segregating a derived GC polymorphism. However, no such pattern was observed for comparable base composition polymorphisms in noncoding DNA. These results suggest that AT-ending codons could currently be favored by natural selection in the D. melanogaster lineage.
Collapse
Affiliation(s)
- Yu-Ping Poh
- Institute of Molecular and Cellular Biology, National Tsing Hua University, Taiwan, Republic of China.
| | | | | | | | | |
Collapse
|
13
|
Marques AC, Tan J, Lee S, Kong L, Heger A, Ponting CP. Evidence for conserved post-transcriptional roles of unitary pseudogenes and for frequent bifunctionality of mRNAs. Genome Biol 2012; 13:R102. [PMID: 23153069 PMCID: PMC3580494 DOI: 10.1186/gb-2012-13-11-r102] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Accepted: 11/15/2012] [Indexed: 01/15/2023] Open
Abstract
Background Recent reports have highlighted instances of mRNAs that, in addition to coding for protein, regulate the abundance of related transcripts by altering microRNA availability. These two mRNA roles - one mediated by RNA and the other by protein - are inter-dependent and hence cannot easily be separated. Whether the RNA-mediated role of transcripts is important, per se, or whether it is a relatively innocuous consequence of competition by different transcripts for microRNA binding remains unknown. Results Here we took advantage of 48 loci that encoded proteins in the earliest eutherian ancestor, but whose protein-coding capability has since been lost specifically during rodent evolution. Sixty-five percent of such loci, which we term 'unitary pseudogenes', have retained their expression in mouse and their transcripts exhibit conserved tissue expression profiles. The maintenance of these unitary pseudogenes' spatial expression profiles is associated with conservation of their microRNA response elements and these appear to preserve the post-transcriptional roles of their protein-coding ancestor. We used mouse Pbcas4, an exemplar of these transcribed unitary pseudogenes, to experimentally test our genome-wide predictions. We demonstrate that the role of Pbcas4 as a competitive endogenous RNA has been conserved and has outlived its ancestral gene's loss of protein-coding potential. Conclusions These results show that post-transcriptional regulation by bifunctional mRNAs can persist over long evolutionary time periods even after their protein coding ability has been lost.
Collapse
|
14
|
Behura SK, Severson DW. Comparative analysis of codon usage bias and codon context patterns between dipteran and hymenopteran sequenced genomes. PLoS One 2012; 7:e43111. [PMID: 22912801 PMCID: PMC3422295 DOI: 10.1371/journal.pone.0043111] [Citation(s) in RCA: 109] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2012] [Accepted: 07/16/2012] [Indexed: 11/21/2022] Open
Abstract
Background Codon bias is a phenomenon of non-uniform usage of codons whereas codon context generally refers to sequential pair of codons in a gene. Although genome sequencing of multiple species of dipteran and hymenopteran insects have been completed only a few of these species have been analyzed for codon usage bias. Methods and Principal Findings Here, we use bioinformatics approaches to analyze codon usage bias and codon context patterns in a genome-wide manner among 15 dipteran and 7 hymenopteran insect species. Results show that GAA is the most frequent codon in the dipteran species whereas GAG is the most frequent codon in the hymenopteran species. Data reveals that codons ending with C or G are frequently used in the dipteran genomes whereas codons ending with A or T are frequently used in the hymenopteran genomes. Synonymous codon usage orders (SCUO) vary within genomes in a pattern that seems to be distinct for each species. Based on comparison of 30 one-to-one orthologous genes among 17 species, the fruit fly Drosophila willistoni shows the least codon usage bias whereas the honey bee (Apis mellifera) shows the highest bias. Analysis of codon context patterns of these insects shows that specific codons are frequently used as the 3′- and 5′-context of start and stop codons, respectively. Conclusions Codon bias pattern is distinct between dipteran and hymenopteran insects. While codon bias is favored by high GC content of dipteran genomes, high AT content of genes favors biased usage of synonymous codons in the hymenopteran insects. Also, codon context patterns vary among these species largely according to their phylogeny.
Collapse
Affiliation(s)
- Susanta K Behura
- Eck Institute for Global Health, Department of Biological Sciences. University of Notre Dame, Notre Dame, Indiana, United States of America.
| | | |
Collapse
|
15
|
Behura SK, Severson DW. Codon usage bias: causative factors, quantification methods and genome-wide patterns: with emphasis on insect genomes. Biol Rev Camb Philos Soc 2012; 88:49-61. [PMID: 22889422 DOI: 10.1111/j.1469-185x.2012.00242.x] [Citation(s) in RCA: 134] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Codon usage bias refers to the phenomenon where specific codons are used more often than other synonymous codons during translation of genes, the extent of which varies within and among species. Molecular evolutionary investigations suggest that codon bias is manifested as a result of balance between mutational and translational selection of such genes and that this phenomenon is widespread across species and may contribute to genome evolution in a significant manner. With the advent of whole-genome sequencing of numerous species, both prokaryotes and eukaryotes, genome-wide patterns of codon bias are emerging in different organisms. Various factors such as expression level, GC content, recombination rates, RNA stability, codon position, gene length and others (including environmental stress and population size) can influence codon usage bias within and among species. Moreover, there has been a continuous quest towards developing new concepts and tools to measure the extent of codon usage bias of genes. In this review, we outline the fundamental concepts of evolution of the genetic code, discuss various factors that may influence biased usage of synonymous codons and then outline different principles and methods of measurement of codon usage bias. Finally, we discuss selected studies performed using whole-genome sequences of different insect species to show how codon bias patterns vary within and among genomes. We conclude with generalized remarks on specific emerging aspects of codon bias studies and highlight the recent explosion of genome-sequencing efforts on arthropods (such as twelve Drosophila species, species of ants, honeybee, Nasonia and Anopheles mosquitoes as well as the recent launch of a genome-sequencing project involving 5000 insects and other arthropods) that may help us to understand better the evolution of codon bias and its biological significance.
Collapse
Affiliation(s)
- Susanta K Behura
- Department of Biological Sciences, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA.
| | | |
Collapse
|
16
|
Rodriguez O, Singh BK, Severson DW, Behura SK. Translational selection of genes coding for perfectly conserved proteins among three mosquito vectors. INFECTION GENETICS AND EVOLUTION 2012; 12:1535-42. [PMID: 22705463 DOI: 10.1016/j.meegid.2012.06.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2011] [Revised: 05/10/2012] [Accepted: 06/07/2012] [Indexed: 02/03/2023]
Abstract
The biased usage of synonymous codons affects translational efficiency of genes. We studied codon usage patterns of genes that are perfectly conserved at the amino acid level among three important mosquito vector species: Aedes aegypti (vector of dengue virus), Anopheles gambiae (vector of malaria) and Culex quinquefasciatus (vector of lymphatic filariasis and West Nile Virus). Although these proteins have the same amino acid sequences, non-random usage of synonymous codons is evident among the orthologous genes. The coding sequences of these genes were simulated to generate random mutation sites to be further investigated for patterns of codon bias. It was found that codon usage bias is significantly higher in genes that represented perfectly conserved proteins than genes where variation was apparent at the amino acid sequence. Our results suggest that genes coding for perfectly conserved proteins are highly biased with optimized codons and may be under stringent translational selection in these vector species.
Collapse
Affiliation(s)
- Olaf Rodriguez
- Eck Institute for Global Health, Department of Biological Sciences, University of Notre Dame, IN, USA
| | | | | | | |
Collapse
|
17
|
Dass JFP, Sudandiradoss C. Insight into pattern of codon biasness and nucleotide base usage in serotonin receptor gene family from different mammalian species. Gene 2012; 503:92-100. [PMID: 22480817 DOI: 10.1016/j.gene.2012.03.057] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2011] [Revised: 03/14/2012] [Accepted: 03/17/2012] [Indexed: 11/16/2022]
Abstract
5-HT (5-Hydroxy-tryptamine) or serotonin receptors are found both in central and peripheral nervous system as well as in non-neuronal tissues. In the animal and human nervous system, serotonin produces various functional effects through a variety of membrane bound receptors. In this study, we focus on 5-HT receptor family from different mammals and examined the factors that account for codon and nucleotide usage variation. A total of 110 homologous coding sequences from 11 different mammalian species were analyzed using relative synonymous codon usage (RSCU), correspondence analysis (COA) and hierarchical cluster analysis together with nucleotide base usage frequency of chemically similar amino acid codons. The mean effective number of codon (ENc) value of 37.06 for 5-HT(6) shows very high codon bias within the family and may be due to high selective translational efficiency. The COA and Spearman's rank correlation reveals that the nucleotide compositional mutation bias as the major factors influencing the codon usage in serotonin receptor genes. The hierarchical cluster analysis suggests that gene function is another dominant factor that affects the codon usage bias, while species is a minor factor. Nucleotide base usage was reported using Goldman, Engelman, Stietz (GES) scale reveals the presence of high uracil (>45%) content at functionally important hydrophobic regions. Our in silico approach will certainly help for further investigations on critical inference on evolution, structure, function and gene expression aspects of 5-HT receptors family which are potential antipsychotic drug targets.
Collapse
Affiliation(s)
- J Febin Prabhu Dass
- School of Biosciences and Technology, VIT University, Vellore, Tamil Nadu State, India
| | | |
Collapse
|
18
|
Luo XL, Xu JG, Ye CY. Analysis of synonymous codon usage inShigella flexneri2a strain 301 and otherShigellaandEscherichia colistrains. Can J Microbiol 2011; 57:1016-23. [DOI: 10.1139/w11-095] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
In this study, we analysed synonymous codon usage in Shigella flexneri 2a strain 301 (Sf301) and performed a comparative analysis of synonymous codon usage patterns in Sf301 and other strains of Shigella and Escherichia coli . Although there was a significant variety in codon usage bias among different Sf301 genes, there was a slight but observable codon usage bias that could primarily be attributable to mutational pressure and translational selection. In addition, the relative abundance of dinucleotides in Sf301 was observed to be independent of the overall base composition but was still caused by differential mutational pressure; this also shaped codon usage. By comparing the relative synonymous codon usage values across different Shigella and E. coli strains, we suggested that the synonymous codon usage pattern in the Shigella genomes was strain specific. This study represents a comprehensive analysis of Shigella codon usage patterns and provides a basic understanding of the mechanisms underlying codon usage bias.
Collapse
Affiliation(s)
- Xue Lian Luo
- State Key Laboratory for Infectious Disease Prevention and Control, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Changping, Beijing 102206, People’s Republic of China
| | - Jian Guo Xu
- State Key Laboratory for Infectious Disease Prevention and Control, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Changping, Beijing 102206, People’s Republic of China
| | - Chang Yun Ye
- State Key Laboratory for Infectious Disease Prevention and Control, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Changping, Beijing 102206, People’s Republic of China
| |
Collapse
|
19
|
de Procé SM, Zeng K, Betancourt AJ, Charlesworth B. Selection on codon usage and base composition in Drosophila americana. Biol Lett 2011; 8:82-5. [PMID: 21849309 DOI: 10.1098/rsbl.2011.0601] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
We have used a polymorphism dataset on introns and coding sequences of X-linked loci in Drosophila americana to estimate the strength of selection on codon usage and/or biased gene conversion (BGC), taking into account a recent population expansion detected by a maximum-likelihood method. Drosophila americana was previously thought to have a stable demographic history, so that this evidence for a recent population expansion means that previous estimates of selection need revision. There was evidence for natural selection or BGC favouring GC over AT variants in introns, which is stronger for GC-rich than GC-poor introns. By comparing introns and coding sequences, we found evidence for selection on codon usage bias, which is much stronger than the forces acting on GC versus AT basepairs in introns.
Collapse
Affiliation(s)
- Sophie Marion de Procé
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK.
| | | | | | | |
Collapse
|
20
|
Determinants of translation efficiency and accuracy. Mol Syst Biol 2011; 7:481. [PMID: 21487400 PMCID: PMC3101949 DOI: 10.1038/msb.2011.14] [Citation(s) in RCA: 338] [Impact Index Per Article: 24.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2010] [Accepted: 02/15/2011] [Indexed: 12/17/2022] Open
Abstract
A given protein sequence can be encoded by an astronomical number of alternative nucleotide sequences. Recent research has revealed that this flexibility provides evolution with multiple ways to tune the efficiency and fidelity of protein translation and folding. Proper functioning of biological cells requires that the process of protein expression be carried out with high efficiency and fidelity. Given an amino-acid sequence of a protein, multiple degrees of freedom still remain that may allow evolution to tune efficiency and fidelity for each gene under various conditions and cell types. Particularly, the redundancy of the genetic code allows the choice between alternative codons for the same amino acid, which, although ‘synonymous,' may exert dramatic effects on the process of translation. Here we review modern developments in genomics and systems biology that have revolutionized our understanding of the multiple means by which translation is regulated. We suggest new means to model the process of translation in a richer framework that will incorporate information about gene sequences, the tRNA pool of the organism and the thermodynamic stability of the mRNA transcripts. A practical demonstration of a better understanding of the process would be a more accurate prediction of the proteome, given the transcriptome at a diversity of biological conditions.
Collapse
|
21
|
Behura SK, Severson DW. Coadaptation of isoacceptor tRNA genes and codon usage bias for translation efficiency in Aedes aegypti and Anopheles gambiae. INSECT MOLECULAR BIOLOGY 2011; 20:177-87. [PMID: 21040044 PMCID: PMC3057532 DOI: 10.1111/j.1365-2583.2010.01055.x] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
The transfer RNAs (tRNAs) are essential components of translational machinery. We determined that tRNA isoacceptors (tRNAs with different anticodons but incorporating the same amino acid in protein synthesis) show differential copy number abundance, genomic distribution patterns and sequence evolution between Aedes aegypti and Anopheles gambiae mosquitoes. The tRNA-Ala genes are present in unusually high copy number in the Ae. aegypti genome but not in An. gambiae. Many of the tRNA-Ala genes of Ae. aegypti are flanked by a highly conserved sequence that is not observed in An. gambiae. The relative abundance of tRNA isoacceptor genes is correlated with preferred (or optimal) and nonpreferred (or rare) codons for ∼2-4% of the predicted protein coding genes in both species. The majority (∼74-85%) of these genes are related to pathways involved with translation, energy metabolism and carbohydrate metabolism. Our results suggest that these genes and the related pathways may be under translational selection in these mosquitoes.
Collapse
Affiliation(s)
| | - David W. Severson
- Correspondence: David W. Severson, Phone: 574-631-3826, FAX: 574-631-7413,
| |
Collapse
|
22
|
Lan L, Lin S, Zhang S, Cohen RS. Evidence for a transport-trap mode of Drosophila melanogaster gurken mRNA localization. PLoS One 2010; 5:e15448. [PMID: 21103393 PMCID: PMC2980492 DOI: 10.1371/journal.pone.0015448] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2010] [Accepted: 09/22/2010] [Indexed: 11/24/2022] Open
Abstract
The Drosophila melanogaster gurken gene encodes a TGF alpha-like signaling molecule that is secreted from the oocyte during two distinct stages of oogenesis to define the coordinate axes of the follicle cell epithelium that surrounds the oocyte and its 15 anterior nurse cells. Because the gurken receptor is expressed throughout the epithelium, axial patterning requires region-specific secretion of Gurken protein, which in turn requires subcellular localization of gurken transcripts. The first stage of Gurken signaling induces anteroposterior pattern in the epithelium and requires the transport of gurken transcripts from nurse cells into the oocyte. The second stage of Gurken signaling induces dorsovental polarity in the epithelium and requires localization of gurken transcripts to the oocyte's anterodorsal corner. Previous studies, relying predominantly on real-time imaging of injected transcripts, indicated that anterodorsal localization involves transport of gurken transcripts to the oocyte's anterior cortex followed by transport to the anterodorsal corner, and anchoring. Such studies further indicated that a single RNA sequence element, the GLS, mediates both transport steps by facilitating association of gurken transcripts with a cytoplasmic dynein motor complex. Finally, it was proposed that the GLS somehow steers the motor complex toward that subset of microtubules that are nucleated around the oocyte nucleus, permitting directed transport to the anterodorsal corner. Here, we re-investigate the role of the GLS using a transgenic fly assay system that includes use of the endogenous gurken promoter and biological rescue as well as RNA localization assays. In contrast to previous reports, our studies indicate that the GLS is sufficient for anterior localization only. Our data support a model in which anterodorsal localization is brought about by repeated rounds of anterior transport, accompanied by specific trapping at the anterodorsal cortex. Our data further indicate that trapping at the anterodorsal corner requires at least one as-yet-unidentified gurken RLE.
Collapse
Affiliation(s)
- Lan Lan
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, United States of America
| | - Shengyin Lin
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, United States of America
| | - Sui Zhang
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, United States of America
| | - Robert S. Cohen
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, United States of America
- * E-mail:
| |
Collapse
|
23
|
Qiu S, Bergero R, Zeng K, Charlesworth D. Patterns of codon usage bias in Silene latifolia. Mol Biol Evol 2010; 28:771-80. [PMID: 20855431 DOI: 10.1093/molbev/msq251] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Patterns of codon usage bias (CUB) convey useful information about the selection on synonymous codons induced by gene expression and contribute to an understanding of substitution patterns observed at synonymous sites. They can also be informative about the distinctive evolutionary properties of sex chromosomes such as genetic degeneration of the Y chromosome, dosage compensation, and hemizygosity of the X chromosome in males, which can affect the selection on codon usage. Here, we study CUB in Silene latifolia, a species of interest for studying the early stages of sex chromosome evolution. We have obtained a large expressed sequence tag data set containing more than 1,608 sequence fragments by 454 sequencing. Using three different methods, we conservatively define 21 preferred codons. Interestingly, the preferred codons in S. latifolia are almost identical to those in Arabidopsis thaliana, despite their long divergence time (we estimate average nonsynonymous site divergence to be 0.216, and synonymous sites are saturated). The agreement suggests that the nature of selection on codon usage has not changed significantly during the long evolutionary time separating the two species. As in many other organisms, the frequency of preferred codons is negatively correlated with protein length. For the 43 genes with both exon and intron sequences, we find a positive correlation between gene expression levels and GC content at third codon positions, but a strong negative correlation between expression and intron GC content, suggesting that the CUB we detect in S. latifolia is more likely to be due to natural selection than to mutational bias. Using polymorphism data, we detect evidence of ongoing natural selection on CUB, but we find little support for effects of biased gene conversion. An analysis of ten sex-linked genes reveals that the X chromosome has experienced significantly more unpreferred to preferred than preferred to unpreferred substitutions, suggesting that it may be evolving higher CUB. In contrast, numbers of substitutions between preferred and unpreferred codons are similar in both directions in the Y-linked genes, contrary to the expectation of genetic degeneration.
Collapse
Affiliation(s)
- Suo Qiu
- State Key Laboratory of Biocontrol and Key Laboratory of Gene Engineering of the Ministry of Education, Sun Yat-Sen University, Guangzhou 510275, China.
| | | | | | | |
Collapse
|
24
|
Arguello JR, Zhang Y, Kado T, Fan C, Zhao R, Innan H, Wang W, Long M. Recombination yet inefficient selection along the Drosophila melanogaster subgroup's fourth chromosome. Mol Biol Evol 2010; 27:848-61. [PMID: 20008457 PMCID: PMC2877538 DOI: 10.1093/molbev/msp291] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
A central goal of evolutionary genetics is an understanding of the forces responsible for the observed variation, both within and between species. Theoretical and empirical work have demonstrated that genetic recombination contributes to this variation by breaking down linkage between nucleotide sites, thus allowing them to behave independently and for selective forces to act efficiently on them. The Drosophila fourth chromosome, which is believed to experience no-or very low-rates of recombination has been an important model for investigating these effects. Despite previous efforts, central questions regarding the extent of recombination and the predominant modes of selection acting on it remain open. In order to more comprehensively test hypotheses regarding recombination and its potential influence on selection along the fourth chromosome, we have resequenced regions from most of its genes from Drosophila melanogaster, D. simulans, and D. yakuba. These data, along with available outgroup sequence, demonstrate that recombination is low but significantly greater than zero for the three species. Despite there being recombination, there is strong evidence that its frequency is low enough to have rendered selection relatively inefficient. The signatures of relaxed constraint can be detected at both the level of polymorphism and divergence.
Collapse
Affiliation(s)
- J. Roman Arguello
- Committee on Evolutionary Biology, University of Chicago
- Department of Ecology and Evolution, University of Chicago
| | - Yue Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Tomoyuki Kado
- Hayama Center for Advanced Studies, The Graduate University for Advanced Studies, Hayama, Kanagawa, Japan
| | - Chuanzhu Fan
- Department of Ecology and Evolution, University of Chicago
| | - Ruoping Zhao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Hideki Innan
- Hayama Center for Advanced Studies, The Graduate University for Advanced Studies, Hayama, Kanagawa, Japan
| | - Wen Wang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Manyuan Long
- Committee on Evolutionary Biology, University of Chicago
- Department of Ecology and Evolution, University of Chicago
| |
Collapse
|
25
|
McCRACKEN KG, BARGER CP, BULGARELLA M, JOHNSON KP, SONSTHAGEN SA, TRUCCO J, VALQUI TH, WILSON RE, WINKER K, SORENSON MD. Parallel evolution in the major haemoglobin genes of eight species of Andean waterfowl. Mol Ecol 2009; 18:3992-4005. [DOI: 10.1111/j.1365-294x.2009.04352.x] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
26
|
Takahashi A. Effect of exonic splicing regulation on synonymous codon usage in alternatively spliced exons of Dscam. BMC Evol Biol 2009; 9:214. [PMID: 19709440 PMCID: PMC2741454 DOI: 10.1186/1471-2148-9-214] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2008] [Accepted: 08/27/2009] [Indexed: 12/31/2022] Open
Abstract
Background Synonymous codon usage is typically biased towards translationally superior codons in many organisms. In Drosophila, genomic data indicates that translationally optimal codons and splice optimal codons are mostly mutually exclusive, and adaptation to translational efficiency is reduced in the intron-exon boundary regions where potential exonic splicing enhancers (ESEs) reside. In contrast to genomic scale analyses on large datasets, a refined study on a well-controlled set of samples can be effective in demonstrating the effects of particular splice-related factors. Down syndrome cell adhesion molecule (Dscam) has the largest number of alternatively spliced exons (ASEs) known to date, and the splicing frequency of each ASE is accessible from the relative abundance of the transcript. Thus, these ASEs comprise a unique model system for studying the effect of splicing regulation on synonymous codon usage. Results Codon Bias Indices (CBI) in the 3' boundary regions were reduced compared to the rest of the exonic regions among 48 and 33 ASEs of exon 6 and 9 clusters, respectively. These regional differences in CBI were affected by splicing frequency and distance from adjacent exons. Synonymous divergence levels between the 3' boundary region and the remaining exonic region of exon 6 ASEs were similar. Additionally, another sensitive comparison of paralogous exonic regions in recently retrotransposed processed genes and their parental genes revealed that, in the former, the differences in CBI between what were formerly the central regions and the boundary regions gradually became smaller over time. Conclusion Analyses of the multiple ASEs of Dscam allowed direct tests of the effect of splice-related factors on synonymous codon usage and provided clear evidence that synonymous codon usage bias is restricted by exonic splicing signals near the intron-exon boundary. A similar synonymous divergence level between the different exonic regions suggests that the intensity of splice-related selection is generally weak and comparable to that of translational selection. Finally, the leveling off of differences in codon bias over time in retrotransposed genes meets the direct prediction of the tradeoff model that invokes conflict between translational superiority and splicing regulation, and strengthens the conclusions obtained from Dscam.
Collapse
Affiliation(s)
- Aya Takahashi
- Division of Population Genetics, National Institute of Genetics, Mishima 411-8540, Japan.
| |
Collapse
|
27
|
Singh ND, Arndt PF, Clark AG, Aquadro CF. Strong evidence for lineage and sequence specificity of substitution rates and patterns in Drosophila. Mol Biol Evol 2009; 26:1591-605. [PMID: 19351792 DOI: 10.1093/molbev/msp071] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Rates of single nucleotide substitution in Drosophila are highly variable within the genome, and several examples illustrate that evolutionary rates differ among Drosophila species as well. Here, we use a maximum likelihood method to quantify lineage-specific substitutional patterns and apply this method to 4-fold degenerate synonymous sites and introns from more than 8,000 genes aligned in the Drosophila melanogaster group. We find that within species, different classes of sequence evolve at different rates, with long introns evolving most slowly and short introns evolving most rapidly. Relative rates of individual single nucleotide substitutions vary approximately 3-fold among lineages, yielding patterns of substitution that are comparatively less GC-biased in the melanogaster species complex relative to Drosophila yakuba and Drosophila erecta. These results are consistent with a model coupling a mutational shift toward reduced GC content, or a shift in mutation-selection balance, in the D. melanogaster species complex, with variation in selective constraint among different classes of DNA sequence. Finally, base composition of coding and intronic sequences is not at equilibrium with respect to substitutional patterns, which primarily reflects the slow rate of the substitutional process. These results thus support the view that mutational and/or selective processes are labile on an evolutionary timescale and that if the process is indeed selection driven, then the distribution of selective constraint is variable across the genome.
Collapse
Affiliation(s)
- Nadia D Singh
- Department of Molecular Biology and Genetics, Cornell University.
| | | | | | | |
Collapse
|
28
|
Betancourt AJ, Welch JJ, Charlesworth B. Reduced effectiveness of selection caused by a lack of recombination. Curr Biol 2009; 19:655-60. [PMID: 19285399 DOI: 10.1016/j.cub.2009.02.039] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2008] [Revised: 02/13/2009] [Accepted: 02/13/2009] [Indexed: 11/27/2022]
Abstract
Genetic recombination associated with sexual reproduction is expected to have important consequences for the effectiveness of natural selection. These effects may be evident within genomes, in the form of contrasting patterns of molecular variation and evolution in regions with different levels of recombination. Previous work reveals patterns that are consistent with a benefit of recombination for adaptation at the level of protein sequence: both positive selection for adaptive variants and purifying selection against deleterious ones appear to be compromised in regions of low recombination [1-11]. Here, we re-examine these patterns by using polymorphism and divergence data from the Drosophila dot chromosome, which has a long history of reduced recombination. To avoid confounding selection and demographic effects, we collected these data from a species with an apparently stable demographic history, Drosophila americana. We find that D. americana dot loci show several signatures of ineffective purifying and positive selection, including an increase in the rate of protein evolution, an increase in protein polymorphism, and a reduction in the proportion of amino acid substitutions attributable to positive selection.
Collapse
Affiliation(s)
- Andrea J Betancourt
- Institute of Evolutionary Biology, University of Edinburgh, Ashworth Laboratories, Edinburgh, UK.
| | | | | |
Collapse
|
29
|
Petit N, Barbadilla A. Selection efficiency and effective population size in Drosophila species. J Evol Biol 2008; 22:515-26. [PMID: 19170822 DOI: 10.1111/j.1420-9101.2008.01672.x] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
A corollary of the nearly neutral theory of molecular evolution is that the efficiency of natural selection depends on effective population size. In this study, we evaluated the differences in levels of synonymous polymorphism among Drosophila species and showed that these differences can be explained by differences in effective population size. The differences can have implications for the molecular evolution of the Drosophila species, as is suggested by our results showing that the levels of codon bias and the proportion of adaptive substitutions are both higher in species with higher levels of synonymous polymorphism. Moreover, species with lower synonymous polymorphism have higher levels of nonsynonymous polymorphism and larger content of repetitive sequences in their genomes, suggesting a diminished efficiency of selection in species with smaller effective population size.
Collapse
Affiliation(s)
- N Petit
- Group of Genomics, Bioinformatics and Evolution, Departament de Genètica i Microbiologia, Facultat de Biociències, Universitat Autònoma de Barcelona, Bellaterra, Spain.
| | | |
Collapse
|
30
|
Holloway AK, Begun DJ, Siepel A, Pollard KS. Accelerated sequence divergence of conserved genomic elements in Drosophila melanogaster. Genome Res 2008; 18:1592-601. [PMID: 18583644 DOI: 10.1101/gr.077131.108] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Recent genomic sequencing of 10 additional Drosophila genomes provides a rich resource for comparative genomics analyses aimed at understanding the similarities and differences between species and between Drosophila and mammals. Using a phylogenetic approach, we identified 64 genomic elements that have been highly conserved over most of the Drosophila tree, but that have experienced a recent burst of evolution along the Drosophila melanogaster lineage. Compared to similarly defined elements in humans, these regions of rapid lineage-specific evolution in Drosophila differ dramatically in location, mechanism of evolution, and functional properties of associated genes. Notably, the majority reside in protein-coding regions and primarily result from rapid adaptive synonymous site evolution. In fact, adaptive evolution appears to be driving substitutions to unpreferred codons. Our analysis also highlights interesting noncoding genomic regions, such as regulatory regions in the gene gooseberry-neuro and a putative novel miRNA.
Collapse
Affiliation(s)
- Alisha K Holloway
- Department of Evolution and Ecology and Center for Population Biology, University of California, Davis, California 95691, USA.
| | | | | | | |
Collapse
|
31
|
Multilocus analysis of introgression between two sand fly vectors of leishmaniasis. BMC Evol Biol 2008; 8:141. [PMID: 18474115 PMCID: PMC2413237 DOI: 10.1186/1471-2148-8-141] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2007] [Accepted: 05/12/2008] [Indexed: 11/24/2022] Open
Abstract
Background The phlebotomine sand flies (Diptera:Psychodidae) Lutzomyia (Nyssomyia) intermedia Lutz & Neiva 1912 and Lutzomyia (Nyssomyia) whitmani Antunes & Coutinho 1932 are two very closely related species and important vectors of American cutaneous leishmaniasis. Two single-locus studies have revealed evidence for introgression between the two species in both mitochondrial and nuclear genomes. These findings have prompted the development of a multilocus approach to investigate in more detail the genetic exchanges between the two species. Results We analyzed ten nuclear loci using the "isolation with migration" model implemented in the IM program, finding evidence for introgression from L. intermedia towards L. whitmani in three loci. These results confirm that introgression is occurring between the two species and suggest variation in the effects of gene flow among the different regions of the genome. Conclusion The demonstration that these two vectors are not fully reproductively isolated might have important epidemiological consequences as these species could be exchanging genes controlling aspects of their vectorial capacity.
Collapse
|
32
|
Heger A, Ponting CP. Evolutionary rate analyses of orthologs and paralogs from 12 Drosophila genomes. Genome Res 2007; 17:1837-49. [PMID: 17989258 DOI: 10.1101/gr.6249707] [Citation(s) in RCA: 106] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The newly sequenced genome sequences of 11 Drosophila species provide the first opportunity to investigate variations in evolutionary rates across a clade of closely related species. Protein-coding genes were predicted using established Drosophila melanogaster genes as templates, with recovery rates ranging from 81%-97% depending on species divergence and on genome assembly quality. Orthology and paralogy assignments were shown to be self-consistent among the different Drosophila species and to be consistent with regions of conserved gene order (synteny blocks). Next, we investigated the rates of diversification among these species' gene repertoires with respect to amino acid substitutions and to gene duplications. Constraints on amino acid sequences appear to have been most pronounced on D. ananassae and least pronounced on D. simulans and D. erecta terminal lineages. Codons predicted to have been subject to positive selection were found to be significantly over-represented among genes with roles in immune response and RNA metabolism, with the latter category including each subunit of the Dicer-2/r2d2 heterodimer. The vast majority of gene duplications (96.5%) and synteny rearrangements were found to occur, as expected, within single Müller elements. We show that the rate of ancient gene duplications was relatively uniform. However, gene duplications in terminal lineages are strongly skewed toward very recent events, consistent with either a rapid-birth and rapid-death model or the presence of large proportions of copy number variable genes in these Drosophila populations. Duplications were significantly more frequent among trypsin-like proteases and DM8 putative lipid-binding domain proteins.
Collapse
Affiliation(s)
- Andreas Heger
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3QX, United Kingdom.
| | | |
Collapse
|