1
|
Ali F. Patterns of Change in Nucleotide Diversity Over Gene Length. Genome Biol Evol 2024; 16:evae078. [PMID: 38608148 DOI: 10.1093/gbe/evae078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 03/26/2024] [Accepted: 04/03/2024] [Indexed: 04/14/2024] Open
Abstract
Nucleotide diversity at a site is influenced by the relative strengths of neutral and selective population genetic processes. Therefore, attempts to estimate Effective population size based on the diversity of synonymous sites demand a better understanding of their selective constraints. The nucleotide diversity of a gene was previously found to correlate with its length. In this work, I measure nucleotide diversity at synonymous sites and uncover a pattern of low diversity towards the translation initiation site of a gene. The degree of reduction in diversity at the translation initiation site and the length of this region of reduced diversity can be quantified as "Effect Size" and "Effect Length" respectively, using parameters of an asymptotic regression model. Estimates of Effect Length across bacteria covaried with recombination rates as well as with a multitude of translation-associated traits such as the avoidance of mRNA secondary structure around translation initiation site, the number of rRNAs, and relative codon usage of ribosomal genes. Evolutionary simulations under purifying selection reproduce the observed patterns and diversity-length correlation and highlight that selective constraints on the 5'-region of a gene may be more extensive than previously believed. These results have implications for the estimation of effective population size, and relative mutation rates, and for genome scans of genes under positive selection based on "silent-site" diversity.
Collapse
Affiliation(s)
- Farhan Ali
- Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
2
|
Picard MAL, Leblay F, Cassan C, Willemsen A, Daron J, Bauffe F, Decourcelle M, Demange A, Bravo IG. Transcriptomic, proteomic, and functional consequences of codon usage bias in human cells during heterologous gene expression. Protein Sci 2023; 32:e4576. [PMID: 36692287 PMCID: PMC9926478 DOI: 10.1002/pro.4576] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/12/2023] [Accepted: 01/14/2023] [Indexed: 01/25/2023]
Abstract
Differences in codon frequency between genomes, genes, or positions along a gene, modulate transcription and translation efficiency, leading to phenotypic and functional differences. Here, we present a multiscale analysis of the effects of synonymous codon recoding during heterologous gene expression in human cells, quantifying the phenotypic consequences of codon usage bias at different molecular and cellular levels, with an emphasis on translation elongation. Six synonymous versions of an antibiotic resistance gene were generated, fused to a fluorescent reporter, and independently expressed in HEK293 cells. Multiscale phenotype was analyzed by means of quantitative transcriptome and proteome assessment, as proxies for gene expression; cellular fluorescence, as a proxy for single-cell level expression; and real-time cell proliferation in absence or presence of antibiotic, as a proxy for the cell fitness. We show that differences in codon usage bias strongly impact the molecular and cellular phenotype: (i) they result in large differences in mRNA levels and protein levels, leading to differences of over 15 times in translation efficiency; (ii) they introduce unpredicted splicing events; (iii) they lead to reproducible phenotypic heterogeneity; and (iv) they lead to a trade-off between the benefit of antibiotic resistance and the burden of heterologous expression. In human cells in culture, codon usage bias modulates gene expression by modifying mRNA availability and suitability for translation, leading to differences in protein levels and eventually eliciting functional phenotypic changes.
Collapse
Affiliation(s)
- Marion A. L. Picard
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Fiona Leblay
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Cécile Cassan
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Anouk Willemsen
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Josquin Daron
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Frédérique Bauffe
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Mathilde Decourcelle
- BioCampus Montpellier (University of Montpellier, CNRS, INSERM)MontpellierFrance
| | - Antonin Demange
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Ignacio G. Bravo
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| |
Collapse
|
3
|
Li C, Zhou L, Nie J, Wu S, Li W, Liu Y, Liu Y. Codon usage bias and genetic diversity in chloroplast genomes of Elaeagnus species (Myrtiflorae: Elaeagnaceae). PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2023; 29:239-251. [PMID: 36875724 PMCID: PMC9981860 DOI: 10.1007/s12298-023-01289-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 01/20/2023] [Accepted: 01/27/2023] [Indexed: 06/18/2023]
Abstract
Codon usage bias (CUB) reveals the characteristics of species and can be utilized to understand their evolutionary relationship, increase the target genes' expression in the heterologous receptor plants, and further provide theoretic assistance for correlative study on molecular biology and genetic breeding. The chief aim of this work was to analyze the CUB in chloroplast (cp.) genes in nine Elaeagnus species to provide references for subsequent studies. The codons of Elaeagnus cp. genes preferred to end with A/T bases rather than with G/C bases. Most of the cp. genes were prone to mutation, while the rps7 genes were identical in sequences. Natural selection was inferred to have a powerful impact on the CUB in Elaeagnus cp. genomes, and their CUB was extremely strong. In addition, the optimal codons were identified in the nine cp. genomes based on the relative synonymous codon usage (RSCU) values, and the optimal codon numbers were between 15 and 19. The clustering analyses based on RSCU were contrasted with the maximum likelihood (ML)-based phylogenetic tree derived from coding sequences, suggesting that the t-distributed Stochastic Neighbor Embedding clustering method was more appropriate for evolutionary relationship analysis than the complete linkage method. Moreover, the ML-based phylogenetic tree based on the conservative matK genes and the whole cp. genomes had visible differences, indicating that the sequences of specific cp. genes were profoundly affected by their surroundings. Following the clustering analysis, Arabidopsis thaliana was considered the optimal heterologous expression receptor plant for the Elaeagnus cp. genes. Supplementary Information The online version contains supplementary material available at 10.1007/s12298-023-01289-6.
Collapse
Affiliation(s)
- Changle Li
- College of Forestry, Northwest A&F University, Yangling, 712100 China
| | - Ling Zhou
- College of Forestry, Northwest A&F University, Yangling, 712100 China
| | - Jiangbo Nie
- College of Forestry, Northwest A&F University, Yangling, 712100 China
| | - Songping Wu
- College of Forestry, Northwest A&F University, Yangling, 712100 China
| | - Wei Li
- Academy of Agriculture and Forestry Science, Qinghai University, Xining, 810016 China
| | - Yonghong Liu
- College of Forestry, Northwest A&F University, Yangling, 712100 China
| | - Yulin Liu
- College of Forestry, Northwest A&F University, Yangling, 712100 China
| |
Collapse
|
4
|
Khandia R, Saeed M, Alharbi AM, Ashraf GM, Greig NH, Kamal MA. Codon Usage Bias Correlates With Gene Length in Neurodegeneration Associated Genes. Front Neurosci 2022; 16:895607. [PMID: 35860292 PMCID: PMC9289476 DOI: 10.3389/fnins.2022.895607] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 06/08/2022] [Indexed: 11/13/2022] Open
Abstract
Codon usage analysis is a crucial part of molecular characterization and is used to determine the factors affecting the evolution of a gene. The length of a gene is an important parameter that affects the characteristics of the gene, such as codon usage, compositional parameters, and sometimes, its functions. In the present study, we investigated the association of various parameters related to codon usage with the length of genes. Gene expression is affected by nucleotide disproportion. In sixty genes related to neurodegenerative disorders, the G nucleotide was the most abundant and the T nucleotide was the least. The nucleotide T exhibited a significant association with the length of the gene at both the overall compositional level and the first and second codon positions. Codon usage bias (CUB) of these genes was affected by pyrimidine and keto skews. Gene length was found to be significantly correlated with codon bias in neurodegeneration associated genes. In gene segments with lengths below 1,200 bp and above 2,400 bp, CUB was positively associated with length. Relative synonymous CUB, which is another measure of CUB, showed that codons TTA, GTT, GTC, TCA, GGT, and GGA exhibited a positive association with length, whereas codons GTA, AGC, CGT, CGA, and GGG showed a negative association. GC-ending codons were preferred over AT-ending codons. Overall analysis indicated that the association between CUB and length varies depending on the segment size; however, CUB of 1,200–2,000 bp gene segments appeared not affected by gene length. In synopsis, analysis suggests that length of the genes correlates with various imperative molecular signatures including A/T nucleotide disproportion and codon choices. In the present study we additionally evaluated various molecular features and their correlation with different indices of codon usage, like the Codon Adaptation Index (CAI) and Relative Dynonymous Codon Usage (RSCU) of codons. We also considered the impact of gene fragment size on different molecular features in genes related to neurodegeneration. This analysis will aid our understanding of and in potentially modulating gene expression in cases of defective gene functioning in clinical settings.
Collapse
Affiliation(s)
- Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, India
- *Correspondence: Rekha Khandia, ;
| | - Mohd. Saeed
- Department of Biology, College of Sciences, University of Hail, Hail, Saudi Arabia
| | - Ahmed M. Alharbi
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, University of Hail, Hail, Saudi Arabia
| | - Ghulam Md. Ashraf
- Pre-clinical Research Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Nigel H. Greig
- Drug Design and Development Section, Translational Gerontology Branch, Intramural Research Program National Institute on Aging, NIH, Baltimore, MD, United States
| | - Mohammad Amjad Kamal
- Institutes for Systems Genetics, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka, Bangladesh
- Enzymoics, Novel Global Community Educational Foundation, Hebersham, NSW, Australia
| |
Collapse
|
5
|
Wang X, Li LL, Xiao Y, Chen XY, Chen JH, Hu XS. A complete sequence of mitochondrial genome of Neolamarckia cadamba and its use for systematic analysis. Sci Rep 2021; 11:21452. [PMID: 34728739 PMCID: PMC8564537 DOI: 10.1038/s41598-021-01040-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 10/22/2021] [Indexed: 11/09/2022] Open
Abstract
Neolamarckia cadamba is an important tropical and subtropical tree for timber industry in southern China and is also a medicinal plant because of the secondary product cadambine. N. cadamba belongs to Rubiaceae family and its taxonomic relationships with other species are not fully evaluated based on genome sequences. Here, we report the complete sequences of mitochondrial genome of N. cadamba, which is 414,980 bp in length and successfully assembled in two genome circles (109,836 bp and 305,144 bp). The mtDNA harbors 83 genes in total, including 40 protein-coding genes (PCGs), 31 transfer RNA genes, 6 ribosomal RNA genes, and 6 other genes. The base composition of the whole genome is estimated as 27.26% for base A, 22.63% for C, 22.53% for G, and 27.56% for T, with the A + T content of 54.82% (54.45% in the small circle and 54.79% in the large circle). Repetitive sequences account for ~ 0.14% of the whole genome. A maximum likelihood (ML) tree based on DNA sequences of 24 PCGs supports that N. cadamba belongs to order Gentianales. A ML tree based on rps3 gene of 60 species in family Rubiaceae shows that N. cadamba is more related to Cephalanthus accidentalis and Hymenodictyon parvifolium and belongs to the Cinchonoideae subfamily. The result indicates that N. cadamba is genetically distant from the species and genera of Rubiaceae in systematic position. As the first sequence of mitochondrial genome of N. cadamba, it will provide a useful resource to investigate genetic variation and develop molecular markers for genetic breeding in the future.
Collapse
Affiliation(s)
- Xi Wang
- College of Forestry and Landscape Architecture, South China Agricultural University, Guangdong, 510642, China.,Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, South China Agricultural University, Guangdong, 510642, China
| | - Ling-Ling Li
- College of Forestry and Landscape Architecture, South China Agricultural University, Guangdong, 510642, China.,Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, South China Agricultural University, Guangdong, 510642, China
| | - Yu Xiao
- College of Forestry and Landscape Architecture, South China Agricultural University, Guangdong, 510642, China.,Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, South China Agricultural University, Guangdong, 510642, China
| | - Xiao-Yang Chen
- College of Forestry and Landscape Architecture, South China Agricultural University, Guangdong, 510642, China.,Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, South China Agricultural University, Guangdong, 510642, China
| | - Jie-Hu Chen
- Science Corporation of Gene (SCGene), Guangzhou, 510000, China
| | - Xin-Sheng Hu
- College of Forestry and Landscape Architecture, South China Agricultural University, Guangdong, 510642, China. .,Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, South China Agricultural University, Guangdong, 510642, China.
| |
Collapse
|
6
|
Bahiri-Elitzur S, Tuller T. Codon-based indices for modeling gene expression and transcript evolution. Comput Struct Biotechnol J 2021; 19:2646-2663. [PMID: 34025951 PMCID: PMC8122159 DOI: 10.1016/j.csbj.2021.04.042] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2021] [Revised: 04/17/2021] [Accepted: 04/18/2021] [Indexed: 11/21/2022] Open
Abstract
Codon usage bias (CUB) refers to the phenomena that synonymous codons are used in different frequencies in most genes and organisms. The general assumption is that codon biases reflect a balance between mutational biases and natural selection. Today we understand that the codon content is related and can affect all gene expression steps. Starting from the 1980s, codon-based indices have been used for answering different questions in all biomedical fields, including systems biology, agriculture, medicine, and biotechnology. In general, codon usage bias indices weigh each codon or a small set of codons to estimate the fitting of a certain coding sequence to a certain phenomenon (e.g., bias in codons, adaptation to the tRNA pool, frequencies of certain codons, transcription elongation speed, etc.) and are usually easy to implement. Today there are dozens of such indices; thus, this paper aims to review and compare the different codon usage bias indices, their applications, and advantages. In addition, we perform analysis that demonstrates that most indices tend to correlate even though they aim to capture different aspects. Due to the centrality of codon usage bias on different gene expression steps, it is important to keep developing new indices that can capture additional aspects that are not modeled with the current indices.
Collapse
Affiliation(s)
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel-Aviv University, Tel Aviv, Israel
- The Sagol School of Neuroscience, Tel-Aviv University, Tel Aviv, Israel
| |
Collapse
|
7
|
Abstract
Darwin's theory of evolution emphasized that positive selection of functional proficiency provides the fitness that ultimately determines the structure of life, a view that has dominated biochemical thinking of enzymes as perfectly optimized for their specific functions. The 20th-century modern synthesis, structural biology, and the central dogma explained the machinery of evolution, and nearly neutral theory explained how selection competes with random fixation dynamics that produce molecular clocks essential e.g. for dating evolutionary histories. However, quantitative proteomics revealed that selection pressures not relating to optimal function play much larger roles than previously thought, acting perhaps most importantly via protein expression levels. This paper first summarizes recent progress in the 21st century toward recovering this universal selection pressure. Then, the paper argues that proteome cost minimization is the dominant, underlying 'non-function' selection pressure controlling most of the evolution of already functionally adapted living systems. A theory of proteome cost minimization is described and argued to have consequences for understanding evolutionary trade-offs, aging, cancer, and neurodegenerative protein-misfolding diseases.
Collapse
|
8
|
Chen Z, Zhao J, Qiao J, Li W, Li J, Xu R, Wang H, Liu Z, Xing B, Wendel JF, Grover CE. Comparative analysis of codon usage between Gossypium hirsutum and G. barbadense mitochondrial genomes. Mitochondrial DNA B Resour 2020; 5:2500-2506. [PMID: 33457843 PMCID: PMC7782173 DOI: 10.1080/23802359.2020.1780969] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Gossypium hirsutum and G. barbadense mitochondrial genomes were analyzed to understand the factors shaping codon usage. While most analyses of codon usage suggest minimal to no bias, nucleotide composition, specifically GC content, was significantly correlated with codon usage. In general, both mitochondrial genomes favor codons that end in A or U, with a secondary preference for pyrimidine rich codons. These observations are similar to previous reports of codon usage in cotton nuclear genomes, possibly suggestive of a general bias spanning genomic compartment. Although evidence for codon usage bias is weak for most genes, we identified six genes (i.e. atp8, atp9, sdh3, sdh4, mttB and rpl2) with significant nonrandom codon usage. In general, we find multiple factors that influence cotton mitochondrial genome codon usage, which may include selection in a subset of genes.
Collapse
Affiliation(s)
- Zhiwen Chen
- Institute of Carbon Materials Science, Shanxi Datong University, Datong, China
| | - Jianguo Zhao
- Institute of Carbon Materials Science, Shanxi Datong University, Datong, China.,College of Chemistry and Environment Engineering, Shanxi Datong University, Datong, China
| | - Jun Qiao
- College of Chemistry and Environment Engineering, Shanxi Datong University, Datong, China
| | - Weijia Li
- Institute of Carbon Materials Science, Shanxi Datong University, Datong, China
| | - Jingwei Li
- Institute of Carbon Materials Science, Shanxi Datong University, Datong, China
| | - Ran Xu
- College of Chemistry and Environment Engineering, Shanxi Datong University, Datong, China
| | - Haiyan Wang
- College of Chemistry and Environment Engineering, Shanxi Datong University, Datong, China
| | - Zehui Liu
- College of Chemistry and Environment Engineering, Shanxi Datong University, Datong, China
| | - Baoyan Xing
- Institute of Carbon Materials Science, Shanxi Datong University, Datong, China
| | - Jonathan F Wendel
- Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, IA, USA
| | - Corrinne E Grover
- Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, IA, USA
| |
Collapse
|
9
|
Jitobaom K, Phakaratsakul S, Sirihongthong T, Chotewutmontri S, Suriyaphol P, Suptawiwat O, Auewarakul P. Codon usage similarity between viral and some host genes suggests a codon-specific translational regulation. Heliyon 2020; 6:e03915. [PMID: 32395662 PMCID: PMC7205639 DOI: 10.1016/j.heliyon.2020.e03915] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Revised: 03/02/2020] [Accepted: 04/30/2020] [Indexed: 02/03/2023] Open
Abstract
The codon usage pattern is a specific characteristic of each species; however, the codon usage of all of the genes in a genome is not uniform. Intriguingly, most viruses have codon usage patterns that are vastly different from the optimal codon usage of their hosts. How viral genes with different codon usage patterns are efficiently expressed during a viral infection is unclear. An analysis of the similarity between viral codon usage and the codon usage of the individual genes of a host genome has never been performed. In this study, we demonstrated that the codon usage of human RNA viruses is similar to that of some human genes, especially those involved in the cell cycle. This finding was substantiated by its concordance with previous reports of an upregulation at the protein level of some of these biological processes. It therefore suggests that some suboptimal viral codon usage patterns may actually be compatible with cellular translational machineries in infected conditions.
Collapse
Affiliation(s)
- Kunlakanya Jitobaom
- Department of Microbiology, Faculty of Medicine Siriraj Hospital, Mahidol University, Thailand
| | - Supinya Phakaratsakul
- Department of Microbiology, Faculty of Medicine Siriraj Hospital, Mahidol University, Thailand
| | | | - Sasithorn Chotewutmontri
- Faculty of Medicine and Public Health, HRH Princess Chulabhorn College of Medical Science, Chulabhorn Royal Academy, Bangkok, Thailand
| | - Prapat Suriyaphol
- Division of Bioinformatics and Data Management for Research, Department of Research and Development, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok, Thailand.,Center of Excellence in Bioinformatics and Clinical Data Management, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Ornpreya Suptawiwat
- Faculty of Medicine and Public Health, HRH Princess Chulabhorn College of Medical Science, Chulabhorn Royal Academy, Bangkok, Thailand
| | - Prasert Auewarakul
- Department of Microbiology, Faculty of Medicine Siriraj Hospital, Mahidol University, Thailand
| |
Collapse
|
10
|
Alvarez-Ponce D, Feyertag F, Chakraborty S. Position Matters: Network Centrality Considerably Impacts Rates of Protein Evolution in the Human Protein-Protein Interaction Network. Genome Biol Evol 2018; 9:1742-1756. [PMID: 28854629 PMCID: PMC5570066 DOI: 10.1093/gbe/evx117] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2017] [Indexed: 02/06/2023] Open
Abstract
The proteins of any organism evolve at disparate rates. A long list of factors affecting rates of protein evolution have been identified. However, the relative importance of each factor in determining rates of protein evolution remains unresolved. The prevailing view is that evolutionary rates are dominantly determined by gene expression, and that other factors such as network centrality have only a marginal effect, if any. However, this view is largely based on analyses in yeasts, and accurately measuring the importance of the determinants of rates of protein evolution is complicated by the fact that the different factors are often correlated with each other, and by the relatively poor quality of available functional genomics data sets. Here, we use correlation, partial correlation and principal component regression analyses to measure the contributions of several factors to the variability of the rates of evolution of human proteins. For this purpose, we analyzed the entire human protein–protein interaction data set and the human signal transduction network—a network data set of exceptionally high quality, obtained by manual curation, which is expected to be virtually free from false positives. In contrast with the prevailing view, we observe that network centrality (measured as the number of physical and nonphysical interactions, betweenness, and closeness) has a considerable impact on rates of protein evolution. Surprisingly, the impact of centrality on rates of protein evolution seems to be comparable, or even superior according to some analyses, to that of gene expression. Our observations seem to be independent of potentially confounding factors and from the limitations (biases and errors) of interactomic data sets.
Collapse
|
11
|
Pal S, Sarkar I, Roy A, Mohapatra PKD, Mondal KC, Sen A. Comparative evolutionary genomics of Corynebacterium with special reference to codon and amino acid usage diversities. Genetica 2017; 146:13-27. [DOI: 10.1007/s10709-017-9986-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 09/11/2017] [Indexed: 11/28/2022]
|
12
|
Abstract
Mistranslation errors compromise fitness by wasting resources on nonfunctional proteins. In order to reduce the cost of mistranslations, natural selection chooses the most accurately translated codons at sites that are particularly important for protein structure and function. We investigated the determinants underlying selection for translational accuracy in several species of plants belonging to three clades: Brassicaceae, Fabidae, and Poaceae. Although signatures of translational selection were found in genes from a wide range of species, the underlying factors varied in nature and intensity. Indeed, the degree of synonymous codon bias at evolutionarily conserved sites varied among plant clades while remaining uniform within each clade. This is unlikely to solely reflect the diversity of tRNA pools because there is little correlation between synonymous codon bias and tRNA abundance, so other factors must affect codon choice and translational accuracy in plant genes. Accordingly, synonymous codon choice at a given site was affected not only by the selection pressure at that site, but also its participation in protein domains or mRNA secondary structures. Although these effects were detected in all the species we analyzed, their impact on translation accuracy was distinct in evolutionarily distant plant clades. The domain effect was found to enhance translational accuracy in dicot and monocot genes with a high GC content, but to oppose the selection of more accurate codons in monocot genes with a low GC content.
Collapse
|
13
|
Rogers DW, Böttcher MA, Traulsen A, Greig D. Ribosome reinitiation can explain length-dependent translation of messenger RNA. PLoS Comput Biol 2017; 13:e1005592. [PMID: 28598992 PMCID: PMC5482490 DOI: 10.1371/journal.pcbi.1005592] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Revised: 06/23/2017] [Accepted: 05/25/2017] [Indexed: 12/21/2022] Open
Abstract
Models of mRNA translation usually presume that transcripts are linear; upon reaching the end of a transcript each terminating ribosome returns to the cytoplasmic pool before initiating anew on a different transcript. A consequence of linear models is that faster translation of a given mRNA is unlikely to generate more of the encoded protein, particularly at low ribosome availability. Recent evidence indicates that eukaryotic mRNAs are circularized, potentially allowing terminating ribosomes to preferentially reinitiate on the same transcript. Here we model the effect of ribosome reinitiation on translation and show that, at high levels of reinitiation, protein synthesis rates are dominated by the time required to translate a given transcript. Our model provides a simple mechanistic explanation for many previously enigmatic features of eukaryotic translation, including the negative correlation of both ribosome densities and protein abundance on transcript length, the importance of codon usage in determining protein synthesis rates, and the negative correlation between transcript length and both codon adaptation and 5' mRNA folding energies. In contrast to linear models where translation is largely limited by initiation rates, our model reveals that all three stages of translation-initiation, elongation, and termination/reinitiation-determine protein synthesis rates even at low ribosome availability.
Collapse
Affiliation(s)
- David W. Rogers
- Experimental Evolution Research Group, Max Planck Institute for Evolutionary Biology, Plön, Germany
- Department of Evolutionary Theory, Max Planck Institute for Evolutionary Biology, Plön, Germany
- * E-mail:
| | - Marvin A. Böttcher
- Department of Evolutionary Theory, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Arne Traulsen
- Department of Evolutionary Theory, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Duncan Greig
- Experimental Evolution Research Group, Max Planck Institute for Evolutionary Biology, Plön, Germany
- Department of Genetics, Evolution, and Environment, University College London, London, United Kingdom
| |
Collapse
|
14
|
Badet T, Peyraud R, Mbengue M, Navaud O, Derbyshire M, Oliver RP, Barbacci A, Raffaele S. Codon optimization underpins generalist parasitism in fungi. eLife 2017; 6:e22472. [PMID: 28157073 PMCID: PMC5315462 DOI: 10.7554/elife.22472] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 01/28/2017] [Indexed: 01/04/2023] Open
Abstract
The range of hosts that parasites can infect is a key determinant of the emergence and spread of disease. Yet, the impact of host range variation on the evolution of parasite genomes remains unknown. Here, we show that codon optimization underlies genome adaptation in broad host range parasites. We found that the longer proteins encoded by broad host range fungi likely increase natural selection on codon optimization in these species. Accordingly, codon optimization correlates with host range across the fungal kingdom. At the species level, biased patterns of synonymous substitutions underpin increased codon optimization in a generalist but not a specialist fungal pathogen. Virulence genes were consistently enriched in highly codon-optimized genes of generalist but not specialist species. We conclude that codon optimization is related to the capacity of parasites to colonize multiple hosts. Our results link genome evolution and translational regulation to the long-term persistence of generalist parasitism.
Collapse
Affiliation(s)
- Thomas Badet
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Remi Peyraud
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Malick Mbengue
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Olivier Navaud
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Mark Derbyshire
- Centre for Crop and Disease Management, Department of Environment and Agriculture, Curtin University, Perth, Australia
| | - Richard P Oliver
- Centre for Crop and Disease Management, Department of Environment and Agriculture, Curtin University, Perth, Australia
| | - Adelin Barbacci
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Sylvain Raffaele
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| |
Collapse
|
15
|
Alvarez-Ponce D, Sabater-Muñoz B, Toft C, Ruiz-González MX, Fares MA. Essentiality Is a Strong Determinant of Protein Rates of Evolution during Mutation Accumulation Experiments in Escherichia coli. Genome Biol Evol 2016; 8:2914-2927. [PMID: 27566759 PMCID: PMC5630975 DOI: 10.1093/gbe/evw205] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The Neutral Theory of Molecular Evolution is considered the most powerful theory to understand the evolutionary behavior of proteins. One of the main predictions of this theory is that essential proteins should evolve slower than dispensable ones owing to increased selective constraints. Comparison of genomes of different species, however, has revealed only small differences between the rates of evolution of essential and nonessential proteins. In some analyses, these differences vanish once confounding factors are controlled for, whereas in other cases essentiality seems to have an independent, albeit small, effect. It has been argued that comparing relatively distant genomes may entail a number of limitations. For instance, many of the genes that are dispensable in controlled lab conditions may be essential in some of the conditions faced in nature. Moreover, essentiality can change during evolution, and rates of protein evolution are simultaneously shaped by a variety of factors, whose individual effects are difficult to isolate. Here, we conducted two parallel mutation accumulation experiments in Escherichia coli, during 5,500–5,750 generations, and compared the genomes at different points of the experiments. Our approach (a short-term experiment, under highly controlled conditions) enabled us to overcome many of the limitations of previous studies. We observed that essential proteins evolved substantially slower than nonessential ones during our experiments. Strikingly, rates of protein evolution were only moderately affected by expression level and protein length.
Collapse
Affiliation(s)
| | - Beatriz Sabater-Muñoz
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Dublin, Ireland
| | - Christina Toft
- Department of Genetics, University of Valencia, Valencia, Spain Departamento de Biotecnología, Instituto de Agroquímica y Tecnología de los Alimentos (CSIC), Valencia, Spain
| | - Mario X Ruiz-González
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Current Address: Secretaría de Educación Superior, Ciencia, Tecnología e Innovación, Proyecto Prometeo; Departamento de Ciencias Biológicas, Universidad Tócnica Particular de Loja, Loja, Ecuador
| | - Mario A Fares
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
16
|
Aklilu BB, Culligan KM. Molecular Evolution and Functional Diversification of Replication Protein A1 in Plants. FRONTIERS IN PLANT SCIENCE 2016; 7:33. [PMID: 26858742 PMCID: PMC4731521 DOI: 10.3389/fpls.2016.00033] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 01/10/2016] [Indexed: 05/23/2023]
Abstract
Replication protein A (RPA) is a heterotrimeric, single-stranded DNA binding complex required for eukaryotic DNA replication, repair, and recombination. RPA is composed of three subunits, RPA1, RPA2, and RPA3. In contrast to single RPA subunit genes generally found in animals and yeast, plants encode multiple paralogs of RPA subunits, suggesting subfunctionalization. Genetic analysis demonstrates that five Arabidopsis thaliana RPA1 paralogs (RPA1A to RPA1E) have unique and overlapping functions in DNA replication, repair, and meiosis. We hypothesize here that RPA1 subfunctionalities will be reflected in major structural and sequence differences among the paralogs. To address this, we analyzed amino acid and nucleotide sequences of RPA1 paralogs from 25 complete genomes representing a wide spectrum of plants and unicellular green algae. We find here that the plant RPA1 gene family is divided into three general groups termed RPA1A, RPA1B, and RPA1C, which likely arose from two progenitor groups in unicellular green algae. In the family Brassicaceae the RPA1B and RPA1C groups have further expanded to include two unique sub-functional paralogs RPA1D and RPA1E, respectively. In addition, RPA1 groups have unique domains, motifs, cis-elements, gene expression profiles, and pattern of conservation that are consistent with proposed functions in monocot and dicot species, including a novel C-terminal zinc-finger domain found only in plant RPA1C-like sequences. These results allow for improved prediction of RPA1 subunit functions in newly sequenced plant genomes, and potentially provide a unique molecular tool to improve classification of Brassicaceae species.
Collapse
Affiliation(s)
- Behailu B. Aklilu
- Department of Molecular, Cellular and Biomedical Sciences, University of New HampshireDurham, NH, USA
- Program in Genetics, University of New HampshireDurham, NH, USA
| | - Kevin M. Culligan
- Department of Molecular, Cellular and Biomedical Sciences, University of New HampshireDurham, NH, USA
- Program in Genetics, University of New HampshireDurham, NH, USA
| |
Collapse
|
17
|
Gerdol M, De Moro G, Venier P, Pallavicini A. Analysis of synonymous codon usage patterns in sixty-four different bivalve species. PeerJ 2015; 3:e1520. [PMID: 26713259 PMCID: PMC4690358 DOI: 10.7717/peerj.1520] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2015] [Accepted: 11/28/2015] [Indexed: 12/21/2022] Open
Abstract
Synonymous codon usage bias (CUB) is a defined as the non-random usage of codons encoding the same amino acid across different genomes. This phenomenon is common to all organisms and the real weight of the many factors involved in its shaping still remains to be fully determined. So far, relatively little attention has been put in the analysis of CUB in bivalve mollusks due to the limited genomic data available. Taking advantage of the massive sequence data generated from next generation sequencing projects, we explored codon preferences in 64 different species pertaining to the six major evolutionary lineages in Bivalvia. We detected remarkable differences across species, which are only partially dependent on phylogeny. While the intensity of CUB is mild in most organisms, a heterogeneous group of species (including Arcida and Mytilida, among the others) display higher bias and a strong preference for AT-ending codons. We show that the relative strength and direction of mutational bias, selection for translational efficiency and for translational accuracy contribute to the establishment of synonymous codon usage in bivalves. Although many aspects underlying bivalve CUB still remain obscure, we provide for the first time an overview of this phenomenon in this large, commercially and environmentally important, class of marine invertebrates.
Collapse
Affiliation(s)
- Marco Gerdol
- Department of Life Sciences, University of Trieste , Trieste , Italy
| | - Gianluca De Moro
- Department of Life Sciences, University of Trieste , Trieste , Italy
| | - Paola Venier
- Department of Biology, University of Padova , Padova , Italy
| | | |
Collapse
|
18
|
Mukherjee D, Mukherjee A, Ghosh TC. Evolutionary Rate Heterogeneity of Primary and Secondary Metabolic Pathway Genes in Arabidopsis thaliana. Genome Biol Evol 2015; 8:17-28. [PMID: 26556590 PMCID: PMC4758233 DOI: 10.1093/gbe/evv217] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Primary metabolism is essential to plants for growth and development, and secondary metabolism helps plants to interact with the environment. Many plant metabolites are industrially important. These metabolites are produced by plants through complex metabolic pathways. Lack of knowledge about these pathways is hindering the successful breeding practices for these metabolites. For a better knowledge of the metabolism in plants as a whole, evolutionary rate variation of primary and secondary metabolic pathway genes is a prerequisite. In this study, evolutionary rate variation of primary and secondary metabolic pathway genes has been analyzed in the model plant Arabidopsis thaliana. Primary metabolic pathway genes were found to be more conserved than secondary metabolic pathway genes. Several factors such as gene structure, expression level, tissue specificity, multifunctionality, and domain number are the key factors behind this evolutionary rate variation. This study will help to better understand the evolutionary dynamics of plant metabolism.
Collapse
Affiliation(s)
- Dola Mukherjee
- Bioinformatics Centre, Bose Institute, Kolkata, West Bengal, India
| | - Ashutosh Mukherjee
- Department of Botany, Vivekananda College, Thakurpukur, Kolkata, West Bengal, India
| | | |
Collapse
|
19
|
Roy A, Mukhopadhyay S, Sarkar I, Sen A. Comparative investigation of the various determinants that influence the codon and amino acid usage patterns in the genus Bifidobacterium. World J Microbiol Biotechnol 2015; 31:959-81. [DOI: 10.1007/s11274-015-1850-1] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2015] [Accepted: 03/31/2015] [Indexed: 12/31/2022]
|
20
|
Mukherjee S, Panda A, Ghosh TC. Elucidating evolutionary features and functional implications of orphan genes in Leishmania major. INFECTION GENETICS AND EVOLUTION 2015; 32:330-7. [PMID: 25843649 DOI: 10.1016/j.meegid.2015.03.031] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2015] [Revised: 03/25/2015] [Accepted: 03/26/2015] [Indexed: 11/28/2022]
Abstract
Orphan genes are protein coding genes that lack recognizable homologs in other organisms. These genes were reported to comprise a considerable fraction of coding regions in all sequenced genomes and thought to be allied with organism's lineage-specific traits. However, their evolutionary persistence and functional significance still remain elusive. Due to lack of homologs with the host genome and for their probable lineage-specific functional roles, orphan gene product of pathogenic protozoan might be considered as the possible therapeutic targets. Leishmania major is an important parasitic protozoan of the genus Leishmania that is associated with the disease cutaneous leishmaniasis. Therefore, evolutionary and functional characterization of orphan genes in this organism may help in understanding the factors prevailing pathogen evolution and parasitic adaptation. In this study, we systematically identified orphan genes of L. major and employed several in silico analyses for understanding their evolutionary and functional attributes. To trace the signatures of molecular evolution, we compared their evolutionary rate with non-orphan genes. In agreement with prior observations, here we noticed that orphan genes evolve at a higher rate as compared to non-orphan genes. Lower sequence conservation of orphan genes was previously attributed solely due to their younger gene age. However, here we observed that together with gene age, a number of genomic (like expression level, GC content, variation in codon usage) and proteomic factors (like protein length, intrinsic disorder content, hydropathicity) could independently modulate their evolutionary rate. We considered the interplay of all these factors and analyzed their relative contribution on protein evolutionary rate by regression analysis. On the functional level, we observed that orphan genes are associated with regulatory, growth factor and transport related processes. Moreover, these genes were found to be enriched with various types of interaction and trafficking motifs, implying their possible involvement in host-parasite interactions. Thus, our comprehensive analysis of L. major orphan genes provided evidence for their extensive roles in host-pathogen interactions and virulence.
Collapse
Affiliation(s)
- Sumit Mukherjee
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, West Bengal, India; Department of Physical Sciences, Indian Institute of Science Education and Research-Kolkata, Mohanpur 741246, Nadia, West Bengal, India
| | - Arup Panda
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, West Bengal, India
| | - Tapash Chandra Ghosh
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, West Bengal, India.
| |
Collapse
|
21
|
Limitations of the ‘ambush hypothesis’ at the single-gene scale: what codon biases are to blame? Mol Genet Genomics 2014; 290:493-504. [DOI: 10.1007/s00438-014-0937-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Accepted: 10/01/2014] [Indexed: 10/24/2022]
|
22
|
Porceddu A, Zenoni S, Camiolo S. The signatures of selection for translational accuracy in plant genes. Genome Biol Evol 2013; 5:1117-26. [PMID: 23695187 PMCID: PMC3698923 DOI: 10.1093/gbe/evt078] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Little is known about the natural selection of synonymous codons within the coding sequences of plant genes. We analyzed the distribution of synonymous codons within plant coding sequences and found that preferred codons tend to encode the more conserved and functionally important residues of plant proteins. This was consistent among several synonymous codon families and applied to genes with different expression profiles and functions. Most of the randomly chosen alternative sets of codons scored weaker associations than the actual sets of preferred codons, suggesting that codon position within plant genes and codon usage bias have coevolved to maximize translational accuracy. All these findings are consistent with the mistranslation-induced protein misfolding theory, which predicts the natural selection of highly preferred codons more frequently at sites where translation errors could compromise protein folding or functionality. Our results will provide an important insight in future studies of protein folding, molecular evolution, and transgene design for optimal expression.
Collapse
Affiliation(s)
- Andrea Porceddu
- Dipartimento di Agraria, Sezione di Agronomia e Coltivazione Erbacee Genetica-SACEG, Università degli studi di Sassari, Italy.
| | | | | |
Collapse
|
23
|
A comparison of synonymous codon usage bias patterns in DNA and RNA virus genomes: quantifying the relative importance of mutational pressure and natural selection. BIOMED RESEARCH INTERNATIONAL 2013; 2013:406342. [PMID: 24199191 PMCID: PMC3808105 DOI: 10.1155/2013/406342] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Revised: 06/30/2013] [Accepted: 08/04/2013] [Indexed: 11/17/2022]
Abstract
Codon usage bias patterns have been broadly explored for many viruses. However, the relative importance of mutation pressure and natural selection is still under debate. In the present study, I tried to resolve controversial issues on determining the principal factors of codon usage patterns for DNA and RNA viruses, respectively, by examining over 38000 ORFs. By utilizing variation partitioning technique, the results showed that 27% and 21% of total variation could be attributed to mutational pressure, while 5% and 6% of total variation could be explained by natural selection for DNA and RNA viruses, respectively, in codon usage patterns. Furthermore, the combined effect of mutational pressure and natural selection on influencing codon usage patterns of viruses is substantial (explaining 10% and 8% of total variation of codon usage patterns). With respect to GC variation, GC content is always negatively and significantly correlated with aromaticity. Interestingly, the signs for the significant correlations between GC, gene lengths, and hydrophobicity are completely opposite between DNA and RNA viruses, being positive for DNA viruses while being negative for RNA viruses. At last, GC12 versus G3s plot suggests that natural selection is more important than mutational pressure on influencing the GC content in the first and second codon positions.
Collapse
|
24
|
Javier Zea D, Miguel Monzon A, Fornasari MS, Marino-Buslje C, Parisi G. Protein Conformational Diversity Correlates with Evolutionary Rate. Mol Biol Evol 2013; 30:1500-3. [DOI: 10.1093/molbev/mst065] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
25
|
Nayak KC. Comparative genome sequence analysis of Sulfolobus acidocaldarius and 9 other isolates of its genus for factors influencing codon and amino acid usage. Gene 2013; 513:163-73. [DOI: 10.1016/j.gene.2012.10.024] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Revised: 10/08/2012] [Accepted: 10/21/2012] [Indexed: 11/17/2022]
|
26
|
Hershberg R, Petrov DA. On the limitations of using ribosomal genes as references for the study of codon usage: a rebuttal. PLoS One 2012; 7:e49060. [PMID: 23284622 PMCID: PMC3527481 DOI: 10.1371/journal.pone.0049060] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Accepted: 10/05/2012] [Indexed: 01/08/2023] Open
Abstract
In a recent paper published in PLOS ONE, Wang et al. challenge our finding that the identity of optimal codons in different genomes follows a set of clear rules. Here we provide a rebuttal of their paper and demonstrate that the results of our original PLOS Genetics paper stand. This provides us with an opportunity to bring up an aspect of how codon usage has been studied that should be of general interest. The Wang et al. study, as well as many other studies, used ribosomal genes as a reference set for the study of patterns of codon usage. We discuss here the assumptions that are made in order to justify using ribosomal genes to study codon bias, suggest that this practice can at times be problematic, and discuss its limitations.
Collapse
Affiliation(s)
- Ruth Hershberg
- Rachel & Menachem Mendelovitch Evolutionary Processes of Mutation & Natural Selection Research Laboratory, Department of Genetics, Technion-Israel Institute of Technology, Haifa, Israel.
| | | |
Collapse
|
27
|
Panda A, Begum T, Ghosh TC. Insights into the evolutionary features of human neurodegenerative diseases. PLoS One 2012; 7:e48336. [PMID: 23118989 PMCID: PMC3484049 DOI: 10.1371/journal.pone.0048336] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2012] [Accepted: 09/24/2012] [Indexed: 02/06/2023] Open
Abstract
Comparative analyses between human disease and non-disease genes are of great interest in understanding human disease gene evolution. However, the progression of neurodegenerative diseases (NDD) involving amyloid formation in specific brain regions is still unknown. Therefore, in this study, we mainly focused our analysis on the evolutionary features of human NDD genes with respect to non-disease genes. Here, we observed that human NDD genes are evolutionarily conserved relative to non-disease genes. To elucidate the conserved nature of NDD genes, we incorporated the evolutionary attributes like gene expression level, number of regulatory miRNAs, protein connectivity, intrinsic disorder content and relative aggregation propensity in our analysis. Our studies demonstrate that NDD genes have higher gene expression levels in favor of their lower evolutionary rates. Additionally, we observed that NDD genes have higher number of different regulatory miRNAs target sites and also have higher interaction partners than the non-disease genes. Moreover, miRNA targeted genes are known to have higher disorder content. In contrast, our analysis exclusively established that NDD genes have lower disorder content. In favor of our analysis, we found that NDD gene encoded proteins are enriched with multi interface hubs (party hubs) with lower disorder contents. Since, proteins with higher disorder content need to adapt special structure to reduce their aggregation propensity, NDD proteins found to have elevated relative aggregation propensity (RAP) in support of their lower disorder content. Finally, our categorical regression analysis confirmed the underlined relative dominance of protein connectivity, 3'UTR length, RAP, nature of hubs (singlish/multi interface) and disorder content for such evolutionary rates variation between human NDD genes and non-disease genes.
Collapse
Affiliation(s)
- Arup Panda
- Bioinformatics Centre, Bose Institute, Kolkata, India
| | - Tina Begum
- Bioinformatics Centre, Bose Institute, Kolkata, India
| | | |
Collapse
|
28
|
Clemente F, Vogl C. Evidence for complex selection on four-fold degenerate sites in Drosophila melanogaster. J Evol Biol 2012; 25:2582-95. [PMID: 23020078 DOI: 10.1111/jeb.12003] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Revised: 08/31/2012] [Accepted: 08/31/2012] [Indexed: 01/04/2023]
Abstract
We considered genome-wide four-fold degenerate sites from an African Drosophila melanogaster population and compared them to short introns. To include divergence and to polarize the data, we used its close relatives Drosophila simulans, Drosophila sechellia, Drosophila erecta and Drosophila yakuba as outgroups. In D. melanogaster, the GC content at four-fold degenerate sites is higher than in short introns; compared to its relatives, more AT than GC is fixed. The former has been explained by codon usage bias (CUB) favouring GC; the latter by decreased intensity of directional selection or by increased mutation bias towards AT. With a biallelic equilibrium model, evidence for directional selection comes mostly from the GC-rich ancestral base composition. Together with a slight mutation bias, it leads to an asymmetry of the unpolarized allele frequency spectrum, from which directional selection is inferred. Using a quasi-equilibrium model and polarized spectra, however, only purifying and no directional selection is detected. Furthermore, polarized spectra are proportional to those of the presumably unselected short introns. As we have no evidence for a decrease in effective population size, relaxed CUB must be due to a reduction in the selection coefficient. Going beyond the biallelic model and considering all four bases, signs of directional selection are stronger. In contrast to short introns, complementary bases show strand specificity and allele frequency spectra depend on mutation directions. Hence, the traditional biallelic model to describe the evolution of four-fold degenerate sites should be replaced by more complex models assuming only quasi-equilibrium and accounting for all four bases.
Collapse
Affiliation(s)
- F Clemente
- Institute of Population Genetics, Veterinärmedizinische Universität Wien, Vienna, Austria
| | | |
Collapse
|
29
|
Toll-Riera M, Bostick D, Albà MM, Plotkin JB. Structure and age jointly influence rates of protein evolution. PLoS Comput Biol 2012; 8:e1002542. [PMID: 22693443 PMCID: PMC3364943 DOI: 10.1371/journal.pcbi.1002542] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2012] [Accepted: 04/17/2012] [Indexed: 12/01/2022] Open
Abstract
What factors determine a protein's rate of evolution are actively debated. Especially unclear is the relative role of intrinsic factors of present-day proteins versus historical factors such as protein age. Here we study the interplay of structural properties and evolutionary age, as determinants of protein evolutionary rate. We use a large set of one-to-one orthologs between human and mouse proteins, with mapped PDB structures. We report that previously observed structural correlations also hold within each age group – including relationships between solvent accessibility, designabililty, and evolutionary rates. However, age also plays a crucial role: age modulates the relationship between solvent accessibility and rate. Additionally, younger proteins, despite being less designable, tend to evolve faster than older proteins. We show that previously reported relationships between age and rate cannot be explained by structural biases among age groups. Finally, we introduce a knowledge-based potential function to study the stability of proteins through large-scale computation. We find that older proteins are more stable for their native structure, and more robust to mutations, than younger ones. Our results underscore that several determinants, both intrinsic and historical, can interact to determine rates of protein evolution. Rates of protein evolution vary dramatically within and between organisms. But the factors that determine a protein's evolutionary rate are still under debate, despite extensive studies over the past decade. Several determinants have been proposed, for example gene expression, the importance of the gene for the organism, the number of physical or genetic interactions it has, its structural characteristics, or when it originated. Here we study how age and structural characteristics interact with one another to influence evolutionary rates. We use a set of one-to-one orthologs of human and mouse proteins, with known crystal structures. We find that these two determinants interact: for example, the age of protein modulates how its structure correlates with evolutionary rate. Nonetheless, the influence of age on evolutionary rate cannot be explained by its interplay with structure.
Collapse
Affiliation(s)
- Macarena Toll-Riera
- Evolutionary Genomics Group, Fundació Institut Municipal d'Investigació Mèdica (FIMIM)- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - David Bostick
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - M. Mar Albà
- Evolutionary Genomics Group, Fundació Institut Municipal d'Investigació Mèdica (FIMIM)- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
- * E-mail: (MMA); (JBP)
| | - Joshua B. Plotkin
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- * E-mail: (MMA); (JBP)
| |
Collapse
|
30
|
Ortutay C, Vihinen M. Conserved and quickly evolving immunome genes have different evolutionary paths. Hum Mutat 2012; 33:1456-63. [PMID: 22623381 DOI: 10.1002/humu.22125] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2012] [Accepted: 05/15/2012] [Indexed: 12/11/2022]
Abstract
Genetic, transcript, and protein level variations have important functional and evolutionary consequences. We performed systematic data collection and analysis of copy-number variations, single-nucleotide polymorphisms, disease-causing variations, messenger RNA splicing variants, and protein posttranslational modifications for the genes and proteins essential for human immune system. Information about polymorphic and evolutionarily fixed genetic variations was used to group immunome genes to the most conserved and the most quickly changing ones under directed selection during the recent immunome evolution. Gene Ontology terms related to adaptive immunity are associated with gene groups subject to recent directing selection. In addition, several other characteristics of the immunome genes and proteins in these two categories have statistically significant differences. The presented findings question the usability of directed mouse genes as models for human diseases and conditions and shed light on the fine tuning of human immunity and its diverse functions.
Collapse
Affiliation(s)
- Csaba Ortutay
- Institute of Biomedical Technology, University of Tampere, Tampere, Finland
| | | |
Collapse
|
31
|
Hilterbrand A, Saelens J, Putonti C. CBDB: the codon bias database. BMC Bioinformatics 2012; 13:62. [PMID: 22536831 PMCID: PMC3463423 DOI: 10.1186/1471-2105-13-62] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2012] [Accepted: 03/26/2012] [Indexed: 02/01/2023] Open
Abstract
Background In many genomes, a clear preference in the usage of particular codons exists. The mechanisms that induce codon biases remain an open question; studies have attributed codon usage to translational selection, mutational bias and drift. Furthermore, correlations between codon usage within host genomes and their viral pathogens have been observed for a myriad of host-virus systems. As such, numerous studies have investigated codon usage and codon bias in an effort to better understand how species evolve. Numerous metrics have been developed to identify biases in codon usage. In addition, a few data repositories of codon bias data are available, differing in the metrics reported as well as the number and taxonomy of strains examined. Description We have created a new web resource called the Codon Bias Database (CBDB) which provides information regarding the codon bias within the set of highly expressed genes for 300+ bacterial genomes. CBDB was developed to provide a resource for researchers investigating codon bias in bacteria, facilitating comparisons between strains and species. Furthermore, the site was created to serve those studying adaptation in phage; the genera selected for this first release of CBDB all have sequenced, annotated bacteriophages. The annotations and sequences for the highly expressed gene set are available for each strain in addition to the strain’s codon bias measurements. Conclusions Comparing species and strains provides a comprehensive look at how codon usage has been shaped over evolutionary time and can elucidate the putative mechanisms behind it. The Codon Bias Database provides a centralized repository of look-up tables and codon usage bias measures for a wide variety of genera, species and strains. Through our analysis of the variation in codon usage within the strains presently available, we find that most members of a genus have a codon composition most similar to other members of its genus, although not necessarily other members of its species.
Collapse
Affiliation(s)
- Adam Hilterbrand
- Department of Biology, Loyola University Chicago, 1032 W Sheridan Road, Chicago, IL 60660, USA
| | | | | |
Collapse
|
32
|
Mahlab S, Tuller T, Linial M. Conservation of the relative tRNA composition in healthy and cancerous tissues. RNA (NEW YORK, N.Y.) 2012; 18:640-52. [PMID: 22357911 PMCID: PMC3312552 DOI: 10.1261/rna.030775.111] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Elongation in protein translation is strongly dependent on the availability of mature transfer RNAs (tRNAs). The relative concentrations of the tRNA isoacceptors determine the translation efficiency in unicellular organisms. However, the degree of correspondence of codons and the relevant tRNA isoacceptors serves as an estimator for translation efficiency in all organisms. In this study, we focus on the translational capacity of the human proteome. We show that the correspondence between the codon usage and tRNAs can be improved by combining experimental measurements with the genomic copy number of isoacceptor groups. We show that there are technologies of tRNA measurements that are useful for our analysis. However, fragments of tRNAs do not agree with translational capacity. It was shown that there is a significant increase in the absolute levels of tRNA genes in cancerous cells in comparison to healthy cells. However, we find that the relative composition of tRNA isoacceptors in healthy, cancerous, or transformed cells remains almost identical. This result may indicate that maintaining the relative tRNA composition in cancerous cells is advantageous via its stabilizing of the effectiveness of translation.
Collapse
Affiliation(s)
- Shelly Mahlab
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
- Corresponding author.E-mail.E-mail .E-mail .
| | - Tamir Tuller
- Iby and Aladar Fleischman Faculty of Engineering, Department of Biomedical Engineering, Tel Aviv University, Tel Aviv 69978, Israel
- Corresponding author.E-mail.E-mail .E-mail .
| | - Michal Linial
- Department of Biological Chemistry, Institute of Life Sciences, Sudarsky Center for Computational Biology, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
- Corresponding author.E-mail.E-mail .E-mail .
| |
Collapse
|
33
|
Level of gene expression is a major determinant of protein evolution in the viral order Mononegavirales. J Virol 2012; 86:5253-63. [PMID: 22345453 DOI: 10.1128/jvi.06050-11] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Although the rate at which proteins change is a key parameter in molecular evolution, its determinants are poorly understood in viruses. A variety of factors, including gene length, codon usage bias, protein abundance, protein function, and gene expression level, have been shown to affect the rate of protein evolution in a diverse array of organisms. However, the role of these factors in viral evolution has yet to be addressed. The polar 3'-5' stepwise attenuation of transcription in the Mononegavirales, a group of single-strand negative-sense RNA viruses, provides a unique system to explore the determinants of protein evolution in viruses. We analyzed the relative importance of a variety of factors in shaping patterns of sequence variation in full-length genomes from 13 Mononegavirales species. Our analysis suggests that the level of gene expression, and by extension the relative genomic position of each gene, is a key determinant of the protein evolution in these viruses. This appears to be the consequence of selection for translational robustness, but not for translational accuracy, in highly expressed genes. The small genome size and number of proteins encoded by these viruses allowed us to identify other protein-specific factors that may also play a role in virus evolution, such as host-virus interactions and functional constraints. Finally, we explored the evolutionary pressures acting on noncoding regions in Mononegavirales genomes and observed that, despite being less constrained than coding regions, their evolutionary rates are also associated with genomic position.
Collapse
|
34
|
Zhu E, Sambath S. Characterization of Synonymous Codon Usage in the Newly Identified Duck Plague Virus UL16 Gene. ADVANCES IN INTELLIGENT AND SOFT COMPUTING 2012. [PMCID: PMC7122970 DOI: 10.1007/978-3-642-27537-1_89] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
A comparative analysis of the codon usage bias in the newly identified UL16 gene(GenBank accession no.EU195095) of DPV and the UL16 gene of 22 reference herpesviruses was performed. In this study, the synonymous codon usage bias of UL16 gene in the 23 herpesviruses have been analyzed and the results showed obvious differences by the CAI, RSCU, ENC and GC3s. The results revealed that the synonymous codons with A and T at the third codon positon have widely usage in the codon of UL16 gene of DPV. The ENC-GC3s plot revealed that the genetic heterogeneity in UL16 gene of herpesviruses was constrained by G+C content at the third codon position. The phylogenetic analysis suggested that DPV was evolutionarily closer to herpesviruses which further clustered into Alphaherpesvirinae. Furthermore the ORF of DPV UL16 gene has sequential rare codons. There were 21 codons showing distinct usage differences between DPV with Escherichia coli, 19 codons showing distinct usage differences between DPV with yeast, and 20 between DPV and Human. Therefore the Escherichia coli, Yeast and Human expression system were suitable for the expression of DPV UL16 gene if some codons could be optimized.
Collapse
Affiliation(s)
- Egui Zhu
- South China Normal University, Guangzhou, 510631 China, People's Republic
| | - Sabo Sambath
- South China Normal University, Guangzhou, 510631 China, People's Republic
| |
Collapse
|
35
|
Zhu E, Sambath S. Analysis of Codon Usage Bias in Interferon Alpha Gene of the Giant Panda (Ailuropoda Melanoleuca). ADVANCES IN INTELLIGENT AND SOFT COMPUTING 2012. [PMCID: PMC7123504 DOI: 10.1007/978-3-642-27537-1_37] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The analysis on codon usage bias of IFN-a gene of giant panda (Ailuropoda melanoleuca) may provide a basis for understanding the evolution relationship of giant panda and for selecting appropriate host expression systems to improve the expression of target genes. In this paper, the codon usage bias in the mature IFN-a sequence of giant panda and 15 reference species have been analyzed. The results showed that the synonymous codons with G and C at the third codon position were widely used and the ENC-GC3S plot revealed that the genetic heterogeneity in IFN-a gene was main constrained by mutational bias. Contrastive analysis revealed that there were 40 codons showing distinct usage differences between GpIFN-a and Escherichia coli, 38 codons between GpIFN-a and yeast. and only 30 between GpIFN-a and Homo sapiens. Therefore the Homo expression system may be more suitable for the expression of GpIFN-a genes.
Collapse
Affiliation(s)
- Egui Zhu
- South China Normal University, Guangzhou, 510631 China, People's Republic
| | - Sabo Sambath
- South China Normal University, Guangzhou, 510631 China, People's Republic
| |
Collapse
|
36
|
Luo XL, Xu JG, Ye CY. Analysis of synonymous codon usage inShigella flexneri2a strain 301 and otherShigellaandEscherichia colistrains. Can J Microbiol 2011; 57:1016-23. [DOI: 10.1139/w11-095] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
In this study, we analysed synonymous codon usage in Shigella flexneri 2a strain 301 (Sf301) and performed a comparative analysis of synonymous codon usage patterns in Sf301 and other strains of Shigella and Escherichia coli . Although there was a significant variety in codon usage bias among different Sf301 genes, there was a slight but observable codon usage bias that could primarily be attributable to mutational pressure and translational selection. In addition, the relative abundance of dinucleotides in Sf301 was observed to be independent of the overall base composition but was still caused by differential mutational pressure; this also shaped codon usage. By comparing the relative synonymous codon usage values across different Shigella and E. coli strains, we suggested that the synonymous codon usage pattern in the Shigella genomes was strain specific. This study represents a comprehensive analysis of Shigella codon usage patterns and provides a basic understanding of the mechanisms underlying codon usage bias.
Collapse
Affiliation(s)
- Xue Lian Luo
- State Key Laboratory for Infectious Disease Prevention and Control, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Changping, Beijing 102206, People’s Republic of China
| | - Jian Guo Xu
- State Key Laboratory for Infectious Disease Prevention and Control, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Changping, Beijing 102206, People’s Republic of China
| | - Chang Yun Ye
- State Key Laboratory for Infectious Disease Prevention and Control, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Changping, Beijing 102206, People’s Republic of China
| |
Collapse
|
37
|
Wei W, Liu F, Liu L, Li Z, Zhang X, Jiang F, Shi Q, Zhou X, Sheng W, Cai S, Li X, Xu Y, Nan P. Distinct mutations in MLH1 and MSH2 genes in hereditary non-polyposis colorectal cancer (HNPCC) families from China. BMB Rep 2011; 44:317-22. [PMID: 21615986 DOI: 10.5483/bmbrep.2011.44.5.317] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Hereditary non-polyposis Colorectal Cancer (HNPCC) is an autosomal dominant inheritance syndrome. HNPCC is the most common hereditary variant of colorectal cancer (CRC), which accounts for 2-5% CRCs, mainly due to hMLH1 and hMSH2 mutations that impair DNA repair functions. Our study aimed to identify the patterns of hMSH2 and hMLH1 mutations in Chinese HNPCC patients. Ninety-eight unrelated families from China meeting Amsterdam or Bethesda criteria were included in our study. Germline mutations in MLH1 and MSH2 genes, located in the exons and the splice-site junctions, were screened in the 98 probands by direct sequencing. Eleven mutations were found in ten patients (11%), with six in MLH1 (54.5%) and five in MSH2 (45.5%) genes. One patient had mutations in both MLH1 and MSH2 genes. Three novel mutations in MLH1 gene (c.157_160delGAGG, c.2157dupT and c.-64G>T) were found for the first time, and one suspected hotspot in MSH2 (c.1168C>T) was revealed.
Collapse
Affiliation(s)
- Wenqian Wei
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, School of Life Sciences, Fudan University, Shanghai, China.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Nayak KC. Comparative study on factors influencing the codon and amino acid usage in Lactobacillus sakei 23K and 13 other lactobacilli. Mol Biol Rep 2011; 39:535-45. [DOI: 10.1007/s11033-011-0768-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2010] [Accepted: 04/27/2011] [Indexed: 11/24/2022]
|
39
|
Cheung SC, Liu LZ, Lan LL, Liu QQ, Sun SS, Chan JC, Tong PC. Glucose lowering effect of transgenic human insulin-like growth factor-I from rice: in vitro and in vivo studies. BMC Biotechnol 2011; 11:37. [PMID: 21486461 PMCID: PMC3098155 DOI: 10.1186/1472-6750-11-37] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2010] [Accepted: 04/12/2011] [Indexed: 12/13/2022] Open
Abstract
Background Human insulin-like growth factor-I (hIGF-I) is a growth factor which is highly resemble to insulin. It is essential for cell proliferation and has been proposed for treatment of various endocrine-associated diseases including growth hormone insensitivity syndrome and diabetes mellitus. In the present study, an efficient plant expression system was developed to produce biologically active recombinant hIGF-I (rhIGF-I) in transgenic rice grains. Results The plant-codon-optimized hIGF-I was introduced into rice via Agrobacterium-mediated transformation. To enhance the stability and yield of rhIGF-I, the endoplasmic reticulum-retention signal and glutelin signal peptide were used to deliver rhIGF-I to endoplasmic reticulum for stable accumulation. We found that only glutelin signal peptide could lead to successful expression of hIGF-I and one gram of hIGF-I rice grain possessed the maximum activity level equivalent to 3.2 micro molar of commercial rhIGF-I. In vitro functional analysis showed that the rice-derived rhIGF-I was effective in inducing membrane ruffling and glucose uptake on rat skeletal muscle cells. Oral meal test with rice-containing rhIGF-I acutely reduced blood glucose levels in streptozotocin-induced and Zucker diabetic rats, whereas it had no effect in normal rats. Conclusion Our findings provided an alternative expression system to produce large quantities of biologically active rhIGF-I. The provision of large quantity of recombinant proteins will promote further research on the therapeutic potential of rhIGF-I.
Collapse
Affiliation(s)
- Stanley Ck Cheung
- Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong
| | | | | | | | | | | | | |
Collapse
|
40
|
Stoletzki N. The surprising negative correlation of gene length and optimal codon use--disentangling translational selection from GC-biased gene conversion in yeast. BMC Evol Biol 2011; 11:93. [PMID: 21481245 PMCID: PMC3096941 DOI: 10.1186/1471-2148-11-93] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2010] [Accepted: 04/11/2011] [Indexed: 02/06/2023] Open
Abstract
Background Surprisingly, in several multi-cellular eukaryotes optimal codon use correlates negatively with gene length. This contrasts with the expectation under selection for translational accuracy. While suggested explanations focus on variation in strength and efficiency of translational selection, it has rarely been noticed that the negative correlation is reported only in organisms whose optimal codons are biased towards codons that end with G or C (-GC). This raises the question whether forces that affect base composition - such as GC-biased gene conversion - contribute to the negative correlation between optimal codon use and gene length. Results Yeast is a good organism to study this as equal numbers of optimal codons end in -GC and -AT and one may hence compare frequencies of optimal GC- with optimal AT-ending codons to disentangle the forces. Results of this study demonstrate in yeast frequencies of GC-ending (optimal AND non-optimal) codons decrease with gene length and increase with recombination. A decrease of GC-ending codons along genes contributes to the negative correlation with gene length. Correlations with recombination and gene expression differentiate between GC-ending and optimal codons, and also substitution patterns support effects of GC-biased gene conversion. Conclusion While the general effect of GC-biased gene conversion is well known, the negative correlation of optimal codon use with gene length has not been considered in this context before. Initiation of gene conversion events in promoter regions and the presence of a gene conversion gradient most likely explain the observed decrease of GC-ending codons with gene length and gene position.
Collapse
Affiliation(s)
- Nina Stoletzki
- Ludwig-Maximilan Universität, Biocenter, Grosshadernerstr, 2, D-82152 Planegg-Martinsried, Germany.
| |
Collapse
|
41
|
Seo KW, Kim DH, Kim AH, Yoo HS, Lee KY, Jang YS. Characterization of Antigenic Determinants in ApxIIA Exotoxin Capable of Inducing Protective Immunity toActinobacillus pleuropneumoniaeChallenge. Immunol Invest 2011; 40:465-80. [DOI: 10.3109/08820139.2011.558151] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
42
|
Yang L, Gaut BS. Factors that contribute to variation in evolutionary rate among Arabidopsis genes. Mol Biol Evol 2011; 28:2359-69. [PMID: 21389272 DOI: 10.1093/molbev/msr058] [Citation(s) in RCA: 136] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Surprisingly, few studies have described evolutionary rate variation among plant nuclear genes, with little investigation of the causes of rate variation. Here, we describe evolutionary rates for 11,492 ortholog pairs between Arabidopsis thaliana and A. lyrata and investigate possible contributors to rate variation among these genes. Rates of evolution at synonymous sites vary along chromosomes, suggesting that mutation rates vary on genomic scales, perhaps as a function of recombination rate. Rates of evolution at nonsynonymous sites correlate most strongly with expression patterns, but they also vary as to whether a gene is duplicated and retained after a whole-genome duplication (WGD) event. WGD genes evolve more slowly, on average, than nonduplicated genes and non-WGD duplicates. We hypothesize that levels and patterns of expression are not only the major determinants that explain nonsynonymous rate variation among genes but also a critical determinant of gene retention after duplication.
Collapse
Affiliation(s)
- Liang Yang
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, USA
| | | |
Collapse
|
43
|
Misawa K, Kikuno RF. Relationship between amino acid composition and gene expression in the mouse genome. BMC Res Notes 2011; 4:20. [PMID: 21272306 PMCID: PMC3038927 DOI: 10.1186/1756-0500-4-20] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2010] [Accepted: 01/27/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Codon bias is a phenomenon that refers to the differences in the frequencies of synonymous codons among different genes. In many organisms, natural selection is considered to be a cause of codon bias because codon usage in highly expressed genes is biased toward optimal codons. Methods have previously been developed to predict the expression level of genes from their nucleotide sequences, which is based on the observation that synonymous codon usage shows an overall bias toward a few codons called major codons. However, the relationship between codon bias and gene expression level, as proposed by the translation-selection model, is less evident in mammals. FINDINGS We investigated the correlations between the expression levels of 1,182 mouse genes and amino acid composition, as well as between gene expression and codon preference. We found that a weak but significant correlation exists between gene expression levels and amino acid composition in mouse. In total, less than 10% of variation of expression levels is explained by amino acid components. We found the effect of codon preference on gene expression was weaker than the effect of amino acid composition, because no significant correlations were observed with respect to codon preference. CONCLUSION These results suggest that it is difficult to predict expression level from amino acid components or from codon bias in mouse.
Collapse
Affiliation(s)
- Kazuharu Misawa
- Research Program for Computational Science, Research and Development Group for Next-Generation Integrated Living Matter Simulation, Fusion of Data and Analysis Research and Development Team, RIKEN, 4-6-1 Shirokane-dai, Minato-ku, Tokyo 108-8639, Japan.
| | | |
Collapse
|
44
|
Abstract
Despite their name, synonymous mutations have significant consequences for cellular processes in all taxa. As a result, an understanding of codon bias is central to fields as diverse as molecular evolution and biotechnology. Although recent advances in sequencing and synthetic biology have helped to resolve longstanding questions about codon bias, they have also uncovered striking patterns that suggest new hypotheses about protein synthesis. Ongoing work to quantify the dynamics of initiation and elongation is as important for understanding natural synonymous variation as it is for designing transgenes in applied contexts.
Collapse
Affiliation(s)
- Joshua B Plotkin
- Department of Biology and Program in Applied Mathematics and Computational Science, University of Pennsylvania, 433 South University Avenue, Philadelphia, Pennsylvania 19104, USA.
| | | |
Collapse
|
45
|
Li ZP, Ying DQ, Li P, Li F, Bo XC, Wang SQ. Analysis of synonymous codon usage bias in 09H1N1. Virol Sin 2010; 25:329-40. [PMID: 20960179 DOI: 10.1007/s12250-010-3123-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2010] [Accepted: 04/30/2010] [Indexed: 11/29/2022] Open
Abstract
A novel subtype of influenza A virus 09H1N1 has rapidly spread across the world. Evolutionary analyses of this virus have revealed that 09H1N1 is a triple reassortant of segments from swine, avian and human influenza viruses. In this study, we investigated factors shaping the codon usage bias of 09H1N1 and carried out cluster analysis of 60 strains of influenza A virus from different subtypes based on their codon usage bias. We discovered that more preferentially used codons of 09H1N1 are A-ended or U-ended, and the intra-genomic codon usage bias of 09H1N1 is quite low. Base composition constraint, dinucleotide biases and translational selection are the main factors influencing the codon usage bias of 09H1N1. At the genome level, we find that the codon usage bias of 09H1N1 is similar to H1N1 (A/swine/Kansas/77778/2007H1N1), H9N2 from Asia, H1N2 from Asia and North America and H3N2 from North America. Our results provide insight for understanding the processes governing evolution, regulation of gene expression, and revealing the evolution of 09H1N1.
Collapse
Affiliation(s)
- Zhen-Peng Li
- Beijing Institute of Radiation Medicine, Beijing, 100850, China
| | | | | | | | | | | |
Collapse
|
46
|
Plata G, Gottesman ME, Vitkup D. The rate of the molecular clock and the cost of gratuitous protein synthesis. Genome Biol 2010; 11:R98. [PMID: 20920270 PMCID: PMC2965390 DOI: 10.1186/gb-2010-11-9-r98] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2010] [Revised: 09/03/2010] [Accepted: 09/29/2010] [Indexed: 01/05/2023] Open
Abstract
Background The nature of the protein molecular clock, the protein-specific rate of amino acid substitutions, is among the central questions of molecular evolution. Protein expression level is the dominant determinant of the clock rate in a number of organisms. It has been suggested that highly expressed proteins evolve slowly in all species mainly to maintain robustness to translation errors that generate toxic misfolded proteins. Here we investigate this hypothesis experimentally by comparing the growth rate of Escherichia coli expressing wild type and misfolding-prone variants of the LacZ protein. Results We show that the cost of toxic protein misfolding is small compared to other costs associated with protein synthesis. Complementary computational analyses demonstrate that there is also a relatively weaker, but statistically significant, selection for increasing solubility and polarity in highly expressed E. coli proteins. Conclusions Although we cannot rule out the possibility that selection against misfolding toxicity significantly affects the protein clock in species other than E. coli, our results suggest that it is unlikely to be the dominant and universal factor determining the clock rate in all organisms. We find that in this bacterium other costs associated with protein synthesis are likely to play an important role. Interestingly, our experiments also suggest significant costs associated with volume effects, such as jamming of the cellular environment with unnecessary proteins.
Collapse
Affiliation(s)
- Germán Plata
- Center for Computational Biology and Bioinformatics, Columbia University, 1130 St Nicholas Ave, New York City, NY 10032, USA.
| | | | | |
Collapse
|
47
|
Panchin AY, Gelfand MS, Ramensky VE, Artamonova II. Asymmetric and non-uniform evolution of recently duplicated human genes. Biol Direct 2010; 5:54. [PMID: 20825637 PMCID: PMC2942815 DOI: 10.1186/1745-6150-5-54] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2010] [Accepted: 09/08/2010] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Gene duplications are a source of new genes and protein functions. The innovative role of duplication events makes families of paralogous genes an interesting target for studies in evolutionary biology. Here we study global trends in the evolution of human genes that resulted from recent duplications. RESULTS The pressure of negative selection is weaker during a short time immediately after a duplication event. Roughly one fifth of genes in paralogous gene families are evolving asymmetrically: one of the proteins encoded by two closest paralogs accumulates amino acid substitutions significantly faster than its partner. This asymmetry cannot be explained by differences in gene expression levels. In asymmetric gene pairs the number of deleterious mutations is increased in one copy, while decreased in the other copy as compared to genes constituting non-asymmetrically evolving pairs. The asymmetry in the rate of synonymous substitutions is much weaker and not significant. CONCLUSIONS The increase of negative selection pressure over time after a duplication event seems to be a major trend in the evolution of human paralogous gene families. The observed asymmetry in the evolution of paralogous genes shows that in many cases one of two gene copies remains practically unchanged, while the other accumulates functional mutations. This supports the hypothesis that slowly evolving gene copies preserve their original functions, while fast evolving copies obtain new specificities or functions.
Collapse
Affiliation(s)
- Alexander Y Panchin
- M.V. Lomonosov Moscow State University, Faculty of Bioengineering and Bioinformatics, Vorobyevy Gory 1-73, Moscow, 119992, Russia
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Science, Bolshoi Karetny 19, Moscow, 127994, Russia
| | - Mikhail S Gelfand
- M.V. Lomonosov Moscow State University, Faculty of Bioengineering and Bioinformatics, Vorobyevy Gory 1-73, Moscow, 119992, Russia
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Science, Bolshoi Karetny 19, Moscow, 127994, Russia
| | - Vasily E Ramensky
- V.A. Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilova 32, Moscow, 119991, Russia
| | - Irena I Artamonova
- M.V. Lomonosov Moscow State University, Faculty of Bioengineering and Bioinformatics, Vorobyevy Gory 1-73, Moscow, 119992, Russia
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Science, Bolshoi Karetny 19, Moscow, 127994, Russia
- N.I. Vavilov Institute of General Genetics, Russian Academy of Science, Gubkina 3, Moscow, 119991, Russia
| |
Collapse
|
48
|
Begum T, Ghosh TC. Understanding the Effect of Secondary Structures and Aggregation on Human Protein Folding Class Evolution. J Mol Evol 2010; 71:60-9. [DOI: 10.1007/s00239-010-9364-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2009] [Accepted: 06/23/2010] [Indexed: 12/01/2022]
|
49
|
Codon Usage Patterns in Corynebacterium glutamicum: Mutational Bias, Natural Selection and Amino Acid Conservation. Comp Funct Genomics 2010; 2010:343569. [PMID: 20445740 PMCID: PMC2860111 DOI: 10.1155/2010/343569] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2009] [Revised: 01/29/2010] [Accepted: 02/04/2010] [Indexed: 11/17/2022] Open
Abstract
The alternative synonymous codons in Corynebacterium glutamicum, a well-known bacterium used in industry for the production of amino acid, have been investigated by multivariate analysis. As C. glutamicum is a GC-rich organism, G and C are expected to predominate at the third position of codons. Indeed, overall codon usage analyses have indicated that C and/or G ending codons are predominant in this organism. Through multivariate statistical analysis, apart from mutational selection, we identified three other trends of codon usage variation among the genes. Firstly, the majority of highly expressed genes are scattered towards the positive end of the first axis, whereas the majority of lowly expressed genes are clustered towards the other end of the first axis. Furthermore, the distinct difference in the two sets of genes was that the C ending codons are predominate in putatively highly expressed genes, suggesting that the C ending codons are translationally optimal in this organism. Secondly, the majority of the putatively highly expressed genes have a tendency to locate on the leading strand, which indicates that replicational and transciptional selection might be invoked. Thirdly, highly expressed genes are more conserved than lowly expressed genes by synonymous and nonsynonymous substitutions among orthologous genes fromthe genomes of C. glutamicum and C. diphtheriae. We also analyzed other factors such as the length of genes and hydrophobicity that might influence codon usage and found their contributions to be weak.
Collapse
|
50
|
Prat Y, Fromer M, Linial N, Linial M. Codon usage is associated with the evolutionary age of genes in metazoan genomes. BMC Evol Biol 2009; 9:285. [PMID: 19995431 PMCID: PMC2799417 DOI: 10.1186/1471-2148-9-285] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2009] [Accepted: 12/08/2009] [Indexed: 11/21/2022] Open
Abstract
Background Codon usage may vary significantly between different organisms and between genes within the same organism. Several evolutionary processes have been postulated to be the predominant determinants of codon usage: selection, mutation, and genetic drift. However, the relative contribution of each of these factors in different species remains debatable. The availability of complete genomes for tens of multicellular organisms provides an opportunity to inspect the relationship between codon usage and the evolutionary age of genes. Results We assign an evolutionary age to a gene based on the relative positions of its identified homologues in a standard phylogenetic tree. This yields a classification of all genes in a genome to several evolutionary age classes. The present study starts from the observation that each age class of genes has a unique codon usage and proceeds to provide a quantitative analysis of the codon usage in these classes. This observation is made for the genomes of Homo sapiens, Mus musculus, and Drosophila melanogaster. It is even more remarkable that the differences between codon usages in different age groups exhibit similar and consistent behavior in various organisms. While we find that GC content and gene length are also associated with the evolutionary age of genes, they can provide only a partial explanation for the observed codon usage. Conclusion While factors such as GC content, mutational bias, and selection shape the codon usage in a genome, the evolutionary history of an organism over hundreds of millions of years is an overlooked property that is strongly linked to GC content, protein length, and, even more significantly, to the codon usage of metazoan genomes.
Collapse
Affiliation(s)
- Yosef Prat
- Sudarsky Center for Computational Biology, The Hebrew University of Jerusalem, Jerusalem, 91904, Israel.
| | | | | | | |
Collapse
|