1
|
Sefrji FO, Abulfaraj AA, Alshehrei FM, Al-Andal A, Alnahari AA, Tashkandi M, Baz L, Barqawi AA, Almutrafy AM, Alshareef SA, Alkhatib SN, Abuauf HW, Jalal RS, Aloufi AS. Comprehensive analysis of orthologous genes reveals functional dynamics and energy metabolism in the rhizospheric microbiome of Moringa oleifera. Funct Integr Genomics 2025; 25:82. [PMID: 40195156 PMCID: PMC11976380 DOI: 10.1007/s10142-025-01580-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2025] [Revised: 03/07/2025] [Accepted: 03/13/2025] [Indexed: 04/09/2025]
Abstract
Moringa oleifera, known for its nutritional and therapeutic properties, exhibits a complex relationship with its rhizospheric soil microbiome. This study aimed to elucidate the microbiome's structural composition, molecular functions, and its role in plant growth by integrating Clusters of Orthologous Genes (COG) analysis with enzymatic functions previously identified through KEGG, CAZy, and CARD databases. Metagenomic sequencing and bioinformatics analysis were performed from the rhizospheric soil microbiome of M. oleifera collected from the Mecca district in Saudi Arabia. The analysis revealed a role for the rhizospheric microbiome in energy production, storage, and regulation, with glucose serving as a crucial precursor for NADH synthesis and subsequent ATP production via oxidative phosphorylation. Key orthologous genes (OGs) implicated in this process include NuoD, NuoH, NuoM, NuoN, NuoL, atpA, QcrB/PetB, and AccC. Additionally, OGs involved in ATP hydrolysis, such as ClpP, EntF, YopO, and AtoC, were identified. Taxonomic analysis highlighted Actinobacteria and Proteobacteria as the predominant phyla, with enriched genera including Blastococcus, Nocardioides, Streptomyces, Microvirga, Sphingomonas, and Massilia, correlating with specific OGs involved in ATP hydrolysis. This study provides insights into the molecular mechanisms underpinning plant-microbe interactions and highlights the multifaceted roles of ATP-dependent processes in the rhizosphere. Further research is recommended to explore the potential applications of these findings in sustainable agriculture and ecosystem management.
Collapse
Affiliation(s)
- Fatmah O Sefrji
- Department of Biology, College of Science, Taibah University, Madinah, 42353, Saudi Arabia
| | - Aala A Abulfaraj
- Biological Sciences Department, College of Science & Arts, King Abdulaziz University, Rabigh, 21911, Saudi Arabia
| | - Fatimah M Alshehrei
- Department of Biology, Jumum College University, Umm Al-Qura University, P.O. Box 7388, Makkah, 21955, Saudi Arabia
| | - Abeer Al-Andal
- Department of Biology, College of Science, King Khalid University, Abha, 61413, Saudi Arabia
| | - Alaa A Alnahari
- Department of Biological Sciences, College of Science, University of Jeddah, Jeddah, 21493, Saudi Arabia
| | - Manal Tashkandi
- Department of Biological Sciences, College of Science, University of Jeddah, Jeddah, 21493, Saudi Arabia
| | - Lina Baz
- Department of Biochemistry, Faculty of Science, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia
| | - Aminah A Barqawi
- Department of Chemistry, Al-Leith University College, Umm Al Qura University, Makkah, Saudi Arabia
| | - Abeer M Almutrafy
- Department of Biology, College of Science, Taibah University, Madinah, 42353, Saudi Arabia
| | - Sahar A Alshareef
- Department of Biological Sciences, College of Science, University of Jeddah, Jeddah, 21493, Saudi Arabia
| | - Shaza N Alkhatib
- Department of Biological Sciences, College of Science, University of Jeddah, Jeddah, 21493, Saudi Arabia
| | - Haneen W Abuauf
- Department of Biology, Faculty of Applied Science, Umm Al-Qura University, Makkah, 24381, Saudi Arabia
| | - Rewaa S Jalal
- Department of Biological Sciences, College of Science, University of Jeddah, Jeddah, 21493, Saudi Arabia
| | - Abeer S Aloufi
- Department of Biology, College of Science, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia.
| |
Collapse
|
2
|
Kumar U, Singhal S, Khan AA, Alanazi AM, Gurjar P, Khandia R. Insights into genetic architecture and disease associations of genes associated with different human blood group systems using codon usage bias. J Biomol Struct Dyn 2025:1-21. [PMID: 39988946 DOI: 10.1080/07391102.2025.2466710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Accepted: 11/13/2024] [Indexed: 02/25/2025]
Abstract
The differential use of synonymous codons of an amino acid is an imperative evolutionary phenomenon, termed codon usage bias, that functions across various levels of organisms. It is accustomed to providing an understanding of a gene's differential architecture driven by functional regulation of gene expression. Numerous synonymous mutations are linked to various diseases, demonstrating that silent mutations can be deleterious. We employed bioinformatics methods to examine codon usage trends in 263 coding sequences of 44 blood group systems. The blood group systems were categorized into two groups based on association with a sort of neurodegenerative disorder. We performed a CUB study to investigate how multiple components, such as selection, mutation and biased nucleotide composition are accountable for the evolution of the transcripts of the blood group antigens. The compositional analysis implicated blood group genes were GC-rich. RSCU analysis showed G/C-ending codon choice among synonymous codons. Also, a distinct codon choice was found in both blood groups for serine and proline. Moreover, the leucine-coding CTG codon was found the most overrepresented in all the genes, indicating selectional pressure substantially impacts overall codon usage. This was also supported by biplot analysis. Additionally, CpC and GpG overrepresentation is in concordance with the results concerning neurodegenerative disorders where CpC has been attributed to non-CpG methylation and linked to several neurodegenerative ailments. Both the Z-test analysis and rare codon choice showed a substantial difference in codon usage by the genes of both groups.
Collapse
Affiliation(s)
- Utsang Kumar
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, Madhya Pradesh, India
| | - Shailja Singhal
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, Madhya Pradesh, India
| | - Azmat Ali Khan
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Amer M Alanazi
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Pankaj Gurjar
- Centre for Global Health Research, Saveetha Medical College and Hospital, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India
- Department of Science and Engineering, Novel Global Community Educational Foundation, Hebersham, Australia
| | - Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, Madhya Pradesh, India
| |
Collapse
|
3
|
Zaytsev K, Bogatyreva N, Fedorov A. Link Between Individual Codon Frequencies and Protein Expression: Going Beyond Codon Adaptation Index. Int J Mol Sci 2024; 25:11622. [PMID: 39519173 PMCID: PMC11546221 DOI: 10.3390/ijms252111622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 10/21/2024] [Accepted: 10/26/2024] [Indexed: 11/16/2024] Open
Abstract
An important role of a particular synonymous codon composition of a gene in its expression level is well known. There are a number of algorithms optimizing codon usage of recombinant genes to maximize their expression in host cells. Nevertheless, the underlying mechanism remains unsolved and is of significant relevance. In the realm of modern biotechnology, directing protein production to a specific level is crucial for metabolic engineering, genome rewriting and a growing number of other applications. In this study, we propose two new simple statistical and empirical methods for predicting the protein expression level from the nucleotide sequence of the corresponding gene: Codon Expression Index Score (CEIS) and Codon Productivity Score (CPS). Both of these methods are based on the influence of each individual codon in the gene on the overall expression level of the encoded protein and the frequencies of isoacceptors in the species. Our predictions achieve a correlation level of up to r = 0.7 with experimentally measured quantitative proteome data of Escherichia coli, which is superior to any previously proposed methods. Our work helps understand how codons determine protein abundances. Based on these methods, it is possible to design proteins optimized for expression in a particular organism.
Collapse
Affiliation(s)
| | | | - Alexey Fedorov
- Bach Institute of Biochemistry, Federal Research Center of Biotechnology of the Russian Academy of Sciences, Moscow 119071, Russia
| |
Collapse
|
4
|
Fan K, Li Y, Chen Z, Fan L. GenRCA: a user-friendly rare codon analysis tool for comprehensive evaluation of codon usage preferences based on coding sequences in genomes. BMC Bioinformatics 2024; 25:309. [PMID: 39333857 PMCID: PMC11438159 DOI: 10.1186/s12859-024-05934-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 09/17/2024] [Indexed: 09/30/2024] Open
Abstract
BACKGROUND The study of codon usage bias is important for understanding gene expression, evolution and gene design, providing critical insights into the molecular processes that govern the function and regulation of genes. Codon Usage Bias (CUB) indices are valuable metrics for understanding codon usage patterns across different organisms without extensive experiments. Considering that there is no one-fits-all index for all species, a comprehensive platform supporting the calculation and analysis of multiple CUB indices for codon optimization is greatly needed. RESULTS Here, we release GenRCA, an updated version of our previous Rare Codon Analysis Tool, as a free and user-friendly website for all-inclusive evaluation of codon usage preferences of coding sequences. In this study, we manually reviewed and implemented up to 31 codon preference indices, with 65 expression host organisms covered and batch processing of multiple gene sequences supported, aiming to improve the user experience and provide more comprehensive and efficient analysis. CONCLUSIONS Our website fills a gap in the availability of comprehensive tools for species-specific CUB calculations, enabling researchers to thoroughly assess the protein expression level based on a comprehensive list of 31 indices and further guide the codon optimization.
Collapse
Affiliation(s)
- Kunjie Fan
- Production and R&D Center I of LSS, GenScript (Shanghai) Biotech Co., Ltd., Shanghai, China
| | - Yuanyuan Li
- Production and R&D Center I of LSS, GenScript Biotech Corporation, Nanjing, China
| | - Zhiwei Chen
- Production and R&D Center I of LSS, GenScript Biotech Corporation, Nanjing, China
| | - Long Fan
- Production and R&D Center I of LSS, GenScript (Shanghai) Biotech Co., Ltd., Shanghai, China.
| |
Collapse
|
5
|
Chen H, Jiang S, Xu K, Ding Z, Wang J, Cao M, Yuan J. Design of Thermoresponsive Genetic Controls with Minimal Heat-Shock Response. ACS Synth Biol 2024; 13:3032-3040. [PMID: 39150992 DOI: 10.1021/acssynbio.4c00236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/18/2024]
Abstract
As temperature serves as a versatile input signal, thermoresponsive genetic controls have gained significant interest for recombinant protein production and metabolic engineering applications. The conventional thermoresponsive systems normally require the continuous exposure of heat stimuli to trigger the prolonged expression of targeted genes, and the accompanied heat-shock response is detrimental to the bioproduction process. In this study, we present the design of thermoresponsive quorum-sensing (ThermoQS) circuits to make Escherichia coli record transient heat stimuli. By conversion of the heat input into the accumulation of quorum-sensing molecules such as acyl-homoserine lactone derived from Pseudomonas aeruginosa, sustained gene expressions were achieved by a minimal heat stimulus. Moreover, we also demonstrated that we reprogrammed the E. coli Lac operon to make it respond to heat stimuli with an impressive signal-to-noise ratio (S/N) of 15.3. Taken together, we envision that the ThermoQS systems reported in this study are expected to remarkably diminish both design and experimental expenditures for future metabolic engineering applications.
Collapse
Affiliation(s)
- Haofeng Chen
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Shan Jiang
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Kaixuan Xu
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Ziyu Ding
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Jiangkai Wang
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Mingfeng Cao
- Department of Chemical and Biochemical Engineering, College of Chemistry and Chemical Engineering, Key Laboratory for Synthetic Biotechnology of Xiamen City, Xiamen University, Xiamen 361005, China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen 361005, China
| | - Jifeng Yuan
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
- Shenzhen Research Institute of Xiamen University, Shenzhen 518057, China
| |
Collapse
|
6
|
Bergman S, Tuller T. Codon usage and expression-based features significantly improve prediction of CRISPR efficiency. NPJ Syst Biol Appl 2024; 10:100. [PMID: 39227603 PMCID: PMC11372048 DOI: 10.1038/s41540-024-00431-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Accepted: 08/27/2024] [Indexed: 09/05/2024] Open
Abstract
CRISPR is a precise and effective genome editing technology; but despite several advancements during the last decade, our ability to computationally design gRNAs remains limited. Most predictive models have relatively low predictive power and utilize only the sequence of the target site as input. Here we suggest a new category of features, which incorporate the target site genomic position and the presence of genes close to it. We calculate four features based on gene expression and codon usage bias indices. We show, on CRISPR datasets taken from 3 different cell types, that such features perform comparably with 425 state-of-the-art predictive features, ranking in the top 2-12% of features. We trained new predictive models, showing that adding expression features to them significantly improves their r2 by up to 0.04 (relative increase of 39%), achieving average correlations of up to 0.38 on their validation sets; and that these features are deemed important by different feature importance metrics. We believe that incorporating the target site's position, in addition to its sequence, in features such as we have generated here will improve our ability to predict, design and understand CRISPR experiments going forward.
Collapse
Affiliation(s)
- Shaked Bergman
- Department of Biomedical Engineering, Tel-Aviv University, Tel Aviv, Israel
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel-Aviv University, Tel Aviv, Israel.
- The Sagol School of Neuroscience, Tel-Aviv University, Tel Aviv, Israel.
| |
Collapse
|
7
|
Paremskaia AI, Kogan AA, Murashkina A, Naumova DA, Satish A, Abramov IS, Feoktistova SG, Mityaeva ON, Deviatkin AA, Volchkov PY. Codon-optimization in gene therapy: promises, prospects and challenges. Front Bioeng Biotechnol 2024; 12:1371596. [PMID: 38605988 PMCID: PMC11007035 DOI: 10.3389/fbioe.2024.1371596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 03/19/2024] [Indexed: 04/13/2024] Open
Abstract
Codon optimization has evolved to enhance protein expression efficiency by exploiting the genetic code's redundancy, allowing for multiple codon options for a single amino acid. Initially observed in E. coli, optimal codon usage correlates with high gene expression, which has propelled applications expanding from basic research to biopharmaceuticals and vaccine development. The method is especially valuable for adjusting immune responses in gene therapies and has the potenial to create tissue-specific therapies. However, challenges persist, such as the risk of unintended effects on protein function and the complexity of evaluating optimization effectiveness. Despite these issues, codon optimization is crucial in advancing gene therapeutics. This study provides a comprehensive review of the current metrics for codon-optimization, and its practical usage in research and clinical applications, in the context of gene therapy.
Collapse
Affiliation(s)
- Anastasiia Iu Paremskaia
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia
| | - Anna A. Kogan
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia
| | - Anastasiia Murashkina
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia
| | - Daria A. Naumova
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia
| | - Anakha Satish
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia
| | - Ivan S. Abramov
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia
- The MCSC named after A. S. Loginov, Moscow, Russia
| | - Sofya G. Feoktistova
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia
| | - Olga N. Mityaeva
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia
| | - Andrei A. Deviatkin
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia
| | - Pavel Yu Volchkov
- Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia
- The MCSC named after A. S. Loginov, Moscow, Russia
| |
Collapse
|
8
|
Alonso AM, Diambra L. Dicodon-based measures for modeling gene expression. Bioinformatics 2023; 39:btad380. [PMID: 37307098 PMCID: PMC10287933 DOI: 10.1093/bioinformatics/btad380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 05/20/2023] [Accepted: 06/09/2023] [Indexed: 06/14/2023] Open
Abstract
MOTIVATION Codon usage preference patterns have been associated with modulation of translation efficiency, protein folding, and mRNA decay. However, new studies support that codon pair usage has also a remarkable effect at the gene expression level. Here, we expand the concept of CAI to answer if codon pair usage patterns can be understood in terms of codon usage bias, or if they offer new information regarding coding translation efficiency. RESULTS Through the implementation of a weighting strategy to consider the dicodon contributions, we observe that the dicodon-based measure has greater correlations with gene expression level than CAI. Interestingly, we have noted that dicodons associated with a low value of adaptiveness are related to dicodons which mediate strong translational inhibition in yeast. We have also noticed that some codon-pairs have a smaller dicodon contribution than estimated by the product of the respective codon contributions. AVAILABILITY AND IMPLEMENTATION Scripts, implemented in Python, are freely available for download at https://zenodo.org/record/7738276#.ZBIDBtLMIdU.
Collapse
Affiliation(s)
- Andres M Alonso
- Instituto Tecnológico Chascomús (INTECH), CONICET-UNSAM, Intendente Marino km 8.2, Chascomús, 7130 Provincia de Buenos Aires, Argentina
- CCT-La Plata, CONICET, Calle 8 Nº 1467, La Plata, B1904CMC Provincia de Buenos Aires, Argentina
| | - Luis Diambra
- CCT-La Plata, CONICET, Calle 8 Nº 1467, La Plata, B1904CMC Provincia de Buenos Aires, Argentina
- Centro Regional de Estudios Genómicos, FCE-UNLP, Blvd 120 N∘ 1461, La Plata, 1900 Provincia de Buenos Aires, Argentina
| |
Collapse
|
9
|
Michel CJ, Sereni JS. Reading Frame Retrieval of Genes: A New Parameter of Codon Usage Based on the Circular Code Theory. Bull Math Biol 2023; 85:24. [PMID: 36826719 PMCID: PMC9950712 DOI: 10.1007/s11538-023-01129-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Accepted: 01/26/2023] [Indexed: 02/25/2023]
Abstract
Based on the circular code theory, we define a new function f that quantifies the property of reading frame retrieval (RFR) of genes from their codon usage. This RFR function f is computed on a massive scale in genes of genomes of bacteria, eukaryotes and archaea. By expressing f as a function of the mean number [Formula: see text] of codons per gene, a "universal" property is identified, whatever the kingdom: the reading frame retrieval is enhanced in large genes. By investigating this property according to the theory developed, a Spearman's rank correlation with a strong negative coefficient is observed between the codon usage dispersion d (from the uniform codon distribution [Formula: see text]) and the RFR function f, whatever the kingdom (p-values [Formula: see text] in bacteria, [Formula: see text] in eukaryotes and [Formula: see text] in archaea). Thus, the reading frame retrieval is enhanced with the codon usage dispersion. Furthermore, this approach identifies a "genome centre" from which emerge two distinct "genome arms": an upper arm and a lower arm, respectively, above and below the linear regression. The RFR function by itself or combined with classical methods (alignment, phylogeny) could also be a new approach to classify the genomes in the future.
Collapse
Affiliation(s)
- Christian J. Michel
- grid.11843.3f0000 0001 2157 9291Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, 67400 Illkirch, France
| | - Jean-Sébastien Sereni
- grid.11843.3f0000 0001 2157 9291Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, 67400 Illkirch, France
| |
Collapse
|
10
|
Panda A, Tuller T. Determinants of associations between codon and amino acid usage patterns of microbial communities and the environment inferred based on a cross-biome metagenomic analysis. NPJ Biofilms Microbiomes 2023; 9:5. [PMID: 36693851 PMCID: PMC9873608 DOI: 10.1038/s41522-023-00372-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 01/11/2023] [Indexed: 01/25/2023] Open
Abstract
Codon and amino acid usage were associated with almost every aspect of microbial life. However, how the environment may impact the codon and amino acid choice of microbial communities at the habitat level is not clearly understood. Therefore, in this study, we analyzed codon and amino acid usage patterns of a large number of environmental samples collected from diverse ecological niches. Our results suggested that samples derived from similar environmental niches, in general, show overall similar codon and amino acid distribution as compared to samples from other habitats. To substantiate the relative impact of the environment, we considered several factors, such as their similarity in GC content, or in functional or taxonomic abundance. Our analysis demonstrated that none of these factors can fully explain the trends that we observed at the codon or amino acid level implying a direct environmental influence on them. Further, our analysis demonstrated different levels of selection on codon bias in different microbial communities with the highest bias in host-associated environments such as the digestive system or oral samples and the lowest level of selection in soil and water samples. Considering a large number of metagenomic samples here we showed that microorganisms collected from similar environmental backgrounds exhibit similar patterns of codon and amino acid usage irrespective of the location or time from where the samples were collected. Thus our study suggested a direct impact of the environment on codon and amino usage of microorganisms that cannot be explained considering the influence of other factors.
Collapse
Affiliation(s)
- Arup Panda
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv, 69978, Israel.
| |
Collapse
|
11
|
Sahoo S, Rakshit R. The pattern of coding sequences in the chloroplast genome of Atropa belladonna and a comparative analysis with other related genomes in the nightshade family. Genomics Inform 2022; 20:e43. [PMID: 36617650 PMCID: PMC9847383 DOI: 10.5808/gi.22045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 12/12/2022] [Indexed: 12/31/2022] Open
Abstract
Atropa belladonna is a valuable medicinal plant and a commercial source of tropane alkaloids, which are frequently utilized in therapeutic practice. In this study, bioinformaticmethodologies were used to examine the pattern of coding sequences and the factors thatmight influence codon usage bias in the chloroplast genome of Atropa belladonna andother nightshade genomes. The chloroplast engineering being a promising field in modernbiotechnology, the characterization of chloroplast genome is very important. The resultsrevealed that the chloroplast genomes of Nicotiana tabacum, Solanum lycopersicum, Capsicum frutescens, Datura stramonium, Lyciumbarbarum, Solanum melongena, and Solanumtuberosum exhibited comparable codon usage patterns. In these chloroplast genomes, weobserved a weak codon usage bias. According to the correspondence analysis, the genesisof the codon use bias in these chloroplast genes might be explained by natural selection,directed mutational pressure, and other factors. GC12 and GC3S were shown to have nomeaningful relationship. Further research revealed that natural selection primarily shapedthe codon usage in A. belladonna and other nightshade genomes for translational efficiency. The sequencing properties of these chloroplast genomes were also investigated by investing the occurrences of palindromes and inverted repeats, which would be useful forfuture research on medicinal plants.
Collapse
Affiliation(s)
- Satyabrata Sahoo
- Department of Physics, Dhruba Chand Halder College, Dakshin Barasat 743372, India,*Corresponding author E-mail:
| | - Ria Rakshit
- Department of Botany, Baruipur College, Baruipur 743610, India
| |
Collapse
|
12
|
Almutairi MM, Almotairy HM. Analysis of Heat Shock Proteins Based on Amino Acids for the Tomato Genome. Genes (Basel) 2022; 13:2014. [PMID: 36360251 PMCID: PMC9690137 DOI: 10.3390/genes13112014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Revised: 10/30/2022] [Accepted: 10/31/2022] [Indexed: 10/28/2023] Open
Abstract
This research aimed to investigate heat shock proteins in the tomato genome through the analysis of amino acids. The highest length among sequences was found in seq19 with 3534 base pairs. This seq19 was reported and contained a family of proteins known as HsfA that have a domain of transcriptional activation for tolerance to heat and other abiotic stresses. The values of the codon adaptation index (CAI) ranged from 0.80 in Seq19 to 0.65 in Seq10, based on the mRNA of heat shock proteins for tomatoes. Asparagine (AAT, AAC), aspartic acid (GAT, GAC), phenylalanine (TTT, TTC), and tyrosine (TAT, TAC) have relative synonymous codon usage (RSCU) values bigger than 0.5. In modified relative codon bias (MRCBS), the high gene expressions of the amino acids under heat stress were histidine, tryptophan, asparagine, aspartic acid, lysine, phenylalanine, isoleucine, cysteine, and threonine. RSCU values that were less than 0.5 were considered rare codons that affected the rate of translation, and thus selection could be effective by reducing the frequency of expressed genes under heat stress. The normal distribution of RSCU shows about 68% of the values drawn from the standard normal distribution were within 0.22 and -0.22 standard deviations that tend to cluster around the mean. The most critical component based on principal component analysis (PCA) was the RSCU. These findings would help plant breeders in the development of growth habits for tomatoes during breeding programs.
Collapse
Affiliation(s)
- Meshal M. Almutairi
- National Center of Agricultural Technology, Sustainability and Environment, King Abdulaziz City for Science and Technology KACST, Box 6086, Riyadh 11442, Saudi Arabia
| | | |
Collapse
|
13
|
Hugaboom M, Hatmaker EA, LaBella AL, Rokas A. Evolution and codon usage bias of mitochondrial and nuclear genomes in Aspergillus section Flavi. G3 (BETHESDA, MD.) 2022; 13:6777267. [PMID: 36305682 PMCID: PMC9836360 DOI: 10.1093/g3journal/jkac285] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 10/24/2022] [Indexed: 11/06/2022]
Abstract
The fungal genus Aspergillus contains a diversity of species divided into taxonomic sections of closely related species. Section Flavi contains 33 species, many of industrial, agricultural, or medical relevance. Here, we analyze the mitochondrial genomes (mitogenomes) of 20 Flavi species-including 18 newly assembled mitogenomes-and compare their evolutionary history and codon usage bias patterns to their nuclear counterparts. Codon usage bias refers to variable frequencies of synonymous codons in coding DNA and is shaped by a balance of neutral processes and natural selection. All mitogenomes were circular DNA molecules with highly conserved gene content and order. As expected, genomic content, including GC content, and genome size differed greatly between mitochondrial and nuclear genomes. Phylogenetic analysis based on 14 concatenated mitochondrial genes predicted evolutionary relationships largely consistent with those predicted by a phylogeny constructed from 2,422 nuclear genes. Comparing similarities in interspecies patterns of codon usage bias between mitochondrial and nuclear genomes showed that species grouped differently by patterns of codon usage bias depending on whether analyses were performed using mitochondrial or nuclear relative synonymous usage values. We found that patterns of codon usage bias at gene level are more similar between mitogenomes of different species than the mitogenome and nuclear genome of the same species. Finally, we inferred that, although most genes-both nuclear and mitochondrial-deviated from the neutral expectation for codon usage, mitogenomes were not under translational selection while nuclear genomes were under moderate translational selection. These results contribute to the study of mitochondrial genome evolution in filamentous fungi.
Collapse
Affiliation(s)
- Miya Hugaboom
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Elizabeth Anne Hatmaker
- Corresponding author: Department of Biological Sciences, Vanderbilt University, VU Station B 35-1364, Nashville, TN 37235, USA. (AH)
| | - Abigail L LaBella
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Antonis Rokas
- Corresponding author: Department of Biological Sciences, Vanderbilt University, VU Station B 35-1364, Nashville, TN 37235, USA. (AR)
| |
Collapse
|
14
|
Bansal S, Mallikarjuna MG, Balamurugan A, Nayaka SC, Prakash G. Composition and Codon Usage Pattern Results in Divergence of the Zinc Binuclear Cluster ( Zn(II)2Cys6) Sequences among Ascomycetes Plant Pathogenic Fungi. J Fungi (Basel) 2022; 8:1134. [PMID: 36354901 PMCID: PMC9694491 DOI: 10.3390/jof8111134] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Revised: 10/22/2022] [Accepted: 10/23/2022] [Indexed: 07/29/2023] Open
Abstract
Zinc binuclear cluster proteins (ZBC; Zn(II)2Cys6) are unique to the fungi kingdom and associated with a series of functions, viz., the utilization of macromolecules, stress tolerance, and most importantly, host-pathogen interactions by imparting virulence to the pathogen. Codon usage bias (CUB) is the phenomenon of using synonymous codons in a non-uniform fashion during the translation event, which has arisen because of interactions among evolutionary forces. The Zn(II)2Cys6 coding sequences from nine Ascomycetes plant pathogenic species and model system yeast were analysed for compositional and codon usage bias patterns. The clustering analysis diverged the Ascomycetes fungi into two clusters. The nucleotide compositional and relative synonymous codon usage (RSCU) analysis indicated GC biasness toward Ascomycetes fungi compared with the model system S. cerevisiae, which tends to be AT-rich. Further, plant pathogenic Ascomycetes fungi belonging to cluster-2 showed a higher number of GC-rich high-frequency codons than cluster-1 and was exclusively AT-rich in S. cerevisiae. The current investigation also showed the mutual effect of the two evolutionary forces, viz. natural selection and compositional constraints, on the CUB of Zn(II)2Cys6 genes. The perseverance of GC-rich codons of Zn(II)2Cys6 in Ascomycetes could facilitate the invasion process. The findings of the current investigation show the role of CUB and nucleotide composition in the evolutionary divergence of Ascomycetes plant pathogens and paves the way to target specific codons and sequences to modulate host-pathogen interactions through genome editing and functional genomics tools.
Collapse
Affiliation(s)
- Shilpi Bansal
- Division of Plant Pathology, ICAR—Indian Agricultural Research Institute, New Delhi 110012, India
| | | | - Alexander Balamurugan
- Division of Plant Pathology, ICAR—Indian Agricultural Research Institute, New Delhi 110012, India
| | - S. Chandra Nayaka
- Department of Studies in Applied Botany and Biotechnology, University of Mysore, Mysore 570005, India
| | - Ganesan Prakash
- Division of Plant Pathology, ICAR—Indian Agricultural Research Institute, New Delhi 110012, India
| |
Collapse
|
15
|
Mostafa Anwar A, Khodary SM, Soudy M, Ahmed EA, Osama A, Ezzeldin S, Tanios A, Mahgoub S, Magdeldin S. WITHDRAWN: Robust method for calculating the tRNA adaptation index utilizing the genetic algorithm. Comput Struct Biotechnol J 2021. [DOI: 10.1016/j.csbj.2021.12.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
16
|
Liu C, Yuan J, Zhang X, Jin S, Li F, Xiang J. tRNA copy number and codon usage in the sea cucumber genome provide insights into adaptive translation for saponin biosynthesis. Open Biol 2021; 11:210190. [PMID: 34753322 PMCID: PMC8580430 DOI: 10.1098/rsob.210190] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Genomic tRNA copy numbers determine cytoplasmic tRNA abundances, which in turn influence translation efficiency, but the underlying mechanism is not well understood. Using the sea cucumber Apostichopus japonicus as a model, we combined genomic sequence, transcriptome expression and ecological food resource data to study its codon usage adaptation. The results showed that, unlike intragenic non-coding RNAs, transfer RNAs (tRNAs) tended to be transcribed independently. This may be attributed to their specific Pol III promoters that lack transcriptional regulation, which may underlie the correlation between genomic copy number and cytoplasmic abundance of tRNAs. Moreover, codon usage optimization was mostly restrained by a gene's amino acid sequence, which might be a compromise between functionality and translation efficiency for stress responses were highly optimized for most echinoderms, while enzymes for saponin biosynthesis (LAS, CYPs and UGTs) were especially optimized in sea cucumbers, which might promote saponin synthesis as a defence strategy. The genomic tRNA content of A. japonicus was positively correlated with amino acid content in its natural food particles, which should promote its efficiency in protein synthesis. We propose that coevolution between genomic tRNA content and codon usage of sea cucumbers facilitates their saponin synthesis and survival using food resources with low nutrient content.
Collapse
Affiliation(s)
- Chengzhang Liu
- CAS and Shandong Province Key Laboratory of Experimental Marine Biology, Center for Ocean Mega-Science, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, People's Republic of China,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, People's Republic of China
| | - Jianbo Yuan
- CAS and Shandong Province Key Laboratory of Experimental Marine Biology, Center for Ocean Mega-Science, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, People's Republic of China,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, People's Republic of China
| | - Xiaojun Zhang
- CAS and Shandong Province Key Laboratory of Experimental Marine Biology, Center for Ocean Mega-Science, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, People's Republic of China,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, People's Republic of China
| | - Songjun Jin
- CAS and Shandong Province Key Laboratory of Experimental Marine Biology, Center for Ocean Mega-Science, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, People's Republic of China,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, People's Republic of China
| | - Fuhua Li
- CAS and Shandong Province Key Laboratory of Experimental Marine Biology, Center for Ocean Mega-Science, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, People's Republic of China,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, People's Republic of China
| | - Jianhai Xiang
- CAS and Shandong Province Key Laboratory of Experimental Marine Biology, Center for Ocean Mega-Science, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, People's Republic of China,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, People's Republic of China
| |
Collapse
|
17
|
Guo H, Xu N, Prell M, Königs H, Hermanns-Sachweh B, Lüscher B, Kappes F. Bacterial Growth Inhibition Screen (BGIS): harnessing recombinant protein toxicity for rapid and unbiased interrogation of protein function. FEBS Lett 2021; 595:1422-1437. [PMID: 33704777 DOI: 10.1002/1873-3468.14072] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 02/28/2021] [Accepted: 03/02/2021] [Indexed: 12/13/2022]
Abstract
In two proof-of-concept studies, we established and validated the Bacterial Growth Inhibition Screen (BGIS), which explores recombinant protein toxicity in Escherichia coli as a largely overlooked and alternative means for basic characterization of functional eukaryotic protein domains. By applying BGIS, we identified an unrecognized RNA-interacting domain in the DEK oncoprotein (this study) and successfully combined BGIS with random mutagenesis as a screening tool for loss-of-function mutants of the DNA modulating domain of DEK [1]. Collectively, our findings shed new light on the phenomenon of recombinant protein toxicity in E. coli. Given the easy and rapid implementation and wide applicability, BGIS will extend the repertoire of basic methods for the identification, analysis and unbiased manipulation of proteins.
Collapse
Affiliation(s)
- Haihong Guo
- Institute for Biochemistry and Molecular Biology, Medical School, RWTH Aachen University, Germany
| | - Nengwei Xu
- Department of Biological Sciences, Suzhou Dushu Lake Science and Education Innovation District, Suzhou Industrial Park, Xi'an Jiaotong-Liverpool University, Suzhou, China
| | - Malte Prell
- Institute for Biochemistry and Molecular Biology, Medical School, RWTH Aachen University, Germany
| | - Hiltrud Königs
- Institute of Pathology, Medical School, RWTH Aachen University, Germany
| | | | - Bernhard Lüscher
- Institute for Biochemistry and Molecular Biology, Medical School, RWTH Aachen University, Germany
| | - Ferdinand Kappes
- Institute for Biochemistry and Molecular Biology, Medical School, RWTH Aachen University, Germany
- Department of Biological Sciences, Suzhou Dushu Lake Science and Education Innovation District, Suzhou Industrial Park, Xi'an Jiaotong-Liverpool University, Suzhou, China
| |
Collapse
|
18
|
Almutairi MM. Analysis of chromosomes and nucleotides in rice to predict gene expression through codon usage pattern. Saudi J Biol Sci 2021; 28:4569-4574. [PMID: 34354442 PMCID: PMC8325026 DOI: 10.1016/j.sjbs.2021.04.059] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 04/07/2021] [Accepted: 04/21/2021] [Indexed: 11/03/2022] Open
Abstract
Amino acids are essential measurements for the potential growth stage because of connecting to protein structures and functions. The objective of this paper was to analyze chromosomes feature at plastid region of rice represented by nucleotide, synonymous codon, and amino acid usage to predict gene expression through codon usage pattern. The results showed that the values of the codon adaption index ranged from 0.733 in chromosome 9 to 0.631 in chromosome 8 with full length of these two chromosomes were 3738 and 1635 respectively. The higher value of guanine and cytosine content was 60% in chromosomes 9 while the lower values was 37% in chromosomes 11. Eight chromosomes (ch1, ch2, ch3, ch5, ch7, ch8, ch10, and ch12) were greater value of modified relative codon bias than threshold (threshold: 0.66) especially in cysteine for ch1, ch2, ch5, ch10, and ch12. While other remaining chromosomes were less than the threshold. Relative synonymous codon usage found that the over-represented of amino acids were asparagine, aspartate, cysteine, glutamate, and phenylalanine across all 12 chromosomes. These results would establish a platform for more and further projects concerning rice breeding and genetics and codon optimization in the amino acids for developing varieties. These results also will help breeders to select desirable genes through the genome for improve target traits.
Collapse
Affiliation(s)
- Meshal M Almutairi
- National Center of Agricultural and Technology, King Abdulaziz City for Science and Technology (KACST), P.O. Box 6086, Riyadh 11442, Saudi Arabia
| |
Collapse
|
19
|
Bahiri-Elitzur S, Tuller T. Codon-based indices for modeling gene expression and transcript evolution. Comput Struct Biotechnol J 2021; 19:2646-2663. [PMID: 34025951 PMCID: PMC8122159 DOI: 10.1016/j.csbj.2021.04.042] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2021] [Revised: 04/17/2021] [Accepted: 04/18/2021] [Indexed: 11/21/2022] Open
Abstract
Codon usage bias (CUB) refers to the phenomena that synonymous codons are used in different frequencies in most genes and organisms. The general assumption is that codon biases reflect a balance between mutational biases and natural selection. Today we understand that the codon content is related and can affect all gene expression steps. Starting from the 1980s, codon-based indices have been used for answering different questions in all biomedical fields, including systems biology, agriculture, medicine, and biotechnology. In general, codon usage bias indices weigh each codon or a small set of codons to estimate the fitting of a certain coding sequence to a certain phenomenon (e.g., bias in codons, adaptation to the tRNA pool, frequencies of certain codons, transcription elongation speed, etc.) and are usually easy to implement. Today there are dozens of such indices; thus, this paper aims to review and compare the different codon usage bias indices, their applications, and advantages. In addition, we perform analysis that demonstrates that most indices tend to correlate even though they aim to capture different aspects. Due to the centrality of codon usage bias on different gene expression steps, it is important to keep developing new indices that can capture additional aspects that are not modeled with the current indices.
Collapse
Affiliation(s)
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel-Aviv University, Tel Aviv, Israel
- The Sagol School of Neuroscience, Tel-Aviv University, Tel Aviv, Israel
| |
Collapse
|
20
|
LaBella AL, Opulente DA, Steenwyk JL, Hittinger CT, Rokas A. Signatures of optimal codon usage in metabolic genes inform budding yeast ecology. PLoS Biol 2021; 19:e3001185. [PMID: 33872297 PMCID: PMC8084343 DOI: 10.1371/journal.pbio.3001185] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 04/29/2021] [Accepted: 03/15/2021] [Indexed: 02/06/2023] Open
Abstract
Reverse ecology is the inference of ecological information from patterns of genomic variation. One rich, heretofore underutilized, source of ecologically relevant genomic information is codon optimality or adaptation. Bias toward codons that match the tRNA pool is robustly associated with high gene expression in diverse organisms, suggesting that codon optimization could be used in a reverse ecology framework to identify highly expressed, ecologically relevant genes. To test this hypothesis, we examined the relationship between optimal codon usage in the classic galactose metabolism (GAL) pathway and known ecological niches for 329 species of budding yeasts, a diverse subphylum of fungi. We find that optimal codon usage in the GAL pathway is positively correlated with quantitative growth on galactose, suggesting that GAL codon optimization reflects increased capacity to grow on galactose. Optimal codon usage in the GAL pathway is also positively correlated with human-associated ecological niches in yeasts of the CUG-Ser1 clade and with dairy-associated ecological niches in the family Saccharomycetaceae. For example, optimal codon usage of GAL genes is greater than 85% of all genes in the genome of the major human pathogen Candida albicans (CUG-Ser1 clade) and greater than 75% of genes in the genome of the dairy yeast Kluyveromyces lactis (family Saccharomycetaceae). We further find a correlation between optimization in the GALactose pathway genes and several genes associated with nutrient sensing and metabolism. This work suggests that codon optimization harbors information about the metabolic ecology of microbial eukaryotes. This information may be particularly useful for studying fungal dark matter-species that have yet to be cultured in the lab or have only been identified by genomic material.
Collapse
Affiliation(s)
- Abigail Leavitt LaBella
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Dana A. Opulente
- Department of Biology, Villanova University, Villanova, Pennsylvania, United States of America
| | - Jacob L. Steenwyk
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Chris Todd Hittinger
- Laboratory of Genetics, DOE Great Lakes Bioenergy Research Center, Wisconsin Energy Institute, Center for Genomic Science Innovation, J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| |
Collapse
|
21
|
Schmitz A, Zhang F. Massively parallel gene expression variation measurement of a synonymous codon library. BMC Genomics 2021; 22:149. [PMID: 33653272 PMCID: PMC7927243 DOI: 10.1186/s12864-021-07462-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 02/22/2021] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Cell-to-cell variation in gene expression strongly affects population behavior and is key to multiple biological processes. While codon usage is known to affect ensemble gene expression, how codon usage influences variation in gene expression between single cells is not well understood. RESULTS Here, we used a Sort-seq based massively parallel strategy to quantify gene expression variation from a green fluorescent protein (GFP) library containing synonymous codons in Escherichia coli. We found that sequences containing codons with higher tRNA Adaptation Index (TAI) scores, and higher codon adaptation index (CAI) scores, have higher GFP variance. This trend is not observed for codons with high Normalized Translation Efficiency Index (nTE) scores nor from the free energy of folding of the mRNA secondary structure. GFP noise, or squared coefficient of variance (CV2), scales with mean protein abundance for low-abundant proteins but does not change at high mean protein abundance. CONCLUSIONS Our results suggest that the main source of noise for high-abundance proteins is likely not originating at translation elongation. Additionally, the drastic change in mean protein abundance with small changes in protein noise seen from our library implies that codon optimization can be performed without concerning gene expression noise for biotechnology applications.
Collapse
Affiliation(s)
- Alexander Schmitz
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, MO, 63130, USA
| | - Fuzhong Zhang
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, MO, 63130, USA.
- Division of Biological & Biomedical Sciences, Washington University in St. Louis, Saint Louis, MO, 63130, USA.
- Institute of Materials Science & Engineering, Washington University in St. Louis, Saint Louis, MO, 63130, USA.
| |
Collapse
|
22
|
Priya R, Sneha P, Dass JFP, Doss C GP, Manickavasagam M, Siva R. Exploring the codon patterns between CCD and NCED genes among different plant species. Comput Biol Med 2019; 114:103449. [DOI: 10.1016/j.compbiomed.2019.103449] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2019] [Revised: 09/13/2019] [Accepted: 09/13/2019] [Indexed: 01/16/2023]
|
23
|
Parkes GM, Niranjan M. Uncovering extensive post-translation regulation during human cell cycle progression by integrative multi-'omics analysis. BMC Bioinformatics 2019; 20:536. [PMID: 31664894 PMCID: PMC6820968 DOI: 10.1186/s12859-019-3150-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Accepted: 10/04/2019] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Analysis of high-throughput multi-'omics interactions across the hierarchy of expression has wide interest in making inferences with regard to biological function and biomarker discovery. Expression levels across different scales are determined by robust synthesis, regulation and degradation processes, and hence transcript (mRNA) measurements made by microarray/RNA-Seq only show modest correlation with corresponding protein levels. RESULTS In this work we are interested in quantitative modelling of correlation across such gene products. Building on recent work, we develop computational models spanning transcript, translation and protein levels at different stages of the H. sapiens cell cycle. We enhance this analysis by incorporating 25+ sequence-derived features which are likely determinants of cellular protein concentration and quantitatively select for relevant features, producing a vast dataset with thousands of genes. We reveal insights into the complex interplay between expression levels across time, using machine learning methods to highlight outliers with respect to such models as proteins associated with post-translationally regulated modes of action. CONCLUSIONS We uncover quantitative separation between modified and degraded proteins that have roles in cell cycle regulation, chromatin remodelling and protein catabolism according to Gene Ontology; and highlight the opportunities for providing biological insights in future model systems.
Collapse
Affiliation(s)
- Gregory M Parkes
- University of Southampton, University Road, Southampton, SO17 1BJ, UK.
| | - Mahesan Niranjan
- University of Southampton, University Road, Southampton, SO17 1BJ, UK
| |
Collapse
|
24
|
Liu SS, Hockenberry AJ, Jewett MC, Amaral LAN. A novel framework for evaluating the performance of codon usage bias metrics. J R Soc Interface 2019; 15:rsif.2017.0667. [PMID: 29386398 DOI: 10.1098/rsif.2017.0667] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Accepted: 01/04/2018] [Indexed: 11/12/2022] Open
Abstract
The unequal utilization of synonymous codons affects numerous cellular processes including translation rates, protein folding and mRNA degradation. In order to understand the biological impact of variable codon usage bias (CUB) between genes and genomes, it is crucial to be able to accurately measure CUB for a given sequence. A large number of metrics have been developed for this purpose, but there is currently no way of systematically testing the accuracy of individual metrics or knowing whether metrics provide consistent results. This lack of standardization can result in false-positive and false-negative findings if underpowered or inaccurate metrics are applied as tools for discovery. Here, we show that the choice of CUB metric impacts both the significance and measured effect sizes in numerous empirical datasets, raising questions about the generality of findings in published research. To bring about standardization, we developed a novel method to create synthetic protein-coding DNA sequences according to different models of codon usage. We use these benchmark sequences to identify the most accurate and robust metrics with regard to sequence length, GC content and amino acid heterogeneity. Finally, we show how our benchmark can aid the development of new metrics by providing feedback on its performance compared to the state of the art.
Collapse
Affiliation(s)
- Sophia S Liu
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | - Adam J Hockenberry
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.,Interdisciplinary Program in Biological Sciences, Northwestern University, Evanston, IL, USA
| | - Michael C Jewett
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA .,Interdisciplinary Program in Biological Sciences, Northwestern University, Evanston, IL, USA.,Center for Synthetic Biology, Northwestern University, Evanston, IL, USA.,Simpson Querrey BioNanotechnology Institute, Northwestern University, Evanston, IL, USA.,Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA
| | - Luís A N Amaral
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA .,Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA.,Department of Physics and Astronomy, Northwestern University, Evanston, IL, USA
| |
Collapse
|
25
|
Sahoo S, Das SS, Rakshit R. Codon usage pattern and predicted gene expression in Arabidopsis thaliana. Gene 2019; 721S:100012. [PMID: 32550546 PMCID: PMC7286098 DOI: 10.1016/j.gene.2019.100012] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 01/30/2019] [Accepted: 02/21/2019] [Indexed: 01/20/2023]
Abstract
The extensive research for predicting highly expressed genes in plant genome sequences has been going on for decades. The codon usage pattern of genes in Arabidopsis thaliana genome is a classical topic for plant biologists for its significance in the understanding of molecular plant biology. Here we have used a gene expression profiling methodology based on the score of modified relative codon bias (MRCBS) to elucidate expression pattern of genes in Arabidopsis thaliana. MRCBS relies exclusively on sequence features for identifying the highly expressed genes. In this study, a critical analysis of predicted highly expressed (PHE) genes in Arabidopsis thaliana has been performed using MRCBS as a numerical estimator of gene expression level. Consistent with previous other results, our study indicates that codon composition plays an important role in the regulation of gene expression. We found a systematic strong correlation between MRCBS and CAI (codon adaptation index) or other expression-measures. Additionally, MRCBS correlates well with experimental gene expression data. Our study highlights the relationship between gene expression and compositional signature in relation to codon usage bias and sets the ground for the further investigation of the evolution of the protein-coding genes in the plant genome.
Collapse
Key Words
- Arabidopsis thaliana
- CAI
- CAI, Codon adaptation index
- CP, Chloroplast Pltd CP
- Codon usage bias
- GC content
- GEO, Gene Expression Omnibus
- Gene expression
- MADS, Minichromosome maintenance1, Agamous, Deficiens and Serum response factor
- MBP, Megabase pair
- MRCBS, Score of Modified relative codon bias
- MT, Mitochondrion
- PHE genes
- PHE, Predicted Highly Expressed
- RCA, Relative Codon Adaptation
- RCB, Relative codon bias
- RCBS, Relative Codon Bias Strength
- RMA, Relative Molecular Abundance
- RP, Ribosomal protein
- SAGE, Serial Analysis of Gene Expression
- TAIR, The Arabidopsis Information Resourses
Collapse
Affiliation(s)
- Satyabrata Sahoo
- Department of Physics, Dhruba Chand Halder College, Dakshin Barasat, South 24 Parganas, W.B., India
| | - Shib Sankar Das
- Department of Mathematics, Uluberia College, Uluberia, Howrah, W.B., India
| | - Ria Rakshit
- Department of Botany, Baruipur College, South 24 Parganas, W.B., India
| |
Collapse
|
26
|
Deshpande S, Shuttleworth J, Yang J, Taramonli S, England M. PLIT: An alignment-free computational tool for identification of long non-coding RNAs in plant transcriptomic datasets. Comput Biol Med 2019; 105:169-181. [DOI: 10.1016/j.compbiomed.2018.12.014] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 12/27/2018] [Accepted: 12/29/2018] [Indexed: 02/05/2023]
|
27
|
Vasanthi S, Dass JFP. Comparative genome-wide analysis of codon usage of different bacterial species infecting Oryza sativa. J Cell Biochem 2018; 119:9346-9356. [PMID: 30105828 DOI: 10.1002/jcb.27214] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2018] [Accepted: 06/13/2018] [Indexed: 11/05/2022]
Abstract
Oryza sativa is vastly affected by microbial pathogen, causing blight-related diseases, which in turn deplete the growth and productivity of rice. In this study, we analyzed four bacterial rice pathogen genomes and reported on their codon usage that might have greater implication in mutation-related research. Differential codon usage indices, such as codon adaptation index (CAI), codon bias index (CBI), effective number of codons (ENc), relative synonymous codon usage (RSCU), correspondence analysis (COA), and parity plots, were applied on coding sequences of Pseudomonas fuscovaginae, Pseudomonas syringae, Xanthomonas oryzae, and Pseudomonas avenae speices. The RSCU results proposed a high-frequency usage of CUG and CGC that codes for leucine and arginine in all of the species. The CBI and CAI values between the genomes range from 0.17 to 0.3 and from 0.26 to 0.35, respectively, indicating a direct proportionality between these indexes. The mean ENc value of P. avenae coding sequence showed high codon bias compared with other genomes. The axis I variation from COA analysis shows a mean value of 42.28% codon variations in these bacterial species. Correlation studies between axis I and ENc-GC3, along with CAI and CBI, suggested the presence of nucleotide bias and mutational pressure as major forces for codon bias within these species. Hence, certain genes with high CAI-CBI have been correlated for better gene expression. Our study highlights the importance of nucleotide biasness, mutation pressure, and natural selection in shaping protein-coding genes in these four rice-affecting bacteria. This would further help in investigating the evolution of pathogenic gene families, which may direct research toward synthetic genes that could be suppressed or overrepresented based on their codon usage pattern toward pathogenicity.
Collapse
Affiliation(s)
- S Vasanthi
- Department of Integrative Biology, School of Biosciences and Technology, VIT, Vellore, Tamil Nadu, India
| | - J Febin Prabhu Dass
- Department of Integrative Biology, School of Biosciences and Technology, VIT, Vellore, Tamil Nadu, India
| |
Collapse
|
28
|
Sarkar I, Tisa LS, Gtari M, Sen A. Biosynthetic energy cost of potentially highly expressed proteins vary with niche in selected actinobacteria. J Basic Microbiol 2017; 58:154-161. [PMID: 29144540 DOI: 10.1002/jobm.201700350] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2017] [Revised: 10/27/2017] [Accepted: 11/05/2017] [Indexed: 11/08/2022]
Abstract
Amino acid and protein biosynthesis requires a number of high energy phosphate bonds and includes a dual energy cost for the synthesis of chemical intermediates during the fueling reactions and the conversion of precursor molecules to final products. One popular hypothesis is that the proteins encoded by putative highly expressed genes (hence called PHXPs) generally utilize low energy consuming amino acids to reduce the biosynthetic cost of the essential proteins. In our study, we found that this idea was not supported in the case of actinobacteria. With the actinobacteria, the energy costs of PHXPs varied in relation to their niche. Free-living, including aquatic, soil and extremophilic, and plant-associated actinobacteria were found to use energetically expensive amino acids in their PHXPs. An exception occurred with some animal-host-associated actinobacteria that used energy efficient amino acids. One explanation for these results may be due to the diverse metabolic patterns exhibited by actinobacteria under varied niches influenced by nutritional availability and physical environment.
Collapse
Affiliation(s)
- Indrani Sarkar
- NBU Bioinformatics Facility, Department of Botany, University of North Bengal, Siliguri, India
| | - Louis S Tisa
- Department of Molecular, Cellular and Biomedical Sciences, University of New Hampshire, Durham, New Hampshire
| | - Maher Gtari
- Laboratoire Microorganismes et Biomolécules Actives, Université de Tunis Elmanar (FST), Université de Carthage (INSAT), Tunis, Tunisia
| | - Arnab Sen
- NBU Bioinformatics Facility, Department of Botany, University of North Bengal, Siliguri, India
| |
Collapse
|
29
|
Sabi R, Volvovitch Daniel R, Tuller T. stAIcalc: tRNA adaptation index calculator based on species-specific weights. Bioinformatics 2017; 33:589-591. [PMID: 27797757 DOI: 10.1093/bioinformatics/btw647] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2016] [Accepted: 10/11/2016] [Indexed: 11/13/2022] Open
Abstract
Summary The tRNA Adaptation Index (tAI) is a tRNA-centric measure of translation efficiency which includes weights that take into account the efficiencies of the different wobble interactions. To enable the calculation of the index based on a species-specific inference of these weights, we created the stAI calc . The calculator includes optimized tAI weights for 100 species from the three domains of life along with a standalone software package that optimizes the weights for new organisms. The tAI with the optimized weights should enable performing large scale studies in disciplines such as molecular evolution, genomics, systems biology and synthetic biology. Availability and Implementation The calculator is publicly available at http://www.cs.tau.ac.il/∼tamirtul/stAIcalc/stAIcalc.html. Contact tamirtul@post.tau.ac.il.
Collapse
Affiliation(s)
| | | | - Tamir Tuller
- Department of Biomedical Engineering.,The Sagol School of Neuroscience, Tel Aviv University, Ramat Aviv, 69978, Israel
| |
Collapse
|
30
|
Codon usage and amino acid usage influence genes expression level. Genetica 2017; 146:53-63. [DOI: 10.1007/s10709-017-9996-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2017] [Accepted: 10/09/2017] [Indexed: 11/30/2022]
|
31
|
Predicting synonymous codon usage and optimizing the heterologous gene for expression in E. coli. Sci Rep 2017; 7:9926. [PMID: 28855614 PMCID: PMC5577221 DOI: 10.1038/s41598-017-10546-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2017] [Accepted: 08/11/2017] [Indexed: 11/27/2022] Open
Abstract
Of the 20 common amino acids, 18 are encoded by multiple synonymous codons. These synonymous codons are not redundant; in fact, all of codons contribute substantially to protein expression, structure and function. In this study, the codon usage pattern of genes in the E. coli was learned from the sequenced genomes of E. coli. A machine learning based method, Presyncodon was proposed to predict synonymous codon selection in E. coli based on the learned codon usage patterns of the residue in the context of the specific fragment. The predicting results indicate that Presycoden could be used to predict synonymous codon selection of the gene in the E. coli with the high accuracy. Two reporter genes (egfp and mApple) were designed with a combination of low- and high-frequency-usage codons by the method. The fluorescence intensity of eGFP and mApple expressed by the (egfp and mApple) designed by this method was about 2.3- or 1.7- folds greater than that from the genes with only high-frequency-usage codons in E. coli. Therefore, both low- and high-frequency-usage codons make positive contributions to the functional expression of the heterologous proteins. This method could be used to design synthetic genes for heterologous gene expression in biotechnology.
Collapse
|
32
|
Qin WY, Gan LN, Xia RW, Sun SY, Zhu GQ, Wu SL, Bao WB. New insights into the codon usage patterns of the bactericidal/permeability-increasing (BPI) gene across nine species. Gene 2017; 616:45-51. [PMID: 28336464 DOI: 10.1016/j.gene.2017.03.016] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2016] [Revised: 12/31/2016] [Accepted: 03/15/2017] [Indexed: 10/19/2022]
Abstract
Bactericidal/permeability-increasing (BPI) protein is a member of a new generation of proteins known as super-antibiotics that are implicated as endotoxin neutralising agents. Non-uniform usage of synonymous codons for a specific amino acid during translation of a protein is known as codon usage bias (CUB). Analysis of CUB and compositional dynamics of coding sequences could contribute to a better understanding of the molecular mechanism and the evolution of a particular gene. In this study, we performed CUB analysis of the complete coding sequences of the BPI gene from nine different species. The codon usage patterns of BPI across different species were found to be influenced by GC bias, particularly GC3s, with a moderate bias in the codon usage of BPI. We found significant similarities in the codon usage patterns in BPI gene among closely related species, such as Sus_scrofa and Bos_taurus. Moreover, we observed evolutionary conservation of the most over-represented codon CUG for the amino acid leucine in the BPI gene across all species. In conclusion, our analysis provides a novel insight into the codon usage patterns of BPI. This information facilitates an improved understanding of the structural, functional and evolutionary significance of BPI gene among species, and provides a theoretical reference for developing antiseptic drug proteins with high efficiency across species.
Collapse
Affiliation(s)
- Wei-Yun Qin
- Key Laboratory for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, PR China
| | - Li-Na Gan
- Key Laboratory for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, PR China
| | - Ri-Wei Xia
- Key Laboratory for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, PR China
| | - Shou-Yong Sun
- Key Laboratory for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, PR China
| | - Guo-Qiang Zhu
- College of Veterinary Medicine, Yangzhou University, Yangzhou, Jiangsu, PR China
| | - Sheng-Long Wu
- Key Laboratory for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, PR China; Joint International Research Laboratory of Agriculture & Agri-Product Safety, Yangzhou University, Yangzhou, Jiangsu, PR China
| | - Wen-Bin Bao
- Key Laboratory for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, PR China; Joint International Research Laboratory of Agriculture & Agri-Product Safety, Yangzhou University, Yangzhou, Jiangsu, PR China.
| |
Collapse
|
33
|
Das S, Chottopadhyay B, Sahoo S. Comparative Analysis of Predicted Gene Expression among Crenarchaeal Genomes. Genomics Inform 2017; 15:38-47. [PMID: 28416948 PMCID: PMC5389947 DOI: 10.5808/gi.2017.15.1.38] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2016] [Revised: 11/28/2016] [Accepted: 01/26/2017] [Indexed: 12/13/2022] Open
Abstract
Research into new methods for identifying highly expressed genes in anonymous genome sequences has been going on for more than 15 years. We presented here an alternative approach based on modified score of relative codon usage bias to identify highly expressed genes in crenarchaeal genomes. The proposed algorithm relies exclusively on sequence features for identifying the highly expressed genes. In this study, a comparative analysis of predicted highly expressed genes in five crenarchaeal genomes was performed using the score of Modified Relative Codon Bias Strength (MRCBS) as a numerical estimator of gene expression level. We found a systematic strong correlation between Codon Adaptation Index and MRCBS. Additionally, MRCBS correlated well with other expression measures. Our study indicates that MRCBS can consistently capture the highly expressed genes.
Collapse
Affiliation(s)
- Shibsankar Das
- Department of Mathematics, Uluberia College, Uluberia 711315, India
| | | | | |
Collapse
|
34
|
Udaondo Z, Molina L, Segura A, Duque E, Ramos JL. Analysis of the core genome and pangenome ofPseudomonas putida. Environ Microbiol 2015; 18:3268-3283. [DOI: 10.1111/1462-2920.13015] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Revised: 08/04/2015] [Accepted: 08/06/2015] [Indexed: 11/30/2022]
Affiliation(s)
- Zulema Udaondo
- Biotechnology Technological Area; Abengoa Research; Calle Energía Solar 1, Building E, Campus Palmas Altas 41014 Sevilla Spain
| | - Lázaro Molina
- Department of Environmental Protection; Estación Experimental del Zaidín; Consejo Superior de Investigaciones Científicas. C/ Profesor Albareda 1 18008 Granada Spain
| | - Ana Segura
- Biotechnology Technological Area; Abengoa Research; Calle Energía Solar 1, Building E, Campus Palmas Altas 41014 Sevilla Spain
| | - Estrella Duque
- Biotechnology Technological Area; Abengoa Research; Calle Energía Solar 1, Building E, Campus Palmas Altas 41014 Sevilla Spain
| | - Juan L. Ramos
- Biotechnology Technological Area; Abengoa Research; Calle Energía Solar 1, Building E, Campus Palmas Altas 41014 Sevilla Spain
| |
Collapse
|
35
|
Chen L, Chu C, Huang T, Kong X, Cai YD. Prediction and analysis of cell-penetrating peptides using pseudo-amino acid composition and random forest models. Amino Acids 2015; 47:1485-93. [PMID: 25894890 DOI: 10.1007/s00726-015-1974-5] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Accepted: 03/27/2015] [Indexed: 12/26/2022]
Abstract
Cell-penetrating peptides, a group of short peptides, can traverse cell membranes to enter cells and thus facilitate the uptake of various molecular cargoes. Thus, they have the potential to become powerful drug delivery systems. The correct identification of peptides as cell-penetrating or non-cell-penetrating would accelerate this application. In this study, we determined which features were important for a peptide to be cell-penetrating or non-cell-penetrating and built a predictive model based on the key features extracted from this analysis. The investigated peptides were retrieved from a previous study, and each was encoded as a numeric vector according to six properties of amino acids-amino acid frequency, codon diversity, electrostatic charge, molecular volume, polarity, and secondary structure-by the pseudo-amino acid composition method. Methods of minimum redundancy maximum relevance and incremental feature selection were then employed to analyze these features, and some were found to be key determinants of cell penetration. In parallel, an optimal random forest prediction model was built. We hope that our findings will provide new resources for the study of cell-penetrating peptides.
Collapse
Affiliation(s)
- Lei Chen
- College of Life Science, Shanghai University, Shanghai, 200444, People's Republic of China,
| | | | | | | | | |
Collapse
|
36
|
Mazumder TH, Chakraborty S. Gaining insights into the codon usage patterns of TP53 gene across eight mammalian species. PLoS One 2015; 10:e0121709. [PMID: 25807269 PMCID: PMC4373688 DOI: 10.1371/journal.pone.0121709] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2014] [Accepted: 02/14/2015] [Indexed: 02/06/2023] Open
Abstract
TP53 gene is known as the “guardian of the genome” as it plays a vital role in regulating cell cycle, cell proliferation, DNA damage repair, initiation of programmed cell death and suppressing tumor growth. Non uniform usage of synonymous codons for a specific amino acid during translation of protein known as codon usage bias (CUB) is a unique property of the genome and shows species specific deviation. Analysis of codon usage bias with compositional dynamics of coding sequences has contributed to the better understanding of the molecular mechanism and the evolution of a particular gene. In this study, the complete nucleotide coding sequences of TP53 gene from eight different mammalian species were used for CUB analysis. Our results showed that the codon usage patterns in TP53 gene across different mammalian species has been influenced by GC bias particularly GC3 and a moderate bias exists in the codon usage of TP53 gene. Moreover, we observed that nature has highly favored the most over represented codon CTG for leucine amino acid but selected against the ATA codon for isoleucine in TP53 gene across all mammalian species during the course of evolution.
Collapse
Affiliation(s)
| | - Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar-788011, Assam, India
- * E-mail:
| |
Collapse
|
37
|
Diament A, Pinter RY, Tuller T. Three-dimensional eukaryotic genomic organization is strongly correlated with codon usage expression and function. Nat Commun 2014; 5:5876. [PMID: 25510862 DOI: 10.1038/ncomms6876] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2014] [Accepted: 11/17/2014] [Indexed: 01/08/2023] Open
Abstract
It has been shown that the distribution of genes in eukaryotic genomes is not random; however, formerly reported relations between gene function and genomic organization were relatively weak. Previous studies have demonstrated that codon usage bias is related to all stages of gene expression and to protein function. Here we apply a novel tool for assessing functional relatedness, codon usage frequency similarity (CUFS), which measures similarity between genes in terms of codon and amino acid usage. By analyzing chromosome conformation capture data, describing the three-dimensional (3D) conformation of the DNA, we show that the functional similarity between genes captured by CUFS is directly and very strongly correlated with their 3D distance in Saccharomyces cerevisiae, Schizosaccharomyces pombe, Arabidopsis thaliana, mouse and human. This emphasizes the importance of three-dimensional genomic localization in eukaryotes and indicates that codon usage is tightly linked to genome architecture.
Collapse
Affiliation(s)
- Alon Diament
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Ron Y Pinter
- Department of Computer Science, Technion-Israel Institute of Technology, Haifa 32000, Israel
| | - Tamir Tuller
- 1] Department of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel [2] The Sagol School of Neuroscience, Tel Aviv University, Tel Aviv 6997801, Israel
| |
Collapse
|
38
|
Abstract
The tRNA adaptation index (tAI) is a widely used measure of the efficiency by which a coding sequence is recognized by the intra-cellular tRNA pool. This index includes among others weights that represent wobble interactions between codons and tRNA molecules. Currently, these weights are based only on the gene expression in Saccharomyces cerevisiae. However, the efficiencies of the different codon–tRNA interactions are expected to vary among different organisms. In this study, we suggest a new approach for adjusting the tAI weights to any target model organism without the need for gene expression measurements. Our method is based on optimizing the correlation between the tAI and a measure of codon usage bias. Here, we show that in non-fungal the new tAI weights predict protein abundance significantly better than the traditional tAI weights. The unique tRNA–codon adaptation weights computed for 100 different organisms exhibit a significant correlation with evolutionary distance. The reported results demonstrate the usefulness of the new measure in future genomic studies.
Collapse
Affiliation(s)
- Renana Sabi
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv, Israel
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv, Israel The Sagol School of Neuroscience, Tel-Aviv University, Tel-Aviv, Israel
| |
Collapse
|
39
|
Ling MHT, Poh CL. A predictor for predicting Escherichia coli transcriptome and the effects of gene perturbations. BMC Bioinformatics 2014; 15:140. [PMID: 24884349 PMCID: PMC4038595 DOI: 10.1186/1471-2105-15-140] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2013] [Accepted: 05/09/2014] [Indexed: 11/24/2022] Open
Abstract
Background A means to predict the effects of gene over-expression, knockouts, and environmental stimuli in silico is useful for system biologists to develop and test hypotheses. Several studies had predicted the expression of all Escherichia coli genes from sequences and reported a correlation of 0.301 between predicted and actual expression. However, these do not allow biologists to study the effects of gene perturbations on the native transcriptome. Results We developed a predictor to predict transcriptome-scale gene expression from a small number (n = 59) of known gene expressions using gene co-expression network, which can be used to predict the effects of over-expressions and knockdowns on E. coli transcriptome. In terms of transcriptome prediction, our results show that the correlation between predicted and actual expression value is 0.467, which is similar to the microarray intra-array variation (p-value = 0.348), suggesting that intra-array variation accounts for a substantial portion of the transcriptome prediction error. In terms of predicting the effects of gene perturbation(s), our results suggest that the expression of 83% of the genes affected by perturbation can be predicted within 40% of error and the correlation between predicted and actual expression values among the affected genes to be 0.698. With the ability to predict the effects of gene perturbations, we demonstrated that our predictor has the potential to estimate the effects of varying gene expression level on the native transcriptome. Conclusion We present a potential means to predict an entire transcriptome and a tool to estimate the effects of gene perturbations for E. coli, which will aid biologists in hypothesis development. This study forms the baseline for future work in using gene co-expression network for gene expression prediction.
Collapse
Affiliation(s)
- Maurice H T Ling
- School of Chemical and Biomedical Engineering, Nanyang Technological University, Nanyang Ave, Singapore, Singapore.
| | | |
Collapse
|
40
|
Yuan J, Yang M, Ren J, Fu B, Jiang F, Zhang X. Analysis of genomic characters reveals that four distinct gene clusters are correlated with different functions in Burkholderia cenocepacia AU 1054. Appl Microbiol Biotechnol 2013; 98:361-72. [PMID: 24305740 DOI: 10.1007/s00253-013-5415-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2013] [Revised: 11/11/2013] [Accepted: 11/11/2013] [Indexed: 11/30/2022]
Abstract
Possessing three circular chromosomes is a distinct genomic characteristic of Burkholderia cenocepacia AU 1054, a clinically important pathogen in cystic fibrosis. In this study, base composition, codon usage and functional role category were analyzed in the B. cenocepacia AU 1054 genome. Although no bias in the base and codon usage was detected between any two chromosomes, function differences did exist in the genes of each chromosome. Similar base composition and differential functional role categories indicated that genes on these three chromosomes were relatively stable and that a proper division of labor was established. Based on variations in the base or codon usage, four small gene clusters were observed in all of the genes. Multivariate analysis revealed that protein hydrophobicity played a predominant role in shaping base usage bias, while horizontal gene transfer and the gene expression level were the two most important factors that affected the codon usage bias. Interestingly, we also found that these gene clusters were correlated with different biological functions: (i) 45 pyrimidine-leading-codon preferred genes were predominantly involved in regulatory function; (ii) most drug resistance-related genes involved in 826 genes that coding for hydrophobic proteins; (iii) most of the 111 horizontal transfer genes were responsible for genomic plasticity; and (iv) 73 highly expressed genes (predicted by their codon adaptation index values) showed environmental adaptation to cystic fibrosis. Our results showed that genes with base or codon usage bias were affected by mutational pressure and natural selection, and their functions could contribute to drug assistance and transmissible activity in B. cenocepacia.
Collapse
Affiliation(s)
- Jianbo Yuan
- Institute of Oceanology, Chinese Academy of Sciences, No. 7, Nanhai Road, Qingdao, 266071, China
| | | | | | | | | | | |
Collapse
|
41
|
Guo FB, Ye YN, Zhao HL, Lin D, Wei W. Universal pattern and diverse strengths of successive synonymous codon bias in three domains of life, particularly among prokaryotic genomes. DNA Res 2012; 19:477-85. [PMID: 23132389 PMCID: PMC3514858 DOI: 10.1093/dnares/dss027] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
There has been significant progress in understanding the process of protein translation in recent years. One of the best examples is the discovery of usage bias in successive synonymous codons and its role in eukaryotic translation efficiency. We observed here a similar type of bias in the other two life domains, bacteria and archaea, although the bias strength was much smaller than in eukaryotes. Among 136 prokaryotic genomes, 98 were found to have significant bias from random use of successive synonymous codons with Z scores larger than three. Furthermore, significantly different bias strengths were found between prokaryotes grouped by various genomic or biochemical characteristics. Interestingly, the bias strength measured by a general Z score could be fitted well (R = 0.83, P < 10−15) by three genomic variables: genome size, G + C content, and tRNA gene number based on multiple linear regression. A different distribution of synonymous codon pairs between protein-coding genes and intergenic sequences suggests that bias is caused by translation selection. The present results indicate that protein translation is tuned by codon (pair) usage, and the intensity of the regulation is associated with genome size, tRNA gene number, and G + C content.
Collapse
Affiliation(s)
- Feng-Biao Guo
- Center of Bioinformatics and Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | | | | | | | | |
Collapse
|
42
|
Prabha R, Singh DP, Gupta SK, Farooqi S, Rai A. Synonymous codon usage in Thermosynechococcus elongatus (cyanobacteria) identifies the factors shaping codon usage variation. Bioinformation 2012; 8:622-8. [PMID: 22829743 PMCID: PMC3400985 DOI: 10.6026/97320630008622] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2012] [Accepted: 06/07/2012] [Indexed: 11/23/2022] Open
Abstract
Analysis of synonymous codon usage pattern in the genome of a thermophilic cyanobacterium, Thermosynechococcus elongatus BP-1 using multivariate statistical analysis revealed a single major explanatory axis accounting for codon usage variation in the organism. This axis is correlated with the GC content at third base of synonymous codons (GC3s) in correspondence analysis taking T. elongatus genes. A negative correlation was observed between effective number of codons i.e. Nc and GC3s. Results suggested a mutational bias as the major factor in shaping codon usage in this cyanobacterium. In comparison to the lowly expressed genes, highly expressed genes of this organism possess significantly higher proportion of pyrimidine-ending codons suggesting that besides, mutational bias, translational selection also influenced codon usage variation in T. elongatus. Correspondence analysis of relative synonymous codon usage (RSCU) with A, T, G, C at third positions (A3s, T3s, G3s, C3s, respectively) also supported this fact and expression levels of genes and gene length also influenced codon usage. A role of translational accuracy was identified in dictating the codon usage variation of this genome. Results indicated that although mutational bias is the major factor in shaping codon usage in T. elongatus, factors like translational selection, translational accuracy and gene expression level also influenced codon usage variation.
Collapse
Affiliation(s)
- Ratna Prabha
- National Bureau of Agriculturally Important Microorganisms, Indian Council of Agricultural Research, Kushmaur, Maunath
Bhanjan 275101, India
- Department of Biotechnology, Faculty of Science and Technology, Mewar University, Gangrar, Chittorgarh, Rajasthan, India
| | - Dhananjaya P Singh
- National Bureau of Agriculturally Important Microorganisms, Indian Council of Agricultural Research, Kushmaur, Maunath
Bhanjan 275101, India
| | - Shailendra K Gupta
- CSIR-Indian Institute of Toxicology Research, 80, Mahatma Gandhi Marg, Kaisarbagh, Lucknow 226001,
India
| | - Samir Farooqi
- Indian Agricultural Statistical Research Institute, Indian Council of Agricultural Research, Pusa, New Delhi 110 012, India
| | - Anil Rai
- Indian Agricultural Statistical Research Institute, Indian Council of Agricultural Research, Pusa, New Delhi 110 012, India
| |
Collapse
|
43
|
Das S, Roymondal U, Chottopadhyay B, Sahoo S. Gene expression profile of the cynobacterium synechocystis genome. Gene 2012; 497:344-52. [PMID: 22310391 DOI: 10.1016/j.gene.2012.01.023] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Accepted: 01/19/2012] [Indexed: 11/26/2022]
Abstract
The expression of functional proteins plays a crucial role in modern biotechnology. The free-living cynobacterium Synechocystis PCC 6803 is an interesting model organism to study oxygenic photosynthesis as well as other metabolic processes. Here we analyze a gene expression profiling methodology, RCBS (the scores of relative codon usage bias) to elucidate expression patterns of genes in the Synechocystis genome. To assess the predictive performance of the methodology, we propose a simple algorithm to calculate the threshold score to identify the highly expressed genes in a genome. Analysis of differential expression of the genes of this genome reveals that most of the genes in photosynthesis and respiration belong to the highly expressed category. The other genes with the higher predicted expression level include ribosomal proteins, translation processing factors and many hypothetical proteins. Only 9.5% genes are identified as highly expressed genes and we observe that highly expressed genes in Synechocystis genome often have strong compositional bias in terms of codon usage. An important application concerns the automatic detection of a set of impact codons and genes that are highly expressed tend to use this narrow set of preferred codons and display high codon bias .We further observe a strong correlation between RCBS and protein length indicating natural selection in favor of shorter genes to be expressed at higher level. The better correlations of RCBS with 2D electrophoresis and microarray data for heat shock proteins compared to the expression measure based on codon usage difference, E(g) and codon adaptive index, CAI indicate that the genomic expression profile available in our method can be applied in a meaningful way to study the mRNA expression patterns, which are by themselves necessary for the quantitative description of the biological states.
Collapse
Affiliation(s)
- Shibsankar Das
- Department of Mathematics, Uluberia College, Uluberia, Howrah, India.
| | | | | | | |
Collapse
|
44
|
Ruiz ON, Alvarez D, Gonzalez-Ruiz G, Torres C. Characterization of mercury bioremediation by transgenic bacteria expressing metallothionein and polyphosphate kinase. BMC Biotechnol 2011; 11:82. [PMID: 21838857 PMCID: PMC3180271 DOI: 10.1186/1472-6750-11-82] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2010] [Accepted: 08/12/2011] [Indexed: 11/17/2022] Open
Abstract
Background The use of transgenic bacteria has been proposed as a suitable alternative for mercury remediation. Ideally, mercury would be sequestered by metal-scavenging agents inside transgenic bacteria for subsequent retrieval. So far, this approach has produced limited protection and accumulation. We report here the development of a transgenic system that effectively expresses metallothionein (mt-1) and polyphosphate kinase (ppk) genes in bacteria in order to provide high mercury resistance and accumulation. Results In this study, bacterial transformation with transcriptional and translational enhanced vectors designed for the expression of metallothionein and polyphosphate kinase provided high transgene transcript levels independent of the gene being expressed. Expression of polyphosphate kinase and metallothionein in transgenic bacteria provided high resistance to mercury, up to 80 μM and 120 μM, respectively. Here we show for the first time that metallothionein can be efficiently expressed in bacteria without being fused to a carrier protein to enhance mercury bioremediation. Cold vapor atomic absorption spectrometry analyzes revealed that the mt-1 transgenic bacteria accumulated up to 100.2 ± 17.6 μM of mercury from media containing 120 μM Hg. The extent of mercury remediation was such that the contaminated media remediated by the mt-1 transgenic bacteria supported the growth of untransformed bacteria. Cell aggregation, precipitation and color changes were visually observed in mt-1 and ppk transgenic bacteria when these cells were grown in high mercury concentrations. Conclusion The transgenic bacterial system described in this study presents a viable technology for mercury bioremediation from liquid matrices because it provides high mercury resistance and accumulation while inhibiting elemental mercury volatilization. This is the first report that shows that metallothionein expression provides mercury resistance and accumulation in recombinant bacteria. The high accumulation of mercury in the transgenic cells could present the possibility of retrieving the accumulated mercury for further industrial applications.
Collapse
Affiliation(s)
- Oscar N Ruiz
- Inter American University of Puerto Rico, Department of Natural Sciences and Mathematics, 500 Dr. John Will Harris, Bayamon, Puerto Rico.
| | | | | | | |
Collapse
|
45
|
Codon optimization of the major antigen encoding genes of diverse strains of influenza a virus. Interdiscip Sci 2011; 3:36-42. [DOI: 10.1007/s12539-011-0055-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2010] [Revised: 05/06/2010] [Accepted: 05/14/2010] [Indexed: 10/18/2022]
|
46
|
Misawa K, Kikuno RF. Relationship between amino acid composition and gene expression in the mouse genome. BMC Res Notes 2011; 4:20. [PMID: 21272306 PMCID: PMC3038927 DOI: 10.1186/1756-0500-4-20] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2010] [Accepted: 01/27/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Codon bias is a phenomenon that refers to the differences in the frequencies of synonymous codons among different genes. In many organisms, natural selection is considered to be a cause of codon bias because codon usage in highly expressed genes is biased toward optimal codons. Methods have previously been developed to predict the expression level of genes from their nucleotide sequences, which is based on the observation that synonymous codon usage shows an overall bias toward a few codons called major codons. However, the relationship between codon bias and gene expression level, as proposed by the translation-selection model, is less evident in mammals. FINDINGS We investigated the correlations between the expression levels of 1,182 mouse genes and amino acid composition, as well as between gene expression and codon preference. We found that a weak but significant correlation exists between gene expression levels and amino acid composition in mouse. In total, less than 10% of variation of expression levels is explained by amino acid components. We found the effect of codon preference on gene expression was weaker than the effect of amino acid composition, because no significant correlations were observed with respect to codon preference. CONCLUSION These results suggest that it is difficult to predict expression level from amino acid components or from codon bias in mouse.
Collapse
Affiliation(s)
- Kazuharu Misawa
- Research Program for Computational Science, Research and Development Group for Next-Generation Integrated Living Matter Simulation, Fusion of Data and Analysis Research and Development Team, RIKEN, 4-6-1 Shirokane-dai, Minato-ku, Tokyo 108-8639, Japan.
| | | |
Collapse
|
47
|
von Mandach C, Merkl R. Genes optimized by evolution for accurate and fast translation encode in Archaea and Bacteria a broad and characteristic spectrum of protein functions. BMC Genomics 2010; 11:617. [PMID: 21050470 PMCID: PMC3091758 DOI: 10.1186/1471-2164-11-617] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2010] [Accepted: 11/04/2010] [Indexed: 11/13/2022] Open
Abstract
Background In many microbial genomes, a strong preference for a small number of codons can be observed in genes whose products are needed by the cell in large quantities. This codon usage bias (CUB) improves translational accuracy and speed and is one of several factors optimizing cell growth. Whereas CUB and the overrepresentation of individual proteins have been studied in detail, it is still unclear which high-level metabolic categories are subject to translational optimization in different habitats. Results In a systematic study of 388 microbial species, we have identified for each genome a specific subset of genes characterized by a marked CUB, which we named the effectome. As expected, gene products related to protein synthesis are abundant in both archaeal and bacterial effectomes. In addition, enzymes contributing to energy production and gene products involved in protein folding and stabilization are overrepresented. The comparison of genomes from eleven habitats shows that the environment has only a minor effect on the composition of the effectomes. As a paradigmatic example, we detailed the effectome content of 37 bacterial genomes that are most likely exposed to strongest selective pressure towards translational optimization. These effectomes accommodate a broad range of protein functions like enzymes related to glycolysis/gluconeogenesis and the TCA cycle, ATP synthases, aminoacyl-tRNA synthetases, chaperones, proteases that degrade misfolded proteins, protectants against oxidative damage, as well as cold shock and outer membrane proteins. Conclusions We made clear that effectomes consist of specific subsets of the proteome being involved in several cellular functions. As expected, some functions are related to cell growth and affect speed and quality of protein synthesis. Additionally, the effectomes contain enzymes of central metabolic pathways and cellular functions sustaining microbial life under stress situations. These findings indicate that cell growth is an important but not the only factor modulating translational accuracy and speed by means of CUB.
Collapse
|
48
|
Söllner J, Heinzel A, Summer G, Fechete R, Stipkovits L, Szathmary S, Mayer B. Concept and application of a computational vaccinology workflow. Immunome Res 2010; 6 Suppl 2:S7. [PMID: 21067549 PMCID: PMC2981879 DOI: 10.1186/1745-7580-6-s2-s7] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The last years have seen a renaissance of the vaccine area, driven by clinical needs in infectious diseases but also chronic diseases such as cancer and autoimmune disorders. Equally important are technological improvements involving nano-scale delivery platforms as well as third generation adjuvants. In parallel immunoinformatics routines have reached essential maturity for supporting central aspects in vaccinology going beyond prediction of antigenic determinants. On this basis computational vaccinology has emerged as a discipline aimed at ab-initio rational vaccine design.Here we present a computational workflow for implementing computational vaccinology covering aspects from vaccine target identification to functional characterization and epitope selection supported by a Systems Biology assessment of central aspects in host-pathogen interaction. We exemplify the procedures for Epstein Barr Virus (EBV), a clinically relevant pathogen causing chronic infection and suspected of triggering malignancies and autoimmune disorders. RESULTS We introduce pBone/pView as a computational workflow supporting design and execution of immunoinformatics workflow modules, additionally involving aspects of results visualization, knowledge sharing and re-use. Specific elements of the workflow involve identification of vaccine targets in the realm of a Systems Biology assessment of host-pathogen interaction for identifying functionally relevant targets, as well as various methodologies for delineating B- and T-cell epitopes with particular emphasis on broad coverage of viral isolates as well as MHC alleles.Applying the workflow on EBV specifically proposes sequences from the viral proteins LMP2, EBNA2 and BALF4 as vaccine targets holding specific B- and T-cell epitopes promising broad strain and allele coverage. CONCLUSION Based on advancements in the experimental assessment of genomes, transcriptomes and proteomes for both, pathogen and (human) host, the fundaments for rational design of vaccines have been laid out. In parallel, immunoinformatics modules have been designed and successfully applied for supporting specific aspects in vaccine design. Joining these advancements, further complemented by novel vaccine formulation and delivery aspects, have paved the way for implementing computational vaccinology for rational vaccine design tackling presently unmet vaccine challenges.
Collapse
Affiliation(s)
- Johannes Söllner
- emergentec biodevelopment GmbH, Rathausstrasse 5/3, 1010 Vienna, Austria
| | - Andreas Heinzel
- emergentec biodevelopment GmbH, Rathausstrasse 5/3, 1010 Vienna, Austria
- University of Applied Sciences, Softwarepark 11, 4232 Hagenberg, Austria
| | - Georg Summer
- University of Applied Sciences, Softwarepark 11, 4232 Hagenberg, Austria
| | - Raul Fechete
- emergentec biodevelopment GmbH, Rathausstrasse 5/3, 1010 Vienna, Austria
| | | | - Susan Szathmary
- Galenbio Kft., Erdőszél köz 21, 1037 Budapest, Hungary and GalenBio, Inc., 5922 Farnsworth Ct, Carlsbad, CA 92008, USA
| | - Bernd Mayer
- emergentec biodevelopment GmbH, Rathausstrasse 5/3, 1010 Vienna, Austria
- Institute for Theoretical Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| |
Collapse
|
49
|
Park SG, Choi SS. Expression breadth and expression abundance behave differently in correlations with evolutionary rates. BMC Evol Biol 2010; 10:241. [PMID: 20691101 PMCID: PMC2924872 DOI: 10.1186/1471-2148-10-241] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2010] [Accepted: 08/07/2010] [Indexed: 01/12/2023] Open
Abstract
Background One of the main objectives of the molecular evolution and evolutionary systems biology field is to reveal the underlying principles that dictate protein evolutionary rates. Several studies argue that expression abundance is the most critical component in determining the rate of evolution, especially in unicellular organisms. However, the expression breadth also needs to be considered for multicellular organisms. Results In the present paper, we analyzed the relationship between the two expression variables and rates using two different genome-scale expression datasets, microarrays and ESTs. A significant positive correlation between the expression abundance (EA) and expression breadth (EB) was revealed by Kendall's rank correlation tests. A novel random shuffling approach was applied for EA and EB to compare the correlation coefficients obtained from real data sets to those estimated based on random chance. A novel method called a Fixed Group Analysis (FGA) was designed and applied to investigate the correlations between expression variables and rates when one of the two expression variables was evenly fixed. Conclusions In conclusion, all of these analyses and tests consistently showed that the breadth rather than the abundance of gene expression is tightly linked with the evolutionary rate in multicellular organisms.
Collapse
Affiliation(s)
- Seung Gu Park
- Department of Medical Biotechnology, College of Biomedical Science, and Institute of Bioscience & Biotechnology, Kangwon National University, Chunchon 200-701, Korea
| | | |
Collapse
|
50
|
Tatarinova TV, Alexandrov NN, Bouck JB, Feldmann KA. GC3 biology in corn, rice, sorghum and other grasses. BMC Genomics 2010; 11:308. [PMID: 20470436 PMCID: PMC2895627 DOI: 10.1186/1471-2164-11-308] [Citation(s) in RCA: 105] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2009] [Accepted: 05/16/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The third, or wobble, position in a codon provides a high degree of possible degeneracy and is an elegant fault-tolerance mechanism. Nucleotide biases between organisms at the wobble position have been documented and correlated with the abundances of the complementary tRNAs. We and others have noticed a bias for cytosine and guanine at the third position in a subset of transcripts within a single organism. The bias is present in some plant species and warm-blooded vertebrates but not in all plants, or in invertebrates or cold-blooded vertebrates. RESULTS Here we demonstrate that in certain organisms the amount of GC at the wobble position (GC3) can be used to distinguish two classes of genes. We highlight the following features of genes with high GC3 content: they (1) provide more targets for methylation, (2) exhibit more variable expression, (3) more frequently possess upstream TATA boxes, (4) are predominant in certain classes of genes (e.g., stress responsive genes) and (5) have a GC3 content that increases from 5'to 3'. These observations led us to formulate a hypothesis to explain GC3 bimodality in grasses. CONCLUSIONS Our findings suggest that high levels of GC3 typify a class of genes whose expression is regulated through DNA methylation or are a legacy of accelerated evolution through gene conversion. We discuss the three most probable explanations for GC3 bimodality: biased gene conversion, transcriptional and translational advantage and gene methylation.
Collapse
Affiliation(s)
- Tatiana V Tatarinova
- Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA.
| | | | | | | |
Collapse
|