1
|
Lin Q, Hu S, Wu Z, Huang Y, Wang S, Shi W, Zhu B. Comparative chloroplast genomics provides insights into the phylogenetic relationships and evolutionary history for Actinidia species. Sci Rep 2025; 15:13291. [PMID: 40246989 PMCID: PMC12006428 DOI: 10.1038/s41598-025-95789-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Accepted: 03/24/2025] [Indexed: 04/19/2025] Open
Abstract
Actinidia species are fruit trees with various functions, such as providing edible fruit, serving as ornamental plants, and having medicinal benefits. However, the taxonomy of Actinidia species is controversial due to widespread hybridization, the history of divergence and polyploid speciation among Actinidia species also remains unclear. In this study, we conducted comparative analyses of the chloroplast genomes and ploidy among multiple Actinidia species. The genes clpP, infA, ndhD, ndhK, and rpl20 were absent from these chloroplast genomes. The ycf2 and rpl20 genes in the Actinidia species were under positive selection. Several regions (rps16-trnQ-UUG, trnS-GCU-trnR-UCU, ndhC-trnV-UAC, rbcL-accD, rps12-psbB, trnN-GUU-ndhF, ycf1-trnN-GUU, and trnH-GUG-psbA) and genes (ycf1, ycf2, accD, rpl20) exhibited high variability, which could potentially serve as molecular markers in species delineation and other phylogenetic studies. Through divergence time estimation, the Actinidia genus originated 23 million years ago (Ma), and experienced a tetraploidization event in ~ 20 Ma. Subsequently, Actinidia has undergone extensive diploidization. Our findings will provide valuable information in species identification, breeding programs, and conservation efforts for Actinidia species.
Collapse
Affiliation(s)
- Qianhui Lin
- College of Biological Engineering, Qingdao University of Science and Technology, Qingdao, 266042, China
| | - Siqi Hu
- College of Biological Engineering, Qingdao University of Science and Technology, Qingdao, 266042, China
| | - Zhenhua Wu
- College of Biological Engineering, Qingdao University of Science and Technology, Qingdao, 266042, China
| | - Yahui Huang
- College of Biological Engineering, Qingdao University of Science and Technology, Qingdao, 266042, China
| | - Shuo Wang
- College of Biological Engineering, Qingdao University of Science and Technology, Qingdao, 266042, China
| | - Wenbo Shi
- College of Biological Engineering, Qingdao University of Science and Technology, Qingdao, 266042, China
| | - Bingyue Zhu
- College of Biological Engineering, Qingdao University of Science and Technology, Qingdao, 266042, China.
| |
Collapse
|
2
|
Siegall WB, Lyon RB, Kelman Z. An important consideration when expressing mAbs in Escherichiacoli. Protein Expr Purif 2024; 220:106499. [PMID: 38703798 DOI: 10.1016/j.pep.2024.106499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 04/19/2024] [Accepted: 05/02/2024] [Indexed: 05/06/2024]
Abstract
Monoclonal antibodies (mAbs) are a driving force in the biopharmaceutical industry. Therapeutic mAbs are usually produced in mammalian cells, but there has been a push towards the use of alternative production hosts, such as Escherichia coli. When the genes encoding for a mAb heavy and light chains are codon-optimized for E. coli expression, a truncated form of the heavy chain can form along with the full-length product. In this work, the role of codon optimization in the formation of a truncated product was investigated. This study used the amino acid sequences of several therapeutic mAbs and multiple optimization algorithms. It was found that several algorithms incorporate sequences that lead to a truncated product. Approaches to avoid this truncated form are discussed.
Collapse
Affiliation(s)
- William B Siegall
- Institute for Bioscience and Biotechnology Research (IBBR), The University of Maryland (UMD), 9600 Gudelsky Drive, Rockville, MD, 20850, USA
| | - Rachel B Lyon
- Institute for Bioscience and Biotechnology Research (IBBR), The University of Maryland (UMD), 9600 Gudelsky Drive, Rockville, MD, 20850, USA; Biomolecular Labeling Laboratory, IBBR, 9600 Gudelsky Drive, Rockville, MD, 20850, USA
| | - Zvi Kelman
- Institute for Bioscience and Biotechnology Research (IBBR), The University of Maryland (UMD), 9600 Gudelsky Drive, Rockville, MD, 20850, USA; National Institute of Standards and Technology (NIST), 9600 Gudelsky Drive, Rockville, MD, 20850, USA; Biomolecular Labeling Laboratory, IBBR, 9600 Gudelsky Drive, Rockville, MD, 20850, USA.
| |
Collapse
|
3
|
Gao W, Chen X, He J, Sha A, Luo Y, Xiao W, Xiong Z, Li Q. Intraspecific and interspecific variations in the synonymous codon usage in mitochondrial genomes of 8 pleurotus strains. BMC Genomics 2024; 25:456. [PMID: 38730418 PMCID: PMC11084086 DOI: 10.1186/s12864-024-10374-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 05/03/2024] [Indexed: 05/12/2024] Open
Abstract
In this study, we investigated the codon bias of twelve mitochondrial core protein coding genes (PCGs) in eight Pleurotus strains, two of which are from the same species. The results revealed that the codons of all Pleurotus strains had a preference for ending in A/T. Furthermore, the correlation between codon base compositions and codon adaptation index (CAI), codon bias index (CBI) and frequency of optimal codons (FOP) indices was also detected, implying the influence of base composition on codon bias. The two P. ostreatus species were found to have differences in various base bias indicators. The average effective number of codons (ENC) of mitochondrial core PCGs of Pleurotus was found to be less than 35, indicating strong codon preference of mitochondrial core PCGs of Pleurotus. The neutrality plot analysis and PR2-Bias plot analysis further suggested that natural selection plays an important role in Pleurotus codon bias. Additionally, six to ten optimal codons (ΔRSCU > 0.08 and RSCU > 1) were identified in eight Pleurotus strains, with UGU and ACU being the most widely used optimal codons in Pleurotus. Finally, based on the combined mitochondrial sequence and RSCU value, the genetic relationship between different Pleurotus strains was deduced, showing large variations between them. This research has improved our understanding of synonymous codon usage characteristics and evolution of this important fungal group.
Collapse
Affiliation(s)
- Wei Gao
- Clinical Medical College & Affiliated Hospital of Chengdu University, Chengdu University, Chengdu, Sichuan, China
| | - Xiaodie Chen
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Food and Biological Engineering, Chengdu University, Chengdu, Sichuan, China
| | - Jing He
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Food and Biological Engineering, Chengdu University, Chengdu, Sichuan, China
| | - Ajia Sha
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Food and Biological Engineering, Chengdu University, Chengdu, Sichuan, China
| | - Yingyong Luo
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Food and Biological Engineering, Chengdu University, Chengdu, Sichuan, China
| | - Wenqi Xiao
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Food and Biological Engineering, Chengdu University, Chengdu, Sichuan, China
| | - Zhuang Xiong
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Food and Biological Engineering, Chengdu University, Chengdu, Sichuan, China
| | - Qiang Li
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Food and Biological Engineering, Chengdu University, Chengdu, Sichuan, China.
- School of Food and Biological Engineering, Chengdu University, 2025 # Chengluo Avenue, Longquanyi District, Chengdu, Sichuan, 610106, China.
| |
Collapse
|
4
|
Sawyer EB, Cortes T. Ribosome profiling enhances understanding of mycobacterial translation. Front Microbiol 2022; 13:976550. [PMID: 35992675 PMCID: PMC9386245 DOI: 10.3389/fmicb.2022.976550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 07/22/2022] [Indexed: 11/21/2022] Open
Abstract
A recent addition to the -omics toolkit, ribosome profiling, enables researchers to gain insight into the process and regulation of translation by mapping fragments of mRNA protected from nuclease digestion by ribosome binding. In this review, we discuss how ribosome profiling applied to mycobacteria has led to discoveries about translational regulation. Using case studies, we show that the traditional view of “canonical” translation mechanisms needs expanding to encompass features of mycobacterial translation that are more widespread than previously recognized. We also discuss the limitations of the method and potential future developments that could yield further insight into the fundamental biology of this important human pathogen.
Collapse
Affiliation(s)
- Elizabeth B. Sawyer
- School of Life Sciences, University of Westminster, London, United Kingdom
- *Correspondence: Elizabeth B. Sawyer,
| | - Teresa Cortes
- Pathogen Gene Regulation Unit, Instituto de Biomedicina de Valencia (IBV), CSIC, Valencia, Spain
- Teresa Cortes,
| |
Collapse
|
5
|
Comparative Plastome Analysis of Three Amaryllidaceae Subfamilies: Insights into Variation of Genome Characteristics, Phylogeny, and Adaptive Evolution. BIOMED RESEARCH INTERNATIONAL 2022; 2022:3909596. [PMID: 35372568 PMCID: PMC8970886 DOI: 10.1155/2022/3909596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 01/19/2022] [Accepted: 02/05/2022] [Indexed: 11/17/2022]
Abstract
In the latest APG IV classification system, Amaryllidaceae is placed under the order of Asparagus and includes three subfamilies: Agapanthoideae, Allioideae, and Amaryllidoideae, which include many economically important crops. With the development of molecular phylogeny, research on the phylogenetic relationship of Amaryllidaceae has become more convenient. However, the current comparative analysis of Amaryllidaceae at the whole chloroplast genome level is still lacking. In this study, we sequenced 18 Allioideae plastomes and combined them with publicly available data (a total of 41 plastomes), including 21 Allioideae species, 1 Agapanthoideae species, 14 Amaryllidoideae species, and 5 Asparagaceae species. Comparative analyses were performed including basic characteristics of genome structure, codon usage, repeat elements, IR boundary, and genome divergence. Phylogenetic relationships were detected using single-copy genes (SCGs) and ribosomal internal transcribed spacer sequences (ITS), and the branch-site model was also employed to conduct the positive selection analysis. The results indicated that all Amaryllidaceae species showed a highly conserved typical tetrad structure. The GC content and five codon usage indexes in Allioideae species were lower than those in the other two subfamilies. Comparison analysis of Bayesian and ML phylogeny based on SCGs strongly supports the monophyly of three subfamilies and the sisterhood among them. Besides, positively selected genes (PSGs) were detected in each of the three subfamilies. Almost all genes with significant posterior probabilities for codon sites were associated with self-replication and photosynthesis. Our study investigated the three subfamilies of Amaryllidaceae at the whole chloroplast genome level and suggested the key role of selective pressure in the adaptation and evolution of Amaryllidaceae.
Collapse
|
6
|
Chembath A, Wagstaffe BPG, Ashraf M, Amaral MMF, Frigotto L, Hine AV. Nondegenerate Saturation Mutagenesis: Library Construction and Analysis via MAX and ProxiMAX Randomization. Methods Mol Biol 2022; 2461:19-41. [PMID: 35727442 DOI: 10.1007/978-1-0716-2152-3_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Protein engineering can enhance desirable features and improve performance outside of the natural context. Several strategies have been adopted over the years for gene diversification, and engineering of modular proteins in particular is most effective when a high-throughput, library-based approach is employed. Nondegenerate saturation mutagenesis plays a dynamic role in engineering proteins by targeting multiple codons to generate massively diverse gene libraries. Herein, we describe the nondegenerate saturation mutagenesis techniques that we have developed for contiguous (ProxiMAX) and noncontiguous (MAX) randomized codon generation to create precisely defined, diverse gene libraries, in the context of other fully nondegenerate strategies. ProxiMAX randomization comprises saturation cycling with repeated cycles of blunt-ended ligation, type IIS restriction, and PCR amplification, and is now a commercially automated process predominantly used for antibody library generation. MAX randomization encompasses a manual process of selective hybridisation between individual custom oligonucleotide mixes and a conventionally randomized template and is principally employed in the research laboratory setting, to engineer alpha helical proteins and active sites of enzymes. DNA libraries generated using either technology create high-throughput amino acid substitutions via codon randomization, to generate genetically diverse clones.
Collapse
Affiliation(s)
- Anupama Chembath
- College of Health and Life Sciences, Aston University, Aston Triangle, Birmingham, UK
| | | | - Mohammed Ashraf
- College of Health and Life Sciences, Aston University, Aston Triangle, Birmingham, UK
| | - Marta M Ferreira Amaral
- College of Health and Life Sciences, Aston University, Aston Triangle, Birmingham, UK
- Bicycle Therapeutics, Cambridge, UK
| | | | - Anna V Hine
- College of Health and Life Sciences, Aston University, Aston Triangle, Birmingham, UK.
| |
Collapse
|
7
|
Li YM, Gao JQ, Pei XZ, Du C, Fan C, Yuan WJ, Bai FW. Production of L-alanyl-L-glutamine by immobilized Pichia pastoris GS115 expressing α-amino acid ester acyltransferase. Microb Cell Fact 2019; 18:27. [PMID: 30711013 PMCID: PMC6359838 DOI: 10.1186/s12934-019-1077-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2018] [Accepted: 01/29/2019] [Indexed: 11/10/2022] Open
Abstract
Background l-Alanyl-l-glutamine (Ala-Gln) represents the great application potential in clinic due to the unique physicochemical properties. A new approach was developed to synthesize Ala-Gln by recombinant Escherichia coli OPA, which could overcome the disadvantages of traditional chemical synthesis. Although satisfactory results had been obtained with recombinant E. coli OPA, endotoxin and the use of multiple antibiotics along with toxic inducer brought the potential biosafety hazard for the clinical application of Ala-Gln. Results In this study, the safer host Pichia pastoris was applied as an alternative to E. coli. A recombinant P. pastoris (named GPA) with the original gene of α-amino acid ester acyltransferase (SsAet) from Sphingobacterium siyangensis SY1, was constructed to produce Ala-Gln. To improve the expression efficiency of SsAet in P. pastoris, codon optimization was conducted to obtain the strain GPAp. Here, we report that Ala-Gln production by GPAp was approximately 2.5-fold more than that of GPA. The optimal induction conditions (cultivated for 3 days at 26 °C with a daily 1.5% of methanol supplement), the optimum reaction conditions (28 °C and pH 8.5), and the suitable substrate conditions (AlaOMe/Gln = 1.5/1) were also achieved for GPAp. Although most of the metal ions had no effects, the catalytic activity of GPAp showed a slight decrease in the presence of Fe3+ and an obvious increase when cysteine or PMSF were added. Under the optimum conditions, the Ala-Gln generation by GPAp realized the maximum molar yield of 63.5% and the catalytic activity of GPAp by agar embedding maintained extremely stable after 10 cycles. Conclusions Characterized by economy, efficiency and practicability, production of Ala-Gln by recycling immobilized GPAp (whole-cell biocatalyst) is represents a green and promising way in industrial. Electronic supplementary material The online version of this article (10.1186/s12934-019-1077-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yi-Min Li
- School of Life Science and Biotechnology, Dalian University of Technology, Dalian, 116024, China
| | - Jiao-Qi Gao
- Division of Biotechnology, Dalian Institute of Chemical Physics, Dalian, 116023, China
| | - Xu-Ze Pei
- School of Life Science and Biotechnology, Dalian University of Technology, Dalian, 116024, China
| | - Cong Du
- School of Life Science and Biotechnology, Dalian University of Technology, Dalian, 116024, China
| | - Chao Fan
- Research and Development Center, Dalian Innobio Corporation Limited, Dalian, 116600, China
| | - Wen-Jie Yuan
- School of Life Science and Biotechnology, Dalian University of Technology, Dalian, 116024, China.
| | - Feng-Wu Bai
- School of Life Science and Biotechnology, Shanghai Jiaotong University, Shanghai, 200240, China
| |
Collapse
|
8
|
Huang F, Niu Y, Liu Z, Liu W, Li X, Tan H, Yang Y. An E3 ubiquitin ligase from Brassica napus induces a typical heat-shock response in its own way in Escherichia coli. Acta Biochim Biophys Sin (Shanghai) 2017; 49:262-269. [PMID: 28399214 DOI: 10.1093/abbs/gmx004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2016] [Indexed: 11/14/2022] Open
Abstract
Previously, we have identified a novel E3 ubiquitin ligase, BNTR1, which plays a key role in heat stress response in Brassica napus. In this study, we accidentally found that BNTR1 can also improve thermal tolerance and reduce growth inhibition at 42°C in Escherichia coli, in a manner different from that in plant. We show that BNTR1 activates E. coli heat-shock response at low concentration in soluble form instead of in inclusion body, but BNTR1 is not functioning as a heat-shock protein (HSP) because deficient temperature-sensitive mutants of HSP genes display unconspicuous thermal tolerance in the presence of BNTR1. Our further studies show that BNTR1 triggers heat-shock response by competing with σ32 (σ32, heat-shock transcription factor) to its binding proteins DnaJ (HSP40) and DnaK (HSP70), which results in the release and accumulation of σ32, thereby promoting the heat-shock response, even under the non-heat-shock conditions. At 37°C, accumulation of the HSPs induced by BNTR1 could make cells much more tolerant than those without BNTR1 at 42°C. Thus, our results suggest that BNTR1 may potentially be a promising target in fermentation industry for reducing impact from temperature fluctuation, where E. coli works as bioreactors.
Collapse
Affiliation(s)
- Fei Huang
- Key Laboratory of Bio-resources and Eco-environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Yulong Niu
- Key Laboratory of Bio-resources and Eco-environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Zhibin Liu
- Key Laboratory of Bio-resources and Eco-environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Weifeng Liu
- CAS Key Laboratory of Microbial Physiological and Metabolic Engineering, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Xufeng Li
- Key Laboratory of Bio-resources and Eco-environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Hong Tan
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610064, China
| | - Yi Yang
- Key Laboratory of Bio-resources and Eco-environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| |
Collapse
|
9
|
Szitenberg A, Cha S, Opperman CH, Bird DM, Blaxter ML, Lunt DH. Genetic Drift, Not Life History or RNAi, Determine Long-Term Evolution of Transposable Elements. Genome Biol Evol 2016; 8:2964-2978. [PMID: 27566762 PMCID: PMC5635653 DOI: 10.1093/gbe/evw208] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/20/2016] [Indexed: 12/11/2022] Open
Abstract
Transposable elements (TEs) are a major source of genome variation across the branches of life. Although TEs may play an adaptive role in their host's genome, they are more often deleterious, and purifying selection is an important factor controlling their genomic loads. In contrast, life history, mating system, GC content, and RNAi pathways have been suggested to account for the disparity of TE loads in different species. Previous studies of fungal, plant, and animal genomes have reported conflicting results regarding the direction in which these genomic features drive TE evolution. Many of these studies have had limited power, however, because they studied taxonomically narrow systems, comparing only a limited number of phylogenetically independent contrasts, and did not address long-term effects on TE evolution. Here, we test the long-term determinants of TE evolution by comparing 42 nematode genomes spanning over 500 million years of diversification. This analysis includes numerous transitions between life history states, and RNAi pathways, and evaluates if these forces are sufficiently persistent to affect the long-term evolution of TE loads in eukaryotic genomes. Although we demonstrate statistical power to detect selection, we find no evidence that variation in these factors influence genomic TE loads across extended periods of time. In contrast, the effects of genetic drift appear to persist and control TE variation among species. We suggest that variation in the tested factors are largely inconsequential to the large differences in TE content observed between genomes, and only by these large-scale comparisons can we distinguish long-term and persistent effects from transient or random changes.
Collapse
Affiliation(s)
- Amir Szitenberg
- Evolutionary Biology Group, School of Environmental Sciences, University of Hull, England, United Kingdom The Dead Sea and Arava Science Center, Israel
| | - Soyeon Cha
- Department of Plant Pathology, North Carolina State University
| | | | - David M Bird
- Department of Plant Pathology, North Carolina State University
| | - Mark L Blaxter
- School of Biological Sciences, Institute of Evolutionary Biology, University of Edinburgh, Scotland
| | - David H Lunt
- Evolutionary Biology Group, School of Environmental Sciences, University of Hull, England, United Kingdom
| |
Collapse
|
10
|
Yu K, Yu Y, Tang X, Chen H, Xiao J, Su XD. Transcriptome analyses of insect cells to facilitate baculovirus-insect expression. Protein Cell 2016; 7:373-82. [PMID: 27017378 PMCID: PMC4853316 DOI: 10.1007/s13238-016-0260-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2016] [Accepted: 02/27/2016] [Indexed: 12/03/2022] Open
Abstract
The High Five cell line (BTI-TN-5B1-4) isolated from the cabbage looper, Trichoplusia ni is an insect cell line widely used for baculovirus-mediated recombinant protein expression. Despite its widespread application in industry and academic laboratories, the genomic background of this cell line remains unclear. Here we sequenced the transcriptome of High Five cells and assembled 25,234 transcripts. Codon usage analysis showed that High Five cells have a robust codon usage capacity and therefore suit for expressing proteins of both eukaryotic- and prokaryotic-origin. Genes involved in glycosylation were profiled in our study, providing guidance for engineering glycosylated proteins in the insect cells. We also predicted signal peptides for transcripts with high expression abundance in both High Five and Sf21 cell lines, and these results have important implications for optimizing the expression level of some secretory and membrane proteins.
Collapse
Affiliation(s)
- Kai Yu
- Biodynamic Optical Imaging Center, School of Life Science, Peking University, Beijing, 100871, China.,State Key Laboratory of Protein and Plant Gene Research, Peking University, Beijing, 100871, China
| | - Yang Yu
- Biodynamic Optical Imaging Center, School of Life Science, Peking University, Beijing, 100871, China.,State Key Laboratory of Protein and Plant Gene Research, Peking University, Beijing, 100871, China
| | - Xiaoyan Tang
- Biodynamic Optical Imaging Center, School of Life Science, Peking University, Beijing, 100871, China.,State Key Laboratory of Protein and Plant Gene Research, Peking University, Beijing, 100871, China
| | - Huimin Chen
- Biodynamic Optical Imaging Center, School of Life Science, Peking University, Beijing, 100871, China.,State Key Laboratory of Protein and Plant Gene Research, Peking University, Beijing, 100871, China
| | - Junyu Xiao
- State Key Laboratory of Protein and Plant Gene Research, Peking University, Beijing, 100871, China. .,Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, 100871, China.
| | - Xiao-Dong Su
- Biodynamic Optical Imaging Center, School of Life Science, Peking University, Beijing, 100871, China. .,State Key Laboratory of Protein and Plant Gene Research, Peking University, Beijing, 100871, China.
| |
Collapse
|
11
|
Li S, Yang J. System analysis of synonymous codon usage biases in archaeal virus genomes. J Theor Biol 2014; 355:128-39. [PMID: 24685889 PMCID: PMC7094158 DOI: 10.1016/j.jtbi.2014.03.022] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2013] [Revised: 03/11/2014] [Accepted: 03/12/2014] [Indexed: 12/30/2022]
Abstract
Recent studies of geothermally heated aquatic ecosystems have found widely divergent viruses with unusual morphotypes. Archaeal viruses isolated from these hot habitats usually have double-stranded DNA genomes, linear or circular, and can infect members of the Archaea domain. In this study, the synonymous codon usage bias (SCUB) and dinucleotide composition in the available complete archaeal virus genome sequences have been investigated. It was found that there is a significant variation in SCUB among different Archaeal virus species, which is mainly determined by the base composition. The outcome of correspondence analysis (COA) and Spearman׳s rank correlation analysis shows that codon usage of selected archaeal virus genes depends mainly on GC richness of genome, and the gene׳s function, albeit with smaller effects, also contributes to codon usage in this virus. Furthermore, this investigation reveals that aromaticity of each protein is also critical in affecting SCUB of these viral genes although it was less important than that of the mutational bias. Especially, mutational pressure may influence SCUB in SIRV1, SIRV2, ARV1, AFV1, and PhiCh1 viruses, whereas translational selection could play a leading role in HRPV1׳s SCUB. These conclusions not only can offer an insight into the codon usage biases of archaeal virus and subsequently the possible relationship between archaeal viruses and their host, but also may help in understanding the evolution of archaeal viruses and their gene classification, and more helpful to explore the origin of life and the evolution of biology. The SCUB of archaeal virus genes depends mainly on GC richness of genome. The mutational pressure is the main factor that influences SCUB. The aromaticity of each protein is also critical in affecting SCUB. The translational selection could play a leading role in HRPV1׳s SCUB. The mode is helpful to explore the origin of life and the evolution of biology.
Collapse
Affiliation(s)
- Sen Li
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Science, Nanjing University, Nanjing 210093, China
| | - Jie Yang
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Science, Nanjing University, Nanjing 210093, China.
| |
Collapse
|
12
|
Dalmasso MC, Carmona SJ, Angel SO, Agüero F. Characterization of Toxoplasma gondii subtelomeric-like regions: identification of a long-range compositional bias that is also associated with gene-poor regions. BMC Genomics 2014; 15:21. [PMID: 24417889 PMCID: PMC4008256 DOI: 10.1186/1471-2164-15-21] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2013] [Accepted: 01/02/2014] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Chromosome ends are composed of telomeric repeats and subtelomeric regions, which are patchworks of genes interspersed with repeated elements. Although chromosome ends display similar arrangements in different species, their sequences are highly divergent. In addition, these regions display a particular nucleosomal composition and bind specific factors, therefore producing a special kind of heterochromatin. Using data from currently available draft genomes we have characterized these putative Telomeric Associated Sequences in Toxoplasma gondii. RESULTS An all-vs-all pairwise comparison of T. gondii assembled chromosomes revealed the presence of conserved regions of ∼ 30 Kb located near the ends of 9 of the 14 chromosomes of the genome of the ME49 strain. Sequence similarity among these regions is ∼ 70%, and they are also highly conserved in the GT1 and VEG strains. However, they are unique to Toxoplasma with no detectable similarity in other Apicomplexan parasites. The internal structure of these sequences consists of 3 repetitive regions separated by high-complexity sequences without annotated genes, except for a gene from the Toxoplasma Specific Family. ChIP-qPCR experiments showed that nucleosomes associated to these sequences are enriched in histone H4 monomethylated at K20 (H4K20me1), and the histone variant H2A.X, suggesting that they are silenced sequences (heterochromatin). A detailed characterization of the base composition of these sequences, led us to identify a strong long-range compositional bias, which was similar to that observed in other genomic silenced fragments such as those containing centromeric sequences, and was negatively correlated to gene density. CONCLUSIONS We identified and characterized a region present in most Toxoplasma assembled chromosomes. Based on their location, sequence features, and nucleosomal markers we propose that these might be part of subtelomeric regions of T. gondii. The identified regions display a unique trinucleotide compositional bias, which is shared (despite the lack of any detectable sequence similarity) with other silenced sequences, such as those making up the chromosome centromeres. We also identified other genomic regions with this compositional bias (but no detectable sequence similarity) that might be functionally similar.
Collapse
Affiliation(s)
| | | | - Sergio O Angel
- Instituto de Investigaciones Biotecnológicas - Instituto Tecnológico de Chascomús, UNSAM - CONICET, Sede Chascomús, Av, Intendente Marino Km 8, 2 CC 164, B 7130 IWA, Chascomús, Argentina.
| | | |
Collapse
|
13
|
Bhattacharjee S. Role of genomic and proteomic tools in the study of host-virus interactions and virus evolution. INDIAN JOURNAL OF VIROLOGY : AN OFFICIAL ORGAN OF INDIAN VIROLOGICAL SOCIETY 2013; 24:306-11. [PMID: 24426292 PMCID: PMC3832694 DOI: 10.1007/s13337-013-0150-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2013] [Accepted: 07/24/2013] [Indexed: 01/05/2023]
Abstract
Viruses have short replication cycles and produce genomic variants within a host, a process that seems to adapt to their specific host and also enable them to infect new hosts. The recent emergence of viral genomic variants from the circulating pool within the host population and re-emergence of the old ones are posing serious threat to agriculture, animal husbandry and humanity as a whole. This review assesses the potential role of genomic and proteomic tools that can monitor not only the course of infection and pathogenesis, but also predict the pandemic or zoonotic epidemic potential of a virus in a previously exposed or immunologically naive biological population.
Collapse
Affiliation(s)
- Soumen Bhattacharjee
- Cell and Molecular Biology Laboratory, Department of Zoology, University of North Bengal, Raja Rammohunpur, P.O. North Bengal University, Siliguri, 734 013 District Darjeeling, West Bengal India
| |
Collapse
|
14
|
Protection of chickens against reticuloendotheliosis virus infection by DNA vaccination. Vet Microbiol 2013; 166:59-67. [DOI: 10.1016/j.vetmic.2013.04.031] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2013] [Revised: 04/25/2013] [Accepted: 04/30/2013] [Indexed: 11/22/2022]
|
15
|
Li K, Gao L, Gao H, Qi X, Gao Y, Qin L, Wang Y, Wang X. Codon optimization and woodchuck hepatitis virus posttranscriptional regulatory element enhance the immune responses of DNA vaccines against infectious bursal disease virus in chickens. Virus Res 2013; 175:120-7. [PMID: 23631937 DOI: 10.1016/j.virusres.2013.04.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2013] [Revised: 04/15/2013] [Accepted: 04/17/2013] [Indexed: 11/18/2022]
Abstract
The present study was undertaken to evaluate the protective efficacy of DNA vaccines against infectious bursal disease virus (IBDV) in chickens and to determine whether codon optimization and the woodchuck hepatitis virus posttranscriptional regulatory element (WPRE) could improve the immunogenicity of the DNA vaccines. The VP2, VP243 and codon-optimized VP243 genes of IBDV were cloned into pCAGGS vector, and designated as pCAGVP2, pCAGVP243 and pCAGoptiVP243, respectively. Plasmids pCAGWVP243 and pCAGWoptiVP243 carrying the WPRE elements were also constructed as DNA vaccines. To evaluate vaccine efficacy, 2-week-old chickens were injected intramuscularly with the constructed plasmids twice at 2-week intervals and challenged with very virulent IBDV 2 weeks post-boost. Plasmid pCAGVP243 induced better immune responses than pCAGVP2. Chickens immunized with pCAGoptiVP243 and pCAGWVP243 had higher levels of antibody titers, lymphoproliferation responses and cytokine production compared with pCAGVP243. Furthermore, plasmid pCAGWoptiVP243 induced the highest levels of immune responses among the groups. After challenged, DNA vaccines pCAGVP2, pCAGVP243, pCAGoptiVP243, pCAGWVP243 and pCAGWoptiVP243 conferred protection for 33%, 60%, 80%, 87% and 100% of chickens, respectively, as evidenced by the absence of clinical signs, mortality, and bursal atrophy. These results indicate that codon optimization and WPRE could enhance the protective efficacy of DNA vaccines against IBDV and these two approaches could work together synergistically in a single DNA vaccine.
Collapse
MESH Headings
- Animals
- Antibodies, Viral/blood
- Birnaviridae Infections/mortality
- Birnaviridae Infections/pathology
- Birnaviridae Infections/prevention & control
- Cell Proliferation
- Chickens
- Cytokines/metabolism
- Gene Expression
- Hepatitis B Virus, Woodchuck/genetics
- Infectious bursal disease virus/genetics
- Infectious bursal disease virus/immunology
- Injections, Intramuscular
- Leukocytes, Mononuclear/immunology
- Protein Biosynthesis
- Regulatory Elements, Transcriptional
- Survival Analysis
- Vaccination/methods
- Vaccines, DNA/administration & dosage
- Vaccines, DNA/genetics
- Vaccines, DNA/immunology
- Vaccines, Synthetic/administration & dosage
- Vaccines, Synthetic/genetics
- Vaccines, Synthetic/immunology
- Viral Vaccines/administration & dosage
- Viral Vaccines/genetics
- Viral Vaccines/immunology
Collapse
Affiliation(s)
- Kai Li
- Division of Avian Infectious Diseases, State Key Laboratory of Veterinary Biotechnology, Harbin Veterinary Research Institute, the Chinese Academy of Agricultural Sciences, Harbin 150001, PR China
| | | | | | | | | | | | | | | |
Collapse
|
16
|
|
17
|
Comparative pathogenomics of bacteria causing infectious diseases in fish. INTERNATIONAL JOURNAL OF EVOLUTIONARY BIOLOGY 2012; 2012:457264. [PMID: 22675651 PMCID: PMC3364575 DOI: 10.1155/2012/457264] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/06/2012] [Accepted: 03/20/2012] [Indexed: 11/18/2022]
Abstract
Fish living in the wild as well as reared in the aquaculture facilities are susceptible to infectious diseases caused by a phylogenetically diverse collection of bacterial pathogens. Control and treatment options using vaccines and drugs are either inadequate, inefficient, or impracticable. The classical approach in studying fish bacterial pathogens has been looking at individual or few virulence factors. Recently, genome sequencing of a number of bacterial fish pathogens has tremendously increased our understanding of the biology, host adaptation, and virulence factors of these important pathogens. This paper attempts to compile the scattered literature on genome sequence information of fish pathogenic bacteria published and available to date. The genome sequencing has uncovered several complex adaptive evolutionary strategies mediated by horizontal gene transfer, insertion sequence elements, mutations and prophage sequences operating in fish pathogens, and how their genomes evolved from generalist environmental strains to highly virulent obligatory pathogens. In addition, the comparative genomics has allowed the identification of unique pathogen-specific gene clusters. The paper focuses on the comparative analysis of the virulogenomes of important fish bacterial pathogens, and the genes involved in their evolutionary adaptation to different ecological niches. The paper also proposes some new directions on finding novel vaccine and chemotherapeutic targets in the genomes of bacterial pathogens of fish.
Collapse
|
18
|
Wald N, Alroy M, Botzman M, Margalit H. Codon usage bias in prokaryotic pyrimidine-ending codons is associated with the degeneracy of the encoded amino acids. Nucleic Acids Res 2012; 40:7074-83. [PMID: 22581775 PMCID: PMC3424539 DOI: 10.1093/nar/gks348] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Synonymous codons are unevenly distributed among genes, a phenomenon termed codon usage bias. Understanding the patterns of codon bias and the forces shaping them is a major step towards elucidating the adaptive advantage codon choice can confer at the level of individual genes and organisms. Here, we perform a large-scale analysis to assess codon usage bias pattern of pyrimidine-ending codons in highly expressed genes in prokaryotes. We find a bias pattern linked to the degeneracy of the encoded amino acid. Specifically, we show that codon-pairs that encode two- and three-fold degenerate amino acids are biased towards the C-ending codon while codons encoding four-fold degenerate amino acids are biased towards the U-ending codon. This codon usage pattern is widespread in prokaryotes, and its strength is correlated with translational selection both within and between organisms. We show that this bias is associated with an improved correspondence with the tRNA pool, avoidance of mis-incorporation errors during translation and moderate stability of codon–anticodon interaction, all consistent with more efficient translation.
Collapse
Affiliation(s)
- Naama Wald
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 91120, Israel
| | | | | | | |
Collapse
|
19
|
Das S, Roymondal U, Chottopadhyay B, Sahoo S. Gene expression profile of the cynobacterium synechocystis genome. Gene 2012; 497:344-52. [PMID: 22310391 DOI: 10.1016/j.gene.2012.01.023] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Accepted: 01/19/2012] [Indexed: 11/26/2022]
Abstract
The expression of functional proteins plays a crucial role in modern biotechnology. The free-living cynobacterium Synechocystis PCC 6803 is an interesting model organism to study oxygenic photosynthesis as well as other metabolic processes. Here we analyze a gene expression profiling methodology, RCBS (the scores of relative codon usage bias) to elucidate expression patterns of genes in the Synechocystis genome. To assess the predictive performance of the methodology, we propose a simple algorithm to calculate the threshold score to identify the highly expressed genes in a genome. Analysis of differential expression of the genes of this genome reveals that most of the genes in photosynthesis and respiration belong to the highly expressed category. The other genes with the higher predicted expression level include ribosomal proteins, translation processing factors and many hypothetical proteins. Only 9.5% genes are identified as highly expressed genes and we observe that highly expressed genes in Synechocystis genome often have strong compositional bias in terms of codon usage. An important application concerns the automatic detection of a set of impact codons and genes that are highly expressed tend to use this narrow set of preferred codons and display high codon bias .We further observe a strong correlation between RCBS and protein length indicating natural selection in favor of shorter genes to be expressed at higher level. The better correlations of RCBS with 2D electrophoresis and microarray data for heat shock proteins compared to the expression measure based on codon usage difference, E(g) and codon adaptive index, CAI indicate that the genomic expression profile available in our method can be applied in a meaningful way to study the mRNA expression patterns, which are by themselves necessary for the quantitative description of the biological states.
Collapse
Affiliation(s)
- Shibsankar Das
- Department of Mathematics, Uluberia College, Uluberia, Howrah, India.
| | | | | | | |
Collapse
|
20
|
Badalucco L, Poudel I, Yamanishi M, Natarajan C, Moriyama H. Crystallization of Chlorella deoxyuridine triphosphatase. Acta Crystallogr Sect F Struct Biol Cryst Commun 2011; 67:1599-602. [PMID: 22139176 PMCID: PMC3232149 DOI: 10.1107/s1744309111038097] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2011] [Accepted: 09/18/2011] [Indexed: 11/10/2022]
Abstract
Deoxyuridine triphosphatase (dUTPase) is a ubiquitous enzyme that has been widely studied owing to its function and evolutionary significance. The gene coding for the dUTPase from the Chlorella alga was codon-optimized and synthesized. The synthetic gene was expressed in Escherichia coli and recombinant core Chlorella dUTPase (chdUTPase) was purified. Crystallization of chdUTPase was performed by the repetitive hanging-drop vapor-diffusion method at 298 K with ammonium sulfate as the precipitant. In the presence of 2'-deoxyuridine-5'-[(α,β)-imido]triphosphate and magnesium, the enzyme produced die-shaped hexagonal R3 crystals with unit-cell parameters a = b = 66.9, c = 93.6 Å, γ = 120°. X-ray diffraction data for chdUTPase were collected to 1.6 Å resolution. The crystallization of chdUTPase with manganese resulted in very fragile clusters of needles.
Collapse
Affiliation(s)
- Laura Badalucco
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, NE 68588-0118, USA
| | | | | | | | | |
Collapse
|
21
|
|
22
|
Welch M, Villalobos A, Gustafsson C, Minshull J. Designing genes for successful protein expression. Methods Enzymol 2011; 498:43-66. [PMID: 21601673 DOI: 10.1016/b978-0-12-385120-8.00003-6] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
DNA sequences are now far more readily available in silico than as physical DNA. De novo gene synthesis is an increasingly cost-effective method for building genetic constructs, and effectively removes the constraint of basing constructs on extant sequences. This allows scientists and engineers to experimentally test their hypotheses relating sequence to function. Molecular biologists, and now synthetic biologists, are characterizing and cataloging genetic elements with specific functions, aiming to combine them to perform complex functions. However, the most common purpose of synthetic genes is for the expression of an encoded protein. The huge number of different proteins makes it impossible to characterize and catalog each functional gene. Instead, it is necessary to abstract design principles from experimental data: data that can be generated by making predictions followed by synthesizing sequences to test those predictions. Because of the degeneracy of the genetic code, design of gene sequences to encode proteins is a high-dimensional problem, so there is no single simple formula to guarantee success. Nevertheless, there are several straightforward steps that can be taken to greatly increase the probability that a designed sequence will result in expression of the encoded protein. In this chapter, we discuss gene sequence parameters that are important for protein expression. We also describe algorithms for optimizing these parameters, and troubleshooting procedures that can be helpful when initial attempts fail. Finally, we show how many of these methods can be accomplished using the synthetic biology software tool Gene Designer.
Collapse
Affiliation(s)
- Mark Welch
- DNA2.0, Inc., Suite A, Menlo Park, California, USA
| | | | | | | |
Collapse
|
23
|
Jiang Y, Zhang H, Wang G, Zhang P, Tian G, Bu Z, Chen H. Protective Efficacy of H7 Subtype Avian Influenza DNA Vaccine. Avian Dis 2010; 54:290-3. [DOI: 10.1637/8723-032409-resnote.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
24
|
Cohanim AB, Haran TE. The coexistence of the nucleosome positioning code with the genetic code on eukaryotic genomes. Nucleic Acids Res 2009; 37:6466-76. [PMID: 19700771 PMCID: PMC2770662 DOI: 10.1093/nar/gkp689] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
It is known that there are several codes residing simultaneously on the DNA double helix. The two best-characterized codes are the genetic code--the code for protein production, and the code for DNA packaging into nucleosomes. Since these codes have to coexist simultaneously on the same DNA region, both must be degenerate to allow this coexistence. A-tracts are homopolymeric stretches of several adjacent deoxyadenosines on one strand of the double helix, having unusual structural properties, which were shown to exclude nucleosomes and as such are instrumental in setting the translational positioning of DNA within nucleosomes. We observe, cross-kingdoms, a strong codon bias toward the avoidance of long A-tracts in exon regions, which enables the formation of high density of nucleosomes in these regions. Moreover, long A-tract avoidance is restricted exclusively to nucleosome-occupied exon regions. We show that this bias in codon usage is sufficient for enabling DNA organization within nucleosomes without constraints on the actual code for proteins. Thus, there is inter-dependency of the two major codes within DNA to allow their coexistence. Furthermore, we show that modulation of A-tract occurrences in exon versus non-exon regions may result in a unique alternation of the diameter of the '30-nm' fiber model.
Collapse
Affiliation(s)
- Amir B Cohanim
- Department of Biology, Technion, Technion City, Haifa 32000, Israel
| | | |
Collapse
|
25
|
Das S, Roymondal U, Sahoo S. Analyzing gene expression from relative codon usage bias in Yeast genome: a statistical significance and biological relevance. Gene 2009; 443:121-31. [PMID: 19410638 DOI: 10.1016/j.gene.2009.04.022] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2008] [Revised: 03/08/2009] [Accepted: 04/20/2009] [Indexed: 11/17/2022]
Abstract
Based on the hypothesis that highly expressed genes are often characterized by strong compositional bias in terms of codon usage, there are a number of measures currently in use that quantify codon usage bias in genes, and hence provide numerical indices to predict the expression levels of genes. With the recent advent of expression measure from the score of the relative codon usage bias (RCBS), we have explicitly tested the performance of this numerical measure to predict the gene expression level and illustrate this with an analysis of Yeast genomes. In contradiction with previous other studies, we observe a weak correlations between GC content and RCBS, but a selective pressure on the codon preferences in highly expressed genes. The assertion that the expression of a given gene depends on the score of relative codon usage bias (RCBS) is supported by the data. We further observe a strong correlation between RCBS and protein length indicating natural selection in favour of shorter genes to be expressed at higher level. We also attempt a statistical analysis to assess the strength of relative codon bias in genes as a guide to their likely expression level, suggesting a decrease of the informational entropy in the highly expressed genes.
Collapse
Affiliation(s)
- Shibsankar Das
- Department of Mathematics, Uluberia College, Uluberia, Howrah, W.B., India
| | | | | |
Collapse
|
26
|
Welch M, Villalobos A, Gustafsson C, Minshull J. You're one in a googol: optimizing genes for protein expression. J R Soc Interface 2009; 6 Suppl 4:S467-76. [PMID: 19324676 DOI: 10.1098/rsif.2008.0520.focus] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
A vast number of different nucleic acid sequences can all be translated by the genetic code into the same amino acid sequence. These sequences are not all equally useful however; the exact sequence chosen can have profound effects on the expression of the encoded protein. Despite the importance of protein-coding sequences, there has been little systematic study to identify parameters that affect expression. This is probably because protein expression has largely been tackled on an ad hoc basis in many independent projects: once a sequence has been obtained that yields adequate expression for that project, there is little incentive to continue work on the problem. Synthetic biology may now provide the impetus to transform protein expression folklore into design principles, so that DNA sequences may easily be designed to express any protein in any system. In this review, we offer a brief survey of the literature, outline the major challenges in interpreting existing data and constructing robust design algorithms, and propose a way to proceed towards the goal of rational sequence engineering.
Collapse
Affiliation(s)
- Mark Welch
- DNA 2.0, Inc., 1430 O'Brien Drive, Menlo Park, CA 94025, USA
| | | | | | | |
Collapse
|
27
|
Roymondal U, Das S, Sahoo S. Predicting gene expression level from relative codon usage bias: an application to Escherichia coli genome. DNA Res 2009; 16:13-30. [PMID: 19131380 PMCID: PMC2646356 DOI: 10.1093/dnares/dsn029] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We present an expression measure of a gene, devised to predict the level of gene expression from relative codon bias (RCB). There are a number of measures currently in use that quantify codon usage in genes. Based on the hypothesis that gene expressivity and codon composition is strongly correlated, RCB has been defined to provide an intuitively meaningful measure of an extent of the codon preference in a gene. We outline a simple approach to assess the strength of RCB (RCBS) in genes as a guide to their likely expression levels and illustrate this with an analysis of Escherichia coli (E. coli) genome. Our efforts to quantitatively predict gene expression levels in E. coli met with a high level of success. Surprisingly, we observe a strong correlation between RCBS and protein length indicating natural selection in favour of the shorter genes to be expressed at higher level. The agreement of our result with high protein abundances, microarray data and radioactive data demonstrates that the genomic expression profile available in our method can be applied in a meaningful way to the study of cell physiology and also for more detailed studies of particular genes of interest.
Collapse
Affiliation(s)
- Uttam Roymondal
- Department of Mathematics, Raidighi College, South 24 Parganas, Raidighi, West Bengal, India
| | | | | |
Collapse
|
28
|
A consensus-hemagglutinin-based DNA vaccine that protects mice against divergent H5N1 influenza viruses. Proc Natl Acad Sci U S A 2008; 105:13538-43. [PMID: 18765801 DOI: 10.1073/pnas.0806901105] [Citation(s) in RCA: 127] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
H5N1 influenza viruses have spread extensively among wild birds and domestic poultry. Cross-species transmission of these viruses to humans has been documented in over 380 cases, with a mortality rate of approximately 60%. There is great concern that a H5N1 virus would acquire the ability to spread efficiently between humans, thereby becoming a pandemic threat. An H5N1 influenza vaccine must, therefore, be an integral part of any pandemic preparedness plan. However, traditional methods of making influenza vaccines have yet to produce a candidate that could induce potently neutralizing antibodies against divergent strains of H5N1 influenza viruses. To address this need, we generated a consensus H5N1 hemagglutinin (HA) sequence based on data available in early 2006. This sequence was then optimized for protein expression before being inserted into a DNA plasmid (pCHA5). Immunizing mice with pCHA5, delivered intramuscularly via electroporation, elicited antibodies that neutralized a panel of virions that have been pseudotyped with the HA from various H5N1 viruses (clades 1, 2.1, 2.2, 2.3.2, and 2.3.4). Moreover, immunization with pCHA5 in mice conferred complete (clades 1 and 2.2) or significant (clade 2.1) protection from H5N1 virus challenges. We conclude that this vaccine, based on a consensus HA, could induce broad protection against divergent H5N1 influenza viruses and thus warrants further study.
Collapse
|
29
|
Vladimirov NV, Likhoshvai VA, Matushkin YG. Correlation of codon biases and potential secondary structures with mRNA translation efficiency in unicellular organisms. Mol Biol 2007. [DOI: 10.1134/s0026893307050184] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
30
|
Zhou M, Tong C, Shi J. Analysis of Codon Usage Between Different Poplar Species. J Genet Genomics 2007; 34:555-61. [PMID: 17601615 DOI: 10.1016/s1673-8527(07)60061-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2006] [Accepted: 09/14/2006] [Indexed: 10/23/2022]
Abstract
Codon usage is the selective and nonrandom use of synonymous codons to encode amino acids in genes for proteins. The analysis of codon usage may improve the understanding of codon preferences between different species and allow to rebuild the codons of exogenous genes to increase the expression efficiency of exogenous genes. Here, codon DNA sequence (CDS) of four poplar species, including Populus tremuloides Michx., P. tomentosa Carr., P. deltoides Marsh., and P. trichocarpa Torr. & Gray., is used to analyze the relative frequency of synonymous codon (RFSC). High-frequency codons are selected by high-frequency (HF) codon analysis. The results indicate that the codon usage is common for all four poplar species and the codon preference is quite similar among the four poplar species. However, CCT encoding for Pro, and ACT coding for Thr are the preferred codons in P. tremuloides and P. tomentosa, whereas CCA coding for Pro, and ACA coding for Thr are preferred in P. deltoides and P. trichocarpa. The codons such as TGC coding for Cys, TTC coding for Phe, and AAG coding for Lys, are preferred in the poplar species except P. trichocarpa. GAG coding for Glu is preferred only in P. deltoides, while the other three poplar species prefer to use GAA. The commonness of preferred codon allows exogenous gene designed by the preferred codon of one of the different poplar species to be used in other poplar species.
Collapse
Affiliation(s)
- Meng Zhou
- The Key Laboratory of Forestry Genetics and Engineering of State Forestry Administration and Jiangsu Province, Nanjing Foestry University, Nanjing 210037, China
| | | | | |
Collapse
|
31
|
Jiang Y, Yu K, Zhang H, Zhang P, Li C, Tian G, Li Y, Wang X, Ge J, Bu Z, Chen H. Enhanced protective efficacy of H5 subtype avian influenza DNA vaccine with codon optimized HA gene in a pCAGGS plasmid vector. Antiviral Res 2007; 75:234-41. [PMID: 17451817 DOI: 10.1016/j.antiviral.2007.03.009] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2006] [Revised: 01/24/2007] [Accepted: 03/16/2007] [Indexed: 11/29/2022]
Abstract
H5N1 influenza viruses have caused significant disease and deaths in various parts of the world in several species, including humans. Vaccination combined with culling can provide an attractive method for outbreak containment. Using synthesized oligos and overlapping extension PCR techniques, we constructed an H5 HA gene, optiHA, containing chicken biased codons based on the HA amino acid sequence of the highly pathogenic H5N1 virus A/goose/Guangdong/1/96 (GS/GD/96). The optiHA and wild-type HA genes were inserted into plasmids pCI or pCAGGS, and designated as pCIoptiHA, pCAGGoptiHA, pCIHA and pCAGGHA, respectively. To evaluate vaccine efficacy, groups of 3-week-old specific pathogen free (SPF) chickens were intramuscularly injected with the four plasmids. Sera were collected on a weekly basis post-vaccination (p.v.) for hemagglutination inhibition (HI) assays and neutralization (NT) antibody detection. All chickens receiving pCAGGoptiHA and pCAGGHA developed high levels of HI and NT antibodies at 3 weeks p.v., and were completely protected from lethal H5 virus challenge, while only partial protection was induced by inoculation with the other two plasmids. A second experiment was conducted to evaluate if a lower dose of the pCAGGoptiHA vaccine could be effective, results indicated that two doses of 10 microg of pCAGGoptiHA could induce complete protection in chickens against H5 lethal virus challenge. Based on our results, we conclude that construction optimization could dramatically increase the H5 HA gene DNA vaccine efficacy in chickens, and therefore, greatly decrease the dose necessary for inducing complete protection in chickens.
Collapse
MESH Headings
- Animals
- Antibodies, Viral/blood
- Chickens/immunology
- Chickens/virology
- Genetic Vectors
- Hemagglutination Inhibition Tests
- Hemagglutinin Glycoproteins, Influenza Virus/genetics
- Hemagglutinin Glycoproteins, Influenza Virus/immunology
- Influenza A Virus, H5N1 Subtype/genetics
- Influenza A Virus, H5N1 Subtype/immunology
- Influenza A Virus, H5N1 Subtype/pathogenicity
- Influenza A Virus, H5N1 Subtype/physiology
- Influenza Vaccines/administration & dosage
- Influenza Vaccines/genetics
- Influenza Vaccines/immunology
- Influenza in Birds/prevention & control
- Influenza in Birds/virology
- Neutralization Tests
- Plasmids
- Vaccines, DNA/administration & dosage
- Vaccines, DNA/genetics
- Vaccines, DNA/immunology
- Virus Shedding
Collapse
Affiliation(s)
- Yongping Jiang
- Animal Influenza Laboratory of the Ministry of Agriculture, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, 427 Maduan Street, Harbin 150001, People's Republic of China
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Slagter-Jäger JG, Puzis L, Gutgsell NS, Belfort M, Jain C. Functional defects in transfer RNAs lead to the accumulation of ribosomal RNA precursors. RNA (NEW YORK, N.Y.) 2007; 13:597-605. [PMID: 17293391 PMCID: PMC1831865 DOI: 10.1261/rna.319407] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Normal expression and function of transfer RNA (tRNA) are of paramount importance for translation. In this study, we show that tRNA defects are also associated with increased levels of immature ribosomal RNA (rRNA). This association was first shown in detail for a mutant strain that underproduces tRNA(Arg2) in which unprocessed 16S and 23S rRNA levels were increased several-fold. Ribosome profiles indicated that unprocessed 23S rRNA in the mutant strain accumulates in ribosomal fractions that sediment with altered mobility. Underproduction of tRNA(Arg2) also resulted in growth defects under standard laboratory growth conditions. Interestingly, the growth and rRNA processing defects were attenuated when cells were grown in minimal medium or at low temperatures, indicating that the requirement for tRNA(Arg2) may be reduced under conditions of slower growth. Other tRNA defects were also studied, including a defect in RNase P, an enzyme involved in tRNA processing; a mutation in tRNA(Trp) that results in its degradation at elevated temperatures; and the titration of the tRNA that recognizes rare AGA codons. In all cases, the levels of unprocessed 16S and 23S rRNA were enhanced. Thus, a range of tRNA defects can indirectly influence translation via effects on the biogenesis of the translation apparatus.
Collapse
|
33
|
Bonomi G. Long stretches of sequential and identical serine or alanine codons are compatible with an efficient full-length protein expression in Escherichia coli. Protein Expr Purif 2006; 48:160-6. [PMID: 16600623 DOI: 10.1016/j.pep.2006.02.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2005] [Revised: 02/15/2006] [Accepted: 02/15/2006] [Indexed: 11/26/2022]
Abstract
The Schistosoma japonicum glutathione S-transferase (GST) recombinant cDNAs, carrying blocks of sequential and identical triplets, consisting of 15-30-45 GCT (Ala) codons or 15-30 and also up to 75 AGC (Ser) codons, are expressed efficiently in an Escherichia coli system in the form of full-length protein chains, as detected by Coomassie-stained SDS-polyacrylamide gels, and soluble fusion proteins are purified by GSH-affinity chromatography. High expression levels and high yields of purified recombinant proteins are achieved. The efficient protein expression is independent of the molecular context and position of the polySer/polyAla string inserted into the GST carrier (near the part of the gene encoding the N- or the C-terminus). These findings suggest that E. coli is a powerful biological system to express foreign genes carrying long stretches coding for Ser- or Ala-rich domains, which are not uncommon in eukaryotic proteins. Moreover, data reported here show that the negative effect of sequential serine codons on protein expression in bacteria, previously reported in the literature, is not a general phenomenon.
Collapse
Affiliation(s)
- Giovanna Bonomi
- Institute of Genetics and Biophysics Adriano Buzzati - Traverso, CNR, Naples, Italy.
| |
Collapse
|
34
|
Meinicke P, Brodag T, Fricke WF, Waack S. P-value based visualization of codon usage data. Algorithms Mol Biol 2006; 1:10. [PMID: 16808834 PMCID: PMC1526732 DOI: 10.1186/1748-7188-1-10] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2006] [Accepted: 06/29/2006] [Indexed: 11/10/2022] Open
Abstract
Two important and not yet solved problems in bacterial genome research are the identification of horizontally transferred genes and the prediction of gene expression levels. Both problems can be addressed by multivariate analysis of codon usage data. In particular dimensionality reduction methods for visualization of multivariate data have shown to be effective tools for codon usage analysis. We here propose a multidimensional scaling approach using a novel similarity measure for codon usage tables. Our probabilistic similarity measure is based on P-values derived from the well-known chi-square test for comparison of two distributions. Experimental results on four microbial genomes indicate that the new method is well-suited for the analysis of horizontal gene transfer and translational selection. As compared with the widely-used correspondence analysis, our method did not suffer from outlier sensitivity and showed a better clustering of putative alien genes in most cases.
Collapse
Affiliation(s)
- Peter Meinicke
- Abteilung Bioinformatik, Institut für Mikrobiologie und Genetik, Georg-August-Universität Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany
| | - Thomas Brodag
- Institut für Numerische und Angewandte Mathematik, Universität Göttingen, Lotzestr. 16, 37083 Göttingen, Germany
| | - Wolfgang Florian Fricke
- Göttingen Genomics Laboratory, Universität Göttingen, Grisebachstr. 8, 37077 Göttingen, Germany
| | - Stephan Waack
- Institut für Numerische und Angewandte Mathematik, Universität Göttingen, Lotzestr. 16, 37083 Göttingen, Germany
| |
Collapse
|
35
|
Wu G, Bashir-Bello N, Freeland SJ. The Synthetic Gene Designer: a flexible web platform to explore sequence manipulation for heterologous expression. Protein Expr Purif 2005; 47:441-5. [PMID: 16376569 DOI: 10.1016/j.pep.2005.10.020] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2005] [Revised: 10/13/2005] [Accepted: 10/13/2005] [Indexed: 11/15/2022]
Abstract
"Codon optimization" is a general approach to improving heterologous expression where genes are moved from their native genomes into alternatives that exhibit different patterns of codon usage. However, despite reports of successful manipulations and the existence of stand-alone codon optimization software packages or commercial services that offer to redesign genes, the scientific community lacks any systematic understanding of what exactly it means to optimize codon usage. Thus we present a bona fide web application, the "Synthetic Gene Designer," which contrasts with existing software by providing a centralized, free, and transparent platform for the broader scientific community to develop knowledge about synthetic gene design. Consistent with this goal, our software is associated with a moderated e-forum that promotes discussion of synthetic gene design and offers technical support. In addition, the Synthetic Gene Designer presents enhanced functionality over existing software options: for example, it enables users to work with non-standard genetic codes, with user-defined patterns of codon usage and an expanded range of methods for codon optimization. The Synthetic Gene Designer, together with on-line tutorials and the forum, is available at .
Collapse
Affiliation(s)
- Gang Wu
- Department of Biological Sciences, University of Maryland at Baltimore County, 21250, USA.
| | | | | |
Collapse
|
36
|
Sahu K, Gupta SK, Sau S, Ghosh TC. Comparative Analysis of the Base Composition and Codon Usages in Fourteen Mycobacteriophage Genomes. J Biomol Struct Dyn 2005; 23:63-71. [PMID: 15918677 DOI: 10.1080/07391102.2005.10507047] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
To study the possible codon usage and base composition variation in the bacteriophages, fourteen mycobacteriophages were used as a model system here and both the parameters in all these phages and their plating bacteria, M. smegmatis had been determined and compared. As all the organisms are GC-rich, the GC contents at third codon positions were found in fact higher than the second codon positions as well as the first + second codon positions in all the organisms indicating that directional mutational pressure is strongly operative at the synonymous third codon positions. Nc plot indicates that codon usage variation in all these organisms are governed by the forces other than compositional constraints. Correspondence analysis suggests that: (i) there are codon usage variation among the genes and genomes of the fourteen mycobacteriophages and M. smegmatis, i.e., codon usage patterns in the mycobacteriophages is phage-specific but not the M. smegmatis-specific; (ii) synonymous codon usage patterns of Barnyard, Che8, Che9d, and Omega are more similar than the rest mycobacteriophages and M. smegmatis; (iii) codon usage bias in the mycobacteriophages are mainly determined by mutational pressure; and (iv) the genes of comparatively GC rich genomes are more biased than the GC poor genomes. Translational selection in determining the codon usage variation in highly expressed genes can be invoked from the predominant occurrences of C ending codons in the highly expressed genes. Cluster analysis based on codon usage data also shows that there are two distinct branches for the fourteen mycobacteriophages and there is codon usage variation even among the phages of each branch.
Collapse
Affiliation(s)
- K Sahu
- Bioinformatics Centre, Bose Institute, P1/12 - CIT Scheme VII M, Calcutta 700 054, India
| | | | | | | |
Collapse
|
37
|
Manoj S, Babiuk LA, van Drunen Littel-van den Hurk S. Approaches to enhance the efficacy of DNA vaccines. Crit Rev Clin Lab Sci 2004; 41:1-39. [PMID: 15077722 DOI: 10.1080/10408360490269251] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
DNA vaccines consist of antigen-encoding bacterial plasmids that are capable of inducing antigen-specific immune responses upon inoculation into a host. This method of immunization is advantageous in terms of simplicity, adaptability, and cost of vaccine production. However, the entry of DNA vaccines and expression of antigen are subjected to physical and biochemical barriers imposed by the host. In small animals such as mice, the host-imposed impediments have not prevented DNA vaccines from inducing long-lasting, protective humoral, and cellular immune responses. In contrast, these barriers appear to be more difficult to overcome in large animals and humans. The focus of this article is to summarize the limitations of DNA vaccines and to provide a comprehensive review on the different strategies developed to enhance the efficacy of DNA vaccines. Several of these strategies, such as altering codon bias of the encoded gene, changing the cellular localization of the expressed antigen, and optimizing delivery and formulation of the plasmid, have led to improvements in DNA vaccine efficacy in large animals. However, solutions for increasing the amount of plasmid that eventually enters the nucleus and is available for transcription of the transgene still need to be found. The overall conclusions from these studies suggest that, provided these critical improvements are made, DNA vaccines may find important clinical and practical applications in the field of vaccination.
Collapse
Affiliation(s)
- Sharmila Manoj
- Vaccine and Infectious Disease Organization, University of Saskatchewan, Saskatoon, Canada
| | | | | |
Collapse
|
38
|
Klasen M, Wabl M. Silent point mutation in DsRed resulting in enhanced relative fluorescence intensity. Biotechniques 2004; 36:236-8. [PMID: 14989087 DOI: 10.2144/04362bm06] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Affiliation(s)
- Maik Klasen
- Department of Microbiology and Immunology, University of California, San Francisco, CA 94143-0670, USA.
| | | |
Collapse
|
39
|
Buckley CO, Stephens D, Herring PA, Jackson JH. %(G+C) variation and prediction by a model of bacterial gene transfer and codon adaptation. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2003; 6:259-72. [PMID: 12427277 DOI: 10.1089/15362310260256918] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The %(G + C) of bacterial genomes ranges from 25% in Mycoplasma to 75% in Micrococcus. Our model for horizontal gene flow enabled a theoretical study of the adaptation of relative codon frequency to match the pattern of the tRNA set of a new host. This study explored the dynamic relationship of %(G + C) to vectors of relative codon frequency (F(gamma)), relative amino acid coding frequency (F(alpha)), and absolute codon frequency (F(|gamma|)) in chromosomes of nine, fully sequenced bacterial genomes that varied widely in %(G + C). At constant F(alpha), the theoretical maximum average range possible was %(G + C) = 37.4 +/- 0.9%. In simulations of F(gamma) adaptation to a new host following hypothetical gene transfer, we modeled %(G + C) as a function of F(gamma) and F(alpha). The simulation revealed that %(G + C) is dependent on F(gamma) and F(alpha) in an explicit relationship described in this paper. We conclude that (1) F(gamma) and F(alpha) determine %(G + C), and (2) the degree of adaptation of %(G + C) in a transferred gene depends upon the degree of F(gamma) equilibration and the similarity of F(alpha) of the transferred gene to that of the new host.
Collapse
Affiliation(s)
- Cedric O Buckley
- Theoretical & Computational Biology Group, Michigan State University, East Lansing, Michigan 48824, USA.
| | | | | | | |
Collapse
|
40
|
Perrière G, Thioulouse J. Use and misuse of correspondence analysis in codon usage studies. Nucleic Acids Res 2002; 30:4548-55. [PMID: 12384602 PMCID: PMC137129 DOI: 10.1093/nar/gkf565] [Citation(s) in RCA: 120] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Correspondence analysis has frequently been used for codon usage studies but this method is often misused. Because amino acid composition exerts constraints on codon usage, it is common to use tables containing relative codon frequencies (or ratios of frequencies) instead of simple codon counts to get rid of these amino acid biases. The problem is that some important properties of correspondence analysis, such as rows weighting, are lost in the process. Moreover, the use of relative measures sometimes introduces other biases and often diminishes the quantity of information to analyse, occasionally resulting in interpretation errors. For instance, in the case of an organism such as Borrelia burgdorferi, the use of relative measures led to the conclusion that there was no translational selection, while analyses based on codon counts show that there is a possibility of a selective effect at that level. In this paper, we expose these problems and we propose alternative strategies to correspondence analysis for studying codon usage biases when amino acid composition effects must be removed.
Collapse
Affiliation(s)
- Guy Perrière
- Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558, Université Claude Bernard - Lyon 1, 43 Boulevard du 11 Novembre 1918, 69622 Villeurbanne Cedex, France.
| | | |
Collapse
|
41
|
Baev D, Li X, Edgerton M. Genetically engineered human salivary histatin genes are functional in Candida albicans: development of a new system for studying histatin candidacidal activity. MICROBIOLOGY (READING, ENGLAND) 2001; 147:3323-34. [PMID: 11739764 DOI: 10.1099/00221287-147-12-3323] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Histatins are a structurally related family of salivary proteins known as histidine-rich proteins that are produced and secreted by the human major salivary glands. In vitro, histatins are potent cytotoxic proteins with selectivity for pathogenic yeasts including Candida albicans. Studies that investigate the mechanism of action of histatin proteins upon this important human pathogen have used a candidacidal assay in which the histatin is applied extracellularly. In order to develop a model system to study the mechanism of histatin action independently from binding and translocation events, the authors constructed C. albicans strains that contain chromosomally encoded human salivary histatin genes under the control of a regulated promoter. Intracellular expression of either histatin 5 or histatin 3 induced cell killing and ATP release in parallel. Since histatin killing can be initiated solely from intracellular sites, extracellular binding and internalization are preceding transport events. Thus the mechanism of histatin-induced ATP release does not require extracellular binding, and intracellular targets alone can activate ATP release. By employing a codon-optimization strategy it was shown that expression of heterologous sequences in C. albicans can be a useful tool for functional studies.
Collapse
Affiliation(s)
- D Baev
- Department of Oral Biology, School of Dental Medicine, State University of New York at Buffalo Main Street Campus, 3435 Main Street, Buffalo, NY 14214, USA
| | | | | |
Collapse
|
42
|
Abstract
A duplication of the polypurine tract (PPT) at the center of the human immunodeficiency virus type 1 (HIV-1) genome (the cPPT) has been shown to prime a separate plus-strand initiation and to result in a plus-strand displacement (DNA flap) that plays a role in nuclear import of the viral preintegration complex. Feline immunodeficiency virus (FIV) is a lentivirus that infects nondividing cells, causes progressive CD4(+) T-cell depletion, and has been used as a substrate for lentiviral vectors. However, the PPT sequence is not duplicated elsewhere in the FIV genome and a central plus-strand initiation or strand displacement has not been identified. Using Southern blotting of S1 nuclease-digested FIV preintegration complexes isolated from infected cells, we detected a single-strand discontinuity at the approximate center of the reverse-transcribed genome. Primer extension analyses assigned the gap to the plus strand, and mapped the 5' terminus of the downstream (D+) segment to a guanine residue in a purine-rich tract in pol (AAAAGAAGAGGTAGGA). RACE experiments then mapped the 3' terminus of the upstream plus (U+)-strand segment to a T nucleotide located 88 nucleotides downstream of the D+ strand 5' terminus, thereby identifying the extent of D+ strand displacement and the central termination sequence of this virus. Unlike HIV, the FIV cPPT is significantly divergent in sequence from its 3' counterpart (AAAAAAGAAAAAAGGGTGG) and contains one and in some cases two pyrimidines. An invariant thymidine located -2 to the D+ strand origin is neither required nor optimal for codon usage at this position. Although the mapped cPPTs of FIV and HIV-1 act in cis, they encode homologous amino acids in integrase.
Collapse
Affiliation(s)
- T Whitwam
- Molecular Medicine Program and Division of Infectious Diseases, Mayo Clinic, Rochester, Minnesota 55905, USA
| | | | | |
Collapse
|
43
|
Knight RD, Freeland SJ, Landweber LF. A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol 2001; 2:RESEARCH0010. [PMID: 11305938 PMCID: PMC31479 DOI: 10.1186/gb-2001-2-4-research0010] [Citation(s) in RCA: 201] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2000] [Revised: 02/01/2001] [Accepted: 02/13/2001] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Correlations between genome composition (in terms of GC content) and usage of particular codons and amino acids have been widely reported, but poorly explained. We show here that a simple model of processes acting at the nucleotide level explains codon usage across a large sample of species (311 bacteria, 28 archaea and 257 eukaryotes). The model quantitatively predicts responses (slope and intercept of the regression line on genome GC content) of individual codons and amino acids to genome composition. RESULTS Codons respond to genome composition on the basis of their GC content relative to their synonyms (explaining 71-87% of the variance in response among the different codons, depending on measure). Amino-acid responses are determined by the mean GC content of their codons (explaining 71-79% of the variance). Similar trends hold for genes within a genome. Position-dependent selection for error minimization explains why individual bases respond differently to directional mutation pressure. CONCLUSIONS Our model suggests that GC content drives codon usage (rather than the converse). It unifies a large body of empirical evidence concerning relationships between GC content and amino-acid or codon usage in disparate systems. The relationship between GC content and codon and amino-acid usage is ahistorical; it is replicated independently in the three domains of living organisms, reinforcing the idea that genes and genomes at mutation/selection equilibrium reproduce a unique relationship between nucleic acid and protein composition. Thus, the model may be useful in predicting amino-acid or nucleotide sequences in poorly characterized taxa.
Collapse
Affiliation(s)
- Robin D Knight
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| | - Stephen J Freeland
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| | - Laura F Landweber
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
44
|
Abstract
A coding sequence is defined as a DNA sequence coding the primary structure of a protein (a polypeptide). Such a sequence must satisfy a specific constraint, which consists in coding a functional protein. As the genetic code is degenerated, there exists, for a given polypeptide, a set of synonymous sequences which would code the same polypeptide. Translation conditional models are being defined on such sets. The aim of this paper is to give a common formalism. Besides the codon bias model, a few other conditional models will be defined. Statistical estimators and comparison methods will be briefly presented. These models can be used for gene classification, or to find out, in a real sequence, remarkable features. An example will be presented on Escherichia coli genes.
Collapse
|
45
|
Mathé C, Peresetsky A, Déhais P, Van Montagu M, Rouzé P. Classification of Arabidopsis thaliana gene sequences: clustering of coding sequences into two groups according to codon usage improves gene prediction. J Mol Biol 1999; 285:1977-91. [PMID: 9925779 DOI: 10.1006/jmbi.1998.2451] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
While genomic sequences are accumulating, finding the location of the genes remains a major issue that can be solved only for about a half of them by homology searches. Prediction methods are thus required, but unfortunately are not fully satisfying. Most prediction methods implicitly assume a unique model for genes. This is an oversimplification as demonstrated by the possibility to group coding sequences into several classes in Escherichia coli and other genomes. As no classification existed for Arabidopsis thaliana, we classified genes according to the statistical features of their coding sequences. A clustering algorithm using a codon usage model was developed and applied to coding sequences from A. thaliana, E. coli, and a mixture of both. By using it, Arabidopsis sequences were clustered into two classes. The CU1 and CU2 classes differed essentially by the choice of pyrimidine bases at the codon silent sites: CU2 genes often use C whereas CU1 genes prefer T. This classification discriminated the Arabidopsis genes according to their expressiveness, highly expressed genes being clustered in CU2 and genes expected to have a lower expression, such as the regulatory genes, in CU1. The algorithm separated the sequences of the Escherichia-Arabidopsis mixed data set into five classes according to the species, except for one class. This mixed class contained 89 % Arabidopsis genes from CU1 and 11 % E. coli genes, mostly horizontally transferred. Interestingly, most genes encoding organelle-targeted proteins, except the photosynthetic and photoassimilatory ones, were clustered in CU1. By tailoring the GeneMark CDS prediction algorithm to the observed coding sequence classes, its quality of prediction was greatly improved. Similar improvement can be expected with other prediction systems.
Collapse
Affiliation(s)
- C Mathé
- Laboratorium voor Genetica Department of Genetics, Flanders Interuniversity Institute for Biotechnology (VIB), Universiteit Gent, Gent, B-9000, Belgium
| | | | | | | | | |
Collapse
|
46
|
Zhang S, Stancek M, Isaksson LA. The efficiency of a cis-cleaving ribozyme in an mRNA coding region is influenced by the translating ribosome in vivo. Nucleic Acids Res 1997; 25:4301-6. [PMID: 9336461 PMCID: PMC147047 DOI: 10.1093/nar/25.21.4301] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
A cis -cleaving hammerhead ribozyme (Rz) expression system (3A'-Rz) in Escherichia coli has been constructed that can be used to study the involvement of factors that affect ribozyme cleavage in vivo . The ribozyme sequence is placed in the coding region of 3A' mRNA, which is expressed from a semi-synthetic translation assay gene. The size and the 5'-end sequences of the 3' cleavage fragments were determined and the efficiencies of different Rz variants were measured by quantitative primer extension. It is shown that one of the semi-active constructs (3A'-RzIII) can be used as an indicator for ribosomes that read through or terminate at a stop codon upstream of the Rz hammerhead sequence in the mRNA. Readthrough of the stop codon in an uncleaved mRNA gives a full length 3A' protein. Termination at the stop codon upstream of the ribozyme sequence gives a shortened termination product. However, the mRNA fragment that should arise as a result of the auto-cleavage does not give rise to any detectable corresponding truncated protein. Besides studies on translating ribosomes, the 3A'-Rz system can be used to isolate mutant strains that are changed in ribozyme activity either from internal base alterations, or changed interacting host factors.
Collapse
Affiliation(s)
- S Zhang
- Department of Microbiology, Stockholm University, S-106 91 Stockholm, Sweden
| | | | | |
Collapse
|
47
|
Kim CH, Oh Y, Lee TH. Codon optimization for high-level expression of human erythropoietin (EPO) in mammalian cells. Gene 1997; 199:293-301. [PMID: 9358069 DOI: 10.1016/s0378-1119(97)00384-3] [Citation(s) in RCA: 109] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Codon bias has been observed in many species. The usage of selective codons in a given gene is positively correlated with its expression efficiency. As an experimental approach to study codon-usage effects on heterologous gene expression in mammalian cells, we designed two human erythropoietin (EPO) genes, one in which native codons were systematically substituted with codons frequently found in highly expressed human genes and the other with codons prevalent in yeast genes. Relative performances of the re-engineered EPO genes were evaluated with various combinations of promoters and signal leader sequences. Under the comparable set of combinations, mature EPO gene with human high-frequency codons gave a considerably higher level of expression than that with yeast high-frequency codons. However, the levels of EPO expression varied, depending on the alternate combinations. Since the promoters and the signal leader sequences that we used are known to be equally efficient in gene expression, we hypothesized that the varied expression levels were due to the linear sequence between the promoter and the coding gene sequence. To test this possibility, we designed the EPO gene with hybrid codon usage in which the 5'-proximal region of the EPO gene was synthesized with yeast-biased codons and the rest with human-biased codons. This codon-usage hybrid EPO gene substantially enhanced the level of EPO transcripts and proteins up to 2.9-fold and 13.8-fold, respectively, when compared to the level reached by the original counterpart. Our results suggest that the linear sequence between the promoter and the 5'-proximal region of a gene plays an important role in achieving high-level expression in mammalian cells.
Collapse
Affiliation(s)
- C H Kim
- Biotech Research Institute, LG Chem, Taejeon, South Korea
| | | | | |
Collapse
|
48
|
Gutiérrez G, Márquez L, Marín A. Preference for guanosine at first codon position in highly expressed Escherichia coli genes. A relationship with translational efficiency. Nucleic Acids Res 1996; 24:2525-7. [PMID: 8692691 PMCID: PMC145967 DOI: 10.1093/nar/24.13.2525] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
The variation in base composition at the three codon sites in relation to gene expressivity, the latter estimated by the Codon Adaptation Index, has been studied in a sample of 1371 Escherichia coli genes. Correlation and regression analyses show that increasing expression levels are accompanied by higher frequencies of base G at first, of base A at second and of base C at third codon positions. However, correlation between expressivity and base compositional biases at each codon site was only significant and positive at first codon position. The preference for G-starting codons as gene expression level increases is discussed in terms of translational optimization.
Collapse
Affiliation(s)
- G Gutiérrez
- Departamento de Genética, Universidad de Sevilla, Spain
| | | | | |
Collapse
|
49
|
Abstract
Codon usage and base composition in sequences from the A + T-rich genome of Rickettsia prowazekii, a member of the alpha Proteobacteria, have been investigated. Synonymous codon usage patterns are roughly similar among genes, even though the data set includes genes expected to be expressed at very different levels, indicating that translational selection has been ineffective in this species. However, multivariate statistical analysis differentiates genes according to their G + C contents at the first two codon positions. To study this variation, we have compared the amino acid composition patterns of 21 R. prowazekii proteins with that of a homologous set of proteins from Escherichia coli. The analysis shows that individual genes have been affected by biased mutation rates to very different extents: genes encoding proteins highly conserved among other species being the least affected. Overall, protein coding and intergenic spacer regions have G + C content values of 32.5% and 21.4%, respectively. Extrapolation from these values suggests that R. prowazekii has around 800 genes and that 60-70% of the genome may be coding.
Collapse
Affiliation(s)
- S G Andersson
- Department of Molecular Biology, Uppsala University, Sweden
| | | |
Collapse
|
50
|
Andersson SGE, Sharp PM. Codon usage in the Mycobacterium tuberculosis complex. MICROBIOLOGY (READING, ENGLAND) 1996; 142 ( Pt 4):915-925. [PMID: 8936318 DOI: 10.1099/00221287-142-4-915] [Citation(s) in RCA: 80] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
The usage of alternative synonymous codons in Mycobacterium tuberculosis (and M. bovis) genes has been investigated. This species is a member of the high-G+C Gram-positive bacteria, with a genomic G+C content around 65 mol%. This G+C-richness is reflected in a strong bias towards C- and G-ending codons for every amino acid: overall, the G+C content at the third positions of codons is 83%. However, there is significant variation in codon usage patterns among genes, which appears to be associated with gene expression level. From the variation among genes, putative optimal codons were identified for 15 amino acids. The degree of bias towards optimal codons in an M. tuberculosis gene is correlated with that in homologues from Escherichia coli and Bacillus subtilis. The set of selectively favoured codons seems to be quite highly conserved between M. tuberculosis and another high-G+C Gram-positive bacterium, Corynebacterium glutamicum, even though the genome and overall codon usage of the latter are much less G+C-rich.
Collapse
Affiliation(s)
- Siv G E Andersson
- Department of Molecular Biology, Biomedical Center, Uppsala University, Uppsala, S-75124, Sweden
| | - Paul M Sharp
- Department of Genetics, University of Nottingham, Queen's Medical Centre, Nottingham NG7 2UH, UK
| |
Collapse
|