1
|
Aliperti Car L, Sánchez IE. Genomic AT Bias Coupled with Amino Acid Metabolism Modulates Codon Usage. J Mol Evol 2025:10.1007/s00239-025-10251-x. [PMID: 40392286 DOI: 10.1007/s00239-025-10251-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 05/02/2025] [Indexed: 05/22/2025]
Abstract
Encoding of protein-coding sequences in a genome through evolution leads to characteristic proportions of codons and amino acids. Here, we present a simplified maximum entropy model that groups together codons with the same GC (guanine + cytosine) content and coding for the same amino acid and accounts for the stoichiometry of genetic elements in over 50000 genomes with seven interpretable parameters. Our model includes both the cost of a codon given a genomic GC content and the metabolic cost of the corresponding amino acid. Both costs are essential for accurate prediction of codon and amino acid abundances. The best implementation of the model includes a universal equilibrium value for the genomic GC content below 50%, as suggested by the literature. It also splits the twenty amino acids in two groups forming strong (bases C and G) or weak (bases A and U) Watson Crick base pairs with the anticodon, differing in the strength of GC-dependent selection. The entropy-cost trade-off suggests that each organism has sorted out the genome encoding problem given a value for its genomic GC content. The empirical boundaries to this trade-off suggest minimal values for the amino acid and codon entropies, which may limit the GC content of natural genomes.
Collapse
Affiliation(s)
- Lucio Aliperti Car
- Instituto de Química Biológica de La Facultad de Ciencias Exactas y Naturales (IQUIBICEN), Facultad de Ciencias Exactas y Naturales, Laboratorio de Fisiología de Proteínas, Universidad de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina
| | - Ignacio E Sánchez
- Instituto de Química Biológica de La Facultad de Ciencias Exactas y Naturales (IQUIBICEN), Facultad de Ciencias Exactas y Naturales, Laboratorio de Fisiología de Proteínas, Universidad de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina.
| |
Collapse
|
2
|
Yang X, Wang Y, Gong W, Li Y. Comparative Analysis of the Codon Usage Pattern in the Chloroplast Genomes of Gnetales Species. Int J Mol Sci 2024; 25:10622. [PMID: 39408952 PMCID: PMC11477115 DOI: 10.3390/ijms251910622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 09/27/2024] [Accepted: 09/29/2024] [Indexed: 10/20/2024] Open
Abstract
Codon usage bias refers to the preferential use of synonymous codons, a widespread phenomenon found in bacteria, plants, and animals. Codon bias varies among species, families, and groups within kingdoms and between genes within an organism. Codon usage bias (CUB) analysis sheds light on the evolutionary dynamics of various species and optimizes targeted gene expression in heterologous host plants. As a significant order of gymnosperms, species within Gnetales possess extremely high ecological and pharmaceutical values. However, comprehensive analyses of CUB within the chloroplast genomes of Gnetales species remain unexplored. A systematic analysis was conducted to elucidate the codon usage patterns in 13 diverse Gnetales species based on the chloroplast genomes. Our results revealed that chloroplast coding sequences (cp CDSs) in 13 Gnetales species display a marked preference for AT bases and A/T-ending codons. A total of 20 predominantly high-frequency codons and between 2 and 7 optimal codons were identified across these species. The findings from the ENC-plot, PR2-plot, and neutrality analyses suggested that both mutation pressure and natural selection exert influence on the codon bias in these 13 Gnetales species, with natural selection emerging as the predominant influence. Correspondence analysis (COA) demonstrated variation in the codon usage patterns among the Gnetales species and indicated mutation pressure is another factor that could impact CUB. Additionally, our research identified a positive correlation between the measure of idiosyncratic codon usage level of conservatism (MILC) and synonymous codon usage order (SCUO) values, indicative of CUB's potential influence on gene expression. The comparative analysis concerning codon usage frequencies among the 13 Gnetales species and 4 model organisms revealed that Saccharomyces cerevisiae and Nicotiana tabacum were the optimal exogenous expression hosts. Furthermore, the cluster and phylogenetic analyses illustrated distinct patterns of differentiation, implying that codons, even with weak or neutral preferences, could affect the evolutionary trajectories of these species. Our results reveal the characteristics of codon usage patterns and contribute to an enhanced comprehension of evolutionary mechanisms in Gnetales species.
Collapse
Affiliation(s)
- Xiaoming Yang
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing 210037, China;
| | - Yuan Wang
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing 210037, China;
- Inner Mongolia Academy of Forestry Science, Hohhot 010021, China; (W.G.); (Y.L.)
| | - Wenxuan Gong
- Inner Mongolia Academy of Forestry Science, Hohhot 010021, China; (W.G.); (Y.L.)
| | - Yinxiang Li
- Inner Mongolia Academy of Forestry Science, Hohhot 010021, China; (W.G.); (Y.L.)
| |
Collapse
|
3
|
He X, Chen J, Li Z. Complete organelle genomes of the threatened aquatic species Scheuchzeria palustris (Scheuchzeriaceae): Insights into adaptation and phylogenomic placement. Ecol Evol 2024; 14:e70248. [PMID: 39219575 PMCID: PMC11364858 DOI: 10.1002/ece3.70248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2024] [Revised: 08/13/2024] [Accepted: 08/16/2024] [Indexed: 09/04/2024] Open
Abstract
Scheuchzeria palustris, the only species in the Scheuchzeriaceae family, plays a crucial role in methane production and transportation, influencing the global carbon cycle and maintaining ecosystem stability. However, it is now threatened by human activities and global warming. In this study, we generated new organelle genomes for S. palustris, with the plastome (pt) measuring 158,573 bp and the mitogenome (mt) measuring 420,724 bp. We predicted 296 RNA editing sites in mt protein-coding genes (PCGs) and 142 in pt-PCGs. Notably, abundant RNA editing sites in pt-PCGs likely originated from horizontal gene transfer between the plastome and mitogenome. Additionally, we identified positive selection signals in four mt-PCGs (atp4, ccmB, nad3, and sdh4) and one pt-PCG (rps7), which may contribute to the adaptation of S. palustris to low-temperature and high-altitude environments. Furthermore, we identified 35 mitochondrial plastid DNA (MTPT) segments totaling 58,479 bp, attributed to dispersed repeats near most MTPT. Phylogenetic trees reconstructed from mt- and pt-PCGs showed topologies consistent with the APG IV system. However, the conflicting position of S. palustris can be explained by significant differences in the substitution rates of its mt- and pt-PCGs (p < .001). In conclusion, our study provides vital genomic resources to support future conservation efforts and explores the adaptation mechanisms of S. palustris.
Collapse
Affiliation(s)
- Xiang‐Yan He
- Collaborative Innovation Center of Recovery and Reconstruction of Degraded Ecosystem in WanjiangBasin Co‐Funded by Anhui Province and Ministry of Education of the People's Republic of China, School of Ecology and EnvironmentAnhui Normal UniversityWuhuChina
- Aquatic Plant Research Center, Wuhan Botanical GardenChinese Academy of SciencesWuhanChina
- University of Chinese Academy of SciencesBeijingChina
| | - Jin‐Ming Chen
- Aquatic Plant Research Center, Wuhan Botanical GardenChinese Academy of SciencesWuhanChina
| | - Zhi‐Zhong Li
- Collaborative Innovation Center of Recovery and Reconstruction of Degraded Ecosystem in WanjiangBasin Co‐Funded by Anhui Province and Ministry of Education of the People's Republic of China, School of Ecology and EnvironmentAnhui Normal UniversityWuhuChina
| |
Collapse
|
4
|
Lamolle G, Iriarte A, Simón D, Musto H. Amino acid usage and protein expression levels in the flatworm Schistosoma mansoni. Mol Biochem Parasitol 2023; 255:111581. [PMID: 37478919 DOI: 10.1016/j.molbiopara.2023.111581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 07/10/2023] [Accepted: 07/17/2023] [Indexed: 07/23/2023]
Abstract
Schistosoma mansoni is a parasitic flatworm that causes a human disease called schistosomiasis, or bilharzia. At the genomic level, S. mansoni is AT-rich, but has some compositional heterogeneity. Indeed, some regions of its genome are GC-rich, mainly in the regions located near the extreme ends of the chromosomes. Recently, we showed that, despite the strong bias towards A/T ending codons, highly expressed genes tend to use GC-rich codons. Here, we address the following question: are highly expressed sequences biased in their amino acid frequencies? Our analyses show that these sequences in S. mansoni, as in species ranging from bacteria to human, are strongly biased in nucleotide composition. Highly expressed genes tend to use GC-rich codons (in the first and second codon positions), which code the energetically cheapest amino acids. Therefore, we conclude that amino acid usage, at least in highly expressed genes, is strongly shaped by natural selection to avoid energetically expensive residues. Whether this is an adaptation to the parasitic way of life of S. mansoni, is unclear since the same pattern occurs in free-living species.
Collapse
Affiliation(s)
- Guillermo Lamolle
- Unidad de Genómica Evolutiva, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay
| | - Andrés Iriarte
- Unidad de Genómica Evolutiva, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay; Laboratorio de Biología Computacional, Departamento de Desarrollo Biotecnológico, Instituto de Higiene, Facultad de Medicina, Universidad de la República, Avenida A. Navarro 3051, 11600 Montevideo, Uruguay
| | - Diego Simón
- Unidad de Genómica Evolutiva, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay; Laboratorio de Virología Molecular, Centro de Investigaciones Nucleares, Universidad de la República, Mataojo 2055, 11400 Montevideo, Uruguay; Laboratorio de Evolución Experimental de Virus, Institut Pasteur de Montevideo, Mataojo 2020, 11400 Montevideo, Uruguay
| | - Héctor Musto
- Unidad de Genómica Evolutiva, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay.
| |
Collapse
|
5
|
Fu Y, Liang F, Li C, Warren A, Shin MK, Li L. Codon Usage Bias Analysis in Macronuclear Genomes of Ciliated Protozoa. Microorganisms 2023; 11:1833. [PMID: 37513005 PMCID: PMC10384029 DOI: 10.3390/microorganisms11071833] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 07/12/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
Ciliated protozoa (ciliates) are unicellular eukaryotes, several of which are important model organisms for molecular biology research. Analyses of codon usage bias (CUB) of the macronuclear (MAC) genome of ciliates can promote a better understanding of the genetic mode and evolutionary history of these organisms and help optimize codons to improve gene editing efficiency in model ciliates. In this study, the following indices were calculated: the guanine-cytosine (GC) content, the frequency of the nucleotides at the third position of codons (T3, C3, A3, G3), the effective number of codons (ENc), GC content at the 3rd position of synonymous codons (GC3s), and the relative synonymous codon usage (RSCU). Parity rule 2 plot analysis, Neutrality plot analysis, ENc plot analysis, and correlation analysis were employed to explore the main influencing factors of CUB. The results showed that the GC content in the MAC genomes of each of 21 ciliate species, the genomes of which were relatively complete, was lower than 50%, and the base compositions of GC and GC3s were markedly distinct. Synonymous codon analysis revealed that the codons in most of the 21 ciliates ended with A or T and four codons were the general putative optimal codons. Collectively, our results indicated that most of the ciliates investigated preferred using the codons with anof AT-ending and that codon usage bias was affected by gene mutation and natural selection.
Collapse
Affiliation(s)
- Yu Fu
- Laboratory of Marine Protozoan Biodiversity and Evolution, Marine College, Shandong University, Weihai 264209, China
| | - Fasheng Liang
- Laboratory of Marine Protozoan Biodiversity and Evolution, Marine College, Shandong University, Weihai 264209, China
| | - Congjun Li
- Laboratory of Marine Protozoan Biodiversity and Evolution, Marine College, Shandong University, Weihai 264209, China
| | - Alan Warren
- Department of Life Sciences, Natural History Museum, London SW7 5BD, UK
| | - Mann Kyoon Shin
- Department of Biology, University of Ulsan, Ulsan 44610, Republic of Korea
| | - Lifang Li
- Laboratory of Marine Protozoan Biodiversity and Evolution, Marine College, Shandong University, Weihai 264209, China
| |
Collapse
|
6
|
Zaidi SEZ, Zaheer R, Thomas K, Abeysekara S, Haight T, Saville L, Stuart-Edwards M, Zovoilis A, McAllister TA. Genomic Characterization of Carbapenem-Resistant Bacteria from Beef Cattle Feedlots. Antibiotics (Basel) 2023; 12:960. [PMID: 37370279 DOI: 10.3390/antibiotics12060960] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 05/17/2023] [Accepted: 05/24/2023] [Indexed: 06/29/2023] Open
Abstract
Carbapenems are considered a last resort for the treatment of multi-drug-resistant bacterial infections in humans. In this study, we investigated the occurrence of carbapenem-resistant bacteria in feedlots in Alberta, Canada. The presumptive carbapenem-resistant isolates (n = 116) recovered after ertapenem enrichment were subjected to antimicrobial susceptibility testing against 12 different antibiotics, including four carbapenems. Of these, 72% of the isolates (n = 84) showed resistance to ertapenem, while 27% of the isolates (n = 31) were resistant to at least one other carbapenem, with all except one isolate being resistant to at least two other drug classes. Of these 31 isolates, 90% were carbapenemase positive, while a subset of 36 ertapenem-only resistant isolates were carbapenemase negative. The positive isolates belonged to three genera; Pseudomonas, Acinetobacter, and Stenotrophomonas, with the majority being Pseudomonas aeruginosa (n = 20) as identified by 16S rRNA gene sequencing. Whole genome sequencing identified intrinsic carbapenem resistance genes, including blaOXA-50 and its variants (P. aeruginosa), blaOXA-265 (A. haemolyticus), blaOXA-648 (A. lwoffii), blaOXA-278 (A. junii), and blaL1 and blaL2 (S. maltophilia). The acquired carbapenem resistance gene (blaPST-2) was identified in P. saudiphocaensis and P. stutzeri. In a comparative genomic analysis, clinical P. aeruginosa clustered separately from those recovered from bovine feces. In conclusion, despite the use of selective enrichment methods, finding carbapenem-resistant bacteria within a feedlot environment was a rarity.
Collapse
Affiliation(s)
- Sani-E-Zehra Zaidi
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive, Lethbridge, AB T1K 3M4, Canada
| | - Rahat Zaheer
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada
| | - Krysty Thomas
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada
| | - Sujeema Abeysekara
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada
| | - Travis Haight
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive, Lethbridge, AB T1K 3M4, Canada
| | - Luke Saville
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive, Lethbridge, AB T1K 3M4, Canada
| | - Matthew Stuart-Edwards
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive, Lethbridge, AB T1K 3M4, Canada
| | - Athanasios Zovoilis
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive, Lethbridge, AB T1K 3M4, Canada
| | - Tim A McAllister
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada
| |
Collapse
|
7
|
Panda A, Tuller T. Determinants of associations between codon and amino acid usage patterns of microbial communities and the environment inferred based on a cross-biome metagenomic analysis. NPJ Biofilms Microbiomes 2023; 9:5. [PMID: 36693851 PMCID: PMC9873608 DOI: 10.1038/s41522-023-00372-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 01/11/2023] [Indexed: 01/25/2023] Open
Abstract
Codon and amino acid usage were associated with almost every aspect of microbial life. However, how the environment may impact the codon and amino acid choice of microbial communities at the habitat level is not clearly understood. Therefore, in this study, we analyzed codon and amino acid usage patterns of a large number of environmental samples collected from diverse ecological niches. Our results suggested that samples derived from similar environmental niches, in general, show overall similar codon and amino acid distribution as compared to samples from other habitats. To substantiate the relative impact of the environment, we considered several factors, such as their similarity in GC content, or in functional or taxonomic abundance. Our analysis demonstrated that none of these factors can fully explain the trends that we observed at the codon or amino acid level implying a direct environmental influence on them. Further, our analysis demonstrated different levels of selection on codon bias in different microbial communities with the highest bias in host-associated environments such as the digestive system or oral samples and the lowest level of selection in soil and water samples. Considering a large number of metagenomic samples here we showed that microorganisms collected from similar environmental backgrounds exhibit similar patterns of codon and amino acid usage irrespective of the location or time from where the samples were collected. Thus our study suggested a direct impact of the environment on codon and amino usage of microorganisms that cannot be explained considering the influence of other factors.
Collapse
Affiliation(s)
- Arup Panda
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv, 69978, Israel.
| |
Collapse
|
8
|
Miller JB, Meurs TE, Hodgman MW, Song B, Miller KN, Ebbert MTW, Kauwe JSK, Ridge PG. The Ramp Atlas: facilitating tissue and cell-specific ramp sequence analyses through an intuitive web interface. NAR Genom Bioinform 2022; 4:lqac039. [PMID: 35664804 PMCID: PMC9155233 DOI: 10.1093/nargab/lqac039] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 03/01/2022] [Accepted: 05/24/2022] [Indexed: 11/14/2022] Open
Abstract
Ramp sequences occur when the average translational efficiency of codons near the 5′ end of highly expressed genes is significantly lower than the rest of the gene sequence, which counterintuitively increases translational efficiency by decreasing downstream ribosomal collisions. Here, we show that the relative codon adaptiveness within different tissues changes the existence of a ramp sequence without altering the underlying genetic code. We present the first comprehensive analysis of tissue and cell type-specific ramp sequences and report 3108 genes with ramp sequences that change between tissues and cell types, which corresponds with increased gene expression within those tissues and cells. The Ramp Atlas (https://ramps.byu.edu/) allows researchers to query precomputed ramp sequences in 18 388 genes across 62 tissues and 66 cell types and calculate tissue-specific ramp sequences from user-uploaded FASTA files through an intuitive web interface. We used The Ramp Atlas to identify seven SARS-CoV-2 genes and seven human SARS-CoV-2 entry factor genes with tissue-specific ramp sequences that may help explain viral proliferation within those tissues. We anticipate that The Ramp Atlas will facilitate personalized and creative tissue-specific ramp sequence analyses for both human and viral genes that will increase our ability to utilize this often-overlooked regulatory region.
Collapse
Affiliation(s)
- Justin B Miller
- Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY 40504, USA
| | - Taylor E Meurs
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Matthew W Hodgman
- Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY 40504, USA
| | - Benjamin Song
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Kyle N Miller
- Department of Computer Science, Utah Valley University, Orem, UT 84058, USA
| | - Mark T W Ebbert
- Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY 40504, USA
| | - John S K Kauwe
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Perry G Ridge
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| |
Collapse
|
9
|
Jin YT, Pu DK, Guo HX, Deng Z, Chen LL, Guo FB. T-G-A Deficiency Pattern in Protein-Coding Genes and Its Potential Reason. Front Microbiol 2022; 13:847325. [PMID: 35602045 PMCID: PMC9116502 DOI: 10.3389/fmicb.2022.847325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 03/30/2022] [Indexed: 11/20/2022] Open
Abstract
If a stop codon appears within one gene, then its translation will be terminated earlier than expected. False folding of premature protein will be adverse to the host; hence, all functional genes would tend to avoid the intragenic stop codons. Therefore, we hypothesize that there will be less frequency of nucleotides corresponding to stop codons at each codon position of genes. Here, we validate this inference by investigating the nucleotide frequency at a large scale and results from 19,911 prokaryote genomes revealed that nucleotides coinciding with stop codons indeed have the lowest frequency in most genomes. Interestingly, genes with three types of stop codons all tend to follow a T-G-A deficiency pattern, suggesting that the property of avoiding intragenic termination pressure is the same and the major stop codon TGA plays a dominant role in this effect. Finally, a positive correlation between the TGA deficiency extent and the base length was observed in start-experimentally verified genes of Escherichia coli (E. coli). This strengthens the proof of our hypothesis. The T-G-A deficiency pattern observed would help to understand the evolution of codon usage tactics in extant organisms.
Collapse
Affiliation(s)
- Yan-Ting Jin
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China.,Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Dong-Kai Pu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hai-Xia Guo
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zixin Deng
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Ling-Ling Chen
- Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Feng-Biao Guo
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| |
Collapse
|
10
|
Simón D, Cristina J, Musto H. An overview of dinucleotide and codon usage in all viruses. Arch Virol 2022; 167:1443-1448. [PMID: 35467158 DOI: 10.1007/s00705-022-05454-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 04/05/2022] [Indexed: 11/30/2022]
Abstract
Viruses are, by far, the most abundant biological entities on earth. They are found in all known ecological niches and are the causative agents of many important diseases in plants and animals. From an evolutionary point of view, since viruses do not share any orthologous genes, there is a general consensus that they are polyphyletic; that is, they do not have a common ancestor. This means that they appeared several times during the course of evolution. For their life cycle, they are always obligate parasites of a free cellular life form, which can be bacteria, archaea, or eukaryotes. More complexity is added to these entities by the fact that their genetic material can be DNA or RNA (double- or single-stranded) or retrotranscribed. Given these features, we wondered if some general rules can be inferred when studying two basic genomic signatures-dinucleotides and codon usage-analyzing all available complete and non-redundant viral sequences. In spite of the obviously biased sample of sequences available, some general features appear to emerge.
Collapse
Affiliation(s)
- Diego Simón
- Laboratorio de Genómica Evolutiva, Departamento de Biología Celular y Molecular, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay.,Laboratorio de Virología Molecular, Centro de Investigaciones Nucleares, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay.,Laboratorio de Evolución Experimental de Virus, Institut Pasteur de Montevideo, Montevideo, Uruguay
| | - Juan Cristina
- Laboratorio de Virología Molecular, Centro de Investigaciones Nucleares, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| | - Héctor Musto
- Laboratorio de Genómica Evolutiva, Departamento de Biología Celular y Molecular, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay.
| |
Collapse
|
11
|
Simón D, Cristina J, Musto H. Nucleotide Composition and Codon Usage Across Viruses and Their Respective Hosts. Front Microbiol 2021; 12:646300. [PMID: 34262534 PMCID: PMC8274242 DOI: 10.3389/fmicb.2021.646300] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Accepted: 06/04/2021] [Indexed: 11/13/2022] Open
Abstract
The genetic material of the three domains of life (Bacteria, Archaea, and Eukaryota) is always double-stranded DNA, and their GC content (molar content of guanine plus cytosine) varies between ≈ 13% and ≈ 75%. Nucleotide composition is the simplest way of characterizing genomes. Despite this simplicity, it has several implications. Indeed, it is the main factor that determines, among other features, dinucleotide frequencies, repeated short DNA sequences, and codon and amino acid usage. Which forces drive this strong variation is still a matter of controversy. For rather obvious reasons, most of the studies concerning this huge variation and its consequences, have been done in free-living organisms. However, no recent comprehensive study of all known viruses has been done (that is, concerning all available sequences). Viruses, by far the most abundant biological entities on Earth, are the causative agents of many diseases. An overview of these entities is important also because their genetic material is not always double-stranded DNA: indeed, certain viruses have as genetic material single-stranded DNA, double-stranded RNA, single-stranded RNA, and/or retro-transcribing. Therefore, one may wonder if what we have learned about the evolution of GC content and its implications in prokaryotes and eukaryotes also applies to viruses. In this contribution, we attempt to describe compositional properties of ∼ 10,000 viral species: base composition (globally and according to Baltimore classification), correlations among non-coding regions and the three codon positions, and the relationship of the nucleotide frequencies and codon usage of viruses with the same feature of their hosts. This allowed us to determine how the base composition of phages strongly correlate with the value of their respective hosts, while eukaryotic viruses do not (with fungi and protists as exceptions). Finally, we discuss some of these results concerning codon usage: reinforcing previous results, we found that phages and hosts exhibit moderate to high correlations, while for eukaryotes and their viruses the correlations are weak or do not exist.
Collapse
Affiliation(s)
- Diego Simón
- Laboratorio de Genómica Evolutiva, Departamento de Biología Celular y Molecular, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay.,Laboratorio de Virología Molecular, Centro de Investigaciones Nucleares, Facultad de Ciencias, Universidad de la Republica, Montevideo, Uruguay.,Laboratorio de Evolución Experimental de Virus, Institut Pasteur de Montevideo, Montevideo, Uruguay
| | - Juan Cristina
- Laboratorio de Virología Molecular, Centro de Investigaciones Nucleares, Facultad de Ciencias, Universidad de la Republica, Montevideo, Uruguay
| | - Héctor Musto
- Laboratorio de Genómica Evolutiva, Departamento de Biología Celular y Molecular, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| |
Collapse
|
12
|
Castillo AI, Almeida RPP. Evidence of gene nucleotide composition favoring replication and growth in a fastidious plant pathogen. G3-GENES GENOMES GENETICS 2021; 11:6170658. [PMID: 33715000 PMCID: PMC8495750 DOI: 10.1093/g3journal/jkab076] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 03/02/2021] [Indexed: 11/13/2022]
Abstract
Nucleotide composition (GC content) varies across bacteria species, genome regions, and specific genes. In Xylella fastidiosa, a vector-borne fastidious plant pathogen infecting multiple crops, GC content ranges between ∼51-52%; however, these values were gathered using limited genomic data. We evaluated GC content variations across X. fastidiosa subspecies fastidiosa (N = 194), subsp. pauca (N = 107), and subsp. multiplex (N = 39). Genomes were classified based on plant host and geographic origin; individual genes within each genome were classified based on gene function, strand, length, ortholog group, Core vs. Accessory, and Recombinant vs. Non-recombinant. GC content was calculated for each gene within each evaluated genome. The effects of genome and gene level variables were evaluated with a mixed effect ANOVA, and the marginal-GC content was calculated for each gene. Also, the correlation between gene-specific GC content vs. natural selection (dN/dS) and recombination/mutation (r/m) was estimated. Our analyses show that intra-genomic changes in nucleotide composition in X. fastidiosa are small and influenced by multiple variables. Higher AT-richness is observed in genes involved in replication and translation, and genes in the leading strand. In addition, we observed a negative correlation between high-AT and dN/dS in subsp. pauca. The relationship between recombination and GC content varied between core and accessory genes. We hypothesize that distinct evolutionary forces and energetic constraints both drive and limit these small variations in nucleotide composition.
Collapse
Affiliation(s)
- Andreina I Castillo
- Department of Environmental Science, Policy and Management, University of California, Berkeley, CA 94720, USA
| | - Rodrigo P P Almeida
- Department of Environmental Science, Policy and Management, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
13
|
Designing of peptide aptamer targeting the receptor-binding domain of spike protein of SARS-CoV-2: an in silico study. Mol Divers 2021; 26:157-169. [PMID: 33389440 PMCID: PMC7778502 DOI: 10.1007/s11030-020-10171-6] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Accepted: 12/09/2020] [Indexed: 12/14/2022]
Abstract
Short synthetic peptide molecules which bind to a specific target protein with a high affinity to exert its function are known as peptide aptamers. The high specificity of aptamers with small-molecule targets (metal ions, dyes and theophylline; ATP) is within 1 pM and 1 μM range, whereas with the proteins (thrombin, CD4 and antibodies) it is in the nanomolar range (which is equivalent to monoclonal antibodies). The recently identified coronavirus (SARS-CoV-2) genome encodes for various proteins, such as envelope, membrane, nucleocapsid, and spike protein. Among these, the protein necessary for the virus to enter inside the host cell is spike protein. The work focuses on designing peptide aptamer targeting the spike receptor-binding domain of SARS-CoV-2. The peptide aptamer has been designed by using bacterial Thioredoxin A as the scaffold protein and an 18-residue-long peptide. The tertiary structure of the peptide aptamer is modeled and docked to spike receptor-binding domain of SARS CoV2. Molecular dynamic simulation has been done to check the stability of the aptamer and receptor-binding domain complex. It was observed that the aptamer binds to spike receptor-binding domain of SARS-CoV-2 in a similar pattern as that of ACE2. The aptamer-receptor-binding domain complex was found to be stable in a 100 ns molecular dynamic simulation. The aptamer is also predicted to be non-antigenic, non-allergenic, non-hemolytic, non-inflammatory, water-soluble with high affinity toward ACE2 than serum albumin. Thus, peptide aptamer can be a novel approach for the therapeutic treatment for SARS-CoV-2.
Collapse
|
14
|
Hodgman MW, Miller JB, Meurs TE, Kauwe JSK. CUBAP: an interactive web portal for analyzing codon usage biases across populations. Nucleic Acids Res 2020; 48:11030-11039. [PMID: 33045750 PMCID: PMC7641757 DOI: 10.1093/nar/gkaa863] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Revised: 08/18/2020] [Accepted: 09/22/2020] [Indexed: 12/19/2022] Open
Abstract
Synonymous codon usage significantly impacts translational and transcriptional efficiency, gene expression, the secondary structure of both mRNA and proteins, and has been implicated in various diseases. However, population-specific differences in codon usage biases remain largely unexplored. Here, we present a web server, https://cubap.byu.edu, to facilitate analyses of codon usage biases across populations (CUBAP). Using the 1000 Genomes Project, we calculated and visually depict population-specific differences in codon frequencies, codon aversion, identical codon pairing, co-tRNA codon pairing, ramp sequences, and nucleotide composition in 17,634 genes. We found that codon pairing significantly differs between populations in 35.8% of genes, allowing us to successfully predict the place of origin for African and East Asian individuals with 98.8% and 100% accuracy, respectively. We also used CUBAP to identify a significant bias toward decreased CTG pairing in the immunity related GTPase M (IRGM) gene in East Asian and African populations, which may contribute to the decreased association of rs10065172 with Crohn's disease in those populations. CUBAP facilitates in-depth gene-specific and codon-specific visualization that will aid in analyzing candidate genes identified in genome-wide association studies, identifying functional implications of synonymous variants, predicting population-specific impacts of synonymous variants and categorizing genetic biases unique to certain populations.
Collapse
Affiliation(s)
- Matthew W Hodgman
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Justin B Miller
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Taylor E Meurs
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - John S K Kauwe
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| |
Collapse
|
15
|
Devi A, Chaitanya NSN. In silico designing of multi-epitope vaccine construct against human coronavirus infections. J Biomol Struct Dyn 2020; 39:6903-6917. [PMID: 32772892 PMCID: PMC7484569 DOI: 10.1080/07391102.2020.1804460] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Single stranded RNA viruses were known to cause variety of diseases since many years and are gaining much importance due to pandemic after the identification of a novel corona virus (severe acute respiratory syndrome-coronavirus (SARS-CoV-2)). Seven coronaviruses (CoVs) are known to infect humans and they are OC43 CoV, NL63 CoV, HKU1 CoV, Middle East respiratory syndrome, SARS CoV, and SARS CoV-2. Virus replication weakens the immune system of host thereby altering T-cell count and much of interferon response. Although no vaccine or therapeutic treatment has been approved till now for CoV infection, trials of vaccine against SARS CoV-2 are in progress. One of the epitopes used for vaccine production is of the spike protein on the surface of virus. The work focuses on designing of multi-epitope vaccine construct for treatment of seven human CoV infections using the epitopes present on the spike protein of human CoVs. To address this, immuno-informatics techniques have been employed to design multi-epitope vaccine construct. B- and T-cell epitopes of the spike proteins have been predicted and designed into a multi-epitope vaccine construct. The tertiary structure of the vaccine construct along with the adjuvant has been modelled and the physiochemical properties have been predicted. The multi-epitope vaccine construct has antigenic and non-allergenic property. After validation, refinement and disulphide engineering of the vaccine construct, molecular docking with toll-like receptors (TLRs) have been performed. Molecular dynamics simulation in aqueous environment predicted that the vaccine-TLRs complexes were stable. The vaccine construct is predicted to be able to trigger primary immune response in silico. Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Arpita Devi
- Department of Molecular Biology and Biotechnology, Tezpur University, Tezpur, Assam, India
| | - Nyshadham S N Chaitanya
- Department of Animal Biology, School of Life Sciences, University of Hyderabad, Hyderabad, India
| |
Collapse
|
16
|
Mittal A, Changani AM, Taparia S, Goel D, Parihar A, Singh I. Structural disorder originates beyond narrow stoichiometric margins of amino acids in naturally occurring folded proteins. J Biomol Struct Dyn 2020; 39:2364-2375. [PMID: 32238088 DOI: 10.1080/07391102.2020.1751299] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Rigorous analyses of Euclidean distances between non-peptide bonded residues in structures of several thousand naturally occurring folded proteins yielded a surprising "margin of life" for percentage occurrence of individual amino acids in naturally occurring folded proteins. On one hand, the concept of "margin of life", referring to lower than expected variances in average stoichiometric occurrences of individual amino acids in folded proteins, remains unchallenged since its discovery a decade ago. On the other hand, within this past decade there has been a strong emergence of a gradual paradigm shift in biology, from sequence-structure-function in proteins to sequence-disorder-function, fuelled by discoveries on functional implications of intrinsically disordered proteins (primary sequences that do not form stable structures). Thus the applicability of "margin of life" to peptide-bonded residues in all known natural proteins, adopting stable structures vis-à-vis intrinsically disordered needs to be explored. Therefore in this work, we analyze compositions of the complete naturally occurring primary sequence space (over 560000 sequences) after dividing it into mutually exclusive subsets of structured and intrinsically disordered proteins along with a subset without any structural information. While finding that occurrence of different peptides (up to pentapeptides) is a direct consequence of the relative occurrences of their constituting residues in folded proteins, we report that structural disorder in natural proteins originates beyond the narrow stoichiometric margins of amino acids found in structured proteins.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Aditya Mittal
- Kusuma School of Biological Sciences, Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India.,Supercomputing Facility for Bioinformatics & Computational Biology, Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India
| | | | - Sakshi Taparia
- Department of Mathematics (Bachelors program in Mathematics & Computing), Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India
| | - Deepanshu Goel
- Department of Biochemical Engineering and Biotechnology (Bachelors program), Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India
| | - Animesh Parihar
- Department of Biochemical Engineering and Biotechnology (Bachelors program), Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India
| | - Ishan Singh
- Department of Computer Science & Engineering (Bachelors program Computer Science), Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India
| |
Collapse
|
17
|
Sami Ullah Asif GM. Draft Genome Sequence of Bacillus safensis Strain Sami, Isolated from Leaf Veins of Ficus religiosa. Microbiol Resour Announc 2019; 8:e00827-19. [PMID: 31753935 PMCID: PMC6872877 DOI: 10.1128/mra.00827-19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 10/25/2019] [Indexed: 11/20/2022] Open
Abstract
Here, I report the draft genome sequence of a novel Bacillus safensis strain, Sami, isolated from leaf veins of Ficus religiosa F. religiosa is a large tree native to the Indian subcontinent and Indochina. The draft genome of B. safensis is 3.67 Mb.
Collapse
|
18
|
Comprehensive profiling of codon usage signatures and codon context variations in the genus Ustilago. World J Microbiol Biotechnol 2019; 35:118. [DOI: 10.1007/s11274-019-2693-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Accepted: 07/07/2019] [Indexed: 02/02/2023]
|
19
|
Flüchter S, Follonier S, Schiel-Bengelsdorf B, Bengelsdorf FR, Zinn M, Dürre P. Anaerobic Production of Poly(3-hydroxybutyrate) and Its Precursor 3-Hydroxybutyrate from Synthesis Gas by Autotrophic Clostridia. Biomacromolecules 2019; 20:3271-3282. [DOI: 10.1021/acs.biomac.9b00342] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Sebastian Flüchter
- Institut für
Mikrobiologie und Biotechnologie, Universität Ulm, Albert-Einstein-Allee 11, 89081 Ulm, Germany
| | - Stéphanie Follonier
- Institute of Life
Technologies, University of Applied Sciences and Arts Western Switzerland (HES-SO Valais), Route du Rawyl 64, 1950 Sion, Switzerland
| | - Bettina Schiel-Bengelsdorf
- Institut für
Mikrobiologie und Biotechnologie, Universität Ulm, Albert-Einstein-Allee 11, 89081 Ulm, Germany
| | - Frank R. Bengelsdorf
- Institut für
Mikrobiologie und Biotechnologie, Universität Ulm, Albert-Einstein-Allee 11, 89081 Ulm, Germany
| | - Manfred Zinn
- Institute of Life
Technologies, University of Applied Sciences and Arts Western Switzerland (HES-SO Valais), Route du Rawyl 64, 1950 Sion, Switzerland
| | - Peter Dürre
- Institut für
Mikrobiologie und Biotechnologie, Universität Ulm, Albert-Einstein-Allee 11, 89081 Ulm, Germany
| |
Collapse
|
20
|
Du MZ, Liu S, Zeng Z, Alemayehu LA, Wei W, Guo FB. Amino acid compositions contribute to the proteins' evolution under the influence of their abundances and genomic GC content. Sci Rep 2018; 8:7382. [PMID: 29743515 PMCID: PMC5943316 DOI: 10.1038/s41598-018-25364-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Accepted: 04/16/2018] [Indexed: 12/23/2022] Open
Abstract
Inconsistent results on the association between evolutionary rates and amino acid composition of proteins have been reported in eukaryotes. However, there are few studies of how amino acid composition can influence evolutionary rates in bacteria. Thus, we constructed linear regression models between composition frequencies of amino acids and evolutionary rates for bacteria. Compositions of all amino acids can on average explain 21.5% of the variation in evolutionary rates among 273 investigated bacterial organisms. In five model organisms, amino acid composition contributes more to variation in evolutionary rates than protein abundance, and frequency of optimal codons. The contribution of individual amino acid composition to evolutionary rate varies among organisms. The closer the GC-content of genome to its maximum or minimum, the better the correlation between the amino acid content and the evolutionary rate of proteins would appear in that genome. The types of amino acids that significantly contribute to evolutionary rates can be grouped into GC-rich and AT-rich amino acids. Besides, the amino acid with high composition also contributes more to evolutionary rates than amino acid with low composition in proteome. In summary, amino acid composition significantly contributes to the rate of evolution in bacterial organisms and this in turn is impacted by GC-content.
Collapse
Affiliation(s)
- Meng-Ze Du
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Shuo Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhi Zeng
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Labena Abraham Alemayehu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Wen Wei
- School of Life Sciences, Chongqing University, Chongqing, China.
| | - Feng-Biao Guo
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China. .,Centre for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China. .,Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China.
| |
Collapse
|
21
|
Zhang D, Hu P, Liu T, Wang J, Jiang S, Xu Q, Chen L. GC bias lead to increased small amino acids and random coils of proteins in cold-water fishes. BMC Genomics 2018; 19:315. [PMID: 29720106 PMCID: PMC5930961 DOI: 10.1186/s12864-018-4684-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Accepted: 04/16/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Temperature adaptation of biological molecules is fundamental in evolutionary studies but remains unsolved. Fishes living in cold water are adapted to low temperatures through adaptive modification of their biological molecules, which enables their functioning in extreme cold. To study nucleotide and amino acid preference in cold-water fishes, we investigated the substitution asymmetry of codons and amino acids in protein-coding DNA sequences between cold-water fishes and tropical fishes., The former includes two Antarctic fishes, Dissostichus mawsoni (Antarctic toothfish), Gymnodraco acuticeps (Antarctic dragonfish), and two temperate fishes, Gadus morhua (Atlantic cod) and Gasterosteus aculeatus (stickleback), and the latter includes three tropical fishes, including Danio rerio (zebrafish), Oreochromis niloticus (Nile tilapia) and Xiphophorus maculatus (Platyfish). RESULTS Cold-water fishes showed preference for Guanines and cytosines (GCs) in both synonymous and nonsynonymous codon substitution when compared with tropical fishes. Amino acids coded by GC-rich codons are favored in the temperate fishes, while those coded by AT-rich codons are disfavored. Similar trends were discovered in Antarctic fishes but were statistically weaker. The preference of GC rich codons in nonsynonymous substitution tends to increase ratio of small amino acid in proteins, which was demonstrated by biased small amino acid substitutions in the cold-water species when compared with the tropical species, especially in the temperate species. Prediction and comparison of secondary structure of the proteomes showed that frequency of random coils are significantly larger in the cold-water fish proteomes than those of the tropical fishes. CONCLUSIONS Our results suggested that natural selection in cold temperature might favor biased GC content in the coding DNA sequences, which lead to increased frequency of small amino acids and consequently increased random coils in the proteomes of cold-water fishes.
Collapse
Affiliation(s)
- Dongsheng Zhang
- Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources, Shanghai Ocean University, Ministry of Education, National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), Shanghai, People's Republic of China
| | - Peng Hu
- Department of Genetics, University of Pennsylvania, Philadelphia, USA
| | - Taigang Liu
- College of Informatics, Shanghai Ocean University, Shanghai, People's Republic of China
| | - Jian Wang
- Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources, Shanghai Ocean University, Ministry of Education, National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), Shanghai, People's Republic of China
| | - Shouwen Jiang
- Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources, Shanghai Ocean University, Ministry of Education, National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), Shanghai, People's Republic of China
| | - Qianghua Xu
- College of Marine Sciences, Shanghai Ocean University, Shanghai, People's Republic of China
| | - Liangbiao Chen
- Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources, Shanghai Ocean University, Ministry of Education, National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), Shanghai, People's Republic of China.
| |
Collapse
|
22
|
Sun J, Meng Z, Wu K, Liu B, Zhang S, Liu Y, Wang Y, Zheng H, Huang J, Zhou P. Tracing the origin of Treponema pallidum in China using next-generation sequencing. Oncotarget 2018; 7:42904-42918. [PMID: 27344187 PMCID: PMC5189996 DOI: 10.18632/oncotarget.10154] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2015] [Accepted: 06/01/2016] [Indexed: 12/29/2022] Open
Abstract
Syphilis is a systemic sexually transmitted disease caused by Treponema pallidum ssp. pallidum (TPA). The origin and genetic background of Chinese TPA strains remain unclear. We identified a total of 329 single-nucleotide variants (SNVs) in eight Chinese TPA strains using next-generation sequencing. All of the TPA strains were clustered into three lineages, and Chinese TPA strains were grouped in Lineage 2 based on phylogenetic analysis. The phylogeographical data showed that TPA strains originated earlier than did T. pallidum ssp. pertenue (TPE) and T. pallidum ssp. endemicum (TPN) strains and that Chinese TPA strains might be derived from recombination between Lineage 1 and Lineage 3. Moreover, we found through a homology modeling analysis that a nonsynonymous substitution (I415F) in the PBP3 protein might affect the structural flexibility of PBP3 and the binding constant for substrates based on its possible association with penicillin resistance in T. pallidum. Our findings provide new insight into the molecular foundation of the evolutionary origin of TPA and support the development of novel diagnostic/therapeutic technology for syphilis.
Collapse
Affiliation(s)
- Jun Sun
- STD Institute, Shanghai Skin Disease Hospital, Shanghai, China
| | - Zhefeng Meng
- Oncology Bioinformatics Center, Minhang Hospital, Fudan University, Shanghai, China
| | - Kaiqi Wu
- School of Laboratory Medicine, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Biao Liu
- School of Laboratory Medicine, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Sufang Zhang
- Shanghai Skin Disease Hospital, Clinical School of Anhui Medical University, Shanghai, China
| | - Yudan Liu
- Shanghai Skin Disease Hospital, Clinical School of Anhui Medical University, Shanghai, China
| | - Yuezhu Wang
- Shanghai-MOST Key Laboratory for Disease and Health Genomics, Chinese National Human Genome Center and National Engineering Center for Biochip at Shanghai, Shanghai, China
| | - Huajun Zheng
- Shanghai-MOST Key Laboratory for Disease and Health Genomics, Chinese National Human Genome Center and National Engineering Center for Biochip at Shanghai, Shanghai, China
| | - Jian Huang
- Shanghai-MOST Key Laboratory for Disease and Health Genomics, Chinese National Human Genome Center and National Engineering Center for Biochip at Shanghai, Shanghai, China.,Key Laboratory of Systems Biomedicine (Ministry of Education) and Collaborative Innovation Center of Systems Biomedicine, Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai, China
| | - Pingyu Zhou
- STD Institute, Shanghai Skin Disease Hospital, Shanghai, China.,Shanghai Skin Disease Hospital, Clinical School of Anhui Medical University, Shanghai, China
| |
Collapse
|
23
|
Gajbhiye S, Patra P, Yadav MK. New insights into the factors affecting synonymous codon usage in human infecting Plasmodium species. Acta Trop 2017; 176:29-33. [PMID: 28751162 DOI: 10.1016/j.actatropica.2017.07.025] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 07/18/2017] [Accepted: 07/21/2017] [Indexed: 02/07/2023]
Abstract
Codon usage bias is due to the non-random usage of synonymous codons for coding amino acids. The synonymous sites are under weak selection, and codon usage bias is maintained by the equilibrium in mutational bias, genetic drift and selection pressure. The differential codon usage choices are also relevant to human infecting Plasmodium species. Recently, P. knowlesi switches its natural host, long-tailed macaques, and starts infecting humans. This review focuses on the comparative analysis of codon usage choices among human infecting P. falciparum and P. vivax along with P. knowlesi species taking their coding sequence data. The variation in GC content, amino acid frequencies, effective number of codons and other factors plays a crucial role in determining synonymous codon choices. Within species codon choices are more similar for P. vivax and P. knowlesi in comparison with P. falciparum species. This study suggests that synonymous codon choice modulates the gene expression level, mRNA stability, ribosome speed, protein folding, translation efficiency and its accuracy in Plasmodium species, and provides a valuable information regarding the codon usage pattern to facilitate gene cloning as well as expression and transfection studies for malaria causing species.
Collapse
|
24
|
Xu W, Xing T, Zhao M, Yin X, Xia G, Wang M. Synonymous codon usage bias in plant mitochondrial genes is associated with intron number and mirrors species evolution. PLoS One 2015; 10:e0131508. [PMID: 26110418 PMCID: PMC4481540 DOI: 10.1371/journal.pone.0131508] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 06/03/2015] [Indexed: 11/21/2022] Open
Abstract
Synonymous codon usage bias (SCUB) is a common event that a non-uniform usage of codons often occurs in nearly all organisms. We previously found that SCUB is correlated with both intron number and exon position in the plant nuclear genome but not in the plastid genome; SCUB in both nuclear and plastid genome can mirror the evolutionary specialization. However, how about the rules in the mitochondrial genome has not been addressed. Here, we present an analysis of SCUB in the mitochondrial genome, based on 24 plant species ranging from algae to land plants. The frequencies of NNA and NNT (A- and T-ending codons) are higher than those of NNG and NNC, with the strongest preference in bryophytes and the weakest in land plants, suggesting an association between SCUB and plant evolution. The preference for NNA and NNT is more evident in genes harboring a greater number of introns in land plants, but the bias to NNA and NNT exhibits even among exons. The pattern of SCUB in the mitochondrial genome differs in some respects to that present in both the nuclear and plastid genomes.
Collapse
Affiliation(s)
- Wenjing Xu
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
| | - Tian Xing
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
| | - Mingming Zhao
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
| | - Xunhao Yin
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
| | - Guangmin Xia
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
| | - Mengcheng Wang
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
- * E-mail:
| |
Collapse
|