1
|
Inglis LK, Grigson SR, Roach MJ, Edwards RA. Prophages as a source of antimicrobial resistance genes in the human microbiome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.19.644263. [PMID: 40166311 PMCID: PMC11957107 DOI: 10.1101/2025.03.19.644263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Prophages-viruses that integrate into bacterial genomes-are ubiquitous in the microbial realm. Prophages contribute significantly to horizontal gene transfer, including the potential spread of antimicrobial resistance (AMR) genes, because they can collect host genes. Understanding their role in the human microbiome is essential for fully understanding AMR dynamics and possible clinical implications. We analysed almost 15,000 bacterial genomes for prophages and AMR genes. The bacteria were isolated from diverse human body sites and geographical regions, and their genomes were retrieved from GenBank. AMR genes were detected in 6.6% of bacterial genomes, with a higher prevalence in people with symptomatic diseases. We found a wide variety of AMR genes combating multiple drug classes. We discovered AMR genes previously associated with plasmids, such as blaOXA-23 in Acinetobacter baumannii prophages or genes found in prophages in species they had not been previously described in, such as mefA-msrD in Gardnerella prophages, suggesting prophage-mediated gene transfer of AMR genes. Prophages encoding AMR genes were found at varying frequencies across body sites and geographical regions, with Asia showing the highest diversity of AMR genes.
Collapse
Affiliation(s)
- Laura K Inglis
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, SA, 5042, Australia
| | - Susanna R Grigson
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, SA, 5042, Australia
| | - Michael J Roach
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, SA, 5042, Australia
- Flinders Health and Medical Research Institute, College of Medicine and Public Health, Flinders University, Bedford Park, SA, 5042, Australia
| | - Robert A Edwards
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, SA, 5042, Australia
| |
Collapse
|
2
|
Mallawaarachchi V, Wickramarachchi A, Xue H, Papudeshi B, Grigson SR, Bouras G, Prahl RE, Kaphle A, Verich A, Talamantes-Becerra B, Dinsdale EA, Edwards RA. Solving genomic puzzles: computational methods for metagenomic binning. Brief Bioinform 2024; 25:bbae372. [PMID: 39082646 PMCID: PMC11289683 DOI: 10.1093/bib/bbae372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 06/05/2024] [Accepted: 07/15/2024] [Indexed: 08/03/2024] Open
Abstract
Metagenomics involves the study of genetic material obtained directly from communities of microorganisms living in natural environments. The field of metagenomics has provided valuable insights into the structure, diversity and ecology of microbial communities. Once an environmental sample is sequenced and processed, metagenomic binning clusters the sequences into bins representing different taxonomic groups such as species, genera, or higher levels. Several computational tools have been developed to automate the process of metagenomic binning. These tools have enabled the recovery of novel draft genomes of microorganisms allowing us to study their behaviors and functions within microbial communities. This review classifies and analyzes different approaches of metagenomic binning and different refinement, visualization, and evaluation techniques used by these methods. Furthermore, the review highlights the current challenges and areas of improvement present within the field of research.
Collapse
Affiliation(s)
- Vijini Mallawaarachchi
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, SA 5042, Australia
| | - Anuradha Wickramarachchi
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Westmead, NSW 2145, Australia
| | - Hansheng Xue
- School of Computing, National University of Singapore, Singapore 119077, Singapore
| | - Bhavya Papudeshi
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, SA 5042, Australia
| | - Susanna R Grigson
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, SA 5042, Australia
| | - George Bouras
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA 5005, Australia
- The Department of Surgery—Otolaryngology Head and Neck Surgery, University of Adelaide and the Basil Hetzel Institute for Translational Health Research, Central Adelaide Local Health Network, Adelaide, SA 5011, Australia
| | - Rosa E Prahl
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Westmead, NSW 2145, Australia
| | - Anubhav Kaphle
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Westmead, NSW 2145, Australia
| | - Andrey Verich
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Westmead, NSW 2145, Australia
- The Kirby Institute, The University of New South Wales, Randwick, Sydney, NSW 2052, Australia
| | - Berenice Talamantes-Becerra
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Westmead, NSW 2145, Australia
| | - Elizabeth A Dinsdale
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, SA 5042, Australia
| | - Robert A Edwards
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, SA 5042, Australia
| |
Collapse
|
3
|
Singh UB, Deb S, Rani L, Gupta R, Verma S, Kumari L, Bhardwaj D, Bala K, Ahmed J, Gaurav S, Perumalla S, Nizam M, Mishra A, Stephenraj J, Shukla J, Nayer J, Aggarwal P, Kabra M, Ahuja V, Chaudhry R, Sinha S, Guleria R. Phylogeny and evolution of SARS-CoV-2 during Delta and Omicron variant waves in India. J Biomol Struct Dyn 2024; 42:4769-4781. [PMID: 37318006 DOI: 10.1080/07391102.2023.2222832] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 06/02/2023] [Indexed: 06/16/2023]
Abstract
SARS-CoV-2 evolution has continued to generate variants, responsible for new pandemic waves locally and globally. Varying disease presentation and severity has been ascribed to inherent variant characteristics and vaccine immunity. This study analyzed genomic data from 305 whole genome sequences from SARS-CoV-2 patients before and through the third wave in India. Delta variant was reported in patients without comorbidity (97%), while Omicron BA.2 was reported in patients with comorbidity (77%). Tissue adaptation studies brought forth higher propensity of Omicron variants to bronchial tissue than lung, contrary to observation in Delta variants from Delhi. Study of codon usage pattern distinguished the prevalent variants, clustering them separately, Omicron BA.2 isolated in February grouped away from December strains, and all BA.2 after December acquired a new mutation S959P in ORF1b (44.3% of BA.2 in the study) indicating ongoing evolution. Loss of critical spike mutations in Omicron BA.2 and gain of immune evasion mutations including G142D, reported in Delta but absent in BA.1, and S371F instead of S371L in BA.1 could explain very brief period of BA.1 in December 2021, followed by complete replacement by BA.2. Higher propensity of Omicron variants to bronchial tissue, probably ensured increased transmission while Omicron BA.2 became the prevalent variant possibly due to evolutionary trade-off. Virus evolution continues to shape the epidemic and its culmination.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Urvashi B Singh
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Sushanta Deb
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Lata Rani
- Central Core Research Facility, All India Institute of Medical Sciences, New Delhi, India
| | - Ritu Gupta
- Department of Laboratory Oncology, All India Institute of Medical Sciences, New Delhi, India
| | - Sunita Verma
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Lata Kumari
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Deepika Bhardwaj
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Kiran Bala
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Jawed Ahmed
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Sudesh Gaurav
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Sowjanya Perumalla
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Md Nizam
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Anwita Mishra
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - J Stephenraj
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Jyoti Shukla
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Jamshed Nayer
- Department of Emergency Medicine, All India Institute of Medical Sciences, New Delhi, India
| | - Praveen Aggarwal
- Department of Emergency Medicine, All India Institute of Medical Sciences, New Delhi, India
| | - Madhulika Kabra
- Department of Paediatrics, All India Institute of Medical Sciences, New Delhi, India
| | - Vineet Ahuja
- Department of Gastroenterology, All India Institute of Medical Sciences, New Delhi, India
| | - Rama Chaudhry
- Department of Microbiology, All India Institute of Medical Sciences, New Delhi, India
| | - Subrata Sinha
- Department of Biochemistry, All India Institute of Medical Sciences, New Delhi, India
| | - Randeep Guleria
- Department of Pulmonary, Critical Care & Sleep Medicine, All India Institute of Medical Sciences, New Delhi, India
| |
Collapse
|
4
|
Khandia R, Pandey MK, Rzhepakovsky IV, Khan AA, Alexiou A. Synonymous Codon Variant Analysis for Autophagic Genes Dysregulated in Neurodegeneration. Mol Neurobiol 2023; 60:2252-2267. [PMID: 36637744 DOI: 10.1007/s12035-022-03081-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 09/27/2022] [Indexed: 01/14/2023]
Abstract
Neurodegenerative disorders are often a culmination of the accumulation of abnormally folded proteins and defective organelles. Autophagy is a process of removing these defective proteins, organelles, and harmful substances from the body, and it works to maintain homeostasis. If autophagic removal of defective proteins has interfered, it affects neuronal health. Some of the autophagic genes are specifically found to be associated with neurodegenerative phenotypes. Non-functional, mutated, or gene copies having silent mutations, often termed synonymous variants, might explain this. However, these synonymous variant which codes for exactly similar proteins have different translation rates, stability, and gene expression profiling. Hence, it would be interesting to study the pattern of synonymous variant usage. In the study, synonymous variant usage in various transcripts of autophagic genes ATG5, ATG7, ATG8A, ATG16, and ATG17/FIP200 reported to cause neurodegeneration (if dysregulated) is studied. These genes were analyzed for their synonymous variant usage; nucleotide composition; any possible nucleotide skew in a gene; physical properties of autophagic protein including GRAVY and AROMA; hydropathicity; instability index; and frequency of acidic, basic, neutral amino acids; and gene expression level. The study will help understand various evolutionary forces acting on these genes and the possible augmentation of a gene if showing unusual behavior.
Collapse
Affiliation(s)
- Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, 462026, India.
| | - Megha Katare Pandey
- Department of Translational Medicine, All India Institute of Medical Sciences, Bhopal, 462020, India
| | | | - Azmat Ali Khan
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh, 11451, Saudi Arabia.
| | - Athanasios Alexiou
- Novel Global Community Educational Foundation, Hebersham, Australia
- AFNP Med, Wien, Austria
| |
Collapse
|
5
|
Maguire F, Jia B, Gray KL, Lau WYV, Beiko RG, Brinkman FSL. Metagenome-assembled genome binning methods with short reads disproportionately fail for plasmids and genomic Islands. Microb Genom 2020; 6:mgen000436. [PMID: 33001022 PMCID: PMC7660262 DOI: 10.1099/mgen.0.000436] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Accepted: 09/04/2020] [Indexed: 12/12/2022] Open
Abstract
Metagenomic methods enable the simultaneous characterization of microbial communities without time-consuming and bias-inducing culturing. Metagenome-assembled genome (MAG) binning methods aim to reassemble individual genomes from this data. However, the recovery of mobile genetic elements (MGEs), such as plasmids and genomic islands (GIs), by binning has not been well characterized. Given the association of antimicrobial resistance (AMR) genes and virulence factor (VF) genes with MGEs, studying their transmission is a public-health priority. The variable copy number and sequence composition of MGEs makes them potentially problematic for MAG binning methods. To systematically investigate this issue, we simulated a low-complexity metagenome comprising 30 GI-rich and plasmid-containing bacterial genomes. MAGs were then recovered using 12 current prediction pipelines and evaluated. While 82-94 % of chromosomes could be correctly recovered and binned, only 38-44 % of GIs and 1-29 % of plasmid sequences were found. Strikingly, no plasmid-borne VF nor AMR genes were recovered, and only 0-45 % of AMR or VF genes within GIs. We conclude that short-read MAG approaches, without further optimization, are largely ineffective for the analysis of mobile genes, including those of public-health importance, such as AMR and VF genes. We propose that researchers should explore developing methods that optimize for this issue and consider also using unassembled short reads and/or long-read approaches to more fully characterize metagenomic data.
Collapse
Affiliation(s)
- Finlay Maguire
- Faculty of Computer Science, Dalhousie University, 6050 University Avenue, Halifax, Nova Scotia, B3H 4R2, Canada
| | - Baofeng Jia
- Department of Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
| | - Kristen L. Gray
- Department of Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
| | - Wing Yin Venus Lau
- Department of Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
| | - Robert G. Beiko
- Faculty of Computer Science, Dalhousie University, 6050 University Avenue, Halifax, Nova Scotia, B3H 4R2, Canada
| | - Fiona S. L. Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
| |
Collapse
|
6
|
Codon Usage Optimization in the Prokaryotic Tree of Life: How Synonymous Codons Are Differentially Selected in Sequence Domains with Different Expression Levels and Degrees of Conservation. mBio 2020; 11:mBio.00766-20. [PMID: 32694138 PMCID: PMC7374057 DOI: 10.1128/mbio.00766-20] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The prokaryotic genomes—the current heritage of the most ancient life forms on earth—are comprised of diverse gene sets, all characterized by varied origins, ancestries, and spatial-temporal expression patterns. Such genetic diversity has for a long time raised the question of how cells shape their coding strategies to optimize protein demands (i.e., product abundance) and accuracy (i.e., translation fidelity) through the use of the same genetic code in genomes with GC contents that range from less than 20 to more than 80%. Here, we present evidence on how codon usage is adjusted in the prokaryotic tree of life and on how specific biases have operated to improve translation. Through the use of proteome data, we characterized conserved and variable sequence domains in genes of either high or low expression level and quantitated the relative weight of efficiency and accuracy—as well as their interaction—in shaping codon usage in prokaryotes. Prokaryote genomes exhibit a wide range of GC contents and codon usages, both resulting from an interaction between mutational bias and natural selection. In order to investigate the basis underlying specific codon changes, we performed a comprehensive analysis of 29 different prokaryote families. The analysis of core gene sets with increasing ancestries in each family lineage revealed that the codon usages became progressively more adapted to the tRNA pools. While, as previously reported, highly expressed genes presented the most optimized codon usage, the singletons contained the less selectively favored codons. The results showed that usually codons with the highest translational adaptation were preferentially enriched. In agreement with previous reports, a C bias in 2- to 3-fold pyrimidine-ending codons, and a U bias in 4-fold codons occurred in all families, irrespective of the global genomic GC content. Furthermore, the U biases suggested that U3-mRNA–U34-tRNA interactions were responsible for a prominent codon optimization in both the most ancestral core and the highly expressed genes. A comparative analysis of sequences that encode conserved (cr) or variable (vr) translated products, with each one being under high (HEP) and low (LEP) expression levels, demonstrated that the efficiency was more relevant (by a factor of 2) than accuracy to modeling codon usage. Finally, analysis of the third position of codons (GC3) revealed that in genomes with global GC contents higher than 35 to 40%, selection favored a GC3 increase, whereas in genomes with very low GC contents, a decrease in GC3 occurred. A comprehensive final model is presented in which all patterns of codon usage variations are condensed in four distinct behavioral groups.
Collapse
|
7
|
Singh P, Venkatesan A, Padmanabhan P, Gulyas B, Dass J FP. Codon usage of human hepatitis C virus clearance genes in relation to its expression. J Cell Biochem 2019; 121:534-544. [PMID: 31310376 DOI: 10.1002/jcb.29290] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 03/15/2019] [Indexed: 11/08/2022]
Abstract
Hepatitis C virus (HCV) infection is among the leading causes of hepatocellular carcinoma and liver cirrhosis globally, with a high economic burden. The disease progression is well established, but less is known about the spontaneous HCV infection clearance. This study tries to establish the relationship between codon biasness and expression of HCV clearance candidate genes in normal and HCV infected liver tissues. A total of 112 coding sequences comprising 151 679 codons were subjected to the computation of codon indices, namely relative synonymous codon usage, an effective number of codon (Nc), frequency of optimal codon, codon adaptation index, codon bias index, and base compositions. Codon indices report of GC3s, GC12, hydropathicity, and aromaticity implicates both mutational and translational selection in the candidate gene set. This was further correlated with the differentially expressed genes among the selected genes using BioGPS. A significant correlation is observed between the gene expression of normal liver and cancerous liver tissues with codon bias (Nc). Gene expression is also correlated with relative codon bias values, indicating that CCL5, APOA2, CD28, IFITM1, and TNFSF4 genes have higher expression. These results are quite encouraging in selecting the high responsive genes in HCV clearance. However, there could be additional genes which could also orchestrate the clearance role with the above mentioned first line of defensive genes.
Collapse
Affiliation(s)
- Pratichi Singh
- Department of Integrative Biology, School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, India
| | - Arthi Venkatesan
- Department of Integrative Biology, School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, India
| | - Parasuraman Padmanabhan
- Centre for Neuroimaging Research at NTU (CeNReN), Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore
| | - Balazs Gulyas
- Centre for Neuroimaging Research at NTU (CeNReN), Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore
| | - Febin Prabhu Dass J
- Department of Integrative Biology, School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore, Tamil Nadu, India
| |
Collapse
|
8
|
Codon Usage Heterogeneity in the Multipartite Prokaryote Genome: Selection-Based Coding Bias Associated with Gene Location, Expression Level, and Ancestry. mBio 2019; 10:mBio.00505-19. [PMID: 31138741 PMCID: PMC6538778 DOI: 10.1128/mbio.00505-19] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Prokaryotes represent an ancestral lineage in the tree of life and constitute optimal resources for investigating the evolution of genomes in unicellular organisms. Many bacterial species possess multipartite genomes offering opportunities to study functional variations among replicons, how and where new genes integrate into a genome, and how genetic information within a lineage becomes encoded and evolves. To analyze these issues, we focused on the model soil bacterium Sinorhizobium meliloti, which harbors a chromosome, a chromid (pSymB), a megaplasmid (pSymA), and, in many strains, one or more accessory plasmids. The analysis of several genomes, together with 1.4 Mb of accessory plasmid DNA that we purified and sequenced, revealed clearly different functional profiles associated with each genomic entity. pSymA, in particular, exhibited remarkable interstrain variation and a high density of singletons (unique, exclusive genes) featuring functionalities and modal codon usages that were very similar to those of the plasmidome. All this evidence reinforces the idea of a close relationship between pSymA and the plasmidome. Correspondence analyses revealed that adaptation of codon usages to the translational machinery increased from plasmidome to pSymA to pSymB to chromosome, corresponding as such to the ancestry of each replicon in the lineage. We demonstrated that chromosomal core genes gradually adapted to the translational machinery, reminiscent of observations in several bacterial taxa for genes with high expression levels. Such findings indicate a previously undiscovered codon usage adaptation associated with the chromosomal core information that likely operates to improve bacterial fitness. We present a comprehensive model illustrating the central findings described here, discussed in the context of the changes occurring during the evolution of a multipartite prokaryote genome.IMPORTANCE Bacterial genomes usually include many thousands of genes which are expressed with diverse spatial-temporal patterns and intensities. A well-known evidence is that highly expressed genes, such as the ribosomal and other translation-related proteins (RTRPs), have accommodated their codon usage to optimize translation efficiency and accuracy. Using a bioinformatic approach, we identify core-genes sets with different ancestries, and demonstrate that selection processes that optimize codon usage are not restricted to RTRPs but extended at a genome-wide scale. Such findings highlight, for the first time, a previously undiscovered adaptation strategy associated with the chromosomal-core information. Contrasted with the translationally more adapted genes, singletons (i.e., exclusive genes, including those of the plasmidome) appear as the gene pool with the less-ameliorated codon usage in the lineage. A comprehensive summary describing the inter- and intra-replicon heterogeneity of codon usages in a complex prokaryote genome is presented.
Collapse
|
9
|
Nguyen M, Long SW, McDermott PF, Olsen RJ, Olson R, Stevens RL, Tyson GH, Zhao S, Davis JJ. Using Machine Learning To Predict Antimicrobial MICs and Associated Genomic Features for Nontyphoidal Salmonella. J Clin Microbiol 2019; 57:e01260-18. [PMID: 30333126 PMCID: PMC6355527 DOI: 10.1128/jcm.01260-18] [Citation(s) in RCA: 146] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Accepted: 09/25/2018] [Indexed: 11/20/2022] Open
Abstract
Nontyphoidal Salmonella species are the leading bacterial cause of foodborne disease in the United States. Whole-genome sequences and paired antimicrobial susceptibility data are available for Salmonella strains because of surveillance efforts from public health agencies. In this study, a collection of 5,278 nontyphoidal Salmonella genomes, collected over 15 years in the United States, was used to generate extreme gradient boosting (XGBoost)-based machine learning models for predicting MICs for 15 antibiotics. The MIC prediction models had an overall average accuracy of 95% within ±1 2-fold dilution step (confidence interval, 95% to 95%), an average very major error rate of 2.7% (confidence interval, 2.4% to 3.0%), and an average major error rate of 0.1% (confidence interval, 0.1% to 0.2%). The model predicted MICs with no a priori information about the underlying gene content or resistance phenotypes of the strains. By selecting diverse genomes for the training sets, we show that highly accurate MIC prediction models can be generated with less than 500 genomes. We also show that our approach for predicting MICs is stable over time, despite annual fluctuations in antimicrobial resistance gene content in the sampled genomes. Finally, using feature selection, we explore the important genomic regions identified by the models for predicting MICs. To date, this is one of the largest MIC modeling studies to be published. Our strategy for developing whole-genome sequence-based models for surveillance and clinical diagnostics can be readily applied to other important human pathogens.
Collapse
Affiliation(s)
- Marcus Nguyen
- University of Chicago Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, USA
- Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, Illinois, USA
| | - S Wesley Long
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, Texas, USA
- Department of Pathology and Laboratory Medicine, Weill Cornell Medical College, New York, New York, USA
| | - Patrick F McDermott
- U.S. Food and Drug Administration, Center for Veterinary Medicine, Office of Research, Laurel, Maryland, USA
| | - Randall J Olsen
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, Texas, USA
- Department of Pathology and Laboratory Medicine, Weill Cornell Medical College, New York, New York, USA
| | - Robert Olson
- University of Chicago Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, USA
- Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, Illinois, USA
| | - Rick L Stevens
- Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, Illinois, USA
- Department of Computer Science, University of Chicago, Chicago, Illinois, USA
| | - Gregory H Tyson
- U.S. Food and Drug Administration, Center for Veterinary Medicine, Office of Research, Laurel, Maryland, USA
| | - Shaohua Zhao
- U.S. Food and Drug Administration, Center for Veterinary Medicine, Office of Research, Laurel, Maryland, USA
| | - James J Davis
- University of Chicago Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, USA
- Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, Illinois, USA
| |
Collapse
|
10
|
Yano H, Shintani M, Tomita M, Suzuki H, Oshima T. Reconsidering plasmid maintenance factors for computational plasmid design. Comput Struct Biotechnol J 2018; 17:70-81. [PMID: 30619542 PMCID: PMC6312765 DOI: 10.1016/j.csbj.2018.12.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2018] [Revised: 12/08/2018] [Accepted: 12/09/2018] [Indexed: 12/18/2022] Open
Abstract
Plasmids are genetic parasites of microorganisms. The genomes of naturally occurring plasmids are expected to be polished via natural selection to achieve long-term persistence in the microbial cell population. However, plasmid genomes are extremely diverse, and the rules governing plasmid genomes are not fully understood. Therefore, computationally designing plasmid genomes optimized for model and nonmodel organisms remains challenging. Here, we summarize current knowledge of the plasmid genome organization and the factors that can affect plasmid persistence, with the aim of constructing synthetic plasmids for use in gram-negative bacteria. Then, we introduce publicly available resources, plasmid data, and bioinformatics tools that are useful for computational plasmid design.
Collapse
Affiliation(s)
- Hirokazu Yano
- Graduate School of Life Sciences, Tohoku University, 2-1-1, Katahira, Aoba-ku, Sendai 980-8577, Japan
| | - Masaki Shintani
- Department of Engineering, Graduate School of Integrated Science and Technology, Shizuoka University, 3-5-1, Hamamatsu 432-8561, Japan
- Department of Bioscience, Graduate School of Science and Technology, Shizuoka University, 3-5-1, Hamamatsu 432-8561, Japan
| | - Masaru Tomita
- Institute for Advanced Biosciences, Keio University, 14-1, Baba-cho, Tsuruoka, Yamagata 997-0035, Japan
- Faculty of Environment and Information Studies, Keio University, 5322, Endo, Fujisawa, Kanagawa 252-0882, Japan
| | - Haruo Suzuki
- Institute for Advanced Biosciences, Keio University, 14-1, Baba-cho, Tsuruoka, Yamagata 997-0035, Japan
- Faculty of Environment and Information Studies, Keio University, 5322, Endo, Fujisawa, Kanagawa 252-0882, Japan
| | - Taku Oshima
- Department of Biotechnology, Toyama Prefectural University, 5180, Kurokawa, Imizu, Toyama 939-0398, Japan
| |
Collapse
|
11
|
McInally SG, Hagen KD, Nosala C, Williams J, Nguyen K, Booker J, Jones K, Dawson SC. Robust and stable transcriptional repression in Giardia using CRISPRi. Mol Biol Cell 2018; 30:119-130. [PMID: 30379614 PMCID: PMC6337905 DOI: 10.1091/mbc.e18-09-0605] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Giardia lamblia is a binucleate protistan parasite causing significant diarrheal disease worldwide. An inability to target Cas9 to both nuclei, combined with the lack of nonhomologous end joining and markers for positive selection, has stalled the adaptation of CRISPR/Cas9-mediated genetic tools for this widespread parasite. CRISPR interference (CRISPRi) is a modification of the CRISPR/Cas9 system that directs catalytically inactive Cas9 (dCas9) to target loci for stable transcriptional repression. Using a Giardia nuclear localization signal to target dCas9 to both nuclei, we developed efficient and stable CRISPRi-mediated transcriptional repression of exogenous and endogenous genes in Giardia. Specifically, CRISPRi knockdown of kinesin-2a and kinesin-13 causes severe flagellar length defects that mirror defects with morpholino knockdown. Knockdown of the ventral disk MBP protein also causes severe structural defects that are highly prevalent and persist in the population more than 5 d longer than defects associated with transient morpholino-based knockdown. By expressing two guide RNAs in tandem to simultaneously knock down kinesin-13 and MBP, we created a stable dual knockdown strain with both flagellar length and disk defects. The efficiency and simplicity of CRISPRi in polyploid Giardia allows rapid evaluation of knockdown phenotypes and highlights the utility of CRISPRi for emerging model systems.
Collapse
Affiliation(s)
- S G McInally
- Department of Microbiology and Molecular Genetics, University of California, Davis, Davis, CA 95616
| | - K D Hagen
- Department of Microbiology and Molecular Genetics, University of California, Davis, Davis, CA 95616
| | - C Nosala
- Department of Microbiology and Molecular Genetics, University of California, Davis, Davis, CA 95616
| | - J Williams
- Department of Microbiology and Molecular Genetics, University of California, Davis, Davis, CA 95616
| | - K Nguyen
- Department of Microbiology and Molecular Genetics, University of California, Davis, Davis, CA 95616
| | - J Booker
- Department of Microbiology and Molecular Genetics, University of California, Davis, Davis, CA 95616
| | - K Jones
- Department of Microbiology and Molecular Genetics, University of California, Davis, Davis, CA 95616
| | - Scott C Dawson
- Department of Microbiology and Molecular Genetics, University of California, Davis, Davis, CA 95616
| |
Collapse
|
12
|
Prey Range and Genome Evolution of Halobacteriovorax marinus Predatory Bacteria from an Estuary. mSphere 2018; 3:mSphere00508-17. [PMID: 29359184 PMCID: PMC5760749 DOI: 10.1128/msphere.00508-17] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Accepted: 12/05/2017] [Indexed: 02/04/2023] Open
Abstract
Predatory bacteria attack and digest other bacteria and therefore may play a role in shaping microbial communities. To investigate phenotypic and genotypic variation in saltwater-adapted predatory bacteria, we isolated Halobacteriovorax marinus BE01 from an estuary in Rhode Island, assayed whether it could attack different prey bacteria, and sequenced and analyzed its genome. We found that BE01 is a prey generalist, attacking bacteria from different phylogenetic groups and environments. Gene order and amino acid sequences are highly conserved between BE01 and the H. marinus type strain, SJ. By comparative genomics, we detected two regions of gene content difference that likely occurred via horizontal gene transfer events. Acquired genes encode functions such as modification of DNA, membrane synthesis and regulation of gene expression. Understanding genome evolution and variation in predation phenotypes among predatory bacteria will inform their development as biocontrol agents and clarify how they impact microbial communities. Halobacteriovorax strains are saltwater-adapted predatory bacteria that attack Gram-negative bacteria and may play an important role in shaping microbial communities. To understand how Halobacteriovorax strains impact ecosystems and develop them as biocontrol agents, it is important to characterize variation in predation phenotypes and investigate Halobacteriovorax genome evolution. We isolated Halobacteriovorax marinus BE01 from an estuary in Rhode Island using Vibrio from the same site as prey. Small, fast-moving, attack-phase BE01 cells attach to and invade prey cells, consistent with the intraperiplasmic predation strategy of the H. marinus type strain, SJ. BE01 is a prey generalist, forming plaques on Vibrio strains from the estuary, Pseudomonas from soil, and Escherichia coli. Genome analysis revealed extremely high conservation of gene order and amino acid sequences between BE01 and SJ, suggesting strong selective pressure to maintain the genome in this H. marinus lineage. Despite this, we identified two regions of gene content difference that likely resulted from horizontal gene transfer. Analysis of modal codon usage frequencies supports the hypothesis that these regions were acquired from bacteria with different codon usage biases than H. marinus. In one of these regions, BE01 and SJ carry different genes associated with mobile genetic elements. Acquired functions in BE01 include the dnd operon, which encodes a pathway for DNA modification, and a suite of genes involved in membrane synthesis and regulation of gene expression that was likely acquired from another Halobacteriovorax lineage. This analysis provides further evidence that horizontal gene transfer plays an important role in genome evolution in predatory bacteria. IMPORTANCE Predatory bacteria attack and digest other bacteria and therefore may play a role in shaping microbial communities. To investigate phenotypic and genotypic variation in saltwater-adapted predatory bacteria, we isolated Halobacteriovorax marinus BE01 from an estuary in Rhode Island, assayed whether it could attack different prey bacteria, and sequenced and analyzed its genome. We found that BE01 is a prey generalist, attacking bacteria from different phylogenetic groups and environments. Gene order and amino acid sequences are highly conserved between BE01 and the H. marinus type strain, SJ. By comparative genomics, we detected two regions of gene content difference that likely occurred via horizontal gene transfer events. Acquired genes encode functions such as modification of DNA, membrane synthesis and regulation of gene expression. Understanding genome evolution and variation in predation phenotypes among predatory bacteria will inform their development as biocontrol agents and clarify how they impact microbial communities.
Collapse
|
13
|
Akhter S, Aziz RK, Kashef MT, Ibrahim ES, Bailey B, Edwards RA. Kullback Leibler divergence in complete bacterial and phage genomes. PeerJ 2017; 5:e4026. [PMID: 29204318 PMCID: PMC5712468 DOI: 10.7717/peerj.4026] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Accepted: 10/22/2017] [Indexed: 12/11/2022] Open
Abstract
The amino acid content of the proteins encoded by a genome may predict the coding potential of that genome and may reflect lifestyle restrictions of the organism. Here, we calculated the Kullback–Leibler divergence from the mean amino acid content as a metric to compare the amino acid composition for a large set of bacterial and phage genome sequences. Using these data, we demonstrate that (i) there is a significant difference between amino acid utilization in different phylogenetic groups of bacteria and phages; (ii) many of the bacteria with the most skewed amino acid utilization profiles, or the bacteria that host phages with the most skewed profiles, are endosymbionts or parasites; (iii) the skews in the distribution are not restricted to certain metabolic processes but are common across all bacterial genomic subsystems; (iv) amino acid utilization profiles strongly correlate with GC content in bacterial genomes but very weakly correlate with the G+C percent in phage genomes. These findings might be exploited to distinguish coding from non-coding sequences in large data sets, such as metagenomic sequence libraries, to help in prioritizing subsequent analyses.
Collapse
Affiliation(s)
- Sajia Akhter
- Computational Science Research Center, San Diego State University, San Diego, CA, USA
| | - Ramy K Aziz
- Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, Egypt.,Department of Computer Science, San Diego State University, San Diego, CA, United States of America
| | - Mona T Kashef
- Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, Egypt
| | - Eslam S Ibrahim
- Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, Egypt
| | - Barbara Bailey
- Department of Mathematics & Statistics, San Diego State University, San Diego, CA, USA
| | - Robert A Edwards
- Computational Science Research Center, San Diego State University, San Diego, CA, USA.,Department of Computer Science, San Diego State University, San Diego, CA, United States of America.,Department of Mathematics & Statistics, San Diego State University, San Diego, CA, USA.,Department of Biology, San Diego State University, San Diego, CA, USA
| |
Collapse
|
14
|
Fares M. Identifying Evolution Signatures in Molecules. NATURAL SELECTION 2014:9-27. [DOI: 10.1201/b17795-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
15
|
Ainsworth S, Stockdale S, Bottacini F, Mahony J, van Sinderen D. The Lactococcus lactis plasmidome: much learnt, yet still lots to discover. FEMS Microbiol Rev 2014; 38:1066-88. [PMID: 24861818 DOI: 10.1111/1574-6976.12074] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Revised: 04/17/2014] [Accepted: 05/07/2014] [Indexed: 01/20/2023] Open
Abstract
Lactococcus lactis is used extensively worldwide for the production of a variety of fermented dairy products. The ability of L. lactis to successfully grow and acidify milk has long been known to be reliant on a number of plasmid-encoded traits. The recent availability of low-cost, high-quality genome sequencing, and the quest for novel, technologically desirable characteristics, such as novel flavour development and increased stress tolerance, has led to a steady increase in the number of available lactococcal plasmid sequences. We will review both well-known and very recent discoveries regarding plasmid-encoded traits of biotechnological significance. The acquired lactococcal plasmid sequence information has in recent years progressed our understanding of the origin of lactococcal dairy starter cultures. Salient points on the acquisition and evolution of lactococcal plasmids will be discussed in this review, as well as prospects of finding novel plasmid-encoded functions.
Collapse
Affiliation(s)
- Stuart Ainsworth
- Department of Microbiology, University College Cork, Cork, Ireland
| | | | | | | | | |
Collapse
|
16
|
Ecotype diversity and conversion in Photobacterium profundum strains. PLoS One 2014; 9:e96953. [PMID: 24824441 PMCID: PMC4019646 DOI: 10.1371/journal.pone.0096953] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Accepted: 04/12/2014] [Indexed: 12/03/2022] Open
Abstract
Photobacterium profundum is a cosmopolitan marine bacterium capable of growth at low temperature and high hydrostatic pressure. Multiple strains of P. profundum have been isolated from different depths of the ocean and display remarkable differences in their physiological responses to pressure. The genome sequence of the deep-sea piezopsychrophilic strain Photobacterium profundum SS9 has provided some clues regarding the genetic features required for growth in the deep sea. The sequenced genome of Photobacterium profundum strain 3TCK, a non-piezophilic strain isolated from a shallow-water environment, is now available and its analysis expands the identification of unique genomic features that correlate to environmental differences and define the Hutchinsonian niche of each strain. These differences range from variations in gene content to specific gene sequences under positive selection. Genome plasticity between Photobacterium bathytypes was investigated when strain 3TCK-specific genes involved in photorepair were introduced to SS9, demonstrating that horizontal gene transfer can provide a mechanism for rapid colonisation of new environments.
Collapse
|
17
|
Wan PJ, Yang L, Wang WX, Fan JM, Fu Q, Li GQ. Constructing the major biosynthesis pathways for amino acids in the brown planthopper, Nilaparvata lugens Stål (Hemiptera: Delphacidae), based on the transcriptome data. INSECT MOLECULAR BIOLOGY 2014; 23:152-64. [PMID: 24330026 DOI: 10.1111/imb.12069] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Nilaparvata lugens is a serious phloem-feeding pest of rice throughout Asia. Rice phloem sap can meet its nutrition requirement for sugars but not for some essential amino acids such as isoleucine, leucine, methionine, phenylalanine, tryptophan, lysine, arginine and histidine. N. lugens harbours yeast-like symbionts in mycetocytes formed by abdominal fat body cells. Removal of the symbionts results in negative physiological effects, suggesting that the symbionts play a pivotal role in the nitrogen metabolism. In the present paper, 521 mRNA expressed sequence tags (ESTs) encoding 126 enzymes that were involved in amino acid biosynthesis were identified based on a transcriptome data, reverse transcription (RT)-PCR and rapid amplification of cDNA ends. Similarity analysis, codon usage bias, along with tissue-biased expression and phylogenetic analysis of a subset of ESTs, suggest that 437 ESTs out of the 521 originate from symbionts, and the remaining 84 mRNA fragments come from N. lugens. Accordingly, the biosynthesis pathways for 20 amino acids were manually constructed. It is postulated that both N. lugens and its symbiont can independently assimilate ammonia and biosynthesize seven non-essential amino acids: glutamate; glutamine; aspartate; asparagine; alanine; serine; and glycine. N. lugens and symbiont enzymes may work collaboratively to catalyse the biosynthesis of proline, methionine, valine, leucine, isoleucine, phenylalanine and tyrosine. We infer from this that symbionts function in the biosynthesis of lysine, arginine, tryptophan, threonine, histidine and cysteine. Our data support the previously proposed hypothesis, i.e. the yeast-like symbionts compensate for, at least partially, the amino acid needs of N. lugens.
Collapse
Affiliation(s)
- P-J Wan
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Hangzhou, China; Education Ministry Key Laboratory of Integrated Management of Crop Diseases and Pests, College of Plant Protection, Nanjing Agricultural University, Nanjing, China
| | | | | | | | | | | |
Collapse
|
18
|
Borziak K, Posner MG, Upadhyay A, Danson MJ, Bagby S, Dorus S. Comparative genomic analysis reveals 2-oxoacid dehydrogenase complex lipoylation correlation with aerobiosis in archaea. PLoS One 2014; 9:e87063. [PMID: 24489835 PMCID: PMC3904984 DOI: 10.1371/journal.pone.0087063] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2013] [Accepted: 12/18/2013] [Indexed: 02/04/2023] Open
Abstract
Metagenomic analyses have advanced our understanding of ecological microbial diversity, but to what extent can metagenomic data be used to predict the metabolic capacity of difficult-to-study organisms and their abiotic environmental interactions? We tackle this question, using a comparative genomic approach, by considering the molecular basis of aerobiosis within archaea. Lipoylation, the covalent attachment of lipoic acid to 2-oxoacid dehydrogenase multienzyme complexes (OADHCs), is essential for metabolism in aerobic bacteria and eukarya. Lipoylation is catalysed either by lipoate protein ligase (LplA), which in archaea is typically encoded by two genes (LplA-N and LplA-C), or by a lipoyl(octanoyl) transferase (LipB or LipM) plus a lipoic acid synthetase (LipA). Does the genomic presence of lipoylation and OADHC genes across archaea from diverse habitats correlate with aerobiosis? First, analyses of 11,826 biotin protein ligase (BPL)-LplA-LipB transferase family members and 147 archaeal genomes identified 85 species with lipoylation capabilities and provided support for multiple ancestral acquisitions of lipoylation pathways during archaeal evolution. Second, with the exception of the Sulfolobales order, the majority of species possessing lipoylation systems exclusively retain LplA, or either LipB or LipM, consistent with archaeal genome streamlining. Third, obligate anaerobic archaea display widespread loss of lipoylation and OADHC genes. Conversely, a high level of correspondence is observed between aerobiosis and the presence of LplA/LipB/LipM, LipA and OADHC E2, consistent with the role of lipoylation in aerobic metabolism. This correspondence between OADHC lipoylation capacity and aerobiosis indicates that genomic pathway profiling in archaea is informative and that well characterized pathways may be predictive in relation to abiotic conditions in difficult-to-study extremophiles. Given the highly variable retention of gene repertoires across the archaea, the extension of comparative genomic pathway profiling to broader metabolic and homeostasis networks should be useful in revealing characteristics from metagenomic datasets related to adaptations to diverse environments.
Collapse
Affiliation(s)
- Kirill Borziak
- Department of Biology, Syracuse University, Syracuse, New York, United States of America
| | - Mareike G. Posner
- Department of Biology & Biochemistry, University of Bath, Claverton Down, United Kingdom
| | - Abhishek Upadhyay
- Department of Biology & Biochemistry, University of Bath, Claverton Down, United Kingdom
| | - Michael J. Danson
- Department of Biology & Biochemistry, University of Bath, Claverton Down, United Kingdom
- Centre for Extremophile Research, University of Bath, Claverton Down, United Kingdom
| | - Stefan Bagby
- Department of Biology & Biochemistry, University of Bath, Claverton Down, United Kingdom
- * E-mail: (SB); (SD)
| | - Steve Dorus
- Department of Biology, Syracuse University, Syracuse, New York, United States of America
- * E-mail: (SB); (SD)
| |
Collapse
|
19
|
Davis JJ, Xia F, Overbeek RA, Olsen GJ. Genomes of the class Erysipelotrichia clarify the firmicute origin of the class Mollicutes. Int J Syst Evol Microbiol 2013; 63:2727-2741. [PMID: 23606477 PMCID: PMC3749518 DOI: 10.1099/ijs.0.048983-0] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The tree of life is paramount for achieving an integrated understanding of microbial evolution and the relationships between physiology, genealogy and genomics. It provides the framework for interpreting environmental sequence data, whether applied to microbial ecology or to human health. However, there remain many instances where there is ambiguity in our understanding of the phylogeny of major lineages, and/or confounding nomenclature. Here we apply recent genomic sequence data to examine the evolutionary history of members of the classes Mollicutes (phylum Tenericutes) and Erysipelotrichia (phylum Firmicutes). Consistent with previous analyses, we find evidence of a specific relationship between them in molecular phylogenies and signatures of the 16S rRNA, 23S rRNA, ribosomal proteins and aminoacyl-tRNA synthetase proteins. Furthermore, by mapping functions over the phylogenetic tree we find that the erysipelotrichia lineages are involved in various stages of genomic reduction, having lost (often repeatedly) a variety of metabolic functions and the ability to form endospores. Although molecular phylogeny has driven numerous taxonomic revisions, we find it puzzling that the most recent taxonomic revision of the phyla Firmicutes and Tenericutes has further separated them into distinct phyla, rather than reflecting their common roots.
Collapse
Affiliation(s)
- James J Davis
- Department of Microbiology and Institute for Genomic Biology, University of Illinois at Urbana-Champaign, USA
| | | | - Ross A Overbeek
- Fellowship for Interpretation of Genomes, Burr Ridge, IL, USA
| | - Gary J Olsen
- Center for Biophysics and Computational Biology, University of Illinois at Urbana-Champaign, USA.,Department of Microbiology and Institute for Genomic Biology, University of Illinois at Urbana-Champaign, USA
| |
Collapse
|
20
|
Wegmann U, Overweg K, Jeanson S, Gasson M, Shearman C. Molecular characterization and structural instability of the industrially important composite metabolic plasmid pLP712. MICROBIOLOGY-SGM 2012; 158:2936-2945. [PMID: 23023974 DOI: 10.1099/mic.0.062554-0] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The widely used plasmid-free Lactococcus lactis strain MG1363 was derived from the industrial dairy starter strain NCDO712. This strain carries a 55.39 kb plasmid encoding genes for lactose catabolism and a serine proteinase involved in casein degradation. We report the DNA sequencing and annotation of pLP712, which revealed additional metabolic genes, including peptidase F, d-lactate dehydrogenase and α-keto acid dehydrogenase (E3 complex). Comparison of pLP712 with other large lactococcal lactose and/or proteinase plasmids from L. lactis subsp. cremoris SK11 (pSK11L, pSK11P) and the plant strain L. lactis NCDO1867 (pGdh442) revealed their close relationship. The plasmid appears to have evolved through a series of genetic events as a composite of pGdh442, pSK11L and pSK11P. We describe in detail a scenario by which the metabolic genes relevant to the growth of its host in a milk environment have been unified on one replicon, reflecting the evolution of L. lactis as it changed its biological niche from plants to dairy environments. The extensive structural instability of pLP712 allows easy isolation of derivative plasmids lacking genes for casein degradation and/or lactose catabolism. Plasmid pLP712 is transferable by transduction and conjugation, and both of these processes result in significant molecular rearrangements. We report the detailed molecular analysis of insertion sequence element-mediated genetic rearrangements within pLP712 and several different mechanisms, including homologous recombination and adjacent deletion. Analysis of the integration of the lactose operon into the chromosome highlights the fluidity of the MG1363 integration hotspot and the potential for frequent movement of genes between plasmids and chromosomes in Lactococcus.
Collapse
Affiliation(s)
- Udo Wegmann
- Institute of Food Research, Norwich Research Park, Colney, Norwich NR4 7UA, UK
| | - Karin Overweg
- Institute of Food Research, Norwich Research Park, Colney, Norwich NR4 7UA, UK
| | - Sophie Jeanson
- Institute of Food Research, Norwich Research Park, Colney, Norwich NR4 7UA, UK
| | - Mike Gasson
- Institute of Food Research, Norwich Research Park, Colney, Norwich NR4 7UA, UK
| | - Claire Shearman
- Institute of Food Research, Norwich Research Park, Colney, Norwich NR4 7UA, UK
| |
Collapse
|
21
|
Friedman R, Ely B. Codon usage methods for horizontal gene transfer detection generate an abundance of false positive and false negative results. Curr Microbiol 2012; 65:639-42. [PMID: 23010940 DOI: 10.1007/s00284-012-0205-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Accepted: 07/07/2012] [Indexed: 11/24/2022]
Abstract
Bacteria acquire new DNA in a process known as horizontal gene transfer (HGT). To investigate the evolutionary impact of this transfer of DNA, various methods have been developed to detect past HGT events. For example, codon usage-based methods detect the presence of transferred genes by identifying atypical patterns of codon usage. However, some inherited genes exhibit atypical codon usage and some transferred genes have codon usage patterns similar to those of the inherited genes. In this study, we used a comparative phylogenetic approach with Methylobacterium and Caulobacter species to demonstrate that even well-designed codon usage methods fail to detect many HGT events and generate a high rate of false positives (60-75 %) and false negatives (23-61 %). Therefore, we recommend caution when employing codon usage methods to identify transferred genes and suggest that the rapidly increasing availability of bacterial genome sequences makes the phylogenetic approach the method of choice.
Collapse
Affiliation(s)
- Robert Friedman
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA.
| | | |
Collapse
|
22
|
Mellitzer A, Weis R, Glieder A, Flicker K. Expression of lignocellulolytic enzymes in Pichia pastoris. Microb Cell Fact 2012; 11:61. [PMID: 22583625 PMCID: PMC3503753 DOI: 10.1186/1475-2859-11-61] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2011] [Accepted: 04/21/2012] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Sustainable utilization of plant biomass as renewable source for fuels and chemical building blocks requires a complex mixture of diverse enzymes, including hydrolases which comprise the largest class of lignocellulolytic enzymes. These enzymes need to be available in large amounts at a low price to allow sustainable and economic biotechnological processes.Over the past years Pichia pastoris has become an attractive host for the cost-efficient production and engineering of heterologous (eukaryotic) proteins due to several advantages. RESULTS In this paper codon optimized genes and synthetic alcohol oxidase 1 promoter variants were used to generate Pichia pastoris strains which individually expressed cellobiohydrolase 1, cellobiohydrolase 2 and beta-mannanase from Trichoderma reesei and xylanase A from Thermomyces lanuginosus. For three of these enzymes we could develop strains capable of secreting gram quantities of enzyme per liter in fed-batch cultivations. Additionally, we compared our achieved yields of secreted enzymes and the corresponding activities to literature data. CONCLUSION In our experiments we could clearly show the importance of gene optimization and strain characterization for successfully improving secretion levels. We also present a basic guideline how to correctly interpret the interplay of promoter strength and gene dosage for a successful improvement of the secretory production of lignocellulolytic enzymes in Pichia pastoris.
Collapse
Affiliation(s)
- Andrea Mellitzer
- Institute of Molecular Biotechnology, Graz University of Technology, Graz, Austria
| | | | - Anton Glieder
- ACIB GmbH, Austrian Centre of Industrial Biotechnology, Graz, Austria
| | - Karlheinz Flicker
- ACIB GmbH, Austrian Centre of Industrial Biotechnology, Graz, Austria
| |
Collapse
|
23
|
Tumu S, Patil A, Towns W, Dyavaiah M, Begley TJ. The gene-specific codon counting database: a genome-based catalog of one-, two-, three-, four- and five-codon combinations present in Saccharomyces cerevisiae genes. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bas002. [PMID: 22323063 PMCID: PMC3275765 DOI: 10.1093/database/bas002] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
A codon consists of three nucleotides and functions during translation to dictate the insertion of a specific amino acid in a growing peptide or, in the case of stop codons, to specify the completion of protein synthesis. There are 64 possible single codons and there are 4096 double, 262 144 triple, 16 777 216 quadruple and 1 073 741 824 quintuple codon combinations available for use by specific genes and genomes. In order to evaluate the use of specific single, double, triple, quadruple and quintuple codon combinations in genes and gene networks, we have developed a codon counting tool and employed it to analyze 5780 Saccharomyces cerevisiae genes. We have also developed visualization approaches, including codon painting, combination and bar graphs, and have used them to identify distinct codon usage patterns in specific genes and groups of genes. Using our developed Gene-Specific Codon Counting Database, we have identified extreme codon runs in specific genes. We have also demonstrated that specific codon combinations or usage patterns are over-represented in genes whose corresponding proteins belong to ribosome or translation-associated biological processes. Our resulting database provides a mineable list of multi-codon data and can be used to identify unique sequence runs and codon usage patterns in individual and functionally linked groups of genes. Database URL:http://www.cs.albany.edu/~tumu/GSCC.html
Collapse
Affiliation(s)
- Sudheer Tumu
- Department of Computer Science, University at Albany, State University of New York, Albany, NY 12222, USA
| | | | | | | | | |
Collapse
|
24
|
Similarity of genes horizontally acquired by Escherichia coli and Salmonella enterica is evidence of a supraspecies pangenome. Proc Natl Acad Sci U S A 2011; 108:20154-9. [PMID: 22128332 DOI: 10.1073/pnas.1109451108] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Most bacterial and archaeal genomes contain many genes with little or no similarity to other genes, a property that impedes identification of gene origins. By comparing the codon usage of genes shared among strains (primarily vertically inherited genes) and genes unique to one strain (primarily recently horizontally acquired genes), we found that the plurality of unique genes in Escherichia coli and Salmonella enterica are much more similar to each other than are their vertically inherited genes. We conclude that E. coli and S. enterica derive these unique genes from a common source, a supraspecies phylogenetic group that includes the organisms themselves. The phylogenetic range of the sharing appears to include other (but not all) members of the Enterobacteriaceae. We found evidence of similar gene sharing in other bacterial and archaeal taxa. Thus, we conclude that frequent gene exchange, particularly that of genetic novelties, extends well beyond accepted species boundaries.
Collapse
|
25
|
Schmid P, Flegel WA. Codon usage in vertebrates is associated with a low risk of acquiring nonsense mutations. J Transl Med 2011; 9:87. [PMID: 21651781 PMCID: PMC3123582 DOI: 10.1186/1479-5876-9-87] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2011] [Accepted: 06/08/2011] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Codon usage in genomes is biased towards specific subsets of codons. Codon usage bias affects translational speed and accuracy, and it is associated with the tRNA levels and the GC content of the genome. Spontaneous mutations drive genomes to a low GC content. Active cellular processes are needed to maintain a high GC content, which influences the codon usage of a species. Loss-of-function mutations, such as nonsense mutations, are the molecular basis of many recessive alleles, which can greatly affect the genome of an organism and are the cause of many genetic diseases in humans. METHODS We developed an event based model to calculate the risk of acquiring nonsense mutations in coding sequences. Complete coding sequences and genomes of 40 eukaryotes were analyzed for GC and CpG content, codon usage, and the associated risk of acquiring nonsense mutations. We included one species per genus for all eukaryotes with available reference sequence. RESULTS We discovered that the codon usage bias detected in genomes of high GC content decreases the risk of acquiring nonsense mutations (Pearson's r = -0.95; P < 0.0001). In the genomes of all examined vertebrates, including humans, this risk was lower than expected (0.93 ± 0.02; mean ± SD) and lower than the risk in genomes of non-vertebrates (1.02 ± 0.13; P = 0.019). CONCLUSIONS While the maintenance of a high GC content is energetically costly, it is associated with a codon usage bias harboring a low risk of acquiring nonsense mutations. The reduced exposure to this risk may contribute to the fitness of vertebrates.
Collapse
Affiliation(s)
- Pirmin Schmid
- National Institutes of Health, Clinical Center, Bethesda, MD, USA
| | | |
Collapse
|
26
|
Yeoman CJ, Yildirim S, Thomas SM, Durkin AS, Torralba M, Sutton G, Buhay CJ, Ding Y, Dugan-Rocha SP, Muzny DM, Qin X, Gibbs RA, Leigh SR, Stumpf R, White BA, Highlander SK, Nelson KE, Wilson BA. Comparative genomics of Gardnerella vaginalis strains reveals substantial differences in metabolic and virulence potential. PLoS One 2010; 5:e12411. [PMID: 20865041 PMCID: PMC2928729 DOI: 10.1371/journal.pone.0012411] [Citation(s) in RCA: 105] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2010] [Accepted: 07/22/2010] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Gardnerella vaginalis is described as a common vaginal bacterial species whose presence correlates strongly with bacterial vaginosis (BV). Here we report the genome sequencing and comparative analyses of three strains of G. vaginalis. Strains 317 (ATCC 14019) and 594 (ATCC 14018) were isolated from the vaginal tracts of women with symptomatic BV, while Strain 409-05 was isolated from a healthy, asymptomatic individual with a Nugent score of 9. PRINCIPAL FINDINGS Substantial genomic rearrangement and heterogeneity were observed that appeared to have resulted from both mobile elements and substantial lateral gene transfer. These genomic differences translated to differences in metabolic potential. All strains are equipped with significant virulence potential, including genes encoding the previously described vaginolysin, pili for cytoadhesion, EPS biosynthetic genes for biofilm formation, and antimicrobial resistance systems, We also observed systems promoting multi-drug and lantibiotic extrusion. All G. vaginalis strains possess a large number of genes that may enhance their ability to compete with and exclude other vaginal colonists. These include up to six toxin-antitoxin systems and up to nine additional antitoxins lacking cognate toxins, several of which are clustered within each genome. All strains encode bacteriocidal toxins, including two lysozyme-like toxins produced uniquely by strain 409-05. Interestingly, the BV isolates encode numerous proteins not found in strain 409-05 that likely increase their pathogenic potential. These include enzymes enabling mucin degradation, a trait previously described to strongly correlate with BV, although commonly attributed to non-G. vaginalis species. CONCLUSIONS Collectively, our results indicate that all three strains are able to thrive in vaginal environments, and therein the BV isolates are capable of occupying a niche that is unique from 409-05. Each strain has significant virulence potential, although genomic and metabolic differences, such as the ability to degrade mucin, indicate that the detection of G. vaginalis in the vaginal tract provides only partial information on the physiological potential of the organism.
Collapse
Affiliation(s)
- Carl J. Yeoman
- Institute for Genomic Biology, University of Illinois, Urbana, Illinois, United States of America
| | - Suleyman Yildirim
- Institute for Genomic Biology, University of Illinois, Urbana, Illinois, United States of America
| | - Susan M. Thomas
- Institute for Genomic Biology, University of Illinois, Urbana, Illinois, United States of America
| | - A. Scott Durkin
- J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Manolito Torralba
- J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Granger Sutton
- J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Christian J. Buhay
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Yan Ding
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Shannon P. Dugan-Rocha
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Donna M. Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Xiang Qin
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Richard A. Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Steven R. Leigh
- Department of Anthropology, University of Illinois, Urbana, Illinois, United States of America
| | - Rebecca Stumpf
- Department of Anthropology, University of Illinois, Urbana, Illinois, United States of America
| | - Bryan A. White
- Institute for Genomic Biology, University of Illinois, Urbana, Illinois, United States of America
- Department of Animal Sciences, University of Illinois, Urbana, Illinois, United States of America
| | - Sarah K. Highlander
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, Texas, United States of America
| | - Karen E. Nelson
- J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Brenda A. Wilson
- Institute for Genomic Biology, University of Illinois, Urbana, Illinois, United States of America
- Department of Microbiology, University of Illinois, Urbana, Illinois, United States of America
| |
Collapse
|
27
|
Davis JJ, Olsen GJ. Characterizing the native codon usages of a genome: an axis projection approach. Mol Biol Evol 2010; 28:211-21. [PMID: 20679093 PMCID: PMC3002238 DOI: 10.1093/molbev/msq185] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Codon usage can provide insights into the nature of the genes in a genome. Genes that are “native” to a genome (have not been recently acquired by horizontal transfer) range in codon usage from a low-bias “typical” usage to a more biased “high-expression” usage characteristic of genes encoding abundant proteins. Genes that differ from these native codon usages are candidates for foreign genes that have been recently acquired by horizontal gene transfer. In this study, we present a method for characterizing the codon usages of native genes—both typical and highly expressed—within a genome. Each gene is evaluated relative to a half line (or axis) in a 59D space of codon usage. The axis begins at the modal codon usage, the usage that matches the largest number of genes in the genome, and it passes through a point representing the codon usage of a set of genes with expression-related bias. A gene whose codon usage matches (does not significantly differ from) a point on this axis is a candidate native gene, and the location of its projection onto the axis provides a general estimate of its expression level. A gene that differs significantly from all points on the axis is a candidate foreign gene. This automated approach offers significant improvements over existing methods. We illustrate this by analyzing the genomes of Pseudomonas aeruginosa PAO1 and Bacillus anthracis A0248, which can be difficult to analyze with commonly used methods due to their biased base compositions. Finally, we use this approach to measure the proportion of candidate foreign genes in 923 bacterial and archaeal genomes. The organisms with the most homogeneous genomes (containing the fewest candidate foreign genes) are mostly endosymbionts and parasites, though with exceptions that include Pelagibacter ubique and Beutenbergia cavernae. The organisms with the most heterogeneous genomes (containing the most candidate foreign genes) include members of the genera Bacteroides, Corynebacterium, Desulfotalea, Neisseria, Xylella, and Thermobaculum.
Collapse
Affiliation(s)
- James J Davis
- Department of Microbiology, University of Illinois at Urbana-Champaign
| | | |
Collapse
|
28
|
Zhou F, Xu Y. cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data. ACTA ACUST UNITED AC 2010; 26:2051-2. [PMID: 20538725 PMCID: PMC2916713 DOI: 10.1093/bioinformatics/btq299] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Summary: Huge amount of metagenomic sequence data have been produced as a result of the rapidly increasing efforts worldwide in studying microbial communities as a whole. Most, if not all, sequenced metagenomes are complex mixtures of chromosomal and plasmid sequence fragments from multiple organisms, possibly from different kingdoms. Computational methods for prediction of genomic elements such as genes are significantly different for chromosomes and plasmids, hence raising the need for separation of chromosomal from plasmid sequences in a metagenome. We present a program for classification of a metagenome set into chromosomal and plasmid sequences, based on their distinguishing pentamer frequencies. On a large training set consisting of all the sequenced prokaryotic chromosomes and plasmids, the program achieves ∼92% in classification accuracy. On a large set of simulated metagenomes with sequence lengths ranging from 300 bp to 100 kbp, the program has classification accuracy from 64.45% to 88.75%. On a large independent test set, the program achieves 88.29% classification accuracy. Availability: The program has been implemented as a standalone prediction program, cBar, which is available at http://csbl.bmb.uga.edu/∼ffzhou/cBar Contact:xyn@bmb.uga.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fengfeng Zhou
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA 30602, USA
| | | |
Collapse
|
29
|
Codon Usage Patterns in Corynebacterium glutamicum: Mutational Bias, Natural Selection and Amino Acid Conservation. Comp Funct Genomics 2010; 2010:343569. [PMID: 20445740 PMCID: PMC2860111 DOI: 10.1155/2010/343569] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2009] [Revised: 01/29/2010] [Accepted: 02/04/2010] [Indexed: 11/17/2022] Open
Abstract
The alternative synonymous codons in Corynebacterium glutamicum, a well-known bacterium used in industry for the production of amino acid, have been investigated by multivariate analysis. As C. glutamicum is a GC-rich organism, G and C are expected to predominate at the third position of codons. Indeed, overall codon usage analyses have indicated that C and/or G ending codons are predominant in this organism. Through multivariate statistical analysis, apart from mutational selection, we identified three other trends of codon usage variation among the genes. Firstly, the majority of highly expressed genes are scattered towards the positive end of the first axis, whereas the majority of lowly expressed genes are clustered towards the other end of the first axis. Furthermore, the distinct difference in the two sets of genes was that the C ending codons are predominate in putatively highly expressed genes, suggesting that the C ending codons are translationally optimal in this organism. Secondly, the majority of the putatively highly expressed genes have a tendency to locate on the leading strand, which indicates that replicational and transciptional selection might be invoked. Thirdly, highly expressed genes are more conserved than lowly expressed genes by synonymous and nonsynonymous substitutions among orthologous genes fromthe genomes of C. glutamicum and C. diphtheriae. We also analyzed other factors such as the length of genes and hydrophobicity that might influence codon usage and found their contributions to be weak.
Collapse
|