1
|
Baranowski B, Pawłowski K. Protein family neighborhood analyzer-ProFaNA. PeerJ 2023; 11:e15715. [PMID: 37492397 PMCID: PMC10364804 DOI: 10.7717/peerj.15715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 06/16/2023] [Indexed: 07/27/2023] Open
Abstract
Background Functionally related genes are well known to be often grouped in close vicinity in the genomes, particularly in prokaryotes. Notwithstanding the diverse evolutionary mechanisms leading to this phenomenon, it can be used to predict functions of uncharacterized genes. Methods Here, we provide a simple but robust statistical approach that leverages the vast amounts of genomic data available today. Considering a protein domain as a functional unit, one can explore other functional units (domains) that significantly often occur within the genomic neighborhoods of the queried domain. This analysis can be performed across different taxonomic levels. Provisions can also be made to correct for the uneven sampling of the taxonomic space by genomic sequencing projects that often focus on large numbers of very closely related strains, e.g., pathogenic ones. To this end, an optional procedure for averaging occurrences within subtaxa is available. Results Several examples show this approach can provide useful functional predictions for uncharacterized gene families, and how to combine this information with other approaches. The method is made available as a web server at http://bioinfo.sggw.edu.pl/neighborhood_analysis.
Collapse
Affiliation(s)
- Bartosz Baranowski
- Department of Biochemistry and Microbiology, Warsaw University of Life Sciences, Warszawa, Poland
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warszawa, Poland
| | - Krzysztof Pawłowski
- Department of Biochemistry and Microbiology, Warsaw University of Life Sciences, Warszawa, Poland
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas, United States
- Department of Translational Sciences, Lund University, Lund, Sweden
| |
Collapse
|
2
|
Cotroneo CE, Gormley IC, Shields DC, Salter-Townshend M. Computational modelling of chromosomally clustering protein domains in bacteria. BMC Bioinformatics 2021; 22:593. [PMID: 34906073 PMCID: PMC8670047 DOI: 10.1186/s12859-021-04512-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 11/16/2021] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND In bacteria, genes with related functions-such as those involved in the metabolism of the same compound or in infection processes-are often physically close on the genome and form groups called clusters. The enrichment of such clusters over various distantly related bacteria can be used to predict the roles of genes of unknown function that cluster with characterised genes. There is no obvious rule to define a cluster, given their variability in size and intergenic distances, and the definition of what comprises a "gene", since genes can gain and lose domains over time. Protein domains can cluster within a gene, or in adjacent genes of related function, and in both cases these are chromosomally clustered. Here, we model the distances between pairs of protein domain coding regions across a wide range of bacteria and archaea via a probabilistic two component mixture model, without imposing arbitrary thresholds in terms of gene numbers or distances. RESULTS We trained our model using matched gene ontology terms to label functionally related pairs and assess the stability of the parameters of the model across 14,178 archaeal and bacterial strains. We found that the parameters of our mixture model are remarkably stable across bacteria and archaea, except for endosymbionts and obligate intracellular pathogens. Obligate pathogens have smaller genomes, and although they vary, on average do not show noticeably different clustering distances; the main difference in the parameter estimates is that a far greater proportion of the genes sharing ontology terms are clustered. This may reflect that these genomes are enriched for complexes encoded by clustered core housekeeping genes, as a proportion of the total genes. Given the overall stability of the parameter estimates, we then used the mean parameter estimates across the entire dataset to investigate which gene ontology terms are most frequently associated with clustered genes. CONCLUSIONS Given the stability of the mixture model across species, it may be used to predict bacterial gene clusters that are shared across multiple species, in addition to giving insights into the evolutionary pressures on the chromosomal locations of genes in different species.
Collapse
Affiliation(s)
- Chiara E Cotroneo
- School of Medicine, University College Dublin, Dublin, Ireland.,Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Dublin, Ireland
| | | | - Denis C Shields
- School of Medicine, University College Dublin, Dublin, Ireland. .,Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Dublin, Ireland.
| | | |
Collapse
|
3
|
Shikura N, Darbon E, Esnault C, Deniset-Besseau A, Xu D, Lejeune C, Jacquet E, Nhiri N, Sago L, Cornu D, Werten S, Martel C, Virolle MJ. The Phosin PptA Plays a Negative Role in the Regulation of Antibiotic Production in Streptomyces lividans. Antibiotics (Basel) 2021; 10:325. [PMID: 33804592 PMCID: PMC8003754 DOI: 10.3390/antibiotics10030325] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 03/16/2021] [Accepted: 03/17/2021] [Indexed: 12/30/2022] Open
Abstract
In Streptomyces, antibiotic biosynthesis is triggered in phosphate limitation that is usually correlated with energetic stress. Polyphosphates constitute an important reservoir of phosphate and energy and a better understanding of their role in the regulation of antibiotic biosynthesis is of crucial importance. We previously characterized a gene, SLI_4384/ppk, encoding a polyphosphate kinase, whose disruption greatly enhanced the weak antibiotic production of Streptomyces lividans. In the condition of energetic stress, Ppk utilizes polyP as phosphate and energy donor, to generate ATP from ADP. In this paper, we established that ppk is co-transcribed with its two downstream genes, SLI_4383, encoding a phosin called PptA possessing a CHAD domain constituting a polyphosphate binding module and SLI_4382 encoding a nudix hydrolase. The expression of the ppk/pptA/SLI_4382 operon was shown to be under the positive control of the two-component system PhoR/PhoP and thus mainly expressed in condition of phosphate limitation. However, pptA and SLI_4382 can also be transcribed alone from their own promoter. The deletion of pptA resulted into earlier and stronger actinorhodin production and lower lipid content than the disruption of ppk, whereas the deletion of SLI_4382 had no obvious phenotypical consequences. The disruption of ppk was shown to have a polar effect on the expression of pptA, suggesting that the phenotype of the ppk mutant might be linked, at least in part, to the weak expression of pptA in this strain. Interestingly, the expression of phoR/phoP and that of the genes of the pho regulon involved in phosphate supply or saving were strongly up-regulated in pptA and ppk mutants, revealing that both mutants suffer from phosphate stress. Considering the presence of a polyphosphate binding module in PptA, but absence of similarities between PptA and known exo-polyphosphatases, we proposed that PptA constitutes an accessory factor for exopolyphosphatases or general phosphatases involved in the degradation of polyphosphates into phosphate.
Collapse
Affiliation(s)
- Noriyasu Shikura
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France; (N.S.); (E.D.); (C.E.); (D.X.); (C.L.); (L.S.); (D.C.); (C.M.)
| | - Emmanuelle Darbon
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France; (N.S.); (E.D.); (C.E.); (D.X.); (C.L.); (L.S.); (D.C.); (C.M.)
| | - Catherine Esnault
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France; (N.S.); (E.D.); (C.E.); (D.X.); (C.L.); (L.S.); (D.C.); (C.M.)
| | - Ariane Deniset-Besseau
- Laboratoire de Chimie Physique (LCP), CNRS UMR 8000, Université Paris-Saclay, 91405 Orsay, France;
| | - Delin Xu
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France; (N.S.); (E.D.); (C.E.); (D.X.); (C.L.); (L.S.); (D.C.); (C.M.)
- Department of Ecology, Institute of Hydrobiology, School of Life Science and Technology, Key Laboratory of Eutrophication and Red Tide Prevention of Guangdong Higher Education Institutes, Engineering Research Center of Tropical and Subtropical Aquatic Ecological Engineering, Ministry of Education, Jinan University, Guangzhou 510632, China
| | - Clara Lejeune
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France; (N.S.); (E.D.); (C.E.); (D.X.); (C.L.); (L.S.); (D.C.); (C.M.)
| | - Eric Jacquet
- Institut de Chimie des Substances Naturelles, CNRS, Université Paris Saclay, 91190 Gif-sur-Yvette, France; (E.J.); (N.N.)
| | - Naima Nhiri
- Institut de Chimie des Substances Naturelles, CNRS, Université Paris Saclay, 91190 Gif-sur-Yvette, France; (E.J.); (N.N.)
| | - Laila Sago
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France; (N.S.); (E.D.); (C.E.); (D.X.); (C.L.); (L.S.); (D.C.); (C.M.)
| | - David Cornu
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France; (N.S.); (E.D.); (C.E.); (D.X.); (C.L.); (L.S.); (D.C.); (C.M.)
| | - Sebastiaan Werten
- Institute of Biological Chemistry, Biocenter, Medical University of Innsbruck, Innrain 80, 6020 Innsbruck, Austria;
| | - Cécile Martel
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France; (N.S.); (E.D.); (C.E.); (D.X.); (C.L.); (L.S.); (D.C.); (C.M.)
| | - Marie-Joelle Virolle
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France; (N.S.); (E.D.); (C.E.); (D.X.); (C.L.); (L.S.); (D.C.); (C.M.)
| |
Collapse
|
4
|
Differential Gene Expression Patterns of Yersinia pestis and Yersinia pseudotuberculosis during Infection and Biofilm Formation in the Flea Digestive Tract. mSystems 2019; 4:mSystems00217-18. [PMID: 30801031 PMCID: PMC6381227 DOI: 10.1128/msystems.00217-18] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Accepted: 01/27/2019] [Indexed: 01/01/2023] Open
Abstract
Yersinia pestis, the etiologic agent of plague, emerged as a fleaborne pathogen only within the last 6,000 years. Just five simple genetic changes in the Yersinia pseudotuberculosis progenitor, which served to eliminate toxicity to fleas and to enhance survival and biofilm formation in the flea digestive tract, were key to the transition to the arthropodborne transmission route. To gain a deeper understanding of the genetic basis for the development of a transmissible biofilm infection in the flea foregut, we evaluated additional gene differences and performed in vivo transcriptional profiling of Y. pestis, a Y. pseudotuberculosis wild-type strain (unable to form biofilm in the flea foregut), and a Y. pseudotuberculosis mutant strain (able to produce foregut-blocking biofilm in fleas) recovered from fleas 1 day and 14 days after an infectious blood meal. Surprisingly, the Y. pseudotuberculosis mutations that increased c-di-GMP levels and enabled biofilm development in the flea did not change the expression levels of the hms genes responsible for the synthesis and export of the extracellular polysaccharide matrix required for mature biofilm formation. The Y. pseudotuberculosis mutant uniquely expressed much higher levels of Yersinia type VI secretion system 4 (T6SS-4) in the flea, and this locus was required for flea blockage by Y. pseudotuberculosis but not for blockage by Y. pestis. Significant differences between the two species in expression of several metabolism genes, the Psa fimbrial genes, quorum sensing-related genes, transcription regulation genes, and stress response genes were evident during flea infection. IMPORTANCE Y. pestis emerged as a highly virulent, arthropod-transmitted pathogen on the basis of relatively few and discrete genetic changes from Y. pseudotuberculosis. Parallel comparisons of the in vitro and in vivo transcriptomes of Y. pestis and two Y. pseudotuberculosis variants that produce a nontransmissible infection and a transmissible infection of the flea vector, respectively, provided insights into how Y. pestis has adapted to life in its flea vector and point to evolutionary changes in the regulation of metabolic and biofilm development pathways in these two closely related species.
Collapse
|
5
|
Abstract
Bacteria encode a variety of adaptations that enable them to survive during zinc starvation, a condition which is encountered both in natural environments and inside the human host. In Vibrio cholerae, the causative agent of the diarrheal disease cholera, we have identified a novel member of this zinc starvation response, a cell wall hydrolase that retains function and is conditionally essential for cell growth in low-zinc environments. Other Gram-negative bacteria contain homologs that appear to be under similar regulatory control. These findings are significant because they represent, to our knowledge, the first evidence that zinc homeostasis influences cell wall turnover. Anti-infective therapies commonly target the bacterial cell wall; therefore, an improved understanding of how the cell wall adapts to host-induced zinc starvation could lead to new antibiotic development. Such therapeutic interventions are required to combat the rising threat of drug-resistant infections. The cell wall is a strong, yet flexible, meshwork of peptidoglycan (PG) that gives a bacterium structural integrity. To accommodate a growing cell, the wall is remodeled by both PG synthesis and degradation. Vibrio cholerae encodes a group of three nearly identical zinc-dependent endopeptidases (EPs) that are predicted to hydrolyze PG to facilitate cell growth. Two of these (ShyA and ShyC) are conditionally essential housekeeping EPs, while the third (ShyB) is not expressed under standard laboratory conditions. To investigate the role of ShyB, we conducted a transposon screen to identify mutations that activate shyB transcription. We found that shyB is induced as part of the Zur-mediated zinc starvation response, a mode of regulation not previously reported for cell wall lytic enzymes. In vivo, ShyB alone was sufficient to sustain cell growth in low-zinc environments. In vitro, ShyB retained its d,d-endopeptidase activity against purified sacculi in the presence of the metal chelator EDTA at concentrations that inhibit ShyA and ShyC. This insensitivity to metal chelation is likely what enables ShyB to substitute for other EPs during zinc starvation. Our survey of transcriptomic data from diverse bacteria identified other candidate Zur-regulated EPs, suggesting that this adaptation to zinc starvation is employed by other Gram-negative bacteria.
Collapse
|
6
|
BLAST-XYPlot Viewer: A Tool for Performing BLAST in Whole-Genome Sequenced Bacteria/Archaea and Visualize Whole Results Simultaneously. G3-GENES GENOMES GENETICS 2018; 8:2167-2172. [PMID: 29789313 PMCID: PMC6027881 DOI: 10.1534/g3.118.200220] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
One of the most commonly used tools to compare protein or DNA sequences against databases is BLAST. We introduce a web tool that allows the performance of BLAST-searches of protein/DNA sequences in whole-genome sequenced bacteria/archaea, and displays a large amount of BLAST-results simultaneously. The circular bacterial replicons are projected as horizontal lines with fixed length of 360, representing the degrees of a circle. A coordinate system is created with length of the replicon along the x-axis and the number of replicon used on the y-axis. When a query sequence matches with a gene/protein of a particular replicon, the BLAST-results are depicted as an "x,y" position in a specially adapted plot. This tool allows the visualization of the results from the whole data to a particular gene/protein in real time with low computational resources.
Collapse
|
7
|
COGNAT: a web server for comparative analysis of genomic neighborhoods. Biol Direct 2017; 12:26. [PMID: 29166914 PMCID: PMC5700660 DOI: 10.1186/s13062-017-0196-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Accepted: 10/26/2017] [Indexed: 11/18/2022] Open
Abstract
Background In prokaryotic genomes, functionally coupled genes can be organized in conserved gene clusters enabling their coordinated regulation. Such clusters could contain one or several operons, which are groups of co-transcribed genes. Those genes that evolved from a common ancestral gene by speciation (i.e. orthologs) are expected to have similar genomic neighborhoods in different organisms, whereas those copies of the gene that are responsible for dissimilar functions (i.e. paralogs) could be found in dissimilar genomic contexts. Comparative analysis of genomic neighborhoods facilitates the prediction of co-regulated genes and helps to discern different functions in large protein families. Aim We intended, building on the attribution of gene sequences to the clusters of orthologous groups of proteins (COGs), to provide a method for visualization and comparative analysis of genomic neighborhoods of evolutionary related genes, as well as a respective web server. Results Here we introduce the COmparative Gene Neighborhoods Analysis Tool (COGNAT), a web server for comparative analysis of genomic neighborhoods. The tool is based on the COG database, as well as the Pfam protein families database. As an example, we show the utility of COGNAT in identifying a new type of membrane protein complex that is formed by paralog(s) of one of the membrane subunits of the NADH:quinone oxidoreductase of type 1 (COG1009) and a cytoplasmic protein of unknown function (COG3002). Reviewers This article was reviewed by Drs. Igor Zhulin, Uri Gophna and Igor Rogozin. Electronic supplementary material The online version of this article (10.1186/s13062-017-0196-z) contains supplementary material, which is available to authorized users.
Collapse
|
8
|
Kandlinger F, Plach MG, Merkl R. AGeNNT: annotation of enzyme families by means of refined neighborhood networks. BMC Bioinformatics 2017; 18:274. [PMID: 28545394 PMCID: PMC5445326 DOI: 10.1186/s12859-017-1689-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Accepted: 05/16/2017] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Large enzyme families may contain functionally diverse members that give rise to clusters in a sequence similarity network (SSN). In prokaryotes, the genome neighborhood of a gene-product is indicative of its function and thus, a genome neighborhood network (GNN) deduced for an SSN provides strong clues to the specific function of enzymes constituting the different clusters. The Enzyme Function Initiative ( http://enzymefunction.org/ ) offers services that compute SSNs and GNNs. RESULTS We have implemented AGeNNT that utilizes these services, albeit with datasets purged with respect to unspecific protein functions and overrepresented species. AGeNNT generates refined GNNs (rGNNs) that consist of cluster-nodes representing the sequences under study and Pfam-nodes representing enzyme functions encoded in the respective neighborhoods. For cluster-nodes, AGeNNT summarizes the phylogenetic relationships of the contributing species and a statistic indicates how unique nodes and GNs are within this rGNN. Pfam-nodes are annotated with additional features like GO terms describing protein function. For edges, the coverage is given, which is the relative number of neighborhoods containing the considered enzyme function (Pfam-node). AGeNNT is available at https://github.com/kandlinf/agennt . CONCLUSIONS An rGNN is easier to interpret than a conventional GNN, which commonly contains proteins without enzymatic function and overly specific neighborhoods due to phylogenetic bias. The implemented filter routines and the statistic allow the user to identify those neighborhoods that are most indicative of a specific metabolic capacity. Thus, AGeNNT facilitates to distinguish and annotate functionally different members of enzyme families.
Collapse
Affiliation(s)
- Florian Kandlinger
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, D-93040 Regensburg, Germany
- Faculty of Mathematics and Computer Science, University of Hagen, D-58084 Hagen, Germany
| | - Maximilian G. Plach
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, D-93040 Regensburg, Germany
| | - Rainer Merkl
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, D-93040 Regensburg, Germany
| |
Collapse
|
9
|
Tietz JI, Schwalen CJ, Patel PS, Maxson T, Blair PM, Tai HC, Zakai UI, Mitchell DA. A new genome-mining tool redefines the lasso peptide biosynthetic landscape. Nat Chem Biol 2017; 13:470-478. [PMID: 28244986 PMCID: PMC5391289 DOI: 10.1038/nchembio.2319] [Citation(s) in RCA: 289] [Impact Index Per Article: 41.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Accepted: 12/06/2016] [Indexed: 12/14/2022]
Abstract
Ribosomally synthesized and post-translationally modified peptide (RiPP) natural products are attractive for genome-driven discovery and re-engineering, but limitations in bioinformatic methods and exponentially increasing genomic data make large-scale mining of RiPP data difficult. We report RODEO (Rapid ORF Description and Evaluation Online), which combines hidden-Markov-model-based analysis, heuristic scoring, and machine learning to identify biosynthetic gene clusters and predict RiPP precursor peptides. We initially focused on lasso peptides, which display intriguing physicochemical properties and bioactivities, but their hypervariability renders them challenging prospects for automated mining. Our approach yielded the most comprehensive mapping to date of lasso peptide space, revealing >1,300 compounds. We characterized the structures and bioactivities of six lasso peptides, prioritized based on predicted structural novelty, including one with an unprecedented handcuff-like topology and another with a citrulline modification exceptionally rare among bacteria. These combined insights significantly expand the knowledge of lasso peptides and, more broadly, provide a framework for future genome-mining efforts.
Collapse
Affiliation(s)
- Jonathan I Tietz
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Christopher J Schwalen
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Parth S Patel
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Tucker Maxson
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Patricia M Blair
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Hua-Chia Tai
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Uzma I Zakai
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Douglas A Mitchell
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.,Department of Microbiology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
10
|
Fouts DE, Matthias MA, Adhikarla H, Adler B, Amorim-Santos L, Berg DE, Bulach D, Buschiazzo A, Chang YF, Galloway RL, Haake DA, Haft DH, Hartskeerl R, Ko AI, Levett PN, Matsunaga J, Mechaly AE, Monk JM, Nascimento ALT, Nelson KE, Palsson B, Peacock SJ, Picardeau M, Ricaldi JN, Thaipandungpanit J, Wunder EA, Yang XF, Zhang JJ, Vinetz JM. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira. PLoS Negl Trop Dis 2016; 10:e0004403. [PMID: 26890609 PMCID: PMC4758666 DOI: 10.1371/journal.pntd.0004403] [Citation(s) in RCA: 204] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Accepted: 01/03/2016] [Indexed: 12/20/2022] Open
Abstract
Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade's refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic) vs. non-infectious Leptospira, this work provides new insights into the evolution of a genus of bacterial pathogens. This work will be a comprehensive roadmap for understanding leptospirosis pathogenesis. More generally, it provides new insights into mechanisms by which bacterial pathogens adapt to mammalian hosts.
Collapse
Affiliation(s)
- Derrick E. Fouts
- J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Michael A. Matthias
- Division of Infectious Diseases, Department of Medicine, University of California San Diego School of Medicine, La Jolla, California, United States of America
| | - Haritha Adhikarla
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, Connecticut, United States of America
| | - Ben Adler
- Australian Research Council Centre of Excellence in Structural and Functional Microbial Genomics, Department of Microbiology, Monash University, Clayton, Australia
| | - Luciane Amorim-Santos
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, Connecticut, United States of America
- Centro de Pesquisas Gonçalo Moniz, Fundação Oswaldo Cruz/MS, Salvador, Bahia, Brazil
| | - Douglas E. Berg
- Division of Infectious Diseases, Department of Medicine, University of California San Diego School of Medicine, La Jolla, California, United States of America
| | - Dieter Bulach
- Victorian Bioinformatics Consortium, Monash University, Clayton, Victoria, Australia
| | - Alejandro Buschiazzo
- Institut Pasteur de Montevideo, Laboratory of Molecular and Structural Microbiology, Montevideo, Uruguay
- Institut Pasteur, Department of Structural Biology and Chemistry, Paris, France
| | - Yung-Fu Chang
- Department of Population Medicine & Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, New York, United States of America
| | - Renee L. Galloway
- Centers for Disease Control and Prevention (DHHS, CDC, OID, NCEZID, DHCPP, BSPB), Atlanta, Georgia, United States of America
| | - David A. Haake
- VA Greater Los Angeles Healthcare System, Los Angeles, California, United States of America
- David Geffen School of Medicine at UCLA, Los Angeles, California, United States of America
| | - Daniel H. Haft
- J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Rudy Hartskeerl
- WHO/FAO/OIE and National Collaborating Centre for Reference and Research on Leptospirosis, KIT Biomedical Research, Royal Tropical Institute (KIT), Amsterdam, The Netherlands
| | - Albert I. Ko
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, Connecticut, United States of America
- Centro de Pesquisas Gonçalo Moniz, Fundação Oswaldo Cruz/MS, Salvador, Bahia, Brazil
| | - Paul N. Levett
- Government of Saskatchewan, Disease Control Laboratory Regina, Canada
| | - James Matsunaga
- VA Greater Los Angeles Healthcare System, Los Angeles, California, United States of America
- David Geffen School of Medicine at UCLA, Los Angeles, California, United States of America
| | - Ariel E. Mechaly
- Institut Pasteur de Montevideo, Laboratory of Molecular and Structural Microbiology, Montevideo, Uruguay
| | - Jonathan M. Monk
- Department of Bioengineering, University of California, San Diego, La Jolla, California, United States of America
| | - Ana L. T. Nascimento
- Centro de Biotecnologia, Instituto Butantan, São Paulo, SP, Brazil
- Programa Interunidades em Biotecnologia, Instituto de Ciências Biomédicas, USP, São Paulo, SP, Brazil
| | - Karen E. Nelson
- J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Bernhard Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, California, United States of America
| | - Sharon J. Peacock
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Mathieu Picardeau
- Institut Pasteur, Biology of Spirochetes Unit, National Reference Centre and WHO Collaborating Center for Leptospirosis, Paris, France
| | - Jessica N. Ricaldi
- Instituto de Medicina Tropical Alexander von Humboldt; Facultad de Medicina Alberto Hurtado, Universidd Peruana Cayetano Heredia, Lima, Peru
| | | | - Elsio A. Wunder
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, Connecticut, United States of America
- Centro de Pesquisas Gonçalo Moniz, Fundação Oswaldo Cruz/MS, Salvador, Bahia, Brazil
| | - X. Frank Yang
- Department of Microbiology and Immunology, Indiana University School of Medicine, Indianapolis, Indiana, United States of America
| | - Jun-Jie Zhang
- Department of Microbiology and Immunology, Indiana University School of Medicine, Indianapolis, Indiana, United States of America
| | - Joseph M. Vinetz
- Division of Infectious Diseases, Department of Medicine, University of California San Diego School of Medicine, La Jolla, California, United States of America
- Instituto de Medicina Tropical Alexander von Humboldt; Facultad de Medicina Alberto Hurtado, Universidd Peruana Cayetano Heredia, Lima, Peru
- Instituto de Medicina “Alexander von Humboldt,” Universidad Peruana Cayetano Heredia, Lima, Peru
| |
Collapse
|
11
|
Pavlopoulos GA, Malliarakis D, Papanikolaou N, Theodosiou T, Enright AJ, Iliopoulos I. Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future. Gigascience 2015; 4:38. [PMID: 26309733 PMCID: PMC4548842 DOI: 10.1186/s13742-015-0077-2] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Accepted: 08/03/2015] [Indexed: 01/31/2023] Open
Abstract
"Α picture is worth a thousand words." This widely used adage sums up in a few words the notion that a successful visual representation of a concept should enable easy and rapid absorption of large amounts of information. Although, in general, the notion of capturing complex ideas using images is very appealing, would 1000 words be enough to describe the unknown in a research field such as the life sciences? Life sciences is one of the biggest generators of enormous datasets, mainly as a result of recent and rapid technological advances; their complexity can make these datasets incomprehensible without effective visualization methods. Here we discuss the past, present and future of genomic and systems biology visualization. We briefly comment on many visualization and analysis tools and the purposes that they serve. We focus on the latest libraries and programming languages that enable more effective, efficient and faster approaches for visualizing biological concepts, and also comment on the future human-computer interaction trends that would enable for enhancing visualization further.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | | | - Nikolas Papanikolaou
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | - Theodosis Theodosiou
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | - Anton J Enright
- EMBL - European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SD UK
| | - Ioannis Iliopoulos
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| |
Collapse
|
12
|
Aurisano J, Reda K, Johnson A, Marai EG, Leigh J. BactoGeNIE: a large-scale comparative genome visualization for big displays. BMC Bioinformatics 2015; 16 Suppl 11:S6. [PMID: 26329021 PMCID: PMC4547189 DOI: 10.1186/1471-2105-16-s11-s6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Background The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visual scalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. Results In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visually scalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE through a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. Conclusions BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visual scalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics.
Collapse
|
13
|
Brilli M, Liò P, Lacroix V, Sagot MF. Short and long-term genome stability analysis of prokaryotic genomes. BMC Genomics 2013; 14:309. [PMID: 23651581 PMCID: PMC3683328 DOI: 10.1186/1471-2164-14-309] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2012] [Accepted: 04/11/2013] [Indexed: 11/21/2022] Open
Abstract
Background Gene organization dynamics is actively studied because it provides useful evolutionary information, makes functional annotation easier and often enables to characterize pathogens. There is therefore a strong interest in understanding the variability of this trait and the possible correlations with life-style. Two kinds of events affect genome organization: on one hand translocations and recombinations change the relative position of genes shared by two genomes (i.e. the backbone gene order); on the other, insertions and deletions leave the backbone gene order unchanged but they alter the gene neighborhoods by breaking the syntenic regions. A complete picture about genome organization evolution therefore requires to account for both kinds of events. Results We developed an approach where we model chromosomes as graphs on which we compute different stability estimators; we consider genome rearrangements as well as the effect of gene insertions and deletions. In a first part of the paper, we fit a measure of backbone gene order conservation (hereinafter called backbone stability) against phylogenetic distance for over 3000 genome comparisons, improving existing models for the divergence in time of backbone stability. Intra- and inter-specific comparisons were treated separately to focus on different time-scales. The use of multiple genomes of a same species allowed to identify genomes with diverging gene order with respect to their conspecific. The inter-species analysis indicates that pathogens are more often unstable with respect to non-pathogens. In a second part of the text, we show that in pathogens, gene content dynamics (insertions and deletions) have a much more dramatic effect on genome organization stability than backbone rearrangements. Conclusion In this work, we studied genome organization divergence taking into account the contribution of both genome order rearrangements and genome content dynamics. By studying species with multiple sequenced genomes available, we were able to explore genome organization stability at different time-scales and to find significant differences for pathogen and non-pathogen species. The output of our framework also allows to identify the conserved gene clusters and/or partial occurrences thereof, making possible to explore how gene clusters assembled during evolution.
Collapse
|
14
|
Overmars L, Kerkhoven R, Siezen RJ, Francke C. MGcV: the microbial genomic context viewer for comparative genome analysis. BMC Genomics 2013; 14:209. [PMID: 23547764 PMCID: PMC3639932 DOI: 10.1186/1471-2164-14-209] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2012] [Accepted: 03/22/2013] [Indexed: 01/22/2023] Open
Abstract
Background Conserved gene context is used in many types of comparative genome analyses. It is used to provide leads on gene function, to guide the discovery of regulatory sequences, but also to aid in the reconstruction of metabolic networks. We present the Microbial Genomic context Viewer (MGcV), an interactive, web-based application tailored to strengthen the practice of manual comparative genome context analysis for bacteria. Results MGcV is a versatile, easy-to-use tool that renders a visualization of the genomic context of any set of selected genes, genes within a phylogenetic tree, genomic segments, or regulatory elements. It is tailored to facilitate laborious tasks such as the interactive annotation of gene function, the discovery of regulatory elements, or the sequence-based reconstruction of gene regulatory networks. We illustrate that MGcV can be used in gene function annotation by visually integrating information on prokaryotic genes, like their annotation as available from NCBI with other annotation data such as Pfam domains, sub-cellular location predictions and gene-sequence characteristics such as GC content. We also illustrate the usefulness of the interactive features that allow the graphical selection of genes to facilitate data gathering (e.g. upstream regions, ID’s or annotation), in the analysis and reconstruction of transcription regulation. Moreover, putative regulatory elements and their corresponding scores or data from RNA-seq and microarray experiments can be uploaded, visualized and interpreted in (ranked-) comparative context maps. The ranked maps allow the interpretation of predicted regulatory elements and experimental data in light of each other. Conclusion MGcV advances the manual comparative analysis of genes and regulatory elements by providing fast and flexible integration of gene related data combined with straightforward data retrieval. MGcV is available at http://mgcv.cmbi.ru.nl.
Collapse
Affiliation(s)
- Lex Overmars
- Centre for Molecular and Biomolecular Informatics, Radboud University Nijmegen Medical Centre, Geert Grooteplein Zuid 26-28, Nijmegen, 6525GA, The Netherlands.
| | | | | | | |
Collapse
|
15
|
Medema MH, Takano E, Breitling R. Detecting sequence homology at the gene cluster level with MultiGeneBlast. Mol Biol Evol 2013; 30:1218-23. [PMID: 23412913 PMCID: PMC3670737 DOI: 10.1093/molbev/mst025] [Citation(s) in RCA: 245] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
The genes encoding many biomolecular systems and pathways are genomically organized in operons or gene clusters. With MultiGeneBlast, we provide a user-friendly and effective tool to perform homology searches with operons or gene clusters as basic units, instead of single genes. The contextualization offered by MultiGeneBlast allows users to get a better understanding of the function, evolutionary history, and practical applications of such genomic regions. The tool is fully equipped with applications to generate search databases from GenBank or from the user’s own sequence data. Finally, an architecture search mode allows searching for gene clusters with novel configurations, by detecting genomic regions with any user-specified combination of genes. Sources, precompiled binaries, and a graphical tutorial of MultiGeneBlast are freely available from http://multigeneblast.sourceforge.net/.
Collapse
Affiliation(s)
- Marnix H Medema
- Department of Microbial Physiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| | | | | |
Collapse
|
16
|
Oberto J. SyntTax: a web server linking synteny to prokaryotic taxonomy. BMC Bioinformatics 2013; 14:4. [PMID: 23323735 PMCID: PMC3571937 DOI: 10.1186/1471-2105-14-4] [Citation(s) in RCA: 120] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2012] [Accepted: 12/19/2012] [Indexed: 11/25/2022] Open
Abstract
Background The study of the conservation of gene order or synteny constitutes a powerful methodology to assess the orthology of genomic regions and to predict functional relationships between genes. The exponential growth of microbial genomic databases is expected to improve synteny predictions significantly. Paradoxically, this genomic data plethora, without information on organisms relatedness, could impair the performance of synteny analysis programs. Results In this work, I present SyntTax, a synteny web service designed to take full advantage of the large amount or archaeal and bacterial genomes by linking them through taxonomic relationships. SyntTax incorporates a full hierarchical taxonomic tree allowing intuitive access to all completely sequenced prokaryotes. Single or multiple organisms can be chosen on the basis of their lineage by selecting the corresponding rank nodes in the tree. The synteny methodology is built upon our previously described Absynte algorithm with several additional improvements. Conclusions SyntTax aims to produce robust syntenies by providing prompt access to the taxonomic relationships connecting all completely sequenced microbial genomes. The reduction in redundancy offered by lineage selection presents the benefit of increasing accuracy while reducing computation time. This web tool was used to resolve successfully several conserved complex gene clusters described in the literature. In addition, particular features of SyntTax permit the confirmation of the involvement of the four components constituting the E. coli YgjD multiprotein complex responsible for tRNA modification. By analyzing the clustering evolution of alternative gene fusions, new proteins potentially interacting with this complex could be proposed. The web service is available at http://archaea.u-psud.fr/SyntTax.
Collapse
Affiliation(s)
- Jacques Oberto
- Université Paris-Sud 11, CNRS, UMR8621, Institut de Génétique et Microbiologie, 91405, Orsay, France.
| |
Collapse
|
17
|
Barh D, Gupta K, Jain N, Khatri G, León-Sicairos N, Canizalez-Roman A, Tiwari S, Verma A, Rahangdale S, Shah Hassan S, Rodrigues dos Santos A, Ali A, Carlos Guimarães L, Thiago Jucá Ramos R, Devarapalli P, Barve N, Bakhtiar M, Kumavath R, Ghosh P, Miyoshi A, Silva A, Kumar A, Narayan Misra A, Blum K, Baumbach J, Azevedo V. Conserved host–pathogen PPIs Globally conserved inter-species bacterial PPIs based conserved host-pathogen interactome derived novel target inC. pseudotuberculosis,C. diphtheriae,M. tuberculosis,C. ulcerans,Y. pestis, andE. colitargeted byPiper betelcompounds. Integr Biol (Camb) 2013; 5:495-509. [DOI: 10.1039/c2ib20206a] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Affiliation(s)
- Debmalya Barh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- Department of Biosciences and Biotechnology, School of Biotechnology, Fakir Mohan University, Jnan Bigyan Vihar, Balasore, Orissa, India
| | - Krishnakant Gupta
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Neha Jain
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
| | - Gourav Khatri
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Nidia León-Sicairos
- Unidad de investigacion, Facultad de Medicina, Universidad Autónoma de Sinaloa. Cedros y Sauces, Fraccionamiento Fresnos, Culiacán Sinaloa 80246, México
| | - Adrian Canizalez-Roman
- Unidad de investigacion, Facultad de Medicina, Universidad Autónoma de Sinaloa. Cedros y Sauces, Fraccionamiento Fresnos, Culiacán Sinaloa 80246, México
| | - Sandeep Tiwari
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
| | - Ankit Verma
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Sachin Rahangdale
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Syed Shah Hassan
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | | | - Amjad Ali
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Luis Carlos Guimarães
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | | | - Pratap Devarapalli
- Department of Genomic Science, School of Biological Sciences, Riverside Transit Campus, Central University of Kerala, Kasaragod, India
| | - Neha Barve
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Marriam Bakhtiar
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Ranjith Kumavath
- Department of Genomic Science, School of Biological Sciences, Riverside Transit Campus, Central University of Kerala, Kasaragod, India
| | - Preetam Ghosh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- Department of Computer Science and Center for the Study of Biological Complexity, Virginia Commonwealth University, 401 West Main Street, Room E4234, P.O. Box 843019, Richmond, Virginia 23284-3019, USA
| | - Anderson Miyoshi
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Artur Silva
- Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém, PA, Brazil
| | - Anil Kumar
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Amarendra Narayan Misra
- Department of Biosciences and Biotechnology, School of Biotechnology, Fakir Mohan University, Jnan Bigyan Vihar, Balasore, Orissa, India
- Center for Life Sciences, School of Natural Sciences, Central University of Jharkhand, Ranchi, Jharkhand State, India
| | - Kenneth Blum
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- University of Florida, College of Medicine, Gainesville, Florida, USA
- Global Integrated Services Unit University of Vermont Center for Clinical & Translational Science, College of Medicine, Burlington, VT, USA
- Dominion Diagnostics LLC, North Kingstown, Rhode Island, USA
| | - Jan Baumbach
- Computational Biology Group Department of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, DK-5230 Odense, Denmark
| | - Vasco Azevedo
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| |
Collapse
|
18
|
Abstract
Pantothenate, commonly referred to as vitamin B(5), is an essential molecule in the metabolism of living organisms and forms the core of coenzyme A. Unlike humans, some bacteria and plants are capable of de novo biosynthesis of pantothenate, making this pathway a potential target for drug development. Francisella tularensis subsp. tularensis Schu S4 is a zoonotic bacterial pathogen that is able to synthesize pantothenate but is lacking the known ketopantoate reductase (KPR) genes, panE and ilvC, found in the canonical Escherichia coli pathway. Described herein is a gene encoding a novel KPR, for which we propose the name panG (FTT1388), which is conserved in all sequenced Francisella species and is the sole KPR in Schu S4. Homologs of this KPR are present in other pathogenic bacteria such as Enterococcus faecalis, Coxiella burnetii, and Clostridium difficile. Both the homologous gene from E. faecalis V583 (EF1861) and E. coli panE functionally complemented Francisella novicida lacking any KPR. Furthermore, panG from F. novicida can complement an E. coli KPR double mutant. A Schu S4 ΔpanG strain is a pantothenate auxotroph and was genetically and chemically complemented with panG in trans or with the addition of pantolactone. There was no virulence defect in the Schu S4 ΔpanG strain compared to the wild type in a mouse model of pneumonic tularemia. In summary, we characterized the pantothenate pathway in Francisella novicida and F. tularensis and identified an unknown and previously uncharacterized KPR that can convert 2-dehydropantoate to pantoate, PanG.
Collapse
|
19
|
Despalins A, Marsit S, Oberto J. Absynte: a web tool to analyze the evolution of orthologous archaeal and bacterial gene clusters. Bioinformatics 2011; 27:2905-6. [PMID: 21840875 DOI: 10.1093/bioinformatics/btr473] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
SUMMARY Absynte (Archaeal and Bacterial Synteny Explorer) is a web-based service designed to display local syntenies in completely sequenced prokaryotic chromosomes. The genomic contexts are determined with a multiple center star clustering topology on the basis of a user-provided protein sequence and all (or a set of) chromosomes from the publicly available archaeal and bacterial genomes. The results consist in a dynamic web page where a consistent color-coding permits a rapid visual evaluation of the relative positioning of genes with similar sequences within the synteny. Each gene composing the synteny can be further queried interactively using either local or remote databases. Absynte results can be exported in .CSV or high-resolution, .PDF formats for printing, archival, further editing or publication purposes. Performance, real-time computation, user-friendliness and daily database updates constitute the principal advantages of Absynte over similar web services. AVAILABILITY http://archaea.u-psud.fr/absynte CONTACT jacques.oberto@igmors.u-psud.fr.
Collapse
Affiliation(s)
- Arnaud Despalins
- Université Paris-Sud 11, CNRS, UMR8621, Institut de Génétique et Microbiologie, 91405 Orsay Cedex, France
| | | | | |
Collapse
|
20
|
Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions. BMC Bioinformatics 2010; 11:461. [PMID: 20843356 PMCID: PMC2949892 DOI: 10.1186/1471-2105-11-461] [Citation(s) in RCA: 189] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2010] [Accepted: 09/15/2010] [Indexed: 01/21/2023] Open
Abstract
Background The pan-genome of a bacterial species consists of a core and an accessory gene pool. The accessory genome is thought to be an important source of genetic variability in bacterial populations and is gained through lateral gene transfer, allowing subpopulations of bacteria to better adapt to specific niches. Low-cost and high-throughput sequencing platforms have created an exponential increase in genome sequence data and an opportunity to study the pan-genomes of many bacterial species. In this study, we describe a new online pan-genome sequence analysis program, Panseq. Results Panseq was used to identify Escherichia coli O157:H7 and E. coli K-12 genomic islands. Within a population of 60 E. coli O157:H7 strains, the existence of 65 accessory genomic regions identified by Panseq analysis was confirmed by PCR. The accessory genome and binary presence/absence data, and core genome and single nucleotide polymorphisms (SNPs) of six L. monocytogenes strains were extracted with Panseq and hierarchically clustered and visualized. The nucleotide core and binary accessory data were also used to construct maximum parsimony (MP) trees, which were compared to the MP tree generated by multi-locus sequence typing (MLST). The topology of the accessory and core trees was identical but differed from the tree produced using seven MLST loci. The Loci Selector module found the most variable and discriminatory combinations of four loci within a 100 loci set among 10 strains in 1 s, compared to the 449 s required to exhaustively search for all possible combinations; it also found the most discriminatory 20 loci from a 96 loci E. coli O157:H7 SNP dataset. Conclusion Panseq determines the core and accessory regions among a collection of genomic sequences based on user-defined parameters. It readily extracts regions unique to a genome or group of genomes, identifies SNPs within shared core genomic regions, constructs files for use in phylogeny programs based on both the presence/absence of accessory regions and SNPs within core regions and produces a graphical overview of the output. Panseq also includes a loci selector that calculates the most variable and discriminatory loci among sets of accessory loci or core gene SNPs. Availability Panseq is freely available online at http://76.70.11.198/panseq. Panseq is written in Perl.
Collapse
|
21
|
Salse J, Abrouk M, Murat F, Quraishi UM, Feuillet C. Improved criteria and comparative genomics tool provide new insights into grass paleogenomics. Brief Bioinform 2009; 10:619-30. [DOI: 10.1093/bib/bbp037] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
22
|
Revanna KV, Krishnakumar V, Dong Q. A web-based software system for dynamic gene cluster comparison across multiple genomes. Bioinformatics 2009; 25:956-7. [PMID: 19208612 DOI: 10.1093/bioinformatics/btp078] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Investigating the conservation of gene clusters across multiple genomes has become a standard practice in the era of comparative genomics. However, all existing software and databases rely heavily on pre-computation to identify homologous genes by genome-wide comparisons. Such pre-computing strategies lack accuracy and updating the data is computationally intensive. Since most molecular biologists are often interested only in a small cluster of genes, catering to this need, we have developed a web-based software system that allows users to upload a list of genes, perform dynamic search against the genomes of their choices and interactively visualize the gene cluster conservation using a novel multi-genome browser. Our approach avoids expensive genome-wide pre-computing and allows users to dynamically change the search criteria to fit their genes of interest. Our system can be customized for any genome sequences. We have applied it to both prokaryotic and eukaryotic genomes to illustrate its usability. AVAILABILITY Our software is freely available at http://cgcv.cgb.indiana.edu/cgi-bin/index.cgi.
Collapse
|
23
|
Klein J, Münch R, Biegler I, Haddad I, Retter I, Jahn D. Strepto-DB, a database for comparative genomics of group A (GAS) and B (GBS) streptococci, implemented with the novel database platform 'Open Genome Resource' (OGeR). Nucleic Acids Res 2008; 37:D494-8. [PMID: 18854354 PMCID: PMC2686516 DOI: 10.1093/nar/gkn674] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Streptococci are the causative agent of many human infectious diseases including bacterial pneumonia and meningitis. Here, we present Strepto-DB, a database for the comparative genome analysis of group A (GAS) and group B (GBS) streptococci. The known genomes of various GAS and GBS contain a large fraction of distributed genes that were found absent in other strains or serotypes of the same species. Strepto-DB identifies the homologous proteins deduced from the genomes of interest. It allows for the elucidation of the GAS and GBS core- and pan-genomes via genome-wide comparisons. Moreover, an intergenic region analysis tool provides alignments and predictions for transcription factor binding sites in the non-coding sequences. An interactive genome browser visualizes functional annotations. Strepto-DB (http://oger.tu-bs.de/strepto_db) was created by the use of OGeR, the Open Genome Resource for comparative analysis of prokaryotic genomes. OGeR is a newly developed open source database and tool platform for the web-based storage, distribution, visualization and comparison of prokaryotic genome data. The system automatically creates the dedicated relational database and web interface and imports an arbitrary number of genomes derived from standardized genome files. OGeR can be downloaded at http://oger.tu-bs.de.
Collapse
Affiliation(s)
- Johannes Klein
- Institute for Microbiology, Technische Universität Braunschweig, Spielmannstrasse 7, 38106 Braunschweig, Germany
| | | | | | | | | | | |
Collapse
|