Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

40
(from Reference Citation Analysis)

Article PDFs (14)

Cited by > 0 (37)

Searched Name

Ivan Moszer

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Statistics

Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Category

Show more Refine

Number	Citation Analysis
1	Plasma microRNA signature in presymptomatic and symptomatic subjects with C9orf72-associated frontotemporal dementia and amyotrophic lateral sclerosis. J Neurol Neurosurg Psychiatry 2021;92:485-493. [PMID: 33239440 PMCID: PMC8053348 DOI: 10.1136/jnnp-2020-324647] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 09/30/2020] [Accepted: 10/27/2020] [Indexed: 12/13/2022] Abstract OBJECTIVE To identify potential biomarkers of preclinical and clinical progression in chromosome 9 open reading frame 72 gene (C9orf72)-associated disease by assessing the expression levels of plasma microRNAs (miRNAs) in C9orf72 patients and presymptomatic carriers. METHODS The PREV-DEMALS study is a prospective study including 22 C9orf72 patients, 45 presymptomatic C9orf72 mutation carriers and 43 controls. We assessed the expression levels of 2576 miRNAs, among which 589 were above noise level, in plasma samples of all participants using RNA sequencing. The expression levels of the differentially expressed miRNAs between patients, presymptomatic carriers and controls were further used to build logistic regression classifiers. RESULTS Four miRNAs were differentially expressed between patients and controls: miR-34a-5p and miR-345-5p were overexpressed, while miR-200c-3p and miR-10a-3p were underexpressed in patients. MiR-34a-5p was also overexpressed in presymptomatic carriers compared with healthy controls, suggesting that miR-34a-5p expression is deregulated in cases with C9orf72 mutation. Moreover, miR-345-5p was also overexpressed in patients compared with presymptomatic carriers, which supports the correlation of miR-345-5p expression with the progression of C9orf72-associated disease. Together, miR-200c-3p and miR-10a-3p underexpression might be associated with full-blown disease. Four presymptomatic subjects in transitional/prodromal stage, close to the disease conversion, exhibited a stronger similarity with the expression levels of patients. CONCLUSIONS We identified a signature of four miRNAs differentially expressed in plasma between clinical conditions that have potential to represent progression biomarkers for C9orf72-associated frontotemporal dementia and amyotrophic lateral sclerosis. This study suggests that dysregulation of miRNAs is dynamically altered throughout neurodegenerative diseases progression, and can be detectable even long before clinical onset. TRIAL REGISTRATION NUMBER NCT02590276. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
2	Converting disease maps into heavyweight ontologies: general methodology and application to Alzheimer's disease. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021;2021:6137817. [PMID: 33590873 DOI: 10.1093/database/baab004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 01/17/2021] [Accepted: 01/27/2021] [Indexed: 11/12/2022] Abstract Omics technologies offer great promises for improving our understanding of diseases. The integration and interpretation of such data pose major challenges, calling for adequate knowledge models. Disease maps provide curated knowledge about disorders' pathophysiology at the molecular level adapted to omics measurements. However, the expressiveness of disease maps could be increased to help in avoiding ambiguities and misinterpretations and to reinforce their interoperability with other knowledge resources. Ontology is an adequate framework to overcome this limitation, through their axiomatic definitions and logical reasoning properties. We introduce the Disease Map Ontology (DMO), an ontological upper model based on systems biology terms. We then propose to apply DMO to Alzheimer's disease (AD). Specifically, we use it to drive the conversion of AlzPathway, a disease map devoted to AD, into a formal ontology: Alzheimer DMO. We demonstrate that it allows one to deal with issues related to redundancy, naming, consistency, process classification and pathway relationships. Furthermore, we show that it can store and manage multi-omics data. Finally, we expand the model using elements from other resources, such as clinical features contained in the AD Ontology, resulting in an enriched model called ADMO-plus. The current versions of DMO, ADMO and ADMO-plus are freely available at http://bioportal.bioontology.org/ontologies/ADMO. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
3	A DNA methylation signature discriminates between excellent and non-response to lithium in patients with bipolar disorder type 1. Sci Rep 2020;10:12239. [PMID: 32699220 PMCID: PMC7376060 DOI: 10.1038/s41598-020-69073-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Accepted: 07/03/2020] [Indexed: 12/15/2022] Open Abstract Lithium (Li) is the cornerstone maintenance treatment for bipolar disorders (BD), but response rates are highly variable. To date, no clinical or biological marker is available to reliably define eligibility criteria for a maintenance treatment with Li. We examined whether the prophylactic response to Li (assessed retrospectively) is associated with distinct blood DNA methylation profiles. Bisulfite-treated total blood DNA samples from individuals with BD type 1 (15 excellent-responders (LiERs) versus 11 non-responders (LiNRs)) were used for targeted enrichment of CpG rich genomic regions followed by high-resolution next-generation sequencing to identify differentially methylated regions (DMRs). After controlling for potential confounders we identified 111 DMRs that significantly differ between LiERs and LiNRs with a significant enrichment in neuronal cell components. Logistic regression and receiver operating curves identified a combination of 7 DMRs with a good discriminatory power for response to Li (Area Under the Curve 0.806). Annotated genes associated with these DMRs include Eukaryotic Translation Initiation Factor 2B Subunit Epsilon (EIF2B5), Von Willebrand Factor A Domain Containing 5B2 (VWA5B2), Ral GTPase Activating Protein Catalytic Alpha Subunit 1 (RALGAPA1). Although preliminary and deserving replication, these results suggest that biomarkers of response to Li may be identified through peripheral epigenetic measures. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
4	Long non-coding RNA repertoire and open chromatin regions constitute midbrain dopaminergic neuron - specific molecular signatures. Sci Rep 2019;9:1409. [PMID: 30723217 PMCID: PMC6363776 DOI: 10.1038/s41598-018-37872-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2018] [Accepted: 12/12/2018] [Indexed: 01/24/2023] Open Abstract Midbrain dopaminergic (DA) neurons are involved in diverse neurological functions, including control of movements, emotions or reward. In turn, their dysfunctions cause severe clinical manifestations in humans, such as the appearance of motor and cognitive symptoms in Parkinson’s Disease. The physiology and pathophysiology of these neurons are widely studied, mostly with respect to molecular mechanisms implicating protein-coding genes. In contrast, the contribution of non-coding elements of the genome to DA neuron function is poorly investigated. In this study, we isolated DA neurons from E14.5 ventral mesencephalons in mice, and used RNA-seq and ATAC-seq to establish and describe repertoires of long non-coding RNAs (lncRNAs) and putative DNA regulatory regions specific to this neuronal population. We identified 1,294 lncRNAs constituting the repertoire of DA neurons, among which 939 were novel. Most of them were not found in hindbrain serotonergic (5-HT) neurons, indicating a high degree of cell-specificity. This feature was also observed regarding open chromatin regions, as 39% of the ATAC-seq peaks from the DA repertoire were not detected in the 5-HT neurons. Our work provides for the first time DA-specific catalogues of non-coding elements of the genome that will undoubtedly participate in deepening our knowledge regarding DA neuronal development and dysfunctions. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
5	Diet-Induced Dysbiosis and Genetic Background Synergize With Cystic Fibrosis Transmembrane Conductance Regulator Deficiency to Promote Cholangiopathy in Mice. Hepatol Commun 2018;2:1533-1549. [PMID: 30556040 PMCID: PMC6287479 DOI: 10.1002/hep4.1266] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Accepted: 09/19/2018] [Indexed: 02/06/2023] Open Abstract The most typical expression of cystic fibrosis (CF)-related liver disease is a cholangiopathy that can progress to cirrhosis. We aimed to determine the potential impact of environmental and genetic factors on the development of CF-related cholangiopathy in mice. Cystic fibrosis transmembrane conductance regulator (Cftr)^-/- mice and Cftr ^+/+ littermates in a congenic C57BL/6J background were fed a high medium-chain triglyceride (MCT) diet. Liver histopathology, fecal microbiota, intestinal inflammation and barrier function, bile acid homeostasis, and liver transcriptome were analyzed in 3-month-old males. Subsequently, MCT diet was changed for chow with polyethylene glycol (PEG) and the genetic background for a mixed C57BL/6J;129/Ola background (resulting from three backcrosses), to test their effect on phenotype. C57BL/6J Cftr ^-/- mice on an MCT diet developed cholangiopathy features that were associated with dysbiosis, primarily Escherichia coli enrichment, and low-grade intestinal inflammation. Compared with Cftr ^+/+ littermates, they displayed increased intestinal permeability and a lack of secondary bile acids together with a low expression of ileal bile acid transporters. Dietary-induced (chow with PEG) changes in gut microbiota composition largely prevented the development of cholangiopathy in Cftr ^-/- mice. Regardless of Cftr status, mice in a mixed C57BL/6J;129/Ola background developed fatty liver under an MCT diet. The Cftr ^-/- mice in the mixed background showed no cholangiopathy, which was not explained by a difference in gut microbiota or intestinal permeability, compared with congenic mice. Transcriptomic analysis of the liver revealed differential expression, notably of immune-related genes, in mice of the congenic versus mixed background. In conclusion, our findings suggest that CFTR deficiency causes abnormal intestinal permeability, which, combined with diet-induced dysbiosis and immune-related genetic susceptibility, promotes CF-related cholangiopathy. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
6	A strategy for multimodal data integration: application to biomarkers identification in spinocerebellar ataxia. Brief Bioinform 2017;19:1356-1369. [DOI: 10.1093/bib/bbx060] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Indexed: 11/14/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
7	Clinical-genetic model predicts incident impulse control disorders in Parkinson's disease. J Neurol Neurosurg Psychiatry 2016;87:1106-11. [PMID: 27076492 PMCID: PMC5098340 DOI: 10.1136/jnnp-2015-312848] [Citation(s) in RCA: 83] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Accepted: 03/23/2016] [Indexed: 11/04/2022] Abstract OBJECTIVES Impulse control disorders (ICD) are commonly associated with dopamine replacement therapy (DRT) in patients with Parkinson's disease (PD). Our aims were to estimate ICD heritability and to predict ICD by a candidate genetic multivariable panel in patients with PD. METHODS Data from de novo patients with PD, drug-naïve and free of ICD behaviour at baseline, were obtained from the Parkinson's Progression Markers Initiative cohort. Incident ICD behaviour was defined as positive score on the Questionnaire for Impulsive-Compulsive Disorders in PD. ICD heritability was estimated by restricted maximum likelihood analysis on whole exome sequencing data. 13 candidate variants were selected from the DRD2, DRD3, DAT1, COMT, DDC, GRIN2B, ADRA2C, SERT, TPH2, HTR2A, OPRK1 and OPRM1 genes. ICD prediction was evaluated by the area under the curve (AUC) of receiver operating characteristic (ROC) curves. RESULTS Among 276 patients with PD included in the analysis, 86% started DRT, 40% were on dopamine agonists (DA), 19% reported incident ICD behaviour during follow-up. We found heritability of this symptom to be 57%. Adding genotypes from the 13 candidate variants significantly increased ICD predictability (AUC=76%, 95% CI (70% to 83%)) compared to prediction based on clinical variables only (AUC=65%, 95% CI (58% to 73%), p=0.002). The clinical-genetic prediction model reached highest accuracy in patients initiating DA therapy (AUC=87%, 95% CI (80% to 93%)). OPRK1, HTR2A and DDC genotypes were the strongest genetic predictive factors. CONCLUSIONS Our results show that adding a candidate genetic panel increases ICD predictability, suggesting potential for developing clinical-genetic models to identify patients with PD at increased risk of ICD development and guide DRT management. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
8	Genome-wide replication landscape of Candida glabrata. BMC Biol 2015;13:69. [PMID: 26329162 PMCID: PMC4556013 DOI: 10.1186/s12915-015-0177-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 08/05/2015] [Indexed: 11/25/2022] Open Abstract Background The opportunistic pathogen Candida glabrata is a member of the Saccharomycetaceae yeasts. Like its close relative Saccharomyces cerevisiae, it underwent a whole-genome duplication followed by an extensive loss of genes. Its genome contains a large number of very long tandem repeats, called megasatellites. In order to determine the whole replication program of the C. glabrata genome and its general chromosomal organization, we used deep-sequencing and chromosome conformation capture experiments. Results We identified 253 replication fork origins, genome wide. Centromeres, HML and HMR loci, and most histone genes are replicated early, whereas natural chromosomal breakpoints are located in late-replicating regions. In addition, 275 autonomously replicating sequences (ARS) were identified during ARS-capture experiments, and their relative fitness was determined during growth competition. Analysis of ARSs allowed us to identify a 17-bp consensus, similar to the S. cerevisiae ARS consensus sequence but slightly more constrained. Megasatellites are not in close proximity to replication origins or termini. Using chromosome conformation capture, we also show that early origins tend to cluster whereas non-subtelomeric megasatellites do not cluster in the yeast nucleus. Conclusions Despite a shorter cell cycle, the C. glabrata replication program shares unexpected striking similarities to S. cerevisiae, in spite of their large evolutionary distance and the presence of highly repetitive large tandem repeats in C. glabrata. No correlation could be found between the replication program and megasatellites, suggesting that their formation and propagation might not be directly caused by replication fork initiation or termination. Electronic supplementary material The online version of this article (doi:10.1186/s12915-015-0177-6) contains supplementary material, which is available to authorized users. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
9	Streptococcus agalactiae clones infecting humans were selected and fixed through the extensive use of tetracycline. Nat Commun 2014;5:4544. [PMID: 25088811 PMCID: PMC4538795 DOI: 10.1038/ncomms5544] [Citation(s) in RCA: 168] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2014] [Accepted: 06/27/2014] [Indexed: 11/17/2022] Open Abstract Streptococcus agalactiae (Group B Streptococcus, GBS) is a commensal of the digestive and genitourinary tracts of humans that emerged as the leading cause of bacterial neonatal infections in Europe and North America during the 1960s. Due to the lack of epidemiological and genomic data, the reasons for this emergence are unknown. Here we show by comparative genome analysis and phylogenetic reconstruction of 229 isolates that the rise of human GBS infections corresponds to the selection and worldwide dissemination of only a few clones. The parallel expansion of the clones is preceded by the insertion of integrative and conjugative elements conferring tetracycline resistance (TcR). Thus, we propose that the use of tetracycline from 1948 onwards led in humans to the complete replacement of a diverse GBS population by only few TcR clones particularly well adapted to their host, causing the observed emergence of GBS diseases in neonates. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
10	SynTView - an interactive multi-view genome browser for next-generation comparative microorganism genomics. BMC Bioinformatics 2013;14:277. [PMID: 24053737 PMCID: PMC3849071 DOI: 10.1186/1471-2105-14-277] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Accepted: 09/16/2013] [Indexed: 12/31/2022] Open Abstract Background Dynamic visualisation interfaces are required to explore the multiple microbial genome data now available, especially those obtained by high-throughput sequencing — a.k.a. “Next-Generation Sequencing” (NGS) — technologies; they would also be useful for “standard” annotated genomes whose chromosome organizations may be compared. Although various software systems are available, few offer an optimal combination of feature-rich capabilities, non-static user interfaces and multi-genome data handling. Results We developed SynTView, a comparative and interactive viewer for microbial genomes, designed to run as either a web-based tool (Flash technology) or a desktop application (AIR environment). The basis of the program is a generic genome browser with sub-maps holding information about genomic objects (annotations). The software is characterised by the presentation of syntenic organisations of microbial genomes and the visualisation of polymorphism data (typically Single Nucleotide Polymorphisms — SNPs) along these genomes; these features are accessible to the user in an integrated way. A variety of specialised views are available and are all dynamically inter-connected (including linear and circular multi-genome representations, dot plots, phylogenetic profiles, SNP density maps, and more). SynTView is not linked to any particular database, allowing the user to plug his own data into the system seamlessly, and use external web services for added functionalities. SynTView has now been used in several genome sequencing projects to help biologists make sense out of huge data sets. Conclusions The most important assets of SynTView are: (i) the interactivity due to the Flash technology; (ii) the capabilities for dynamic interaction between many specialised views; and (iii) the flexibility allowing various user data sets to be integrated. It can thus be used to investigate massive amounts of information efficiently at the chromosome level. This innovative approach to data exploration could not be achieved with most existing genome browsers, which are more static and/or do not offer multiple views of multiple genomes. Documentation, tutorials and demonstration sites are available at the URL: http://genopole.pasteur.fr/SynTView. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
11	In silico comparison of Yersinia pestis and Yersinia pseudotuberculosis transcriptomes reveals a higher expression level of crucial virulence determinants in the plague bacillus. Int J Med Microbiol 2011;301:105-16. [DOI: 10.1016/j.ijmm.2010.08.013] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2010] [Revised: 07/26/2010] [Accepted: 08/04/2010] [Indexed: 10/18/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
12	From gene regulation to gene function: regulatory networks in bacillus subtilis. Comp Funct Genomics 2010;3:37-41. [PMID: 18628883 PMCID: PMC2447243 DOI: 10.1002/cfg.138] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2001] [Accepted: 12/06/2001] [Indexed: 11/30/2022] Open Abstract Bacillus subtilis is a sporulating Gram-positive bacterium that lives primarily in the soil and associated water sources. The publication of the B. subtilis genome sequence and subsequent systematic functional analysis and gene regulation programmes, together with an extensive understanding of its biochemistry and physiology, makes this micro-organism a prime candidate in which to model regulatory networks in silico. In this paper we discuss combined molecular biological and bioinformatical approaches that are being developed to model this organism’s responses to changes in its environment. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
13	Genoscape: a Cytoscape plug-in to automate the retrieval and integration of gene expression data and molecular networks. ACTA ACUST UNITED AC 2009;25:2617-8. [PMID: 19654116 PMCID: PMC2752617 DOI: 10.1093/bioinformatics/btp464] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Abstract Summary: Genoscape is an open-source Cytoscape plug-in that visually integrates gene expression data sets from GenoScript, a transcriptomic database, and KEGG pathways into Cytoscape networks. The generated visualisation highlights gene expression changes and their statistical significance. The plug-in also allows one to browse GenoScript or import transcriptomic data from other sources through tab-separated text files. Genoscape has been successfully used by researchers to investigate the results of gene expression profiling experiments. Availability: Genoscape is an open-source software freely available from the Genoscape webpage (http://www.pasteur.fr/recherche/unites/Gim/genoscape/). Installation instructions and tutorial can also be found at this URL. Contact:Mathieu.clement-ziza@biotec.tu-dresden.de; sandrine.rousseau@pasteur.fr Supplementary information:Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Computational Biology/methods Gene Expression Gene Expression Profiling/methods Genomics Neural Networks, Computer Software Collapse Grants Collapse
14	From a consortium sequence to a unified sequence: the Bacillus subtilis 168 reference genome a decade later. MICROBIOLOGY (READING, ENGLAND) 2009;155:1758-1775. [PMID: 19383706 PMCID: PMC2885750 DOI: 10.1099/mic.0.027839-0] [Citation(s) in RCA: 257] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2009] [Revised: 02/25/2009] [Accepted: 02/25/2009] [Indexed: 11/18/2022] Abstract Comparative genomics is the cornerstone of identification of gene functions. The immense number of living organisms precludes experimental identification of functions except in a handful of model organisms. The bacterial domain is split into large branches, among which the Firmicutes occupy a considerable space. Bacillus subtilis has been the model of Firmicutes for decades and its genome has been a reference for more than 10 years. Sequencing the genome involved more than 30 laboratories, with different expertises, in a attempt to make the most of the experimental information that could be associated with the sequence. This had the expected drawback that the sequencing expertise was quite varied among the groups involved, especially at a time when sequencing genomes was extremely hard work. The recent development of very efficient, fast and accurate sequencing techniques, in parallel with the development of high-level annotation platforms, motivated the present resequencing work. The updated sequence has been reannotated in agreement with the UniProt protein knowledge base, keeping in perspective the split between the paleome (genes necessary for sustaining and perpetuating life) and the cenome (genes required for occupation of a niche, suggesting here that B. subtilis is an epiphyte). This should permit investigators to make reliable inferences to prepare validation experiments in a variety of domains of bacterial growth and development as well as build up accurate phylogenies. Collapse Key Words Collapse MESH Headings Bacillus subtilis/physiology Cell Compartmentation Databases, Genetic Ecosystem Gene Expression Regulation, Bacterial Genetic Phenomena Genetic Variation Genome, Bacterial Metabolism Proteome/analysis Sequence Analysis, DNA Collapse Grants Collapse
15	CandidaDB: a multi-genome database for Candida species and related Saccharomycotina. Nucleic Acids Res 2007;36:D557-61. [PMID: 18039716 PMCID: PMC2238939 DOI: 10.1093/nar/gkm1010] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open Abstract CandidaDB (http://genodb.pasteur.fr/CandidaDB) was established in 2002 to provide the first genomic database for the human fungal pathogen Candida albicans. The availability of an increasing number of fully or partially completed genome sequences of related fungal species has opened the path for comparative genomics and prompted us to migrate CandidaDB into a multi-genome database. The new version of CandidaDB houses the latest versions of the genomes of C. albicans strains SC5314 and WO-1 along with six genome sequences from species closely related to C. albicans that all belong to the CTG clade of Saccharomycotina—Candida tropicalis, Candida (Clavispora) lusitaniae, Candida (Pichia) guillermondii, Lodderomyces elongisporus, Debaryomyces hansenii, Pichia stipitis—and the reference Saccharomyces cerevisiae genome. CandidaDB includes sequences coding for 54 170 proteins with annotations collected from other databases, enriched with illustrations of structural features and functional domains and data of comparative analyses. In order to take advantage of the integration of multiple genomes in a unique database, new tools using pre-calculated or user-defined comparisons have been implemented that allow rapid access to comparative analysis at the genomic scale. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
16	GenoList: an integrated environment for comparative analysis of microbial genomes. Nucleic Acids Res 2007;36:D469-74. [PMID: 18032431 PMCID: PMC2238853 DOI: 10.1093/nar/gkm1042] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract The multitude of bacterial genome sequences being determined has generated new requirements regarding the development of databases and graphical interfaces: these are needed to organize and retrieve biological information from the comparison of large sets of genomes. GenoList (http://genolist.pasteur.fr/GenoList) is an integrated environment dedicated to querying and analyzing genome data from bacterial species. GenoList inherits from the SubtiList database and web server, the reference data resource for the Bacillus subtilis genome. The data model was extended to hold information about relationships between genomes (e.g. protein families). The web user interface was designed to primarily take into account biologists’ needs and modes of operation. Along with standard query and browsing capabilities, comparative genomics facilities are available, including subtractive proteome analysis. One key feature is the integration of the many tools accessible in the environment. As an example, it is straightforward to identify the genes that are specific to a group of bacteria, export them as a tab-separated list, get their protein sequences and run a multiple alignment on a subset of these sequences. Collapse Key Words Collapse MESH Headings Bacterial Proteins/chemistry Bacterial Proteins/classification Bacterial Proteins/genetics Databases, Genetic Genome, Bacterial Genomics Internet Proteomics User-Computer Interface Collapse Grants Collapse
17	Bacillus subtilis genome project: cloning and sequencing of the 97 kb region from 325° to 333deg. Mol Microbiol 2006;10:371-384. [PMID: 28776854 DOI: 10.1111/j.1365-2958.1993.tb01963.x] [Citation(s) in RCA: 144] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Abstract In the framework of the European project aimed at the sequencing of the Bacillus subtilis genome the DNA region located between gerB (314°) and sacXV (333°) was assigned to the Institut Pasteur. In this paper we describe the cloning and sequencing of a segment of 97 kb of contiguous DNA. Ninety-two open reading frames were predicted to encode putative proteins among which only forty-two were found to display significant similarities to known proteins present in databanks, e.g. amino acid permeases, proteins involved in cell wall or antibiotic biosynthesis, various regulatory proteins, proteins of several dehydrogenase families and enzymes II of the phosphotransferase system involved in sugar transport. Additional experiments led to the identification of the products of new B. subtilis genes, e.g. galactokinase and an operon involved in thiamine biosynthesis. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
18	CandidaDB: a genome database for Candida albicans pathogenomics. Nucleic Acids Res 2005;33:D353-7. [PMID: 15608215 PMCID: PMC540078 DOI: 10.1093/nar/gki124] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open Abstract CandidaDB is a database dedicated to the genome of the most prevalent systemic fungal pathogen of humans, Candida albicans. CandidaDB is based on an annotation of the Stanford Genome Technology Center C.albicans genome sequence data by the European Galar Fungail Consortium. CandidaDB Release 2.0 (June 2004) contains information pertaining to Assembly 19 of the genome of C.albicans strain SC5314. The current release contains 6244 annotated entries corresponding to 130 tRNA genes and 5917 protein-coding genes. For these, it provides tentative functional assignments along with numerous pre-run analyses that can assist the researcher in the evaluation of gene function for the purpose of specific or large-scale analysis. CandidaDB is based on GenoList, a generic relational data schema and a World Wide Web interface that has been adapted to the handling of eukaryotic genomes. The interface allows users to browse easily through genome data and retrieve information. CandidaDB also provides more elaborate tools, such as pattern searching, that are tightly connected to the overall browsing system. As the C.albicans genome is diploid and still incompletely assembled, CandidaDB provides tools to browse the genome by individual supercontigs and to examine information about allelic sequences obtained from complementary contigs. CandidaDB is accessible at http://genolist.pasteur.fr/CandidaDB. Collapse Key Words Collapse MESH Headings Candida albicans/genetics Candida albicans/pathogenicity Databases, Genetic Fungal Proteins/chemistry Fungal Proteins/genetics Fungal Proteins/physiology Genome, Fungal Genomics Internet User-Computer Interface Collapse Grants Collapse
19	Specialized microbial databases for inductive exploration of microbial genome sequences. BMC Genomics 2005;6:14. [PMID: 15698474 PMCID: PMC549560 DOI: 10.1186/1471-2164-6-14] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2004] [Accepted: 02/07/2005] [Indexed: 11/10/2022] Open Abstract Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore , a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison. Collapse Key Words Collapse MESH Headings Algorithms Cluster Analysis Computational Biology/methods DNA/chemistry Database Management Systems Databases, Genetic Genome Genome, Bacterial Internet Leptospira interrogans/genetics Operon Programming Languages Sequence Analysis, DNA Software Staphylococcus epidermidis/genetics Terminology as Topic Collapse Grants Collapse
20	A revised annotation and comparative analysis of Helicobacter pylori genomes. Nucleic Acids Res 2003;31:1704-14. [PMID: 12626712 PMCID: PMC152854 DOI: 10.1093/nar/gkg250] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract Huge amounts of genomic information are currently being generated. Therefore, biologists require structured, exhaustive and comparative databases. The PyloriGene database (http://genolist.pasteur.fr/PyloriGene) was developed to respond to these needs, by integrating and connecting the information generated during the sequencing of two distinct strains of Helicobacter pylori. This led to the need for a general annotation consensus, as the physical and functional annotations of the two strains differed significantly in some cases. A revised functional classification system was created to accommodate the existing data and to make it possible to classify coding sequences (CDS) into several functional categories to harmonize CDS classification. The annotation of the two complete genomes was revised in the light of new data, allowing us to reduce the percentage of hypothetical proteins from approximately 40 to 33%. This resulted in the reassignment of functions for 108 CDS (approximately 7% of all CDS). Interestingly, the functions of only approximately 13% of CDS (222 out of 1658 CDS) were annotated as a result of work done directly on H.pylori genes. Finally, comparison of the two published genomes revealed a significant amount of size variation between corresponding (orthologous) CDS. Most of these size variations were due to natural polymorphisms, although other sources of variation were identified, such as pseudogenes, new genes potentially regulated by slipped-strand mispairing mechanism, or frame-shifts. 113 of these differences were due to different start codon assignments, a common problem when constructing physical annotations. Collapse Key Words Collapse MESH Headings Databases, Nucleic Acid Genes, Bacterial/genetics Genome, Bacterial Helicobacter pylori/genetics Internet Species Specificity Collapse Grants Collapse
21	SubtiList: the reference database for the Bacillus subtilis genome. Nucleic Acids Res 2002;30:62-5. [PMID: 11752255 PMCID: PMC99059 DOI: 10.1093/nar/30.1.62] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open Abstract SubtiList is the reference database dedicated to the genome of Bacillus subtilis 168, the paradigm of Gram-positive endospore-forming bacteria. Developed in the framework of the B.subtilis genome project, SubtiList provides a curated dataset of DNA and protein sequences, combined with the relevant annotations and functional assignments. Information about gene functions and products is continuously updated by linking relevant bibliographic references. Recently, sequence corrections arising from both systematic verifications and submissions by individual scientists were included in the reference genome sequence. SubtiList is based on a generic relational data schema and a World Wide Web interface developed for the handling of bacterial genomes, called GenoList. The World Wide Web interface was designed to allow users to easily browse through genome data and retrieve information according to common biological queries. SubtiList also provides more elaborate tools, such as pattern searching, which are tightly connected to the overall browsing system. SubtiList is accessible at http://genolist.pasteur.fr/SubtiList/. Similar bacterial databases are accessible at http://genolist.pasteur.fr/. Collapse Key Words Collapse MESH Headings Bacillus subtilis/genetics Bacillus subtilis/physiology Bacterial Proteins/genetics Bacterial Proteins/physiology Database Management Systems Databases, Genetic Forecasting Genome, Bacterial Information Storage and Retrieval Internet Collapse Grants Collapse
22	Leproma: a Mycobacterium leprae genome browser. LEPROSY REV 2001;72:470-7. [PMID: 11826483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023] Abstract Collapse Key Words Collapse MESH Headings Databases, Genetic Genome, Bacterial Humans Internet Leprosy/microbiology Mycobacterium leprae/genetics Collapse Grants Collapse
23	CotA of Bacillus subtilis is a copper-dependent laccase. J Bacteriol 2001;183:5426-30. [PMID: 11514528 PMCID: PMC95427 DOI: 10.1128/jb.183.18.5426-5430.2001] [Citation(s) in RCA: 255] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open Abstract The spore coat protein CotA of Bacillus subtilis displays similarities with multicopper oxidases, including manganese oxidases and laccases. B. subtilis is able to oxidize manganese, but neither CotA nor other sporulation proteins are involved. We demonstrate that CotA is a laccase. Syringaldazine, a specific substrate of laccases, reacted with wild-type spores but not with DeltacotA spores. CotA may participate in the biosynthesis of the brown spore pigment, which appears to be a melanin-like product and to protect against UV light. Collapse Key Words Collapse MESH Headings Bacillus subtilis/enzymology Bacillus subtilis/radiation effects Bacterial Proteins/chemistry Bacterial Proteins/genetics Bacterial Proteins/metabolism Binding Sites Copper/metabolism Culture Media Laccase Manganese/metabolism Melanins/metabolism Oxidoreductases/chemistry Oxidoreductases/genetics Oxidoreductases/metabolism Pigments, Biological/metabolism Spores, Bacterial/metabolism Ultraviolet Rays Collapse Grants Collapse
24	The complete genome sequence of the murine respiratory pathogen Mycoplasma pulmonis. Nucleic Acids Res 2001;29:2145-53. [PMID: 11353084 PMCID: PMC55444 DOI: 10.1093/nar/29.10.2145] [Citation(s) in RCA: 208] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2001] [Revised: 03/19/2001] [Accepted: 03/19/2001] [Indexed: 11/14/2022] Open Abstract Mycoplasma pulmonis is a wall-less eubacterium belonging to the Mollicutes (trivial name, mycoplasmas) and responsible for murine respiratory diseases. The genome of strain UAB CTIP is composed of a single circular 963 879 bp chromosome with a G + C content of 26.6 mol%, i.e. the lowest reported among bacteria, Ureaplasma urealyticum apart. This genome contains 782 putative coding sequences (CDSs) covering 91.4% of its length and a function could be assigned to 486 CDSs whilst 92 matched the gene sequences of hypothetical proteins, leaving 204 CDSs without significant database match. The genome contains a single set of rRNA genes and only 29 tRNAs genes. The replication origin oriC was localized by sequence analysis and by using the G + C skew method. Sequence polymorphisms within stretches of repeated nucleotides generate phase-variable protein antigens whilst a recombinase gene is likely to catalyse the site-specific DNA inversions in major M.pulmonis surface antigens. Furthermore, a hemolysin, secreted nucleases and a glyco-protease are predicted virulence factors. Surprisingly, several of the genes previously reported to be essential for a self-replicating minimal cell are missing in the M.pulmonis genome although this one is larger than the other mycoplasma genomes fully sequenced until now. Collapse Key Words Collapse MESH Headings Animals Antigens, Bacterial/genetics Antigens, Bacterial/immunology Base Composition Codon, Terminator/genetics Computational Biology Evolution, Molecular Genetic Code Genome Genomic Library Humans Internet Lipoproteins/genetics Mice Molecular Sequence Data Mutation/genetics Mycoplasma/genetics Mycoplasma/immunology Mycoplasma/pathogenicity Open Reading Frames/genetics Polymorphism, Genetic/genetics RNA, Bacterial/genetics Recombination, Genetic/genetics Repetitive Sequences, Nucleic Acid/genetics Replication Origin/genetics Respiratory System/microbiology Virulence/genetics Collapse Grants R01 AI041113 NIAID NIH HHS R21 AI041113 NIAID NIH HHS AI41113 NIAID NIH HHS Collapse
25	Implication of gene distribution in the bacterial chromosome for the bacterial cell factory. J Biotechnol 2000;78:209-19. [PMID: 10751682 DOI: 10.1016/s0168-1656(00)00197-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Abstract As bacterial genome sequences accumulate, more and more pieces of data suggest that there is a significant correlation between the distribution of genes along the chromosome and the physical architecture of the cell, suggesting that the map of the cell is in the chromosome. Considering sequences and experimental data indicative of cell compartmentalisation, mRNA folding and turnover, as well as known structural features of protein and membrane complexes, we show that preliminary in silico analysis of whole genome sequences strongly substantiates this hypothesis. If there is a correlation between the genome sequence and the cell architecture, it must derive from some selection pressure in the organisms growing in the wild. As a consequence, the underlying constraints should be optimised in genetically modified organisms if one is to expect high product yields. Consequences in terms of gene expression for biotechnology are straightforward: knocking genes out and in genomes should not be randomly performed, but should follow the rules of chromosome organisation. Collapse Key Words Collapse MESH Headings Bacteria/genetics Biotechnology Cell Compartmentation Chromosomes, Bacterial/genetics Codon/genetics Gene Expression Genes, Bacterial Genome, Bacterial Models, Genetic Operon Collapse Grants Collapse
26	Mapping the bacterial cell architecture into the chromosome. Philos Trans R Soc Lond B Biol Sci 2000;355:179-90. [PMID: 10724454 PMCID: PMC1692725 DOI: 10.1098/rstb.2000.0557] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open Abstract A genome is not a simple collection of genes. We propose here that it can be viewed as being organized as a 'celluloculus' similar to the homunculus of preformists, but pertaining to the category of programmes (or algorithms) rather than to that of architectures or structures: a significant correlation exists between the distribution of genes along the chromosome and the physical architecture of the cell. We review here data supporting this observation, stressing physical constraints operating on the cell's architecture and dynamics, and their consequences in terms of gene and genome structure. If such a correlation exists, it derives from some selection pressure: simple and general physical principles acting at the level of the cell structure are discussed. As a first case in point we see the piling up of planar modules as a stable, entropy-driven, architectural principle that could be at the root of the coupling between the architecture of the cell and the location of genes at specific places in the chromosome. We propose that the specific organization of certain genes whose products have a general tendency to form easily planar modules is a general motor for architectural organization in the bacterial cell. A second mechanism, operating at the transcription level, is described that could account for the efficient building up of complex structures. As an organizing principle we suggest that exploration by biological polymers of the vast space of possible conformation states is constrained by anchoring points. In particular, we suggest that transcription does not always allow the 5'-end of the transcript to go free and explore the many conformations available, but that, in many cases, it remains linked to the transcribing RNA polymerase complex in such a way that loops of RNA, rather than threads with a free end, explore the surrounding medium. In bacteria, extension of the loops throughout the cytoplasm would therefore be mediated by the de novo synthesis of ribosomes in growing cells. Termination of transcription and mRNA turnover would accordingly be expected to be controlled by sequence features at both the 3'- and 5'-ends of the molecule. These concepts are discussed taking into account in vitro analysis of genome sequences and experimental data about cell compartmentalization, mRNA folding and turnover, as well as known structural features of protein and membrane complexes. Collapse Key Words Collapse MESH Headings Chromosome Mapping Chromosomes, Bacterial Codon Genome, Bacterial Prokaryotic Cells/physiology Protein Biosynthesis Collapse Grants Collapse
27	Codon usage and lateral gene transfer in Bacillus subtilis. Curr Opin Microbiol 1999;2:524-8. [PMID: 10508724 DOI: 10.1016/s1369-5274(99)00011-9] [Citation(s) in RCA: 86] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Abstract Bacillus subtilis possesses three classes of genes, differing by their codon preference. One class corresponds to prophages or prophage-like elements, indicative of the existence of systematic lateral gene transfer in this organism. The nature of the selection pressure that operates on codon bias is beginning to be understood. Collapse Key Words Collapse MESH Headings Bacillus subtilis/genetics Codon/genetics Gene Transfer Techniques Genes, Bacterial RNA, Bacterial/genetics RNA, Bacterial/metabolism RNA, Transfer/genetics RNA, Transfer/metabolism Transformation, Bacterial Collapse Grants Collapse
28	The complete genome of Bacillus subtilis: from sequence annotation to data management and analysis. FEBS Lett 1998;430:28-36. [PMID: 9678589 DOI: 10.1016/s0014-5793(98)00620-6] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Abstract The completion of the entire 4.2-Mb genome sequence of the gram-positive bacterium Bacillus subtilis has been a milestone for biological studies on this model organism. This paper describes bioinformatics work related to this joint European and Japanese project: methods and strategies for gene annotation and detection of sequencing errors, using an integrated cooperative computer environment (Imagene); construction of a specialized database for data management and a WWW server for data retrieval (SubtiList); DNA sequence analysis, yielding striking results on oligonucleotide bias, repeated sequences, and codon usage, all landmarks of evolutionary events shaping the B. subtilis genome. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Bacillus subtilis/genetics Computer Communication Networks Databases, Factual Genome, Bacterial Humans Molecular Sequence Data Sequence Analysis, DNA Collapse Grants Collapse
29	Global analysis of genomic texts: the distribution of AGCT tetranucleotides in the Escherichia coli and Bacillus subtilis genomes predicts translational frameshifting and ribosomal hopping in several genes. Electrophoresis 1998;19:515-27. [PMID: 9588797 DOI: 10.1002/elps.1150190411] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Abstract Present availability of the genomic text of bacteria allows assignment of biological known functions to many genes (typically, half of the genome's gene content). It is now time to try and predict new unexpected functions, using inductive procedures that allow correlating the content of the genomic text to possible biological functions. We show here that analysis of the genomes of Escherichia coli and Bacillus subtilis for the distribution of AGCT motifs predicts that genes exist for which the mRNA molecule can be translated as several different proteins synthesized after ribosomal frameshifting or hopping. Among these genes we found that several coded for the same function in E. coli and B. subtilis. We analyzed in depth the situation of the infB gene (experimentally known to specify synthesis of several proteins differing in their translation starts), the aceF/pdhC gene, the eno gene, and the rplI gene. In addition, genes specific to E. coli were also studied: ompA, ompFand tolA (predicting epigenetic variation that could help escape infection by phages or colicins). Collapse Key Words Collapse MESH Headings Acetyltransferases/genetics Amino Acid Sequence Bacillus subtilis/genetics Bacterial Outer Membrane Proteins/genetics Bacterial Proteins/genetics Consensus Sequence Dihydrolipoyllysine-Residue Acetyltransferase Escherichia coli/genetics Escherichia coli Proteins Frameshifting, Ribosomal Genome, Bacterial Glyceraldehyde-3-Phosphate Dehydrogenases/genetics Mathematical Computing Microsatellite Repeats Molecular Sequence Data Peptide Initiation Factors/genetics Phosphopyruvate Hydratase/genetics Porins/genetics Prokaryotic Initiation Factor-2 Pyruvate Dehydrogenase Complex/genetics RNA, Messenger Ribosomal Proteins/genetics Sequence Homology, Amino Acid Collapse Grants Collapse
30	The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature 1997;390:249-56. [PMID: 9384377 DOI: 10.1038/36786] [Citation(s) in RCA: 2621] [Impact Index Per Article: 97.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Abstract Bacillus subtilis is the best-characterized member of the Gram-positive bacteria. Its genome of 4,214,810 base pairs comprises 4,100 protein-coding genes. Of these protein-coding genes, 53% are represented once, while a quarter of the genome corresponds to several gene families that have been greatly expanded by gene duplication, the largest family containing 77 putative ATP-binding transport proteins. In addition, a large proportion of the genetic capacity is devoted to the utilization of a variety of carbon sources, including many plant-derived molecules. The identification of five signal peptidase genes, as well as several genes for components of the secretion apparatus, is important given the capacity of Bacillus strains to secrete large amounts of industrially important enzymes. Many of the genes are involved in the synthesis of secondary metabolites, including antibiotics, that are more typically associated with Streptomyces species. The genome contains at least ten prophages or remnants of prophages, indicating that bacteriophage infection has played an important evolutionary role in horizontal gene transfer, in particular in the propagation of bacterial pathogenesis. Collapse Key Words Collapse MESH Headings Bacillus subtilis/genetics Bacillus subtilis/metabolism Bacterial Proteins/genetics Cloning, Organism DNA, Bacterial Genome, Bacterial Molecular Sequence Data Collapse Grants Collapse
31	The Bacillus subtilis genome from gerBC (311 degrees) to licR (334 degrees). MICROBIOLOGY (READING, ENGLAND) 1997;143 ( Pt 10):3313-3328. [PMID: 9353933 DOI: 10.1099/00221287-143-10-3313] [Citation(s) in RCA: 25] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Abstract As part of the international project to sequence the Bacillus subtilis genome, the DNA region located between gerBC (311 degrees) and licR (334 degrees) was assigned to the institut Pasteur. In this paper, the cloning and sequencing of 176 kb of DNA and the analysis of the sequence of the entire 271 kb region (6.5% of the B. subtilis chromosome) is described; 273 putative coding sequences were identified. Although the complete genome sequences of seven other organisms (five bacteria, one archaeon and the yeast Saccharomyces cerevisiae) are available in public database, 65 genes from this region of the B. subtilis chromosome encode proteins without significant similarities to other known protein sequences. Among the 208 other genes, 115 have paralogues in the currently known B. subtilis DNA sequences and the products of 178 genes were found to display similarities to protein sequences from public databases for which a function is known. Classification of these genes shows a high proportion of them to be involved in the adaptation to various growth conditions (non-essential cell wall constituents, catabolic and bioenergetic pathways); a small number of the genes are essential or encode anabolic enzymes. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Bacillus subtilis/genetics Bacillus subtilis/physiology Bacterial Proteins/chemistry Bacterial Proteins/genetics Bacterial Proteins/physiology Base Sequence Chromosome Mapping Chromosomes, Bacterial/genetics Cloning, Molecular Consensus Sequence DNA, Bacterial/genetics Genes, Bacterial Genome, Bacterial Molecular Sequence Data Sequence Homology, Amino Acid Sequence Homology, Nucleic Acid Collapse Grants Collapse
32	The NRSub database: update 1997. Nucleic Acids Res 1997;25:53-6. [PMID: 9016504 PMCID: PMC146364 DOI: 10.1093/nar/25.1.53] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open Abstract In the context of the international project aiming at sequencing the whole genome of Bacillus subtilis we have developed NRSub, a non-redundant database of sequences from this organism. Starting from the B.subtilis sequences available in the repository collections we have removed all encountered duplications, then we have added extra annotations to the sequences (e.g. accession numbers for the genes, locations on the genetic map, codon usage index). We have also added cross-references with EMBL/GenBank/DDBJ, MEDLINE, SWISS-PROT and ENZYME databases. NRSub is distributed through anonymous FTP as a text file in EMBL format and as an ACNUC database. It is also possible to access the database through two dedicated World Wide Web servers located in France (http://acnuc.univ-lyon1.fr/nrsub/nrsub.++ +html ) and in Japan (http://ddbjs4h.genes.nig.ac.jp/ ). Collapse Key Words Collapse MESH Headings Academies and Institutes Bacillus subtilis/genetics Base Sequence Computer Communication Networks Databases, Factual France Collapse Grants Collapse
33	The European Bacillus subtilis genome sequencing project: current status and accessibility of the data from a new World Wide Web site. MICROBIOLOGY (READING, ENGLAND) 1996;142 ( Pt 11):2987-91. [PMID: 8969494 DOI: 10.1099/13500872-142-11-2987] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Abstract Collapse Key Words Collapse MESH Headings Bacillus subtilis/genetics Bacterial Proteins/classification Bacterial Proteins/genetics Base Sequence Chromosome Mapping Computer Communication Networks DNA, Bacterial/genetics Databases, Factual Europe Genome, Bacterial Molecular Sequence Data Research Sequence Analysis, DNA Collapse Grants Collapse
34	Uneven distribution of GATC motifs in the Escherichia coli chromosome, its plasmids and its phages. J Mol Biol 1996;257:574-85. [PMID: 8648625 DOI: 10.1006/jmbi.1996.0186] [Citation(s) in RCA: 50] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Abstract This work reconsiders the GATC motif distribution in a 1.6 Mb segment of the Escherichia coli genome, compared to its distribution in phages and plasmids. At first sight the distribution of GATC words looks random. But when a realistic model of the chromosome (made of average genes having the same codon usage as in the real chromasome), is used as a theoretical reference, strong biasesare observed. GATC pairs such as GATCNNGATC are under-represented while there is a strong positive selection for motifs separated by 10, 19, 70 and 1100 bp. The last class is the only one present in E. coli parasites. It can be ascribed to the triggering sequences of the long-patch mismatch repair system. The 6 bp class overlaps with the consensus of CAP (catabolite activator protein) and FNR (fumarate/nitrate regulator) binding sites, thus accounting for counter-selection. The other classes, which could be targets for a nucleic acid-binding protein, are almost always present inside protein coding sequences, and are members of clusters of GATC motifs. Analysis of the genes containing these motifs suggests that they correspond to a regulatory process monitoring the shift from anaerobic to aerobic growth conditions. In particular this regulation, closing down transcription of a large number of genes involved in intermediary metabolism would be well suited for the cold and oxygen shift from the mammal's gut to the standard environmental conditions. In this process the methylation status of GATC clusters would be very important for tuning transcription, and a DNA binding protein, probably a member of the cold-shock proteins family would be needed for alleviating the effects mediated by slackening of the pace of methylation during the shift. Collapse Key Words Collapse MESH Headings Bacteriophages/genetics Base Sequence DNA, Bacterial/genetics Escherichia coli/genetics Molecular Sequence Data Oligonucleotides/genetics Plasmids/genetics Sequence Analysis, DNA Collapse Grants Collapse
35	NRSub: a non-redundant database for Bacillus subtilis. Nucleic Acids Res 1996;24:41-5. [PMID: 8594597 PMCID: PMC145565 DOI: 10.1093/nar/24.1.41] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open Abstract In the context of the international project aimed at sequencing the whole genome of Bacillus subtilis we have developed a non-redundant, fully annotated database of sequences from this organism. Starting from the B.subtilis sequences available in the EMBL, GenBank and DDBJ collections we have removed all encountered duplications and then added extra annotations to the sequences (e.g. accession numbers for the genes, locations on the genetic map, codon usage, etc.) We have also added cross-references to the EMBL, MEDLINE, SWISS-PROT and ENZYME data banks. The present system results from merging of the NRSub and SubtiList databases and the sequence contigs used in the two systems are identical. NRSub is distributed as a flatfile in EMBL format (which is supported by most sequence analysis software packages) and as an ACNUC database, while SubtiList is distributed as a relational database under 4th Dimension. It is possible to access the data through two dedicated World Wide Web servers located in France and Japan. Collapse Key Words Collapse MESH Headings Bacillus subtilis/genetics Base Sequence Computer Communication Networks Databases, Factual Genome, Bacterial Molecular Sequence Data Collapse Grants Collapse
36	Anaerobic transcription activation in Bacillus subtilis: identification of distinct FNR-dependent and -independent regulatory mechanisms. EMBO J 1995;14:5984-94. [PMID: 8846791 PMCID: PMC394718 DOI: 10.1002/j.1460-2075.1995.tb00287.x] [Citation(s) in RCA: 74] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open Abstract Bacillus subtilis is able to grow anaerobically using alternative electron acceptors, including nitrate or fumarate. We characterized an operon encoding the dissimilatory nitrate reductase subunits homologous to the Escherichia coli narGHJI operon and the narK gene encoding a protein with nitrite extrusion activity. Downstream from narK and co-transcribed with it a gene (fnr) encoding a protein homologous to E.coli FNR was found. Disruption of fnr abolished both nitrate and fumarate utilization as electron acceptors and anaerobic induction of narK. Four putative FNR binding sites were found in B.subtilis sequences. The consensus sequence, centred at position -41.5, is identical to the consensus for the DNA site for E.coli CAP. Bs-FNR contained a four cysteine residue cluster at its C-terminal end. This is in contrast to Ec-FNR, where a similar cluster is present at the N-terminal end. It is possible that oxygen modulates the activity of both activators by a similar mechanism involving iron. Unlike in E.coli, where fnr expression is weakly repressed by anaerobiosis, fnr gene expression in B.subtilis is strongly activated by anaerobiosis. We have identified in the narK-fnr intergenic region a promotor activated by anaerobiosis independently of FNR. Thus induction of genes involved in anaerobic respiration requires in B.subtilis at least two levels of regulation: activation of fnr transcription and activation of FNR to induce transcription of FNR-dependent promoters. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Anaerobiosis Anion Transport Proteins Bacillus subtilis/enzymology Bacillus subtilis/genetics Bacillus subtilis/metabolism Bacterial Proteins/chemistry Bacterial Proteins/genetics Bacterial Proteins/metabolism Base Sequence Blotting, Northern Carrier Proteins/chemistry Carrier Proteins/genetics Carrier Proteins/metabolism Cloning, Molecular Computer Graphics Electron Transport/genetics Escherichia coli Proteins Gene Expression Regulation, Bacterial Iron-Sulfur Proteins/chemistry Iron-Sulfur Proteins/genetics Iron-Sulfur Proteins/metabolism Models, Molecular Molecular Sequence Data Nitrate Reductase Nitrate Reductases/genetics Nitrate Transporters Promoter Regions, Genetic/genetics Sequence Alignment Sequence Analysis Transcription Factors/chemistry Transcription Factors/genetics Transcription Factors/metabolism Transcription, Genetic/genetics Collapse Grants Collapse
37	Analysis of a Bacillus subtilis genome fragment using a co-operative computer system prototype. Gene 1995;165:GC37-51. [PMID: 7489895 DOI: 10.1016/0378-1119(95)00636-k] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Abstract Analysis of the huge volume of data generated by large scale sequencing projects requires the construction of new, sophisticated computer systems. These systems should be able to manage the biological data as well as the results of their analysis. They should also help the user to choose the most appropriate methods, and to string them together in order to solve a global analysis task. In this paper we present the prototype of a software system providing an environment for the analysis of large-scale sequence data. As a first step toward this end, this environment has been put to the test within the Bacillus subtilis genome sequencing project. This system integrates both the descriptive knowledge of the entities involved (genes, regulatory signals and the like) and the methodological knowledge comprising an extensible set of analytical methods. A knowledge representation based on two existing object-oriented models is used to implement this integrated system. In addition, the present prototype provides a suitable user interface both for displaying simultaneously the results generated by several methods and for interacting with the objects. We present in this paper the analysis of a B. subtilis genome fragment, present in data libraries but not annotated. Annotation of the genes present in the fragment allowed us to combine the results of several methods used for predicting coding sequences, and to characterize it as comprising a cryptic phage, the skin element. Comparison between the annotation of the skin element and a standard region of the chromosome indicated that local features of the nucleotide sequence could discriminate between phage and non-phage DNA sequence. Collapse Key Words Collapse MESH Headings Bacillus subtilis/genetics Base Sequence Genome, Bacterial Molecular Sequence Data Sequence Analysis Software Collapse Grants Collapse
38	SubtiList: a relational database for the Bacillus subtilis genome. MICROBIOLOGY (READING, ENGLAND) 1995;141 ( Pt 2):261-8. [PMID: 7704253 DOI: 10.1099/13500872-141-2-261] [Citation(s) in RCA: 140] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Abstract In the framework of the international collaborative project aiming to sequence the whole Bacillus subtilis chromosome, we have created a relational database for managing and analysing information associated with the molecular genetics of this bacterium: SubtiList. It allows recovery of non-redundant DNA sequences of the B. subtilis genome, as well as related information, i.e. genes, proteins, etc. A logical structure has been designed with appropriate links between the different objects, and a set of procedures has been implemented for data updating and management. The database is organized around a core constituted by all known contigs of B. subtilis, i.e. sets of non-redundant sequences created from original entries in the EMBL data library. A user-friendly interface has been developed to make the database easy to consult. Sequence analysis tools have been integrated into the database, such as a program for rapid similarity searching of protein data banks, and a powerful DNA pattern searching program. Thanks to the consistency of SubtiList, we have performed a codon usage analysis by Factorial Correspondence Analysis, and a study of the distribution of the isoelectric points of known proteins of B. subtilis. The SubtiList database is available through anonymous ftp (address 'ftp.pasteur.fr' or IP number 157.99.64.12, directory '/pub/GenomeDB/SubtiList'). Collapse Key Words Collapse MESH Headings Algorithms Bacillus subtilis/genetics Bacterial Proteins/genetics DNA, Bacterial/genetics Databases, Factual Genetic Code Genome, Bacterial Isoelectric Point Logic Sequence Analysis, DNA Software Collapse Grants Collapse
39	Bacillus subtilis genome project: cloning and sequencing of the 97 kb region from 325 degrees to 333 degrees. Mol Microbiol 1993;10:371-84. [PMID: 7934828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Abstract In the framework of the European project aimed at the sequencing of the Bacillus subtilis genome the DNA region located between gerB (314 degrees) and sacXY (333 degrees) was assigned to the Institut Pasteur. In this paper we describe the cloning and sequencing of a segment of 97 kb of contiguous DNA. Ninety-two open reading frames were predicted to encode putative proteins among which only forty-two were found to display significant similarities to known proteins present in databanks, e.g. amino acid permeases, proteins involved in cell wall or antibiotic biosynthesis, various regulatory proteins, proteins of several dehydrogenase families and enzymes II of the phosphotransferase system involved in sugar transport. Additional experiments led to the identification of the products of new B. subtilis genes, e.g. galactokinase and an operon involved in thiamine biosynthesis. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Bacillus subtilis/genetics Bacterial Proteins/genetics Base Sequence Cloning, Molecular Codon/genetics Genes, Bacterial/genetics Genome, Bacterial Molecular Sequence Data Open Reading Frames/genetics RNA-Binding Proteins Sequence Analysis, DNA Sequence Homology, Amino Acid Transcription Factors Collapse Grants Collapse
40	Multiple IS insertion sequences near the replication terminus in Escherichia coli K-12. Biochimie 1991;73:1361-74. [PMID: 1665988 DOI: 10.1016/0300-9084(91)90166-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Abstract In order to assess the feasibility of semi-automatic procedures for large genome sequencing, a fragment of 9.4 kb of Escherichia coli chromosomal DNA isolated at random was sequenced. It was found to map at 30 min on the chromosome map and to harbour two insertion sequences (IS2 and IS30) as well as several putative coding sequences which had no feature in common with known proteins. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Bacterial Proteins/genetics Base Sequence Cloning, Molecular DNA Replication DNA Transposable Elements DNA, Bacterial/genetics Escherichia coli/genetics Evaluation Studies as Topic Molecular Sequence Data Robotics Transcription, Genetic Collapse Grants Collapse