1
|
Mixão V, Pinto M, Brendebach H, Sobral D, Dourado Santos J, Radomski N, Majgaard Uldall AS, Bomba A, Pietsch M, Bucciacchio A, de Ruvo A, Castelli P, Iwan E, Simon S, Coipan CE, Linde J, Petrovska L, Kaas RS, Grimstrup Joensen K, Holtsmark Nielsen S, Kiil K, Lagesen K, Di Pasquale A, Gomes JP, Deneke C, Tausch SH, Borges V. Multi-country and intersectoral assessment of cluster congruence between pipelines for genomics surveillance of foodborne pathogens. Nat Commun 2025; 16:3961. [PMID: 40295532 PMCID: PMC12038046 DOI: 10.1038/s41467-025-59246-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Accepted: 04/15/2025] [Indexed: 04/30/2025] Open
Abstract
Different laboratories employ different Whole-Genome Sequencing (WGS) pipelines for Food and Waterborne disease (FWD) surveillance, casting doubt on the comparability of their results and hindering optimal communication at intersectoral and international levels. Through a collaborative effort involving eleven European institutes spanning the food, animal, and human health sectors, we aimed to assess the inter-pipeline clustering congruence across all resolution levels and perform an in-depth comparative analysis of cluster composition at outbreak level for four important foodborne pathogens: Listeria monocytogenes, Salmonella enterica, Escherichia coli, and Campylobacter jejuni. We found a general concordance between allele-based pipelines for all species, except for C. jejuni, where the different resolution power of allele-based schemas led to marked discrepancies. Still, we identified non-negligible differences in outbreak detection and demonstrated how a threshold flexibilization favors the detection of similar outbreak signals by different laboratories. These results, together with the observation that different traditional typing groups (e.g., serotypes) exhibit a remarkably different genetic diversity, represent valuable information for future outbreak case-definitions and WGS-based nomenclature design. This study reinforces the need, while demonstrating the feasibility, of conducting continuous pipeline comparability assessments, and opens good perspectives for a smoother international and intersectoral cooperation towards an efficient One Health FWD surveillance.
Collapse
Affiliation(s)
- Verónica Mixão
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Miguel Pinto
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Holger Brendebach
- National Study Center for Sequencing, Department of Biological Safety, German Federal Institute for Risk Assessment (BfR), Berlin, Germany
| | - Daniel Sobral
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - João Dourado Santos
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Nicolas Radomski
- National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: database and bioinformatics analysis (GENPAT), Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy
| | | | - Arkadiusz Bomba
- Department of Omics Analyses, National Veterinary Research Institute (PIWet), Puławy, Poland
| | - Michael Pietsch
- Unit of Enteropathogenic Bacteria and Legionella, Robert Koch Institute (RKI), Wernigerode, Germany
| | - Andrea Bucciacchio
- National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: database and bioinformatics analysis (GENPAT), Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy
| | - Andrea de Ruvo
- National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: database and bioinformatics analysis (GENPAT), Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy
- Computer Science, Gran Sasso Science Institute, L'Aquila, Italy
| | - Pierluigi Castelli
- National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: database and bioinformatics analysis (GENPAT), Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy
| | - Ewelina Iwan
- Department of Omics Analyses, National Veterinary Research Institute (PIWet), Puławy, Poland
| | - Sandra Simon
- Unit of Enteropathogenic Bacteria and Legionella, Robert Koch Institute (RKI), Wernigerode, Germany
| | - Claudia E Coipan
- Department for Infectious Diseases, Epidemiology and Surveillance, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Jörg Linde
- Institute of Bacterial Infections and Zoonoses, Friedrich-Loeffler-Institute (FLI), Jena, Germany
| | | | - Rolf Sommer Kaas
- National Food Institute, Technical University of Denmark (DTU), Lyngby, Denmark
| | | | - Sofie Holtsmark Nielsen
- Department of Bacteria, Parasites & Fungi, Statens Serum Institut (SSI), Copenhagen, Denmark
| | - Kristoffer Kiil
- Department of Bacteria, Parasites & Fungi, Statens Serum Institut (SSI), Copenhagen, Denmark
| | - Karin Lagesen
- Section for Epidemiology, Norwegian Veterinary Institute (NVI), Ås, Norway
| | - Adriano Di Pasquale
- National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: database and bioinformatics analysis (GENPAT), Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy
| | - João Paulo Gomes
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
- Veterinary and Animal Research Center (CECAV), Faculty of Veterinary Medicine, Lusófona University, Lisbon, Portugal
| | - Carlus Deneke
- National Study Center for Sequencing, Department of Biological Safety, German Federal Institute for Risk Assessment (BfR), Berlin, Germany
| | - Simon H Tausch
- National Study Center for Sequencing, Department of Biological Safety, German Federal Institute for Risk Assessment (BfR), Berlin, Germany
| | - Vítor Borges
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal.
| |
Collapse
|
2
|
Duchez R, Vingadassalon N, Merda D, Van Nieuwenhuysen T, Byrne B, Kourtis C, Nia Y, Hennekinne JA, Cavaiuolo M. Genetic relatedness of Staphylococcus aureus isolates within food outbreaks by single nucleotide polymorphisms. Int J Food Microbiol 2025; 433:111115. [PMID: 39993362 DOI: 10.1016/j.ijfoodmicro.2025.111115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2024] [Revised: 02/14/2025] [Accepted: 02/15/2025] [Indexed: 02/26/2025]
Abstract
Investigation of bacterial food outbreaks by whole genome sequencing can rely on the inspection of the genetic relatedness between isolates through the application of single nucleotide polymorphism (SNP) thresholds. However, there is no consensus for Staphylococcus aureus in the context of food outbreaks. In this study, we propose a SNP cut-off by taking into account the mutation rate and the evolution time of this pathogen in food. Through in vitro microevolution, we determined the mutation rate of three S. aureus strains grown under mimicked food stressing conditions. From the mutation rate, we set a cut-off of 28 SNPs considering 30 days as evolution time based on the average shelf-life of foods contaminated by S. aureus and the timeline for identifying this pathogen in outbreaks. The SNP threshold was applied to retrospectively study ten staphylococcal food outbreaks to assess whether isolates from food and/or of human origin from the same outbreak were epidemiologically related. To interpret SNP distances, phylogenetic tree topologies and bootstraps were integrated and showed that isolates differing by up to 28 SNPs were monophyletic. Our suggested cut-off can be used in outbreak management to identify closely related S. aureus strains.
Collapse
Affiliation(s)
- Rémi Duchez
- ANSES, Laboratory for Food Safety, SBCL Unit, Maisons-Alfort location, F-94701 Maisons-Alfort, France
| | - Noémie Vingadassalon
- ANSES, Laboratory for Food Safety, SBCL Unit, Maisons-Alfort location, F-94701 Maisons-Alfort, France
| | - Déborah Merda
- ANSES, Laboratory for Food Safety, Shared Support Service for Data Analysis (SPAAD), F-94706 Maisons-Alfort, France
| | | | - Brian Byrne
- Department of Agriculture, Food and the Marine, Food Microbiology Division, Backweston Laboratory Campus, Kildare, Ireland
| | - Christos Kourtis
- State General Laboratory, Food Microbiology Laboratory, Nicosia, Cyprus
| | - Yacine Nia
- ANSES, Laboratory for Food Safety, SBCL Unit, Maisons-Alfort location, F-94701 Maisons-Alfort, France
| | | | - Marina Cavaiuolo
- ANSES, Laboratory for Food Safety, SBCL Unit, Maisons-Alfort location, F-94701 Maisons-Alfort, France.
| |
Collapse
|
3
|
Merda D, Vila-Nova M, Bonis M, Boutigny AL, Brauge T, Cavaiuolo M, Cunty A, Regnier A, Sayeb M, Vingadassalon N, Yvon C, Chesnais V. Unraveling the impact of genome assembly on bacterial typing: a one health perspective. BMC Genomics 2024; 25:1059. [PMID: 39516732 PMCID: PMC11545336 DOI: 10.1186/s12864-024-10982-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Accepted: 10/30/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND In the context of pathogen surveillance, it is crucial to ensure interoperability and harmonized data. Several surveillance systems are designed to compare bacteria and identify outbreak clusters based on core genome MultiLocus Sequence Typing (cgMLST). Among the different approaches available to generate bacterial cgMLST, our research used an assembly-based approach (chewBBACA tool). METHODS Simulations of short-read sequencing were conducted for 5 genomes of 27 pathogens of interest in animal, plant, and human health to evaluate the repeatability and reproducibility of cgMLST. Various quality parameters, such as read quality and depth of sequencing were applied, and several read simulations and genome assemblies were repeated using three tools: SPAdes, Unicycler and Shovill. In vitro sequencing were also used to evaluate assembly impact on cgMLST results, for six bacterial species: Bacillus thuringiensis, Listeria monocytogenes, Salmonella enterica, Staphylococcus aureus, Vibrio parahaemolyticus and Xylella fastidiosa. RESULTS The results highlighted variability in cgMLST, which not only related to the assembly tools, but also induced by the intrinsic composition of the genomes themselves. This variability observed in simulated sequencing was further validated with real data for six of the bacterial pathogens studied. CONCLUSION This highlights that the intrinsic genome composition affects assembly and resulting cgMLST profiles, and that variability in bioinformatics tools can induce a bias in cgMLST profiles. In conclusion, we propose that the completeness of cgMLST schemes should be considered when clustering strains.
Collapse
Affiliation(s)
- Déborah Merda
- Université Paris Est, ANSES, Laboratory for Food Safety, SPAAD unit, Maisons-Alfort, F-94701, France.
| | - Meryl Vila-Nova
- Université Paris Est, ANSES, Laboratory for Food Safety, SPAAD unit, Maisons-Alfort, F-94701, France
| | - Mathilde Bonis
- Université Paris Est, ANSES, Laboratory for Food Safety, SBCL unit, Maisons-Alfort, F-94701, France
| | - Anne-Laure Boutigny
- ANSES, Plant Health Laboratory, Bacteriology Virology GMO Unit, 7 rue Jean Dixméras, Angers cedex 01, 49044, France
| | - Thomas Brauge
- ANSES, Laboratory for Food Safety, Bacteriology and Parasitology of Fishery and Aquaculture Products Unit (B3PA), Boulevard du Bassin Napoléon, Boulogne-sur-Mer, France
| | - Marina Cavaiuolo
- Université Paris Est, ANSES, Laboratory for Food Safety, SBCL unit, Maisons-Alfort, F-94701, France
| | - Amandine Cunty
- ANSES, Plant Health Laboratory, Bacteriology Virology GMO Unit, 7 rue Jean Dixméras, Angers cedex 01, 49044, France
| | - Antoine Regnier
- ANSES, Laboratory for Food Safety, Bacteriology and Parasitology of Fishery and Aquaculture Products Unit (B3PA), Boulevard du Bassin Napoléon, Boulogne-sur-Mer, France
| | - Maroua Sayeb
- Université Paris Est, ANSES, Laboratory for Food Safety, SEL unit, Maisons-Alfort, F-94701, France
| | - Noémie Vingadassalon
- Université Paris Est, ANSES, Laboratory for Food Safety, SBCL unit, Maisons-Alfort, F-94701, France
| | - Claire Yvon
- Université Paris Est, ANSES, Laboratory for Food Safety, SEL unit, Maisons-Alfort, F-94701, France
| | - Virginie Chesnais
- Université Paris Est, ANSES, Laboratory for Food Safety, SPAAD unit, Maisons-Alfort, F-94701, France
| |
Collapse
|
4
|
Halbedel S, Wamp S, Lachmann R, Holzer A, Pietzka A, Ruppitsch W, Wilking H, Flieger A. High density genomic surveillance and risk profiling of clinical Listeria monocytogenes subtypes in Germany. Genome Med 2024; 16:115. [PMID: 39375806 PMCID: PMC11457394 DOI: 10.1186/s13073-024-01389-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 09/24/2024] [Indexed: 10/09/2024] Open
Abstract
BACKGROUND Foodborne infections such as listeriosis caused by the bacterium Listeria monocytogenes represent a significant public health concern, particularly when outbreaks affect many individuals over prolonged time. Systematic collection of pathogen isolates from infected patients, whole genome sequencing (WGS) and phylogenetic analyses allow recognition and termination of outbreaks after source identification and risk profiling of abundant lineages. METHODS We here present a multi-dimensional analysis of > 1800 genome sequences from clinical L. monocytogenes isolates collected in Germany between 2018 and 2021. Different WGS-based subtyping methods were used to determine the population structure with its main phylogenetic sublineages as well as for identification of disease clusters. Clinical frequencies of materno-foetal and brain infections and in vitro infection experiments were used for risk profiling of the most abundant sublineages. These sublineages and large disease clusters were further characterised in terms of their genetic and epidemiological properties. RESULTS The collected isolates covered 62% of all notified cases and belonged to 188 infection clusters. Forty-two percent of these clusters were active for > 12 months, 60% generated cases cross-regionally, including 11 multinational clusters. Thirty-seven percent of the clusters were caused by sequence type (ST) ST6, ST8 and ST1 clones. ST1 was identified as hyper- and ST8, ST14, ST29 as well as ST155 as hypovirulent, while ST6 had average virulence potential. Inactivating mutations were found in several virulence and house-keeping genes, particularly in hypovirulent STs. CONCLUSIONS Our work presents an in-depth analysis of the genomic characteristics of L. monocytogenes isolates that cause disease in Germany. It supports prioritisation of disease clusters for epidemiological investigations and reinforces the need to analyse the mechanisms underlying hyper- and hypovirulence.
Collapse
Affiliation(s)
- Sven Halbedel
- FG11 Division of Enteropathogenic Bacteria and Legionella, Consultant Laboratory for Listeria, Robert Koch Institute, Burgstrasse 37, Wernigerode, D-38855, Germany.
- Institute for Medical Microbiology and Hospital Hygiene, Otto Von Guericke University Magdeburg, Leipziger Strasse 44, Magdeburg, 39120, Germany.
| | - Sabrina Wamp
- FG11 Division of Enteropathogenic Bacteria and Legionella, Consultant Laboratory for Listeria, Robert Koch Institute, Burgstrasse 37, Wernigerode, D-38855, Germany
| | - Raskit Lachmann
- FG35 - Division for Gastrointestinal Infections, Zoonoses and Tropical Infections, Robert Koch Institute, Seestrasse 10, Berlin, 13353, Germany
| | - Alexandra Holzer
- FG35 - Division for Gastrointestinal Infections, Zoonoses and Tropical Infections, Robert Koch Institute, Seestrasse 10, Berlin, 13353, Germany
| | - Ariane Pietzka
- Austrian Agency for Health and Food Safety, Institute for Medical Microbiology and Hygiene, Beethovenstraße 6, Graz, 8010, Austria
| | - Werner Ruppitsch
- Austrian Agency for Health and Food Safety, Institute for Medical Microbiology and Hygiene, Währingerstrasse 25a, Vienna, 1090, Austria
| | - Hendrik Wilking
- FG35 - Division for Gastrointestinal Infections, Zoonoses and Tropical Infections, Robert Koch Institute, Seestrasse 10, Berlin, 13353, Germany
| | - Antje Flieger
- FG11 Division of Enteropathogenic Bacteria and Legionella, Consultant Laboratory for Listeria, Robert Koch Institute, Burgstrasse 37, Wernigerode, D-38855, Germany.
| |
Collapse
|
5
|
Krisna MA, Jolley KA, Monteith W, Boubour A, Hamers RL, Brueggemann AB, Harrison OB, Maiden MCJ. Development and implementation of a core genome multilocus sequence typing scheme for Haemophilus influenzae. Microb Genom 2024; 10:001281. [PMID: 39120932 PMCID: PMC11315579 DOI: 10.1099/mgen.0.001281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 07/18/2024] [Indexed: 08/10/2024] Open
Abstract
Haemophilus influenzae is part of the human nasopharyngeal microbiota and a pathogen causing invasive disease. The extensive genetic diversity observed in H. influenzae necessitates discriminatory analytical approaches to evaluate its population structure. This study developed a core genome multilocus sequence typing (cgMLST) scheme for H. influenzae using pangenome analysis tools and validated the cgMLST scheme using datasets consisting of complete reference genomes (N = 14) and high-quality draft H. influenzae genomes (N = 2297). The draft genome dataset was divided into a development dataset (N = 921) and a validation dataset (N = 1376). The development dataset was used to identify potential core genes, and the validation dataset was used to refine the final core gene list to ensure the reliability of the proposed cgMLST scheme. Functional classifications were made for all the resulting core genes. Phylogenetic analyses were performed using both allelic profiles and nucleotide sequence alignments of the core genome to test congruence, as assessed by Spearman's correlation and ordinary least square linear regression tests. Preliminary analyses using the development dataset identified 1067 core genes, which were refined to 1037 with the validation dataset. More than 70% of core genes were predicted to encode proteins essential for metabolism or genetic information processing. Phylogenetic and statistical analyses indicated that the core genome allelic profile accurately represented phylogenetic relatedness among the isolates (R 2 = 0.945). We used this cgMLST scheme to define a high-resolution population structure for H. influenzae, which enhances the genomic analysis of this clinically relevant human pathogen.
Collapse
Affiliation(s)
- Made Ananda Krisna
- Nuffield Department of Medicine, Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, UK
- Department of Biology, University of Oxford, Oxford, UK
- Oxford University Clinical Research Unit Indonesia, Faculty of Medicine Universitas Indonesia, Jakarta, Indonesia
| | | | - William Monteith
- Department of Biology, University of Oxford, Oxford, UK
- Department of Biology and Biochemistry, University of Bath, Bath, UK
| | - Alexandra Boubour
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Raph L. Hamers
- Nuffield Department of Medicine, Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, UK
- Oxford University Clinical Research Unit Indonesia, Faculty of Medicine Universitas Indonesia, Jakarta, Indonesia
| | | | - Odile B. Harrison
- Department of Biology, University of Oxford, Oxford, UK
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | | |
Collapse
|
6
|
Osek J, Wieczorek K. Why does Listeria monocytogenes survive in food and food-production environments? J Vet Res 2023; 67:537-544. [PMID: 38130454 PMCID: PMC10730553 DOI: 10.2478/jvetres-2023-0068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 11/28/2023] [Indexed: 12/23/2023] Open
Abstract
Listeria monocytogenes is one of the most dangerous food-borne pathogens and is responsible for human listeriosis, a severe disease with a high mortality rate, especially among the elderly, pregnant women and newborns. Therefore, this bacterium has an important impact on food safety and public health. It is able to survive and even grow in a temperature range from -0.4°C to 45°C, a broad pH range from 4.6 to 9.5 and at a relatively low water activity (aW < 0.90), and tolerates salt content up to 20%. It is also resistant to ultraviolet light, biocides and heavy metals and forms biofilm structures on a variety of surfaces in food-production environments. These features make it difficult to remove and allow it to persist for a long time, increasing the risk of contamination of food-production facilities and ultimately of food. In the present review, the key mechanisms of the pathogen's survival and stress adaptation have been presented. This information may grant better understanding of bacterial adaptation to food environmental conditions.
Collapse
Affiliation(s)
- Jacek Osek
- Department of Hygiene of Food of Animal Origin, National Veterinary Research Institute, 24-100Puławy, Poland
| | - Kinga Wieczorek
- Department of Hygiene of Food of Animal Origin, National Veterinary Research Institute, 24-100Puławy, Poland
| |
Collapse
|
7
|
Biguenet A, Bordy A, Atchon A, Hocquet D, Valot B. Introduction and benchmarking of pyMLST: open-source software for assessing bacterial clonality using core genome MLST. Microb Genom 2023; 9. [PMID: 37966168 DOI: 10.1099/mgen.0.001126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2023] Open
Abstract
Core genome multilocus sequence typing (cgMLST) has gained in popularity for bacterial typing since whole-genome sequencing (WGS) has become affordable. We introduce here pyMLST, a new complete, stand-alone, free and open source pipeline for cgMLST analysis. pyMLST can create or import a core genome database. For each gene, the first allele is aligned against the bacterial genome of interest using BLAT. Incomplete genes are aligned using MAFT. All data are stored in a SQLite database. pyMLST accepts assembly genomes or raw data (with the option pyMLST-KMA) as input. To evaluate our new tool, we selected three genome collections of major bacterial pathogens (Escherichia coli, Pseudomonas aeruginosa and Staphylococcus aureus) and compared them with pyMLST, pyMLST-KMA, ChewBBACA, SeqSphere and the variant calling approach. We compared the sensitivity, precision and false-positive rate for each method with those of the variant calling approach. Minimal spanning trees were generated with each type of software to evaluate their interest in the context of a bacterial outbreak. We found that pyMLST-KMA is a convenient screening method to avoid assembling large bacterial collections. Our data showed that pyMLST (free, open source, available in Galaxy and pipeline ready) performed similarly to the commercial SeqSphere and performed better than ChewBBACA and pyMLST-KMA.
Collapse
Affiliation(s)
- Adrien Biguenet
- CHU de Besançon, Hygiène Hospitalière, F-25030 Besançon, France
- Université de Franche-Comté, CNRS, Chrono-environnement, F-25000 Besançon, France
| | - Augustin Bordy
- Université de Franche-Comté, CNRS, Chrono-environnement, F-25000 Besançon, France
| | - Alban Atchon
- Bioinformatique et Big Data Au Service de La Santé, Université de Franche-Comté, F-25000 Besançon, France
| | - Didier Hocquet
- CHU de Besançon, Hygiène Hospitalière, F-25030 Besançon, France
- Université de Franche-Comté, CNRS, Chrono-environnement, F-25000 Besançon, France
| | - Benoit Valot
- Université de Franche-Comté, CNRS, Chrono-environnement, F-25000 Besançon, France
- Bioinformatique et Big Data Au Service de La Santé, Université de Franche-Comté, F-25000 Besançon, France
| |
Collapse
|
8
|
D'Onofrio F, Schirone M, Krasteva I, Tittarelli M, Iannetti L, Pomilio F, Torresi M, Paparella A, D'Alterio N, Luciani M. A comprehensive investigation of protein expression profiles in L. monocytogenes exposed to thermal abuse, mild acid, and salt stress conditions. Front Microbiol 2023; 14:1271787. [PMID: 37876777 PMCID: PMC10591339 DOI: 10.3389/fmicb.2023.1271787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 09/19/2023] [Indexed: 10/26/2023] Open
Abstract
Preventing L. monocytogenes infection is crucial for food safety, considering its widespread presence in the environment and its association with contaminated RTE foods. The pathogen's ability to persist under adverse conditions, for example, in food processing facilities, is linked to virulence and resistance mechanisms, including biofilm formation. In this study, the protein expression patterns of two L. monocytogenes 1/2a strains, grown under environmental stressors (mild acidic pH, thermal abuse, and high concentration of NaCl), were investigated. Protein identification and prediction were performed by nLC-ESI-MS/MS and nine different bioinformatic software programs, respectively. Gene enrichment analysis was carried out by STRING v11.05. A total of 1,215 proteins were identified, of which 335 were non-cytosolic proteins and 265 were immunogenic proteins. Proteomic analysis revealed differences in protein expression between L. monocytogenes strains in stressful conditions. The two strains exhibited unique protein expression profiles linked to stress response, virulence, and pathogenesis. Studying the proteomic profiles of such microorganisms provides information about adaptation and potential treatments, highlighting their genetic diversity and demonstrating the utility of bioinformatics and proteomics for a broader analysis of pathogens.
Collapse
Affiliation(s)
- Federica D'Onofrio
- Department of Bioscience and Technology for Food, Agriculture and Environment, University of Teramo, Teramo, Italy
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise “G. Caporale”, Teramo, Italy
| | - Maria Schirone
- Department of Bioscience and Technology for Food, Agriculture and Environment, University of Teramo, Teramo, Italy
| | - Ivanka Krasteva
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise “G. Caporale”, Teramo, Italy
| | - Manuela Tittarelli
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise “G. Caporale”, Teramo, Italy
| | - Luigi Iannetti
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise “G. Caporale”, Teramo, Italy
| | - Francesco Pomilio
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise “G. Caporale”, Teramo, Italy
| | - Marina Torresi
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise “G. Caporale”, Teramo, Italy
| | - Antonello Paparella
- Department of Bioscience and Technology for Food, Agriculture and Environment, University of Teramo, Teramo, Italy
| | - Nicola D'Alterio
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise “G. Caporale”, Teramo, Italy
| | - Mirella Luciani
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise “G. Caporale”, Teramo, Italy
| |
Collapse
|
9
|
Castelli P, De Ruvo A, Bucciacchio A, D'Alterio N, Cammà C, Di Pasquale A, Radomski N. Harmonization of supervised machine learning practices for efficient source attribution of Listeria monocytogenes based on genomic data. BMC Genomics 2023; 24:560. [PMID: 37736708 PMCID: PMC10515079 DOI: 10.1186/s12864-023-09667-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 09/10/2023] [Indexed: 09/23/2023] Open
Abstract
BACKGROUND Genomic data-based machine learning tools are promising for real-time surveillance activities performing source attribution of foodborne bacteria such as Listeria monocytogenes. Given the heterogeneity of machine learning practices, our aim was to identify those influencing the source prediction performance of the usual holdout method combined with the repeated k-fold cross-validation method. METHODS A large collection of 1 100 L. monocytogenes genomes with known sources was built according to several genomic metrics to ensure authenticity and completeness of genomic profiles. Based on these genomic profiles (i.e. 7-locus alleles, core alleles, accessory genes, core SNPs and pan kmers), we developed a versatile workflow assessing prediction performance of different combinations of training dataset splitting (i.e. 50, 60, 70, 80 and 90%), data preprocessing (i.e. with or without near-zero variance removal), and learning models (i.e. BLR, ERT, RF, SGB, SVM and XGB). The performance metrics included accuracy, Cohen's kappa, F1-score, area under the curves from receiver operating characteristic curve, precision recall curve or precision recall gain curve, and execution time. RESULTS The testing average accuracies from accessory genes and pan kmers were significantly higher than accuracies from core alleles or SNPs. While the accuracies from 70 and 80% of training dataset splitting were not significantly different, those from 80% were significantly higher than the other tested proportions. The near-zero variance removal did not allow to produce results for 7-locus alleles, did not impact significantly the accuracy for core alleles, accessory genes and pan kmers, and decreased significantly accuracy for core SNPs. The SVM and XGB models did not present significant differences in accuracy between each other and reached significantly higher accuracies than BLR, SGB, ERT and RF, in this order of magnitude. However, the SVM model required more computing power than the XGB model, especially for high amount of descriptors such like core SNPs and pan kmers. CONCLUSIONS In addition to recommendations about machine learning practices for L. monocytogenes source attribution based on genomic data, the present study also provides a freely available workflow to solve other balanced or unbalanced multiclass phenotypes from binary and categorical genomic profiles of other microorganisms without source code modifications.
Collapse
Affiliation(s)
- Pierluigi Castelli
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Andrea De Ruvo
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Andrea Bucciacchio
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Nicola D'Alterio
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Cesare Cammà
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Adriano Di Pasquale
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Nicolas Radomski
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy.
| |
Collapse
|
10
|
Hamerlinck H, Aerssens A, Boelens J, Dehaene A, McMahon M, Messiaen AS, Vandendriessche S, Velghe A, Leroux-Roels I, Verhasselt B. Sanitary installations and wastewater plumbing as reservoir for the long-term circulation and transmission of carbapenemase producing Citrobacter freundii clones in a hospital setting. Antimicrob Resist Infect Control 2023; 12:58. [PMID: 37337245 DOI: 10.1186/s13756-023-01261-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 05/29/2023] [Indexed: 06/21/2023] Open
Abstract
BACKGROUND Accumulating evidence shows a role of the hospital wastewater system in the spread of multidrug-resistant organisms, such as carbapenemase producing Enterobacterales (CPE). Several sequential outbreaks of CPE on the geriatric ward of the Ghent University hospital have led to an outbreak investigation. Focusing on OXA-48 producing Citrobacter freundii, the most prevalent species, we aimed to track clonal relatedness using whole genome sequencing (WGS). By exploring transmission routes we wanted to improve understanding and (re)introduce targeted preventive measures. METHODS Environmental screening (toilet water, sink and shower drains) was performed between 2017 and 2021. A retrospective selection was made of 53 Citrobacter freundii screening isolates (30 patients and 23 environmental samples). DNA from frozen bacterial isolates was extracted and prepped for shotgun WGS. Core genome multilocus sequence typing was performed with an in-house developed scheme using 3,004 loci. RESULTS The CPE positivity rate of environmental screening samples was 19.0% (73/385). Highest percentages were found in the shower drain samples (38.2%) and the toilet water samples (25.0%). Sink drain samples showed least CPE positivity (3.3%). The WGS data revealed long-term co-existence of three patient sample derived C. freundii clusters. The biggest cluster (ST22) connects 12 patients and 8 environmental isolates taken between 2018 and 2021 spread across the ward. In an overlapping period, another cluster (ST170) links eight patients and four toilet water isolates connected to the same room. The third C. freundii cluster (ST421) connects two patients hospitalised in the same room but over a period of one and a half year. Additional sampling in 2022 revealed clonal isolates linked to the two largest clusters (ST22, ST170) in the wastewater collection pipes connecting the rooms. CONCLUSIONS Our findings suggest long-term circulation and transmission of carbapenemase producing C. freundii clones in hospital sanitary installations despite surveillance, daily cleaning and intermittent disinfection protocols. We propose a role for the wastewater drainage system in the spread within and between rooms and for the sanitary installations in the indirect transmission via bioaerosol plumes. To tackle this problem, a multidisciplinary approach is necessary including careful design and maintenance of the plumbing system.
Collapse
Affiliation(s)
- Hannelore Hamerlinck
- Department of Laboratory Medicine, Ghent University Hospital, Ghent, Belgium.
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium.
| | - Annelies Aerssens
- Department of Infection Control, Ghent University Hospital, Ghent, Belgium
| | - Jerina Boelens
- Department of Laboratory Medicine, Ghent University Hospital, Ghent, Belgium
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| | - Andrea Dehaene
- Department of Infection Control, Ghent University Hospital, Ghent, Belgium
| | - Michael McMahon
- Department of Infection Control, Ghent University Hospital, Ghent, Belgium
| | | | | | - Anja Velghe
- Department of Geriatrics, Ghent University Hospital, Ghent, Belgium
| | - Isabel Leroux-Roels
- Department of Laboratory Medicine, Ghent University Hospital, Ghent, Belgium
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
- Department of Infection Control, Ghent University Hospital, Ghent, Belgium
| | - Bruno Verhasselt
- Department of Laboratory Medicine, Ghent University Hospital, Ghent, Belgium
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| |
Collapse
|
11
|
Delineating Mycobacterium abscessus population structure and transmission employing high-resolution core genome multilocus sequence typing. Nat Commun 2022; 13:4936. [PMID: 35999208 PMCID: PMC9399081 DOI: 10.1038/s41467-022-32122-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 07/19/2022] [Indexed: 11/08/2022] Open
Abstract
Mycobacterium abscessus is an emerging multidrug-resistant non-tuberculous mycobacterium that causes a wide spectrum of infections and has caused several local outbreaks worldwide. To facilitate standardized prospective molecular surveillance, we established a novel core genome multilocus sequence typing (cgMLST) scheme. Whole genome sequencing data of 1991 isolates were employed to validate the scheme, re-analyze global population structure and set genetic distance thresholds for cluster detection and taxonomic identification. We confirmed and amended the nomenclature of the main dominant circulating clones and found that these also correlate well with traditional 7-loci MLST. Dominant circulating clones could be linked to a corresponding reference genome with less than 250 alleles while 99% of pairwise comparisons between epidemiologically linked isolates were below 25 alleles and 90% below 10 alleles. These thresholds can be used to guide further epidemiological investigations. Overall, the scheme will help to unravel the apparent global spread of certain clonal complexes and as yet undiscovered transmission routes.
Collapse
|
12
|
Core Genome Multilocus Sequence Typing Scheme for Improved Characterization and Epidemiological Surveillance of Pathogenic Brucella. J Clin Microbiol 2022; 60:e0031122. [PMID: 35852343 PMCID: PMC9387271 DOI: 10.1128/jcm.00311-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Brucellosis poses a significant burden to human and animal health worldwide. Robust and harmonized molecular epidemiological approaches and population studies that include routine disease screening are needed to efficiently track the origin and spread of Brucella strains. Core genome multilocus sequence typing (cgMLST) is a powerful genotyping system commonly used to delineate pathogen transmission routes for disease surveillance and control. Except for Brucella melitensis, cgMLST schemes for Brucella species are currently not established. Here, we describe a novel cgMLST scheme that covers multiple Brucella species. We first determined the phylogenetic breadth of the genus using 612 Brucella genomes. We selected 1,764 genes that were particularly well conserved and typeable in at least 98% of these genomes. We tested the new scheme on 600 genomes and found high agreement with the whole-genome-based single nucleotide polymorphism (SNP) analysis. Next, we applied the scheme to reanalyze the genome of Brucella strains from epidemiologically linked outbreaks. We demonstrated the applicability of the new scheme for high-resolution typing required in outbreak investigations as previously reported with whole-genome SNP methods. We also used the novel scheme to define the global population structure of the genus using 1,322 Brucella genomes. Finally, we demonstrated the possibility of tracing distribution of Brucella strains by performing cluster analysis of cgMLST profiles and found nearly identical cgMLST profiles in different countries. Our results show that sequencing depth of more than 40-fold is optimal for allele calling with this scheme. In summary, this study describes a novel Brucella-wide cgMLST scheme that is applicable in Brucella molecular epidemiology and helps in accurately tracking and thus controlling the sources of infection. The scheme is publicly accessible and should represent a valuable resource for laboratories with limited computational resources and bioinformatics expertise.
Collapse
|