1
|
Cardim Falcao R, Edwards MR, Hurst M, Fraser E, Otterstatter M. A Review on Microbiological Source Attribution Methods of Human Salmonellosis: From Subtyping to Whole-Genome Sequencing. Foodborne Pathog Dis 2024; 21:137-146. [PMID: 38032610 PMCID: PMC10924193 DOI: 10.1089/fpd.2023.0075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023] Open
Abstract
Salmonella is one of the main causes of human foodborne illness. It is endemic worldwide, with different animals and animal-based food products as reservoirs and vehicles of infection. Identifying animal reservoirs and potential transmission pathways of Salmonella is essential for prevention and control. There are many approaches for source attribution, each using different statistical models and data streams. Some aim to identify the animal reservoir, while others aim to determine the point at which exposure occurred. With the advance of whole-genome sequencing (WGS) technologies, new source attribution models will greatly benefit from the discriminating power gained with WGS. This review discusses some key source attribution methods and their mathematical and statistical tools. We also highlight recent studies utilizing WGS for source attribution and discuss open questions and challenges in developing new WGS methods. We aim to provide a better understanding of the current state of these methodologies with application to Salmonella and other foodborne pathogens that are common sources of illness in the poultry and human sectors.
Collapse
Affiliation(s)
- Rebeca Cardim Falcao
- British Columbia Centre for Disease Control, Vancouver, Canada
- School of Population and Public Health, The University of British Columbia, Vancouver, Canada
| | - Megan R Edwards
- British Columbia Centre for Disease Control, Vancouver, Canada
- School of Population and Public Health, The University of British Columbia, Vancouver, Canada
| | - Matt Hurst
- Public Health Agency of Canada, Guelph, Canada
| | - Erin Fraser
- British Columbia Centre for Disease Control, Vancouver, Canada
- School of Population and Public Health, The University of British Columbia, Vancouver, Canada
| | - Michael Otterstatter
- British Columbia Centre for Disease Control, Vancouver, Canada
- School of Population and Public Health, The University of British Columbia, Vancouver, Canada
| |
Collapse
|
2
|
Castelli P, De Ruvo A, Bucciacchio A, D'Alterio N, Cammà C, Di Pasquale A, Radomski N. Harmonization of supervised machine learning practices for efficient source attribution of Listeria monocytogenes based on genomic data. BMC Genomics 2023; 24:560. [PMID: 37736708 PMCID: PMC10515079 DOI: 10.1186/s12864-023-09667-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 09/10/2023] [Indexed: 09/23/2023] Open
Abstract
BACKGROUND Genomic data-based machine learning tools are promising for real-time surveillance activities performing source attribution of foodborne bacteria such as Listeria monocytogenes. Given the heterogeneity of machine learning practices, our aim was to identify those influencing the source prediction performance of the usual holdout method combined with the repeated k-fold cross-validation method. METHODS A large collection of 1 100 L. monocytogenes genomes with known sources was built according to several genomic metrics to ensure authenticity and completeness of genomic profiles. Based on these genomic profiles (i.e. 7-locus alleles, core alleles, accessory genes, core SNPs and pan kmers), we developed a versatile workflow assessing prediction performance of different combinations of training dataset splitting (i.e. 50, 60, 70, 80 and 90%), data preprocessing (i.e. with or without near-zero variance removal), and learning models (i.e. BLR, ERT, RF, SGB, SVM and XGB). The performance metrics included accuracy, Cohen's kappa, F1-score, area under the curves from receiver operating characteristic curve, precision recall curve or precision recall gain curve, and execution time. RESULTS The testing average accuracies from accessory genes and pan kmers were significantly higher than accuracies from core alleles or SNPs. While the accuracies from 70 and 80% of training dataset splitting were not significantly different, those from 80% were significantly higher than the other tested proportions. The near-zero variance removal did not allow to produce results for 7-locus alleles, did not impact significantly the accuracy for core alleles, accessory genes and pan kmers, and decreased significantly accuracy for core SNPs. The SVM and XGB models did not present significant differences in accuracy between each other and reached significantly higher accuracies than BLR, SGB, ERT and RF, in this order of magnitude. However, the SVM model required more computing power than the XGB model, especially for high amount of descriptors such like core SNPs and pan kmers. CONCLUSIONS In addition to recommendations about machine learning practices for L. monocytogenes source attribution based on genomic data, the present study also provides a freely available workflow to solve other balanced or unbalanced multiclass phenotypes from binary and categorical genomic profiles of other microorganisms without source code modifications.
Collapse
Affiliation(s)
- Pierluigi Castelli
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Andrea De Ruvo
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Andrea Bucciacchio
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Nicola D'Alterio
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Cesare Cammà
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Adriano Di Pasquale
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Nicolas Radomski
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy.
| |
Collapse
|
3
|
Palma F, Radomski N, Guérin A, Sévellec Y, Félix B, Bridier A, Soumet C, Roussel S, Guillier L. Genomic elements located in the accessory repertoire drive the adaptation to biocides in Listeria monocytogenes strains from different ecological niches. Food Microbiol 2022; 106:103757. [PMID: 35690455 DOI: 10.1016/j.fm.2021.103757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 01/04/2021] [Accepted: 01/29/2021] [Indexed: 11/25/2022]
Abstract
In response to the massive use of biocides for controlling Listeria monocytogenes (hereafter Lm) contaminations along the food chain, strains showing biocide tolerance emerged. Here, accessory genomic elements were associated with biocide tolerance through pangenome-wide associations performed on 197 Lm strains from different lineages, ecological, geographical and temporal origins. Mobile elements, including prophage-related loci, the Tn6188_qacH transposon and pLMST6_emrC plasmid, were widespread across lineage I and II food strains and associated with tolerance to benzalkonium-chloride (BC), a quaternary ammonium compound (QAC) widely used in food processing. The pLMST6_emrC was also associated with tolerance to another QAC, the didecyldimethylammonium-chloride, displaying a pleiotropic effect. While no associations were detected for chemically reactive biocides (alcohols and chlorines), genes encoding for cell-surface proteins were associated with BC or polymeric biguanide tolerance. The latter was restricted to lineage I strains from animal and the environment. In conclusion, different genetic markers, with polygenic nature or not, appear to have driven the Lm adaptation to biocide, especially in food strains but also from animal and the environment. These markers could aid to monitor and predict the spread of biocide tolerant Lm genotypes across different ecological niches, finally reducing the risk of such strains in food industrial settings.
Collapse
Affiliation(s)
- Federica Palma
- Maisons-Alfort Laboratory of food safety, University Paris-Est, ANSES, Maisons-Alfort, France.
| | - Nicolas Radomski
- Maisons-Alfort Laboratory of food safety, University Paris-Est, ANSES, Maisons-Alfort, France
| | - Alizée Guérin
- Fougères Laboratory, Antibiotics, Biocides, Residues and Resistance Unit, ANSES, Fougères, France
| | - Yann Sévellec
- Maisons-Alfort Laboratory of food safety, University Paris-Est, ANSES, Maisons-Alfort, France
| | - Benjamin Félix
- Maisons-Alfort Laboratory of food safety, University Paris-Est, ANSES, Maisons-Alfort, France
| | - Arnaud Bridier
- Fougères Laboratory, Antibiotics, Biocides, Residues and Resistance Unit, ANSES, Fougères, France
| | - Christophe Soumet
- Fougères Laboratory, Antibiotics, Biocides, Residues and Resistance Unit, ANSES, Fougères, France
| | - Sophie Roussel
- Maisons-Alfort Laboratory of food safety, University Paris-Est, ANSES, Maisons-Alfort, France
| | - Laurent Guillier
- Maisons-Alfort Laboratory of food safety, University Paris-Est, ANSES, Maisons-Alfort, France; Maisons-Alfort Risk Assessment Department, University Paris-Est, ANSES, Maisons-Alfort, France
| |
Collapse
|
4
|
WGS-Based Lineage and Antimicrobial Resistance Pattern of Salmonella Typhimurium Isolated during 2000-2017 in Peru. Antibiotics (Basel) 2022; 11:antibiotics11091170. [PMID: 36139949 PMCID: PMC9495214 DOI: 10.3390/antibiotics11091170] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 08/20/2022] [Accepted: 08/23/2022] [Indexed: 11/16/2022] Open
Abstract
Salmonella Typhimurium is associated with foodborne diseases worldwide, including in Peru, and its emerging antibiotic resistance (AMR) is now a global public health problem. Therefore, country-specific monitoring of the AMR emergence is vital to control this pathogen, and in these aspects, whole genome sequence (WGS)—based approaches are better than gene-based analyses. Here, we performed the antimicrobial susceptibility test for ten widely used antibiotics and WGS-based various analyses of 90 S. Typhimurium isolates (human, animal, and environment) from 14 cities of Peru isolated from 2000 to 2017 to understand the lineage and antimicrobial resistance pattern of this pathogen in Peru. Our results suggest that the Peruvian isolates are of Typhimurium serovar and predominantly belong to sequence type ST19. Genomic diversity analyses indicate an open pan-genome, and at least ten lineages are circulating in Peru. A total of 48.8% and 31.0% of isolates are phenotypically and genotypically resistant to at least one antibiotic, while 12.0% are multi-drug resistant (MDR). Genotype−phenotype correlations for ten tested drugs show >80% accuracy, and >90% specificity. Sensitivity above 90% was only achieved for ciprofloxacin and ceftazidime. Two lineages exhibit the majority of the MDR isolates. A total of 63 different AMR genes are detected, of which 30 are found in 17 different plasmids. Transmissible plasmids such as lncI-gamma/k, IncI1-I(Alpha), Col(pHAD28), IncFIB, IncHI2, and lncI2 that carry AMR genes associated with third-generation antibiotics are also identified. Finally, three new non-synonymous single nucleotide variations (SNVs) for nalidixic acid and eight new SNVs for nitrofurantoin resistance are predicted using genome-wide association studies, comparative genomics, and functional annotation. Our analysis provides for the first time the WGS-based details of the circulating S. Typhimurium lineages and their antimicrobial resistance pattern in Peru.
Collapse
|
5
|
Arnold M, Smith RP, Tang Y, Guzinski J, Petrovska L. Bayesian Source Attribution of Salmonella Typhimurium Isolates From Human Patients and Farm Animals in England and Wales. Front Microbiol 2021; 12:579888. [PMID: 33584605 PMCID: PMC7876086 DOI: 10.3389/fmicb.2021.579888] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Accepted: 01/07/2021] [Indexed: 12/13/2022] Open
Abstract
The purpose of the study was to apply a Bayesian source attribution model to England and Wales based data on Salmonella Typhimurium (ST) and monophasic variants (MST), using different subtyping approaches based on sequence data. The data consisted of laboratory confirmed human cases and mainly livestock samples collected from surveillance or monitoring schemes. Three different subtyping methods were used, 7-loci Multi-Locus Sequence Typing (MLST), Core-genome MLST, and Single Nucleotide Polymorphism distance, with the impact of varying the genetic distance over which isolates would be grouped together being varied for the latter two approaches. A Bayesian frequency matching method, known as the modified Hald method, was applied to the data from each of the subtyping approaches. Pigs were found to be the main contributor to human infection for ST/MST, with approximately 60% of human cases attributed to them, followed by other mammals (mostly horses) and cattle. It was found that the use of different clustering methods based on sequence data had minimal impact on the estimates of source attribution. However, there was an impact of genetic distance over which isolates were grouped: grouping isolates which were relatively closely related increased uncertainty but tended to have a better model fit.
Collapse
Affiliation(s)
- Mark Arnold
- Department of Epidemiological Sciences, Animal and Plant Health Agency (APHA), Addlestone, United Kingdom
| | - Richard Piers Smith
- Department of Epidemiological Sciences, Animal and Plant Health Agency (APHA), Addlestone, United Kingdom
| | - Yue Tang
- Department of Bacteriology, Animal and Plant Health Agency (APHA), Addlestone, United Kingdom
| | - Jaromir Guzinski
- Department of Bacteriology, Animal and Plant Health Agency (APHA), Addlestone, United Kingdom
| | - Liljana Petrovska
- Department of Bacteriology, Animal and Plant Health Agency (APHA), Addlestone, United Kingdom
| |
Collapse
|