1
|
Olsen RJ, Long SW, Vedaraju Y, Tomasdottir S, Erlendsdottir H, Kristinsson KG, Musser JM, Haraldsson G. Intra-host genomic variation of serologically nontypeable Haemophilus influenzae isolates from otitis media. Microbiol Spectr 2025; 13:e0308924. [PMID: 40162758 PMCID: PMC12053901 DOI: 10.1128/spectrum.03089-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Accepted: 03/03/2025] [Indexed: 04/02/2025] Open
Abstract
Serologically nontypeable Haemophilus influenzae is a human pathogen that causes infections ranging in severity from mild otitis media and sinusitis to life-threatening pneumonia, bacteremia, and meningitis. Although intra-host genomic variation in infected humans has been studied, many important questions remain unanswered. To address this knowledge deficit, we sequenced the genomes of 500 isolates recovered from ear drainage fluid collected from 11 Icelandic children with otorrhea. We discovered substantial genomic diversity among the H. influenzae strains infecting each patient. In total, we identified 88 genes that acquired nonsynonymous (amino acid-changing) or nonsense (protein-truncating) single-nucleotide polymorphisms, insertions, or deletions in at least one isolate. Of these, 13 genes were recurrently polymorphic. The polymorphic genes encoded proteins with varied inferred functions, including carbohydrate metabolism, cell wall biosynthesis, environmental stress response, glycolipid metabolism, iron metabolism, recombination, small molecule transport, and transcription and translation. No amino acid substitutions or protein truncations were identified in proven H. influenzae virulence factors or major transcription regulators. However, many of the polymorphic genes likely contribute to fitness, virulence, or host-pathogen molecular interactions. Our study of intra-host variation in otitis media provides a framework for understanding the genomic adaptability of H. influenzae during human infections.IMPORTANCESerologically nontypeable H. influenzae is a human pathogen responsible for a range of diseases, including mild otitis media (middle ear infection) and sinusitis, and severe pneumonia, bacteremia, and meningitis. While research has begun advancing our understanding of the population genomic structure of H. influenza strains infecting humans, little is known about intra-host genomic variation. To address this knowledge gap, we sequenced the genomes of 500 H. influenzae isolates recovered from ear drainage fluid of Icelandic children diagnosed with otitis media. Our findings revealed that intra-host genomic variation involves many different genes encoding proteins with diverse functions. The data provide novel information bearing on the complexity of intra-host diversity and improve our understanding of H. influenzae strain fitness and molecular pathogenesis. This information could generate new hypotheses bearing on host-pathogen interactions and identify new therapeutic and vaccine targets.
Collapse
Affiliation(s)
- Randall J. Olsen
- Department of Pathology and Genomic Medicine, Laboratory for Molecular and Translational Human Infectious Diseases Research, Center for Infectious Diseases, Houston Methodist Research Institute, and Houston Methodist Hospital, Houston, Texas, USA
- Department of Pathology and Laboratory Medicine and Microbiology and Immunology, Weill Medical College of Cornell University, New York City, New York, USA
| | - S. Wesley Long
- Department of Pathology and Genomic Medicine, Laboratory for Molecular and Translational Human Infectious Diseases Research, Center for Infectious Diseases, Houston Methodist Research Institute, and Houston Methodist Hospital, Houston, Texas, USA
- Department of Pathology and Laboratory Medicine and Microbiology and Immunology, Weill Medical College of Cornell University, New York City, New York, USA
| | - Yuvanesh Vedaraju
- Department of Pathology and Genomic Medicine, Laboratory for Molecular and Translational Human Infectious Diseases Research, Center for Infectious Diseases, Houston Methodist Research Institute, and Houston Methodist Hospital, Houston, Texas, USA
| | - Sandra Tomasdottir
- Department of Clinical Microbiology, Landspítali - the National University Hospital of Iceland, Reykjavik, Iceland
- Faculty of Medicine, School of Health Science, University of Iceland, , Reykjavík, Capital Region, Iceland
| | - Helga Erlendsdottir
- Department of Clinical Microbiology, Landspítali - the National University Hospital of Iceland, Reykjavik, Iceland
- Faculty of Medicine, School of Health Science, University of Iceland, , Reykjavík, Capital Region, Iceland
| | - Karl G. Kristinsson
- Department of Clinical Microbiology, Landspítali - the National University Hospital of Iceland, Reykjavik, Iceland
- Faculty of Medicine, School of Health Science, University of Iceland, , Reykjavík, Capital Region, Iceland
| | - James M. Musser
- Department of Pathology and Genomic Medicine, Laboratory for Molecular and Translational Human Infectious Diseases Research, Center for Infectious Diseases, Houston Methodist Research Institute, and Houston Methodist Hospital, Houston, Texas, USA
- Department of Pathology and Laboratory Medicine and Microbiology and Immunology, Weill Medical College of Cornell University, New York City, New York, USA
| | - Gunnsteinn Haraldsson
- Department of Clinical Microbiology, Landspítali - the National University Hospital of Iceland, Reykjavik, Iceland
- Faculty of Medicine, School of Health Science, University of Iceland, , Reykjavík, Capital Region, Iceland
| |
Collapse
|
2
|
Rebelo AR, Bortolaia V, Leekitcharoenphon P, Hansen DS, Nielsen HL, Ellermann-Eriksen S, Kemp M, Røder BL, Frimodt-Møller N, Søndergaard TS, Coia JE, Østergaard C, Westh H, Aarestrup FM. One day in Denmark: whole-genome sequence-based analysis of Escherichia coli isolates from clinical settings. J Antimicrob Chemother 2025; 80:1011-1021. [PMID: 39881516 PMCID: PMC11962386 DOI: 10.1093/jac/dkaf028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 01/15/2025] [Indexed: 01/31/2025] Open
Abstract
BACKGROUND WGS can potentially be routinely used in clinical microbiology settings, especially with the increase in sequencing accuracy and decrease in cost. Escherichia coli is the most common bacterial species analysed in those settings, thus fast and accurate diagnostics can lead to reductions in morbidity, mortality and healthcare costs. OBJECTIVES To evaluate WGS for diagnostics and surveillance in a collection of clinical E. coli; to examine the pool of antimicrobial resistance (AMR) determinants circulating in Denmark and the most frequent STs; and to evaluate core-genome MLST (cgMLST) and SNP-based clustering approaches for detecting genetically related isolates. METHODS We analysed the genomes of 699 E. coli isolates collected throughout all Danish Clinical Microbiology Laboratories. We used rMLST and KmerFinder for species identification, ResFinder for prediction of AMR, and PlasmidFinder for plasmid identification. We used Center for Genomic Epidemiology MLST, cgMLSTFinder and CSI Phylogeny to perform typing and clustering analysis. RESULTS Genetic AMR determinants were detected in 56.2% of isolates. We identified 182 MLSTs, most frequently ST-69, ST-73, ST-95 and ST-131. Using a maximum 15-allele difference as the threshold for genetic relatedness, we identified 23 clusters. SNP-based phylogenetic analysis within clusters revealed from 0 to 13 SNPs, except two cases with 111 and 461 SNPs. CONCLUSIONS WGS data are useful to characterize clinical E. coli isolates, including predicting AMR profiles and subtyping in concordance with surveillance data. We have shown that it is possible to adequately cluster isolates through a cgMLST approach, but it remains necessary to define proper interpretative criteria.
Collapse
Affiliation(s)
- Ana Rita Rebelo
- Technical University of Denmark, National Food Institute, Kongens Lyngby, Denmark
| | - Valeria Bortolaia
- Technical University of Denmark, National Food Institute, Kongens Lyngby, Denmark
| | | | | | - Hans Linde Nielsen
- Department of Clinical Microbiology, Aalborg University Hospital, Aalborg, Denmark
- Department of Clinical Medicine, Aalborg University, Aalborg, Denmark
| | | | - Michael Kemp
- Department of Clinical Microbiology, Odense University Hospital, Odense, Denmark
| | - Bent Løwe Røder
- Department of Clinical Microbiology, Slagelse Hospital, Slagelse, Denmark
| | | | | | - John Eugenio Coia
- Department of Clinical Microbiology, Hospital of South West Jutland, Esbjerg, Denmark
| | - Claus Østergaard
- Department of Clinical Microbiology, Vejle Hospital, Vejle, Denmark
| | - Henrik Westh
- Department of Clinical Microbiology, Hvidovre Hospital, Copenhagen University Hospital—Amager and Hvidovre, Hvidovre, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Frank M Aarestrup
- Technical University of Denmark, National Food Institute, Kongens Lyngby, Denmark
| |
Collapse
|
3
|
Derelle R, Madon K, Hellewell J, Rodríguez-Bouza V, Arinaminpathy N, Lalvani A, Croucher NJ, Harris SR, Lees JA, Chindelevitch L. Reference-Free Variant Calling with Local Graph Construction with ska lo (SKA). Mol Biol Evol 2025; 42:msaf077. [PMID: 40171940 PMCID: PMC11986325 DOI: 10.1093/molbev/msaf077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Revised: 02/20/2025] [Accepted: 03/20/2025] [Indexed: 04/04/2025] Open
Abstract
The study of genomic variants is increasingly important for public health surveillance of pathogens. Traditional variant-calling methods from whole-genome sequencing data rely on reference-based alignment, which can introduce biases and require significant computational resources. Alignment- and reference-free approaches offer an alternative by leveraging k-mer-based methods, but existing implementations often suffer from sensitivity limitations, particularly in high mutation density genomic regions. Here, we present ska lo, a graph-based algorithm that aims to identify within-strain variants in pathogen whole-genome sequencing data by traversing a colored De Bruijn graph and building variant groups (i.e. sets of variant combinations). Through in silico benchmarking and real-world dataset analyses, we demonstrate that ska lo achieves high sensitivity in single-nucleotide polymorphism (SNP) calls while also enabling the detection of insertions and deletions, as well as SNP positioning on a reference genome for recombination analyses. These findings highlight ska lo as a simple, fast, and effective tool for pathogen genomic epidemiology, extending the range of reference-free variant-calling approaches. ska lo is freely available as part of the SKA program (https://github.com/bacpop/ska.rust).
Collapse
Affiliation(s)
- Romain Derelle
- NIHR Health Protection Research Unit in Respiratory Infections, National Heart and Lung Institute, Imperial College London, London W2 1PG, UK
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London W12 0BZ, UK
| | - Kieran Madon
- NIHR Health Protection Research Unit in Respiratory Infections, National Heart and Lung Institute, Imperial College London, London W2 1PG, UK
| | - Joel Hellewell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Víctor Rodríguez-Bouza
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Nimalan Arinaminpathy
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London W12 0BZ, UK
| | - Ajit Lalvani
- NIHR Health Protection Research Unit in Respiratory Infections, National Heart and Lung Institute, Imperial College London, London W2 1PG, UK
| | - Nicholas J Croucher
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London W12 0BZ, UK
| | - Simon R Harris
- Bill and Melinda Gates Foundation, 62 Buckingham Gate, Westminster, London SW1E 6AJ, UK
| | - John A Lees
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Leonid Chindelevitch
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London W12 0BZ, UK
| |
Collapse
|
4
|
Laufer Halpin A, Mathers AJ, Walsh TR, Zingg W, Okeke IN, McDonald LC, Elkins CA, Harbarth S, Peacock SJ, Srinivasan A, Bell M, Pittet D, Cardo D. A framework towards implementation of sequencing for antimicrobial-resistant and other health-care-associated pathogens. THE LANCET. INFECTIOUS DISEASES 2025; 25:e235-e244. [PMID: 39832513 DOI: 10.1016/s1473-3099(24)00729-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 10/09/2024] [Accepted: 10/22/2024] [Indexed: 01/22/2025]
Abstract
Antimicrobial resistance continues to be a growing threat globally, specifically in health-care settings in which antimicrobial-resistant pathogens cause a substantial proportion of health-care-associated infections (HAIs). Next-generation sequencing (NGS) and the analysis of the data produced therein (ie, bioinformatics) represent an opportunity to enhance our capacity to address these threats. The 3rd Geneva Infection Prevention and Control Think Tank brought together experts to identify gaps, propose solutions, and set priorities for the use of NGS for HAIs and antimicrobial-resistant pathogens. The major deliverable recommendation from this meeting was a proposed framework for implementing the sequencing of HAI pathogens, specifically those harbouring antimicrobial-resistance mechanisms. The key components of the proposed framework relate to wet laboratory quality, sequence data quality, database and tool selection, bioinformatic analyses, data sharing, and NGS data integration, to support public health and actions for infection prevention and control. In this Personal View we detail and discuss the framework in the context of global implementation, specifically in low-income and middle-income countries.
Collapse
Affiliation(s)
- Alison Laufer Halpin
- Division of Healthcare Quality Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA; US Public Health Service, Rockville, MD, USA.
| | | | - Timothy R Walsh
- Department of Zoology, Ineos Oxford Institute for Antimicrobial Resistance, Oxford, UK
| | - Walter Zingg
- Division of Infectious Diseases and Hospital Hygiene, Universitätsspital Zürich, Zürich, Switzerland
| | - Iruka N Okeke
- Department of Pharmaceutical Microbiology, University of Ibadan, Ibadan, Nigeria
| | - L Clifford McDonald
- Division of Healthcare Quality Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Christopher A Elkins
- Division of Healthcare Quality Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | | | - Sharon J Peacock
- Cambridge Biomedical Campus, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Arjun Srinivasan
- Division of Healthcare Quality Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Michael Bell
- Division of Healthcare Quality Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Didier Pittet
- Hôpitaux Universitaires de Genève, Geneva, Switzerland
| | - Denise Cardo
- Division of Healthcare Quality Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
| |
Collapse
|
5
|
Roberts MD, Davis O, Josephs EB, Williamson RJ. K-mer-based Approaches to Bridging Pangenomics and Population Genetics. Mol Biol Evol 2025; 42:msaf047. [PMID: 40111256 PMCID: PMC11925024 DOI: 10.1093/molbev/msaf047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 01/10/2025] [Accepted: 02/04/2025] [Indexed: 03/12/2025] Open
Abstract
Many commonly studied species now have more than one chromosome-scale genome assembly, revealing a large amount of genetic diversity previously missed by approaches that map short reads to a single reference. However, many species still lack multiple reference genomes and correctly aligning references to build pangenomes can be challenging for many species, limiting our ability to study this missing genomic variation in population genetics. Here, we argue that k-mers are a very useful but underutilized tool for bridging the reference-focused paradigms of population genetics with the reference-free paradigms of pangenomics. We review current literature on the uses of k-mers for performing three core components of most population genetics analyses: identifying, measuring, and explaining patterns of genetic variation. We also demonstrate how different k-mer-based measures of genetic variation behave in population genetic simulations according to the choice of k, depth of sequencing coverage, and degree of data compression. Overall, we find that k-mer-based measures of genetic diversity scale consistently with pairwise nucleotide diversity (π) up to values of about π=0.025 (R2=0.97) for neutrally evolving populations. For populations with even more variation, using shorter k-mers will maintain the scalability up to at least π=0.1. Furthermore, in our simulated populations, k-mer dissimilarity values can be reliably approximated from counting bloom filters, highlighting a potential avenue to decreasing the memory burden of k-mer-based genomic dissimilarity analyses. For future studies, there is a great opportunity to further develop methods to identifying selected loci using k-mers.
Collapse
Affiliation(s)
- Miles D Roberts
- Genetics and Genome Sciences Program, Michigan State University, East Lansing, MI 48824, USA
| | - Olivia Davis
- Department of Computer Science and Software Engineering, Rose-Hulman Institute of Technology, Terre Haute, IN 47803, USA
| | - Emily B Josephs
- Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI 48824, USA
- Plant Resilience Institute, Michigan State University, East Lansing, MI 48824, USA
| | - Robert J Williamson
- Department of Computer Science and Software Engineering, Rose-Hulman Institute of Technology, Terre Haute, IN 47803, USA
- Department of Biology and Biomedical Engineering, Rose-Hulman Institute of Technology, Terre Haute, IN 47803, USA
| |
Collapse
|
6
|
Choi J, Shin JH, Choi JY, Park KT, Seo MR, Jung SH, Chung YJ, Ko KS. Analysis of Salmonella enterica serovar Enteritidis isolates from South Korea based on whole genome sequences, antibiotic resistance, and virulence assays. Diagn Microbiol Infect Dis 2025; 111:116642. [PMID: 39653628 DOI: 10.1016/j.diagmicrobio.2024.116642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Revised: 11/29/2024] [Accepted: 11/30/2024] [Indexed: 03/03/2025]
Abstract
We compared 15 Salmonella Enteritidis isolates collected in South Korea in 2023: 11 from chickens and 4 from humans. All isolates belonged to ST11, and they were divided into two clusters, rST3888 and rST1425. All four human isolates were colistin-resistant, and mcr-1 was identified in three isolates.
Collapse
Affiliation(s)
- Jihyun Choi
- Department of Microbiology, Sungkyunkwan University School of Medicine, Suwon 16419, Republic of Korea
| | - Jong Hyun Shin
- Department of Microbiology, Sungkyunkwan University School of Medicine, Suwon 16419, Republic of Korea
| | - Ji Young Choi
- Department of Microbiology, Sungkyunkwan University School of Medicine, Suwon 16419, Republic of Korea
| | - Kun Taek Park
- Department of Biotechnology, Inje University, Gimhae 50834, Republic of Korea
| | - Mi-Ran Seo
- ConnectaGen, Hanam 12918, Republic of Korea
| | - Seung-Hyun Jung
- Department of Biochemistry, College of Medicine, The Catholic University of Korea, Seoul 06591, Republic of Korea
| | - Yeun-Jun Chung
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul 06591, Republic of Korea
| | - Kwan Soo Ko
- Department of Microbiology, Sungkyunkwan University School of Medicine, Suwon 16419, Republic of Korea.
| |
Collapse
|
7
|
Abdel-Glil MY, Brandt C, Pletz MW, Neubauer H, Sprague LD. High intra-laboratory reproducibility of nanopore sequencing in bacterial species underscores advances in its accuracy. Microb Genom 2025; 11:001372. [PMID: 40117330 PMCID: PMC11927881 DOI: 10.1099/mgen.0.001372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2024] [Accepted: 01/30/2025] [Indexed: 03/23/2025] Open
Abstract
Nanopore sequencing is a third-generation technology known for its portability, real-time analysis and ability to generate long reads. It has great potential for use in clinical diagnostics, but thorough validation is required to address accuracy concerns and ensure reliable and reproducible results. In this study, we automated an open-source workflow (freely available at https://gitlab.com/FLI_Bioinfo/nanobacta) for the assembly of Oxford Nanopore sequencing data and used it to investigate the reproducibility of assembly results under consistent conditions. We used a benchmark dataset of five bacterial reference strains and generated eight technical sequencing replicates of the same DNA using the Ligation and Rapid Barcoding kits together with the Flongle and MinION flow cells. We assessed reproducibility by measuring discrepancies such as substitution and insertion/deletion errors, analysing plasmid recovery results and examining genetic markers and clustering information. We compared the results of genome assemblies with and without short-read polishing. Our results show an average reproducibility accuracy of 99.999955% for nanopore-only assemblies and 99.999996% when the short reads were used for polishing. The genomic analysis results were highly reproducible for the nanopore-only assemblies without short read in the following areas: identification of genetic markers for antimicrobial resistance and virulence, classical MLST, taxonomic classification, genome completeness and contamination analysis. Interestingly, the clustering information results from the core genome SNP and core genome MLST analyses were also highly reproducible for the nanopore-only assemblies, with pairwise differences of up to two allele differences in core genome MLST and two SNPs in core genome SNP across replicates. After polishing the assemblies with short reads, the pairwise differences for cgMLST were 0 and for cgSNP were 0-1 SNP across replicates. These results highlight the advances in sequencing accuracy of nanopore data without the use of short reads.
Collapse
Affiliation(s)
- Mostafa Y. Abdel-Glil
- Institute of Bacterial Infections and Zoonoses, Friedrich-Loeffler-Institut, Naumburger Str. 96A, 07743 Jena, Germany
- Institute for Infectious Diseases and Infection Control, Jena University Hospital – Friedrich Schiller University, Jena, Germany
| | - Christian Brandt
- Institute for Infectious Diseases and Infection Control, Jena University Hospital – Friedrich Schiller University, Jena, Germany
- InfectoGnostics Research Campus Jena, Center for Applied Research, 07743 Jena, Germany
| | - Mathias W. Pletz
- Institute for Infectious Diseases and Infection Control, Jena University Hospital – Friedrich Schiller University, Jena, Germany
| | - Heinrich Neubauer
- Institute of Bacterial Infections and Zoonoses, Friedrich-Loeffler-Institut, Naumburger Str. 96A, 07743 Jena, Germany
| | - Lisa D. Sprague
- Institute of Bacterial Infections and Zoonoses, Friedrich-Loeffler-Institut, Naumburger Str. 96A, 07743 Jena, Germany
| |
Collapse
|
8
|
Pham VD, Xu ZS, Simpson DJ, Zhang JS, Gänzle MG. Does strain-level persistence of lactobacilli in long-term back-slopped sourdoughs inform on domestication of food-fermenting lactic acid bacteria? Appl Environ Microbiol 2024; 90:e0189224. [PMID: 39503491 PMCID: PMC11654800 DOI: 10.1128/aem.01892-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Accepted: 10/18/2024] [Indexed: 12/19/2024] Open
Abstract
Sourdoughs are maintained by back-slopping over long time periods. To determine strain-level persistence of bacteria, we characterized four sourdoughs from bakeries over a period of 3.3, 11.0, 18.0, and 19.0 years. One sourdough included isolates of Levilactobacillus spp. and Fructilactobacillus spp. that differed by fewer than 10 single-nucleotide polymorphisms (SNPs) from the isolates obtained 3.3 years earlier and thus likely represent the same strain. Isolates of Levilactobacillus parabrevis differed by 200-300 SNPs; their genomes were under positive selection, indicating transmission from an external source. In two other sourdoughs, isolates of Fructilactobacillus sanfranciscensis that were obtained 11 and 18 years apart differed by 19 and 29 SNPs, respectively, again indicating repeated isolation of the same strain. The isolate of Fl. sanfranciscensis from the fourth sourdough differed by 45 SNPs from the isolate obtained 19 years previously. We thus identified strain-level persistence in three out of four long-term back-slopped sourdoughs, making it possible that strains persisted over periods that are long enough to allow bacterial speciation and domestication.IMPORTANCEThe assembly of microbial communities in sourdough is shaped by dispersal and selection. Speciation and domestication of fermentation microbes in back-slopped food fermentations have been documented for food-fermenting fungi including sourdough yeasts but not for bacteria, which evolve at a slower rate. Bacterial speciation in food fermentations requires strain-level persistence of fermentation microbes over hundreds or thousands of years. By documenting strain-level persistence in three out of four sourdoughs over a period of up to 18 years, we demonstrate that persistence over hundreds or thousands of years is possible, if not likely. We thus not only open a new perspective on fermentation control in bakeries but also support the possibility that all humans, despite their cultural diversity, share the same fermentation microbes.
Collapse
Affiliation(s)
- Vi D. Pham
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Canada
| | - Zhaohui S. Xu
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Canada
| | - David J. Simpson
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Canada
| | - Justina S. Zhang
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Canada
| | - Michael G. Gänzle
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Canada
| |
Collapse
|
9
|
Derelle R, von Wachsmann J, Mäklin T, Hellewell J, Russell T, Lalvani A, Chindelevitch L, Croucher NJ, Harris SR, Lees JA. Seamless, rapid, and accurate analyses of outbreak genomic data using split k-mer analysis. Genome Res 2024; 34:1661-1673. [PMID: 39406504 PMCID: PMC11529842 DOI: 10.1101/gr.279449.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Accepted: 09/16/2024] [Indexed: 11/01/2024]
Abstract
Sequence variation observed in populations of pathogens can be used for important public health and evolutionary genomic analyses, especially outbreak analysis and transmission reconstruction. Identifying this variation is typically achieved by aligning sequence reads to a reference genome, but this approach is susceptible to reference biases and requires careful filtering of called genotypes. There is a need for tools that can process this growing volume of bacterial genome data, providing rapid results, but that remain simple so they can be used without highly trained bioinformaticians, expensive data analysis, and long-term storage and processing of large files. Here we describe split k-mer analysis (SKA2), a method that supports both reference-free and reference-based mapping to quickly and accurately genotype populations of bacteria using sequencing reads or genome assemblies. SKA2 is highly accurate for closely related samples, and in outbreak simulations, we show superior variant recall compared with reference-based methods, with no false positives. SKA2 can also accurately map variants to a reference and be used with recombination detection methods to rapidly reconstruct vertical evolutionary history. SKA2 is many times faster than comparable methods and can be used to add new genomes to an existing call set, allowing sequential use without the need to reanalyze entire collections. With an inherent absence of reference bias, high accuracy, and a robust implementation, SKA2 has the potential to become the tool of choice for genotyping bacteria. SKA2 is implemented in Rust and is freely available as open-source software.
Collapse
Affiliation(s)
- Romain Derelle
- NIHR Health Protection Research Unit in Respiratory Infections, National Heart and Lung Institute, Imperial College London, London W21PG, United Kingdom
| | - Johanna von Wachsmann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Tommi Mäklin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
- Department of Mathematics and Statistics, University of Helsinki, Helsinki 00014, Finland
| | - Joel Hellewell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Timothy Russell
- Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene & Tropical Medicine, London WC1E 7HT, United Kingdom
| | - Ajit Lalvani
- NIHR Health Protection Research Unit in Respiratory Infections, National Heart and Lung Institute, Imperial College London, London W21PG, United Kingdom
| | - Leonid Chindelevitch
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London W12 0BZ, United Kingdom
| | - Nicholas J Croucher
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London W12 0BZ, United Kingdom
| | - Simon R Harris
- Bill and Melinda Gates Foundation, Westminster, London SW1E 6AJ, United Kingdom
| | - John A Lees
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom;
| |
Collapse
|
10
|
Ridone P, Baker MAB. Hybrid Exb/Mot stators require substitutions distant from the chimeric pore to power flagellar rotation. J Bacteriol 2024; 206:e0014024. [PMID: 39283106 PMCID: PMC11500575 DOI: 10.1128/jb.00140-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 08/09/2024] [Indexed: 10/01/2024] Open
Abstract
Powered by ion transport across the cell membrane, conserved ion-powered rotary motors (IRMs) drive bacterial motility by generating torque on the rotor of the bacterial flagellar motor. Homologous heteroheptameric IRMs have been structurally characterized in ion channels such as Tol/Ton/Exb/Gld, and most recently in phage defense systems such as Zor. Functional stator complexes synthesized from chimeras of PomB/MotB (PotB) have been used to study flagellar rotation at low ion-motive force achieved via reduced external sodium concentration. The function of such chimeras is highly sensitive to the location of the fusion site, and these hybrid proteins have thus far been arbitrarily designed. To date, no chimeras have been constructed using interchange of components from Tol/Ton/Exb/Gld and other ion-powered motors with more distant homology. Here, we synthesized chimeras of MotAB, PomAPotB, and ExbBD to assess their capacity for cross-compatibility. We generated motile strains powered by stator complexes with B-subunit chimeras. This motility was further optimized by directed evolution. Whole-genome sequencing of these strains revealed that motility-enhancing residue changes occurred in the A-subunit and at the peptidoglycan binding domain of the B-unit, which could improve motility. Overall, our work highlights the complexity of stator architecture and identifies the challenges associated with the rational design of chimeric IRMs. IMPORTANCE Ion-powered rotary motors (IRMs) underpin the rotation of one of nature's oldest wheels, the flagellar motor. Recent structures show that this complex appears to be a fundamental molecular module with diverse biological utility where electrical energy is coupled to torque. Here, we attempted to rationally design chimeric IRMs to explore the cross-compatibility of these ancient motors. We succeeded in making one working chimera of a flagellar motor and a non-flagellar transport system protein. This had only a short hybrid stretch in the ion-conducting channel, and function was subsequently improved through additional substitutions at sites distant from this hybrid pore region. Our goal was to test the cross-compatibility of these homologous systems and highlight challenges arising when engineering new rotary motors.
Collapse
Affiliation(s)
- Pietro Ridone
- School of Biotechnology and Biomolecular Sciences, UNSW, Kensington, Australia
| | - Matthew A. B. Baker
- School of Biotechnology and Biomolecular Sciences, UNSW, Kensington, Australia
| |
Collapse
|
11
|
Hall MB, Wick RR, Judd LM, Nguyen AN, Steinig EJ, Xie O, Davies M, Seemann T, Stinear TP, Coin L. Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data. eLife 2024; 13:RP98300. [PMID: 39388235 PMCID: PMC11466455 DOI: 10.7554/elife.98300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/12/2024] Open
Abstract
Variant calling is fundamental in bacterial genomics, underpinning the identification of disease transmission clusters, the construction of phylogenetic trees, and antimicrobial resistance detection. This study presents a comprehensive benchmarking of variant calling accuracy in bacterial genomes using Oxford Nanopore Technologies (ONT) sequencing data. We evaluated three ONT basecalling models and both simplex (single-strand) and duplex (dual-strand) read types across 14 diverse bacterial species. Our findings reveal that deep learning-based variant callers, particularly Clair3 and DeepVariant, significantly outperform traditional methods and even exceed the accuracy of Illumina sequencing, especially when applied to ONT's super-high accuracy model. ONT's superior performance is attributed to its ability to overcome Illumina's errors, which often arise from difficulties in aligning reads in repetitive and variant-dense genomic regions. Moreover, the use of high-performing variant callers with ONT's super-high accuracy data mitigates ONT's traditional errors in homopolymers. We also investigated the impact of read depth on variant calling, demonstrating that 10× depth of ONT super-accuracy data can achieve precision and recall comparable to, or better than, full-depth Illumina sequencing. These results underscore the potential of ONT sequencing, combined with advanced variant calling algorithms, to replace traditional short-read sequencing methods in bacterial genomics, particularly in resource-limited settings.
Collapse
Affiliation(s)
- Michael B Hall
- Department of Microbiology and Immunology, The University of Melbourne, at the Peter Doherty Institute for Infection and ImmunityMelbourneAustralia
| | - Ryan R Wick
- Department of Microbiology and Immunology, The University of Melbourne, at the Peter Doherty Institute for Infection and ImmunityMelbourneAustralia
- Centre for Pathogen Genomics, The University of MelbourneMelbourneAustralia
| | - Louise M Judd
- Department of Microbiology and Immunology, The University of Melbourne, at the Peter Doherty Institute for Infection and ImmunityMelbourneAustralia
- Centre for Pathogen Genomics, The University of MelbourneMelbourneAustralia
| | - An N Nguyen
- Department of Microbiology and Immunology, The University of Melbourne, at the Peter Doherty Institute for Infection and ImmunityMelbourneAustralia
| | - Eike J Steinig
- Department of Microbiology and Immunology, The University of Melbourne, at the Peter Doherty Institute for Infection and ImmunityMelbourneAustralia
| | - Ouli Xie
- Department of Infectious Diseases, The University of Melbourne, at the Peter Doherty Institute for Infection and ImmunityMelbourneAustralia
- Monash Infectious Diseases, Monash HealthMelbourneAustralia
| | - Mark Davies
- Department of Microbiology and Immunology, The University of Melbourne, at the Peter Doherty Institute for Infection and ImmunityMelbourneAustralia
| | - Torsten Seemann
- Department of Microbiology and Immunology, The University of Melbourne, at the Peter Doherty Institute for Infection and ImmunityMelbourneAustralia
- Centre for Pathogen Genomics, The University of MelbourneMelbourneAustralia
| | - Timothy P Stinear
- Department of Microbiology and Immunology, The University of Melbourne, at the Peter Doherty Institute for Infection and ImmunityMelbourneAustralia
- Centre for Pathogen Genomics, The University of MelbourneMelbourneAustralia
| | - Lachlan Coin
- Department of Microbiology and Immunology, The University of Melbourne, at the Peter Doherty Institute for Infection and ImmunityMelbourneAustralia
| |
Collapse
|
12
|
Villanueva CD, Bohunická M, Johansen JR. We are doing it wrong: Putting homology before phylogeny in cyanobacterial taxonomy. JOURNAL OF PHYCOLOGY 2024; 60:1071-1089. [PMID: 39152777 DOI: 10.1111/jpy.13491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 07/11/2024] [Accepted: 07/18/2024] [Indexed: 08/19/2024]
Abstract
The rapid expansion of whole genome sequencing in bacterial taxonomy has revealed deep evolutionary relationships and speciation signals, but assembly methods often miss true nucleotide diversity in the ribosomal operons. Though it lacks sufficient phylogenetic signal at the species level, the 16S ribosomal RNA gene is still much used in bacterial taxonomy. In cyanobacterial taxonomy, comparisons of 16S-23S Internal Transcribed Spacer (ITS) regions are used to bridge this information gap. Although ITS rRNA region analyses are routinely being used to identify species, researchers often do not identify orthologous operons, which leads to improper comparisons. No method for delineating orthologous operon copies from paralogous ones has been established. A new method for recognizing orthologous ribosomal operons by quantifying the conserved paired nucleotides in a helical domain of the ITS, has been developed. The D1' Index quantifies differences in the ratio of pyrimidines to purines in paired nucleotide sequences of this helix. Comparing 111 operon sequences from 89 strains of Brasilonema, four orthologous operon types were identified. Plotting D1' Index values against the length of helices produced clear separation of orthologs. Most orthologous operons in this study were observed both with and without tRNA genes present. We hypothesize that genomic rearrangement, not gene duplication, is responsible for the variation among orthologs. This new method will allow cyanobacterial taxonomists to utilize ITS rRNA region data more correctly, preventing erroneous taxonomic hypotheses. Moreover, this work could assist genomicists in identifying and preserving evident sequence variability in ribosomal operons, which is an important proxy for evolution in prokaryotes.
Collapse
Affiliation(s)
- Chelsea D Villanueva
- Department of Biological, Geological, & Environmental Sciences, Cleveland State University, Cleveland, Ohio, USA
- Department of Biology, John Carroll University, University Heights, Ohio, USA
| | - Markéta Bohunická
- Department of Biology, Faculty of Science, University of Hradec Králové, Hradec Králové, Czech Republic
| | - Jeffrey R Johansen
- Department of Biology, John Carroll University, University Heights, Ohio, USA
| |
Collapse
|
13
|
Choi J, Shin JH, Park S, Choi JY, Baek JY, Huh K, Chung DR, Kwon KT, Seo MR, Jung SH, Chung YJ, Ko KS. Phylogenetic Analysis Based on Whole Genome Sequences, Antibiotic Resistance, and Virulence of Salmonella enterica Clinical Isolates from South Korea. Foodborne Pathog Dis 2024. [PMID: 39269884 DOI: 10.1089/fpd.2024.0020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2024] Open
Abstract
Salmonella is a major cause of foodborne disease and frequently causes human salmonellosis in South Korea. In this study, we investigated the genome diversity, antimicrobial resistance, and virulence of clinical isolates of Salmonella enterica from South Korea. We collected 42 S. enterica subsp. enterica isolates from two hospitals in South Korea. Whole genome sequences were determined. Serovars and sequence types (STs) based on multilocus sequence typing (MLST) were identified from whole genome sequences. Phylogenetic trees based on whole genome sequences and a minimum spanning tree based on MLST were constructed. Human serum resistance assays and gentamicin protection assays were performed to assess in vitro virulence. Nineteen serovars were identified among 42 clinical isolates, including nine Salmonella Typhi isolates. There were inconsistencies between serogroups and phylogenetic clusters in the phylogenetic tree and minimum spanning tree, but high clonality of S. Typhi was observed. Salmonella Typhi isolates were divided into two clusters, corresponding to ST1 and ST2. Isolates of serovars Typhimurium and I4,[5],12:i:- clustered into a group, and a hybrid isolate between the two serovars was identified. Four ciprofloxacin-resistant isolates were identified among nine S. Typhi isolates, and all isolates of S. Enteritidis and S. Panama were resistant to colistin. The gentamicin protection assay revealed that serogroup D1 was significantly less virulent than the other serogroups. Our study suggests high diversity of S. enterica clinical isolates from South Korea and non-monophyly of serogroups. In addition, subgroups of S. Typhi isolates and a hybrid isolate between serovars Typhimurium and I4,[5],12:i:- were identified.
Collapse
Affiliation(s)
- Jihyun Choi
- Department of Microbiology, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea
| | - Jong Hyun Shin
- Department of Microbiology, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea
| | - Suyeon Park
- Department of Microbiology, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea
| | - Ji Young Choi
- Department of Microbiology, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea
| | - Jin Yang Baek
- Asia Pacific Foundation of Infectious Diseases (APFID), Seoul, Republic of Korea
| | - Kyungmin Huh
- Division of Infectious Diseases, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Doo Ryeon Chung
- Division of Infectious Diseases, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Ki Tae Kwon
- Department of Internal Medicine, School of Medicine, Kyungpook National University, Daegu, Republic of Korea
| | | | - Seung-Hyun Jung
- Department of Biochemistry, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Yeun-Jun Chung
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Kwan Soo Ko
- Department of Microbiology, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea
| |
Collapse
|
14
|
Versmessen N, Mispelaere M, Vandekerckhove M, Hermans C, Boelens J, Vranckx K, Van Nieuwerburgh F, Vaneechoutte M, Hulpiau P, Cools P. Average Nucleotide Identity and Digital DNA-DNA Hybridization Analysis Following PromethION Nanopore-Based Whole Genome Sequencing Allows for Accurate Prokaryotic Typing. Diagnostics (Basel) 2024; 14:1800. [PMID: 39202288 PMCID: PMC11353866 DOI: 10.3390/diagnostics14161800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 07/31/2024] [Accepted: 08/13/2024] [Indexed: 09/03/2024] Open
Abstract
Whole-genome sequencing (WGS) is revolutionizing clinical bacteriology. However, bacterial typing remains investigated by reference techniques with inherent limitations. This stresses the need for alternative methods providing robust and accurate sequence type (ST) classification. This study optimized and evaluated a GridION nanopore sequencing protocol, adapted for the PromethION platform. Forty-eight Escherichia coli clinical isolates with diverse STs were sequenced to assess two alternative typing methods and resistance profiling applications. Multi-locus sequence typing (MLST) was used as the reference typing method. Genomic relatedness was assessed using Average Nucleotide Identity (ANI) and digital DNA-DNA Hybridization (DDH), and cut-offs for discriminative strain resolution were evaluated. WGS-based antibiotic resistance prediction was compared to reference Minimum Inhibitory Concentration (MIC) assays. We found ANI and DDH cut-offs of 99.3% and 94.1%, respectively, which correlated well with MLST classifications and demonstrated potentially higher discriminative resolution than MLST. WGS-based antibiotic resistance prediction showed categorical agreements of ≥ 93% with MIC assays for amoxicillin, ceftazidime, amikacin, tobramycin, and trimethoprim-sulfamethoxazole. Performance was suboptimal (68.8-81.3%) for amoxicillin-clavulanic acid, cefepime, aztreonam, and ciprofloxacin. A minimal sequencing coverage of 12× was required to maintain essential genomic features and typing accuracy. Our protocol allows the integration of PromethION technology in clinical laboratories, with ANI and DDH proving to be accurate and robust alternative typing methods, potentially offering superior resolution. WGS-based antibiotic resistance prediction holds promise for specific antibiotic classes.
Collapse
Affiliation(s)
- Nick Versmessen
- Laboratory Bacteriology Research, Department of Diagnostic Sciences, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
- Department of Diagnostic Sciences, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| | - Marieke Mispelaere
- Department of Bio-Medical Sciences, HOWEST University of Applied Sciences, 8000 Bruges, Belgium
| | | | - Cedric Hermans
- Department of Bio-Medical Sciences, HOWEST University of Applied Sciences, 8000 Bruges, Belgium
| | - Jerina Boelens
- Department of Diagnostic Sciences, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
- Department of Laboratory Medicine, Ghent University Hospital, 9000 Ghent, Belgium
| | | | - Filip Van Nieuwerburgh
- NXTGNT, Department of Pharmaceutics, Faculty of Pharmaceutical Sciences, Ghent University, 9000 Ghent, Belgium
| | - Mario Vaneechoutte
- Laboratory Bacteriology Research, Department of Diagnostic Sciences, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| | - Paco Hulpiau
- Department of Bio-Medical Sciences, HOWEST University of Applied Sciences, 8000 Bruges, Belgium
| | - Piet Cools
- Laboratory Bacteriology Research, Department of Diagnostic Sciences, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
- Department of Diagnostic Sciences, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
15
|
Lavrov KV, Shemyakina AO, Grechishnikova EG, Gerasimova TV, Kalinina TI, Novikov AD, Leonova TE, Ryabchenko LE, Bayburdov TA, Yanenko AS. A new concept of biocatalytic synthesis of acrylic monomers for obtaining water-soluble acrylic heteropolymers. Metab Eng Commun 2024; 18:e00231. [PMID: 38222043 PMCID: PMC10787234 DOI: 10.1016/j.mec.2023.e00231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 11/03/2023] [Accepted: 12/13/2023] [Indexed: 01/16/2024] Open
Abstract
Rhodococcus strains were designed as model biocatalysts (BCs) for the production of acrylic acid and mixtures of acrylic monomers consisting of acrylamide, acrylic acid, and N-alkylacrylamide (N-isopropylacrylamide). To obtain BC strains, we used, among other approaches, adaptive laboratory evolution (ALE), based on the use of the metabolic pathway of amide utilization. Whole genome sequencing of the strains obtained after ALE, as well as subsequent targeted gene disruption, identified candidate genes for three new amidases that are promising for the development of BCs for the production of acrylic acid from acrylamide. New BCs had two types of amidase activities, acrylamide-hydrolyzing and acrylamide-transferring, and by varying the ratio of these activities in BCs, it is possible to influence the ratio of monomers in the resulting mixtures. Based on these strains, a prototype of a new technological concept for the biocatalytic synthesis of acrylic monomers was developed for the production of water-soluble acrylic heteropolymers containing valuable N-alkylacrylamide units. In addition to the possibility of obtaining mixtures of different compositions, the advantages of the concept are a single starting reagent (acrylamide), more unification of processes (all processes are based on the same type of biocatalyst), and potentially greater safety for personnel and the environment compared to existing chemical technologies.
Collapse
Affiliation(s)
- Konstantin V. Lavrov
- NRC “Kurchatov Institute”, Kurchatov Genomic Center, 123182, Akademika Kurchatova pl. 1, Moscow, Russia
| | - Anna O. Shemyakina
- NRC “Kurchatov Institute”, Kurchatov Genomic Center, 123182, Akademika Kurchatova pl. 1, Moscow, Russia
| | - Elena G. Grechishnikova
- NRC “Kurchatov Institute”, Kurchatov Genomic Center, 123182, Akademika Kurchatova pl. 1, Moscow, Russia
| | - Tatyana V. Gerasimova
- NRC “Kurchatov Institute”, Kurchatov Genomic Center, 123182, Akademika Kurchatova pl. 1, Moscow, Russia
| | - Tatyana I. Kalinina
- NRC “Kurchatov Institute”, Kurchatov Genomic Center, 123182, Akademika Kurchatova pl. 1, Moscow, Russia
| | - Andrey D. Novikov
- NRC “Kurchatov Institute”, Kurchatov Genomic Center, 123182, Akademika Kurchatova pl. 1, Moscow, Russia
| | - Tatyana E. Leonova
- NRC “Kurchatov Institute”, Kurchatov Genomic Center, 123182, Akademika Kurchatova pl. 1, Moscow, Russia
| | - Ludmila E. Ryabchenko
- NRC “Kurchatov Institute”, Kurchatov Genomic Center, 123182, Akademika Kurchatova pl. 1, Moscow, Russia
| | - Telman A. Bayburdov
- Saratov Chemical Plant of Acrylic Polymers “AKRYPOL”, 410059, Saratov, Russia
| | - Alexander S. Yanenko
- NRC “Kurchatov Institute”, Kurchatov Genomic Center, 123182, Akademika Kurchatova pl. 1, Moscow, Russia
| |
Collapse
|
16
|
Bogaerts B, Van den Bossche A, Verhaegen B, Delbrassinne L, Mattheus W, Nouws S, Godfroid M, Hoffman S, Roosens NHC, De Keersmaecker SCJ, Vanneste K. Closing the gap: Oxford Nanopore Technologies R10 sequencing allows comparable results to Illumina sequencing for SNP-based outbreak investigation of bacterial pathogens. J Clin Microbiol 2024; 62:e0157623. [PMID: 38441926 PMCID: PMC11077942 DOI: 10.1128/jcm.01576-23] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 02/09/2024] [Indexed: 03/08/2024] Open
Abstract
Whole-genome sequencing has become the method of choice for bacterial outbreak investigation, with most clinical and public health laboratories currently routinely using short-read Illumina sequencing. Recently, long-read Oxford Nanopore Technologies (ONT) sequencing has gained prominence and may offer advantages over short-read sequencing, particularly with the recent introduction of the R10 chemistry, which promises much lower error rates than the R9 chemistry. However, limited information is available on its performance for bacterial single-nucleotide polymorphism (SNP)-based outbreak investigation. We present an open-source workflow, Prokaryotic Awesome variant Calling Utility (PACU) (https://github.com/BioinformaticsPlatformWIV-ISP/PACU), for constructing SNP phylogenies using Illumina and/or ONT R9/R10 sequencing data. The workflow was evaluated using outbreak data sets of Shiga toxin-producing Escherichia coli and Listeria monocytogenes by comparing ONT R9 and R10 with Illumina data. The performance of each sequencing technology was evaluated not only separately but also by integrating samples sequenced by different technologies/chemistries into the same phylogenomic analysis. Additionally, the minimum sequencing time required to obtain accurate phylogenetic results using nanopore sequencing was evaluated. PACU allowed accurate identification of outbreak clusters for both species using all technologies/chemistries, but ONT R9 results deviated slightly more from the Illumina results. ONT R10 results showed trends very similar to Illumina, and we found that integrating data sets sequenced by either Illumina or ONT R10 for different isolates into the same analysis produced stable and highly accurate phylogenomic results. The resulting phylogenies for these two outbreaks stabilized after ~20 hours of sequencing for ONT R9 and ~8 hours for ONT R10. This study provides a proof of concept for using ONT R10, either in isolation or in combination with Illumina, for rapid and accurate bacterial SNP-based outbreak investigation.
Collapse
Affiliation(s)
- Bert Bogaerts
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
| | | | | | | | | | - Stéphanie Nouws
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
| | - Maxime Godfroid
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
| | - Stefan Hoffman
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
| | | | | | - Kevin Vanneste
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
| |
Collapse
|
17
|
Nieto-Rosado M, Sands K, Portal EAR, Thomson KM, Carvalho MJ, Mathias J, Milton R, Dyer C, Akpulu C, Boostrom I, Hogan P, Saif H, Sanches Ferreira AD, Hender T, Portal B, Andrews R, Watkins WJ, Zahra R, Shirazi H, Muhammad A, Ullah SN, Jan MH, Akif S, Iregbu KC, Modibbo F, Uwaezuoke S, Audu L, Edwin CP, Yusuf AH, Adeleye A, Mukkadas AS, Mazarati JB, Rucogoza A, Gaju L, Mehtar S, Bulabula ANH, Whitelaw A, Roberts L, Chan G, Bekele D, Solomon S, Abayneh M, Metaferia G, Walsh TR. Colonisation of hospital surfaces from low- and middle-income countries by extended spectrum β-lactamase- and carbapenemase-producing bacteria. Nat Commun 2024; 15:2758. [PMID: 38553439 PMCID: PMC10980694 DOI: 10.1038/s41467-024-46684-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 03/06/2024] [Indexed: 04/02/2024] Open
Abstract
Hospital surfaces can harbour bacterial pathogens, which may disseminate and cause nosocomial infections, contributing towards mortality in low- and middle-income countries (LMICs). During the BARNARDS study, hospital surfaces from neonatal wards were sampled to assess the degree of environmental surface and patient care equipment colonisation by Gram-negative bacteria (GNB) carrying antibiotic resistance genes (ARGs). Here, we perform PCR screening for extended-spectrum β-lactamases (blaCTX-M-15) and carbapenemases (blaNDM, blaOXA-48-like and blaKPC), MALDI-TOF MS identification of GNB carrying ARGs, and further analysis by whole genome sequencing of bacterial isolates. We determine presence of consistently dominant clones and their relatedness to strains causing neonatal sepsis. Higher prevalence of carbapenemases is observed in Pakistan, Bangladesh, and Ethiopia, compared to other countries, and are mostly found in surfaces near the sink drain. Klebsiella pneumoniae, Enterobacter hormaechei, Acinetobacter baumannii, Serratia marcescens and Leclercia adecarboxylata are dominant; ST15 K. pneumoniae is identified from the same ward on multiple occasions suggesting clonal persistence within the same environment, and is found to be identical to isolates causing neonatal sepsis in Pakistan over similar time periods. Our data suggests persistence of dominant clones across multiple time points, highlighting the need for assessment of Infection Prevention and Control guidelines.
Collapse
Affiliation(s)
- Maria Nieto-Rosado
- Department of Biology, Ineos Oxford Institute for Antimicrobial Research, University of Oxford, Oxford, UK.
- Division of Infection and Immunity, Cardiff University, Cardiff, UK.
| | - Kirsty Sands
- Department of Biology, Ineos Oxford Institute for Antimicrobial Research, University of Oxford, Oxford, UK
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - Edward A R Portal
- Department of Biology, Ineos Oxford Institute for Antimicrobial Research, University of Oxford, Oxford, UK
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - Kathryn M Thomson
- Department of Biology, Ineos Oxford Institute for Antimicrobial Research, University of Oxford, Oxford, UK
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - Maria J Carvalho
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
- Department of Medical Sciences, Institute of Biomedicine, University of Aveiro, Aveiro, Portugal
| | - Jordan Mathias
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - Rebecca Milton
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
- Centre for Trials Research, Cardiff University, Cardiff, UK
| | - Calie Dyer
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
- Centre for Trials Research, Cardiff University, Cardiff, UK
| | - Chinenye Akpulu
- Department of Biology, Ineos Oxford Institute for Antimicrobial Research, University of Oxford, Oxford, UK
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - Ian Boostrom
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - Patrick Hogan
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - Habiba Saif
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - Ana D Sanches Ferreira
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
- Parasites and Microbes Programme, Wellcome Sanger Institute Hinxton, Hinxton, UK
| | - Thomas Hender
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - Barbra Portal
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - Robert Andrews
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - W John Watkins
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| | - Rabaab Zahra
- Department of Microbiology, Quaid-i-Azam University, Islamabad, Pakistan
| | - Haider Shirazi
- Pakistan Institute of Medical Sciences, Islamabad, Pakistan
| | - Adil Muhammad
- Department of Microbiology, Quaid-i-Azam University, Islamabad, Pakistan
| | - Syed Najeeb Ullah
- Department of Microbiology, Quaid-i-Azam University, Islamabad, Pakistan
| | - Muhammad Hilal Jan
- Department of Microbiology, Quaid-i-Azam University, Islamabad, Pakistan
| | - Shermeen Akif
- Department of Microbiology, Quaid-i-Azam University, Islamabad, Pakistan
| | | | | | | | | | - Chinago P Edwin
- Department of Microbiology, Medway Maritime Hospital NHS Foundation Trust, Gillingham, Kent, UK
- Aminu Kano Teaching Hospital, Kano, Nigeria
| | | | - Adeola Adeleye
- Murtala Muhammad Specialist Hospital, Kano City, Nigeria
| | | | | | - Aniceth Rucogoza
- The National Reference Laboratory, Rwanda Biomedical Centre, Kigali, Rwanda
| | - Lucie Gaju
- The National Reference Laboratory, Rwanda Biomedical Centre, Kigali, Rwanda
| | - Shaheen Mehtar
- Unit of IPC, Stellenbosch University, Cape Town, South Africa
- Infection Control Africa Network, Cape Town, South Africa
| | - Andrew N H Bulabula
- Infection Control Africa Network, Cape Town, South Africa
- Department of Global Health, Stellenbosch University, Cape Town, South Africa
| | - Andrew Whitelaw
- Division of Medical Microbiology, Stellenbosch University, Cape Town, South Africa
- National Health Laboratory Service, Tygerberg Hospital, Cape Town, South Africa
| | - Lauren Roberts
- Division of Medical Microbiology, Stellenbosch University, Cape Town, South Africa
| | - Grace Chan
- Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pediatrics and Child Health, St Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia
| | - Delayehu Bekele
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, USA
- Department of Obstetrics and Gynecology, St Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia
| | - Semaria Solomon
- Department of Microbiology, Immunology and Parasitology, St Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia
| | - Mahlet Abayneh
- Department of Pediatrics and Child Health, St Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia
| | - Gesit Metaferia
- Department of Microbiology, Immunology and Parasitology, St Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia
| | - Timothy R Walsh
- Department of Biology, Ineos Oxford Institute for Antimicrobial Research, University of Oxford, Oxford, UK
- Division of Infection and Immunity, Cardiff University, Cardiff, UK
| |
Collapse
|
18
|
Charron P, Kang M. VariantDetective: an accurate all-in-one pipeline for detecting consensus bacterial SNPs and SVs. Bioinformatics 2024; 40:btae066. [PMID: 38366603 PMCID: PMC10898327 DOI: 10.1093/bioinformatics/btae066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 01/16/2024] [Accepted: 02/14/2024] [Indexed: 02/18/2024] Open
Abstract
MOTIVATION Genomic variations comprise a spectrum of alterations, ranging from single nucleotide polymorphisms (SNPs) to large-scale structural variants (SVs), which play crucial roles in bacterial evolution and species diversification. Accurately identifying SNPs and SVs is beneficial for subsequent evolutionary and epidemiological studies. This study presents VariantDetective (VD), a novel, user-friendly, and all-in-one pipeline combining SNP and SV calling to generate consensus genomic variants using multiple tools. RESULTS The VD pipeline accepts various file types as input to initiate SNP and/or SV calling, and benchmarking results demonstrate VD's robustness and high accuracy across multiple tested datasets when compared to existing variant calling approaches. AVAILABILITY AND IMPLEMENTATION The source code, test data, and relevant information for VD are freely accessible at https://github.com/OLF-Bioinformatics/VariantDetective under the MIT License.
Collapse
Affiliation(s)
- Philippe Charron
- Ottawa Laboratory-Fallowfield, Canadian Food Inspection Agency, 3851 Fallowfield Road, Nepean, Ontario K2J 4S1, Canada
| | - Mingsong Kang
- Ottawa Laboratory-Fallowfield, Canadian Food Inspection Agency, 3851 Fallowfield Road, Nepean, Ontario K2J 4S1, Canada
| |
Collapse
|
19
|
Castelli P, De Ruvo A, Bucciacchio A, D'Alterio N, Cammà C, Di Pasquale A, Radomski N. Harmonization of supervised machine learning practices for efficient source attribution of Listeria monocytogenes based on genomic data. BMC Genomics 2023; 24:560. [PMID: 37736708 PMCID: PMC10515079 DOI: 10.1186/s12864-023-09667-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 09/10/2023] [Indexed: 09/23/2023] Open
Abstract
BACKGROUND Genomic data-based machine learning tools are promising for real-time surveillance activities performing source attribution of foodborne bacteria such as Listeria monocytogenes. Given the heterogeneity of machine learning practices, our aim was to identify those influencing the source prediction performance of the usual holdout method combined with the repeated k-fold cross-validation method. METHODS A large collection of 1 100 L. monocytogenes genomes with known sources was built according to several genomic metrics to ensure authenticity and completeness of genomic profiles. Based on these genomic profiles (i.e. 7-locus alleles, core alleles, accessory genes, core SNPs and pan kmers), we developed a versatile workflow assessing prediction performance of different combinations of training dataset splitting (i.e. 50, 60, 70, 80 and 90%), data preprocessing (i.e. with or without near-zero variance removal), and learning models (i.e. BLR, ERT, RF, SGB, SVM and XGB). The performance metrics included accuracy, Cohen's kappa, F1-score, area under the curves from receiver operating characteristic curve, precision recall curve or precision recall gain curve, and execution time. RESULTS The testing average accuracies from accessory genes and pan kmers were significantly higher than accuracies from core alleles or SNPs. While the accuracies from 70 and 80% of training dataset splitting were not significantly different, those from 80% were significantly higher than the other tested proportions. The near-zero variance removal did not allow to produce results for 7-locus alleles, did not impact significantly the accuracy for core alleles, accessory genes and pan kmers, and decreased significantly accuracy for core SNPs. The SVM and XGB models did not present significant differences in accuracy between each other and reached significantly higher accuracies than BLR, SGB, ERT and RF, in this order of magnitude. However, the SVM model required more computing power than the XGB model, especially for high amount of descriptors such like core SNPs and pan kmers. CONCLUSIONS In addition to recommendations about machine learning practices for L. monocytogenes source attribution based on genomic data, the present study also provides a freely available workflow to solve other balanced or unbalanced multiclass phenotypes from binary and categorical genomic profiles of other microorganisms without source code modifications.
Collapse
Affiliation(s)
- Pierluigi Castelli
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Andrea De Ruvo
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Andrea Bucciacchio
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Nicola D'Alterio
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Cesare Cammà
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Adriano Di Pasquale
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
| | - Nicolas Radomski
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy.
| |
Collapse
|
20
|
Ordóñez CD, Mayoral-Campos C, Egas C, Redrejo-Rodríguez M. A primer-independent DNA polymerase-based method for competent whole-genome amplification of intermediate to high GC sequences. NAR Genom Bioinform 2023; 5:lqad073. [PMID: 37608803 PMCID: PMC10440786 DOI: 10.1093/nargab/lqad073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 07/03/2023] [Accepted: 08/09/2023] [Indexed: 08/24/2023] Open
Abstract
Multiple displacement amplification (MDA) has proven to be a useful technique for obtaining large amounts of DNA from tiny samples in genomics and metagenomics. However, MDA has limitations, such as amplification artifacts and biases that can interfere with subsequent quantitative analysis. To overcome these challenges, alternative methods and engineered DNA polymerase variants have been developed. Here, we present new MDA protocols based on the primer-independent DNA polymerase (piPolB), a replicative-like DNA polymerase endowed with DNA priming and proofreading capacities. These new methods were tested on a genomes mixture containing diverse sequences with high-GC content, followed by deep sequencing. Protocols relying on piPolB as a single enzyme cannot achieve competent amplification due to its limited processivity and the presence of ab initio DNA synthesis. However, an alternative method called piMDA, which combines piPolB with Φ29 DNA polymerase, allows proficient and faithful amplification of the genomes. In addition, the prior denaturation step commonly performed in MDA protocols is dispensable, resulting in a more straightforward protocol. In summary, piMDA outperforms commercial methods in the amplification of genomes and metagenomes containing high GC sequences and exhibits similar profiling, error rate and variant determination as the non-amplified samples.
Collapse
Affiliation(s)
- Carlos D Ordóñez
- Centro de Biología Molecular Severo Ochoa, CSIC-UAM, Madrid, Spain
| | - Carmen Mayoral-Campos
- Departamento de Bioquímica, Universidad Autónoma de Madrid (UAM) and Instituto de Investigaciones Biomédicas Sols-Morreale (CSIC-UAM), Madrid, Spain
| | - Conceição Egas
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
- Biocant, Transfer Technology Association, Cantanhede, Portugal
| | - Modesto Redrejo-Rodríguez
- Departamento de Bioquímica, Universidad Autónoma de Madrid (UAM) and Instituto de Investigaciones Biomédicas Sols-Morreale (CSIC-UAM), Madrid, Spain
| |
Collapse
|
21
|
Seah YM, Stewart MK, Hoogestraat D, Ryder M, Cookson BT, Salipante SJ, Hoffman NG. In Silico Evaluation of Variant Calling Methods for Bacterial Whole-Genome Sequencing Assays. J Clin Microbiol 2023; 61:e0184222. [PMID: 37428072 PMCID: PMC10446864 DOI: 10.1128/jcm.01842-22] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 06/18/2023] [Indexed: 07/11/2023] Open
Abstract
Identification and analysis of clinically relevant strains of bacteria increasingly relies on whole-genome sequencing. The downstream bioinformatics steps necessary for calling variants from short-read sequences are well-established but seldom validated against haploid genomes. We devised an in silico workflow to introduce single nucleotide polymorphisms (SNP) and indels into bacterial reference genomes, and computationally generate sequencing reads based on the mutated genomes. We then applied the method to Mycobacterium tuberculosis H37Rv, Staphylococcus aureus NCTC 8325, and Klebsiella pneumoniae HS11286, and used the synthetic reads as truth sets for evaluating several popular variant callers. Insertions proved especially challenging for most variant callers to correctly identify, relative to deletions and single nucleotide polymorphisms. With adequate read depth, however, variant callers that use high quality soft-clipped reads and base mismatches to perform local realignment consistently had the highest precision and recall in identifying insertions and deletions ranging from1 to 50 bp. The remaining variant callers had lower recall values associated with identification of insertions greater than 20 bp.
Collapse
Affiliation(s)
- Yee Mey Seah
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Mary K. Stewart
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Daniel Hoogestraat
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Molly Ryder
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Brad T. Cookson
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
- Department of Microbiology, University of Washington, Seattle, Washington, USA
| | - Stephen J. Salipante
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Noah G. Hoffman
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| |
Collapse
|
22
|
Shi ZJ, Nayfach S, Pollard KS. Maast: genotyping thousands of microbial strains efficiently. Genome Biol 2023; 24:186. [PMID: 37563669 PMCID: PMC10416524 DOI: 10.1186/s13059-023-03030-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 07/31/2023] [Indexed: 08/12/2023] Open
Abstract
Existing single nucleotide polymorphism (SNP) genotyping algorithms do not scale for species with thousands of sequenced strains, nor do they account for conspecific redundancy. Here we present a bioinformatics tool, Maast, which empowers population genetic meta-analysis of microbes at an unrivaled scale. Maast implements a novel algorithm to heuristically identify a minimal set of diverse conspecific genomes, then constructs a reliable SNP panel for each species, and enables rapid and accurate genotyping using a hybrid of whole-genome alignment and k-mer exact matching. We demonstrate Maast's utility by genotyping thousands of Helicobacter pylori strains and tracking SARS-CoV-2 diversification.
Collapse
Affiliation(s)
- Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Gladstone Institutes of Data Science and Biotechnology, San Francisco, CA, USA
| | - Stephen Nayfach
- Joint Genome Institute, Department of Energy, Walnut Creek, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
- Gladstone Institutes of Data Science and Biotechnology, San Francisco, CA, USA.
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
23
|
Gmiter D, Pacak I, Nawrot S, Czerwonka G, Kaca W. Genomes comparison of two Proteus mirabilis clones showing varied swarming ability. Mol Biol Rep 2023; 50:5817-5826. [PMID: 37219671 PMCID: PMC10290045 DOI: 10.1007/s11033-023-08518-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 05/10/2023] [Indexed: 05/24/2023]
Abstract
BACKGROUND Proteus mirabilis is a Gram-negative bacteria most noted for its involvement with catheter-associated urinary tract infections. It is also known for its multicellular migration over solid surfaces, referred to as 'swarming motility'. Here we analyzed the genomic sequences of two P. mirabilis isolates, designated K38 and K39, which exhibit varied swarming ability. METHODS AND RESULTS The isolates genomes were sequenced using Illumina NextSeq sequencer, resulting in about 3.94 Mbp, with a GC content of 38.6%, genomes. Genomes were subjected for in silico comparative investigation. We revealed that, despite a difference in swarming motility, the isolates showed high genomic relatedness (up to 100% ANI similarity), suggesting that one of the isolates probably originated from the other. CONCLUSIONS The genomic sequences will allow us to investigate the mechanism driving this intriguing phenotypic heterogeneity between closely related P. mirabilis isolates. Phenotypic heterogeneity is an adaptive strategy of bacterial cells to several environmental pressures. It is also an important factor related to their pathogenesis. Therefore, the availability of these genomic sequences will facilitate studies that focus on the host-pathogen interactions during catheter-associated urinary tract infections.
Collapse
Affiliation(s)
- Dawid Gmiter
- Department of Microbiology, Institute of Biology, Faculty of Natural Sciences, Jan Kochanowski University in Kielce, Kielce, Poland.
| | - Ilona Pacak
- Department of Microbiology, Institute of Biology, Faculty of Natural Sciences, Jan Kochanowski University in Kielce, Kielce, Poland
| | - Sylwia Nawrot
- Department of Microbiology, Institute of Biology, Faculty of Natural Sciences, Jan Kochanowski University in Kielce, Kielce, Poland
| | - Grzegorz Czerwonka
- Department of Microbiology, Institute of Biology, Faculty of Natural Sciences, Jan Kochanowski University in Kielce, Kielce, Poland
| | - Wieslaw Kaca
- Department of Microbiology, Institute of Biology, Faculty of Natural Sciences, Jan Kochanowski University in Kielce, Kielce, Poland
| |
Collapse
|
24
|
Pérez-Llanos FJ, Dreyer V, Barilar I, Utpatel C, Kohl TA, Murcia MI, Homolka S, Merker M, Niemann S. Transmission Dynamics of a Mycobacterium tuberculosis Complex Outbreak in an Indigenous Population in the Colombian Amazon Region. Microbiol Spectr 2023; 11:e0501322. [PMID: 37222610 PMCID: PMC10269451 DOI: 10.1128/spectrum.05013-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 05/04/2023] [Indexed: 05/25/2023] Open
Abstract
Whole genome sequencing (WGS) has become the main tool for studying the transmission of Mycobacterium tuberculosis complex (MTBC) strains; however, the clonal expansion of one strain often limits its application in local MTBC outbreaks. The use of an alternative reference genome and the inclusion of repetitive regions in the analysis could potentially increase the resolution, but the added value has not yet been defined. Here, we leveraged short and long WGS read data of a previously reported MTBC outbreak in the Colombian Amazon Region to analyze possible transmission chains among 74 patients in the indigenous setting of Puerto Nariño (March to October 2016). In total, 90.5% (67/74) of the patients were infected with one distinct MTBC strain belonging to lineage 4.3.3. Employing a reference genome from an outbreak strain and highly confident single nucleotide polymorphisms (SNPs) in repetitive genomic regions, e.g., the proline-glutamic acid/proline-proline-glutamic-acid (PE/PPE) gene family, increased the phylogenetic resolution compared to a classical H37Rv reference mapping approach. Specifically, the number of differentiating SNPs increased from 890 to 1,094, which resulted in a more granular transmission network as judged by an increasing number of individual nodes in a maximum parsimony tree, i.e., 5 versus 9 nodes. We also found in 29.9% (20/67) of the outbreak isolates, heterogenous alleles at phylogenetically informative sites, suggesting that these patients are infected with more than one clone. In conclusion, customized SNP calling thresholds and employment of a local reference genome for a mapping approach can improve the phylogenetic resolution in highly clonal MTBC populations and help elucidate within-host MTBC diversity. IMPORTANCE The Colombian Amazon around Puerto Nariño has a high tuberculosis burden with a prevalence of 1,267/100,000 people in 2016. Recently, an outbreak of Mycobacterium tuberculosis complex (MTBC) bacteria among the indigenous populations was identified with classical MTBC genotyping methods. Here, we employed a whole-genome sequencing-based outbreak investigation in order to improve the phylogenetic resolution and gain new insights into the transmission dynamics in this remote Colombian Amazon Region. The inclusion of well-supported single nucleotide polymorphisms in repetitive regions and a de novo-assembled local reference genome provided a more granular picture of the circulating outbreak strain and revealed new transmission chains. Multiple patients from different settlements were possibly infected with at least two different clones in this high-incidence setting. Thus, our results have the potential to improve molecular surveillance studies in other high-burden settings, especially regions with few clonal multidrug-resistant (MDR) MTBC lineages/clades.
Collapse
Affiliation(s)
| | - Viola Dreyer
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
| | - Ivan Barilar
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
| | - Christian Utpatel
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
| | - Thomas A. Kohl
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
| | - Martha Isabel Murcia
- Grupo MICOBAC-UN, Departamento de Microbiología, Facultad de Medicina, Universidad Nacional de Colombia, Bogotá, Colombia
| | - Susanne Homolka
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
| | - Matthias Merker
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
- Evolution of the Resistome, Research Center Borstel, Borstel, Germany
| | - Stefan Niemann
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
| |
Collapse
|
25
|
Hussain A, Mazumder R, Ahmed A, Saima U, Phelan JE, Campino S, Ahmed D, Asadulghani M, Clark TG, Mondal D. Genome dynamics of high-risk resistant and hypervirulent Klebsiella pneumoniae clones in Dhaka, Bangladesh. Front Microbiol 2023; 14:1184196. [PMID: 37303793 PMCID: PMC10248448 DOI: 10.3389/fmicb.2023.1184196] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Accepted: 05/05/2023] [Indexed: 06/13/2023] Open
Abstract
Klebsiella pneumoniae is recognized as an urgent public health threat because of the emergence of difficult-to-treat (DTR) strains and hypervirulent clones, resulting in infections with high morbidity and mortality rates. Despite its prominence, little is known about the genomic epidemiology of K. pneumoniae in resource-limited settings like Bangladesh. We sequenced genomes of 32 K. pneumoniae strains isolated from patient samples at the International Center for Diarrhoeal Disease Research, Bangladesh (icddr,b). Genome sequences were examined for their diversity, population structure, resistome, virulome, MLST, O and K antigens and plasmids. Our results revealed the presence of two K. pneumoniae phylogroups, namely KpI (K. pneumoniae) (97%) and KpII (K. quasipneumoniae) (3%). The genomic characterization revealed that 25% (8/32) of isolates were associated with high-risk multidrug-resistant clones, including ST11, ST14, ST15, ST307, ST231 and ST147. The virulome analysis confirmed the presence of six (19%) hypervirulent K. pneumoniae (hvKp) and 26 (81%) classical K. pneumoniae (cKp) strains. The most common ESBL gene identified was blaCTX-M-15 (50%). Around 9% (3/32) isolates exhibited a difficult-to-treat phenotype, harboring carbapenem resistance genes (2 strains harbored blaNDM-5 plus blaOXA-232, one isolate blaOXA-181). The most prevalent O antigen was O1 (56%). The capsular polysaccharides K2, K20, K16 and K62 were enriched in the K. pneumoniae population. This study suggests the circulation of the major international high-risk multidrug-resistant and hypervirulent (hvKp) K. pneumoniae clones in Dhaka, Bangladesh. These findings warrant immediate appropriate interventions, which would otherwise lead to a high burden of untreatable life-threatening infections locally.
Collapse
Affiliation(s)
- Arif Hussain
- Laboratory Sciences and Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka, Bangladesh
| | - Razib Mazumder
- Laboratory Sciences and Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka, Bangladesh
| | - Abdullah Ahmed
- Laboratory Sciences and Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka, Bangladesh
| | - Umme Saima
- Laboratory Sciences and Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka, Bangladesh
| | - Jody E. Phelan
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Susana Campino
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Dilruba Ahmed
- Clinical Microbiology and Immunology Laboratory, Laboratory Sciences and Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh, Dhaka, Bangladesh
| | - Md Asadulghani
- Biosafety and BSL3 Laboratory, Biosafety Office, International Centre for Diarrhoeal Disease Research, Bangladesh, Dhaka, Bangladesh
| | - Taane G. Clark
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Dinesh Mondal
- Laboratory Sciences and Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka, Bangladesh
| |
Collapse
|
26
|
Zhao C, Shi ZJ, Pollard KS. Pitfalls of genotyping microbial communities with rapidly growing genome collections. Cell Syst 2023; 14:160-176.e3. [PMID: 36657438 PMCID: PMC9957970 DOI: 10.1016/j.cels.2022.12.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/15/2022] [Accepted: 12/19/2022] [Indexed: 01/20/2023]
Abstract
Detecting genetic variants in metagenomic data is a priority for understanding the evolution, ecology, and functional characteristics of microbial communities. Many tools that perform this metagenotyping rely on aligning reads of unknown origin to a database of sequences from many species before calling variants. In this synthesis, we investigate how databases of increasingly diverse and closely related species have pushed the limits of current alignment algorithms, thereby degrading the performance of metagenotyping tools. We identify multi-mapping reads as a prevalent source of errors and illustrate a trade-off between retaining correct alignments versus limiting incorrect alignments, many of which map reads to the wrong species. Then we evaluate several actionable mitigation strategies and review emerging methods showing promise to further improve metagenotyping in response to the rapid growth in genome collections. Our results have implications beyond metagenotyping to the many tools in microbial genomics that depend upon accurate read mapping.
Collapse
Affiliation(s)
- Chunyu Zhao
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA; Department of Epidemiology & Biostatistics, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
27
|
Zhao C, Dimitrov B, Goldman M, Nayfach S, Pollard KS. MIDAS2: Metagenomic Intra-species Diversity Analysis System. Bioinformatics 2023; 39:btac713. [PMID: 36321886 PMCID: PMC9805558 DOI: 10.1093/bioinformatics/btac713] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 10/07/2022] [Accepted: 10/28/2022] [Indexed: 11/07/2022] Open
Abstract
SUMMARY The Metagenomic Intra-Species Diversity Analysis System (MIDAS) is a scalable metagenomic pipeline that identifies single nucleotide variants (SNVs) and gene copy number variants in microbial populations. Here, we present MIDAS2, which addresses the computational challenges presented by increasingly large reference genome databases, while adding functionality for building custom databases and leveraging paired-end reads to improve SNV accuracy. This fast and scalable reengineering of the MIDAS pipeline enables thousands of metagenomic samples to be efficiently genotyped. AVAILABILITY AND IMPLEMENTATION The source code is available at https://github.com/czbiohub/MIDAS2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chunyu Zhao
- Data Science, Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
- Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, USA
| | | | - Miriam Goldman
- Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, USA
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA 94158, USA
| | - Stephen Nayfach
- Department of Energy, Joint Genome Institute, Berkeley, CA 94720, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Katherine S Pollard
- Data Science, Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
- Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94158, USA
| |
Collapse
|
28
|
Zhao C, Goldman M, Smith BJ, Pollard KS. Genotyping Microbial Communities with MIDAS2: From Metagenomic Reads to Allele Tables. Curr Protoc 2022; 2:e604. [PMID: 36469554 PMCID: PMC9907011 DOI: 10.1002/cpz1.604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The Metagenomic Intra-Species Diversity Analysis System 2 (MIDAS2) is a scalable pipeline that identifies single nucleotide variants and gene copy number variants in metagenomes using comprehensive reference databases built from public microbial genome collections (metagenotyping). MIDAS2 is the first metagenotyping tool with functionality to control metagenomic read mapping filters and to customize the reference database to the microbial community, features that improve the precision and recall of detected variants. In this article we present four basic protocols for the most common use cases of MIDAS2, along with supporting protocols for installation and use. In addition, we provide in-depth guidance on adjusting command line parameters, editing the reference database, optimizing hardware utilization, and understanding the metagenotyping results. All the steps of metagenotyping, from raw sequencing reads to population genetic analysis, are demonstrated with example data in two downloadable sequencing libraries of single-end metagenomic reads representing a mixture of multiple bacterial species. This set of protocols empowers users to accurately genotype hundreds of species in thousands of samples, providing rich genetic data for studying the evolution and strain-level ecology of microbial communities. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Species prescreening Basic Protocol 2: Download MIDAS reference database Basic Protocol 3: Population single nucleotide variant calling Basic Protocol 4: Pan-genome copy number variant calling Support Protocol 1: Installing MIDAS2 Support Protocol 2: Command line inputs Support Protocol 3: Metagenotyping with a custom collection of genomes Support Protocol 4: Metagenotyping with advanced parameters.
Collapse
Affiliation(s)
- Chunyu Zhao
- Data Science, Chan Zuckerberg Biohub, San Francisco, California
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- These authors contributed equally to this work
| | - Miriam Goldman
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- Biomedical Informatics, University of California San Francisco, San Francisco, California
- These authors contributed equally to this work
| | - Byron J. Smith
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Katherine S. Pollard
- Data Science, Chan Zuckerberg Biohub, San Francisco, California
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| |
Collapse
|
29
|
Ridone P, Ishida T, Lin A, Humphreys DT, Giannoulatou E, Sowa Y, Baker MAB. The rapid evolution of flagellar ion selectivity in experimental populations of E. coli. SCIENCE ADVANCES 2022; 8:eabq2492. [PMID: 36417540 PMCID: PMC9683732 DOI: 10.1126/sciadv.abq2492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 10/06/2022] [Indexed: 06/16/2023]
Abstract
Determining which cellular processes facilitate adaptation requires a tractable experimental model where an environmental cue can generate variants that rescue function. The bacterial flagellar motor (BFM) is an excellent candidate-an ancient and highly conserved molecular complex for bacterial propulsion toward favorable environments. Motor rotation is often powered by H+ or Na+ ion transit through the torque-generating stator subunit of the motor complex, and ion selectivity has adapted over evolutionary time scales. Here, we used CRISPR engineering to replace the native Escherichia coli H+-powered stator with Na+-powered stator genes and report the spontaneous reversion of our edit in a low-sodium environment. We followed the evolution of the stators during their reversion to H+-powered motility and used both whole-genome and RNA sequencing to identify genes involved in the cell's adaptation. Our transplant of an unfit protein and the cells' rapid response to this edit demonstrate the adaptability of the stator subunit and highlight the hierarchical modularity of the flagellar motor.
Collapse
Affiliation(s)
- Pietro Ridone
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
| | - Tsubasa Ishida
- Department of Frontier Bioscience, Hosei University, Tokyo, Japan
- Research Center for Micro-Nano Technology, Hosei University, Tokyo, Japan
| | - Angela Lin
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
| | - David T. Humphreys
- Victor Chang Cardiac Research Institute, Sydney, Australia
- School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Australia
| | | | - Yoshiyuki Sowa
- Department of Frontier Bioscience, Hosei University, Tokyo, Japan
- Research Center for Micro-Nano Technology, Hosei University, Tokyo, Japan
| | - Matthew A. B. Baker
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
- ARC Centre of Excellence in Synthetic Biology, University of New South Wales, Sydney, Australia
| |
Collapse
|
30
|
Bioinformatics in bioscience and bioengineering: Recent advances, applications, and perspectives. J Biosci Bioeng 2022; 134:363-373. [PMID: 36127250 DOI: 10.1016/j.jbiosc.2022.08.004] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Revised: 07/27/2022] [Accepted: 08/14/2022] [Indexed: 11/24/2022]
Abstract
Recent advances have led to the emergence of highly comprehensive and analytical approaches, such as omics analysis and high-resolution, time-resolved bioimaging analysis. These technologies have made it possible to obtain vast data from a single measurement. Subsequently, large datasets have pioneered the data-driven approach, an alternative to the traditional hypothesis-testing system, for researchers. However, processing, interpreting, and elucidating enormous datasets is no longer possible without computation. Bioinformatics is a field that has developed over long periods, intending to understand biological phenomena using methods collected from information science and statistics, thus solving this proposed research challenge. This review presents the latest methodologies and applications in sequencing, imaging, and mass spectrometry that were developed using bioinformatics. We presented the features of individual techniques and outlines in each part, avoiding the use of complex algorithms and formulas to allow beginning researchers to understand an overview. In the section on sequencing, we focused on comparative genomic, transcriptomic, and bacterial microbiome analyses, which are frequently used as applications of next-generation sequencing. Bioinformatic methods for handling sequence data and case studies were described. In the section on imaging, we introduced the analytical methods and microscopy imaging informatics techniques used in animal cell biology and plant physiology. We introduce informatics technologies for maximizing the value of measured data, including predicting the structure of unknown molecules and untargeted analysis in the section on mass spectrometry. Finally, we discuss the future outlook of this field. We anticipate that this review will assist biologists in using bioinformatics more effectively.
Collapse
|
31
|
Cherchame E, Ilango G, Noël V, Cadel-Six S. Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses. Front Public Health 2022; 10:963188. [PMID: 36159272 PMCID: PMC9493441 DOI: 10.3389/fpubh.2022.963188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 08/02/2022] [Indexed: 01/24/2023] Open
Abstract
Salmonella is the most common cause of gastroenteritis in the world. Over the past 5 years, whole-genome analysis has led to the high-resolution characterization of clinical and foodborne Salmonella responsible for typhoid fever, foodborne illness or contamination of the agro-food chain. Whole-genome analyses are simplified by the availability of high-quality, complete genomes for mapping analysis and for calculating the pairwise distance between genomes, but unfortunately some difficulties may still remain. For some serovars, the complete genome is not available, or some serovars are polyphyletic and knowing the serovar alone is not sufficient for choosing the most appropriate reference genome. For these serovars, it is essential to identify the genetically closest complete genome to be able to carry out precise genome analyses. In this study, we explored the genomic proximity of 650 genomes of the 58 Salmonella enterica subsp. enterica serovars most frequently isolated in humans and from the food chain in the United States (US) and in Europe (EU), with a special focus on France. For each serovar, to take into account their genomic diversity, we included all the multilocus sequence type (MLST) profiles represented in EnteroBase with 10 or more genomes (on 19 July 2021). A phylogenetic analysis using both core- and pan-genome approaches was carried out to identify the genomic proximity of all the Salmonella studied and 20 polyphyletic serovars that have not yet been described in the literature. This study determined the genetic proximity between all 58 serovars studied and revealed polyphyletic serovars, their genomic lineages and MLST profiles. Finally, we enhanced the open-access databases with 73 new genomes and produced a list of high-quality complete reference genomes for 48 S. enterica subsp. enterica serovars among the most isolated in the US, EU, and France.
Collapse
|
32
|
Meumann EM, Krause VL, Baird R, Currie BJ. Using Genomics to Understand the Epidemiology of Infectious Diseases in the Northern Territory of Australia. Trop Med Infect Dis 2022; 7:tropicalmed7080181. [PMID: 36006273 PMCID: PMC9413455 DOI: 10.3390/tropicalmed7080181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 08/09/2022] [Accepted: 08/11/2022] [Indexed: 11/16/2022] Open
Abstract
The Northern Territory (NT) is a geographically remote region of northern and central Australia. Approximately a third of the population are First Nations Australians, many of whom live in remote regions. Due to the physical environment and climate, and scale of social inequity, the rates of many infectious diseases are the highest nationally. Molecular typing and genomic sequencing in research and public health have provided considerable new knowledge on the epidemiology of infectious diseases in the NT. We review the applications of genomic sequencing technology for molecular typing, identification of transmission clusters, phylogenomics, antimicrobial resistance prediction, and pathogen detection. We provide examples where these methodologies have been applied to infectious diseases in the NT and discuss the next steps in public health implementation of this technology.
Collapse
Affiliation(s)
- Ella M. Meumann
- Global and Tropical Health Division, Menzies School of Health Research, Charles Darwin University, Darwin 0810, Australia
- Department of Infectious Diseases, Division of Medicine, Royal Darwin Hospital, Darwin 0810, Australia
- Correspondence:
| | - Vicki L. Krause
- Northern Territory Centre for Disease Control, Northern Territory Government, Darwin 0810, Australia
| | - Robert Baird
- Territory Pathology, Royal Darwin Hospital, Darwin 0810, Australia
| | - Bart J. Currie
- Global and Tropical Health Division, Menzies School of Health Research, Charles Darwin University, Darwin 0810, Australia
- Department of Infectious Diseases, Division of Medicine, Royal Darwin Hospital, Darwin 0810, Australia
| |
Collapse
|
33
|
Antibiotic resistance genes in the gut microbiota of mothers and linked neonates with or without sepsis from low- and middle-income countries. Nat Microbiol 2022; 7:1337-1347. [PMID: 35927336 PMCID: PMC9417982 DOI: 10.1038/s41564-022-01184-y] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 06/23/2022] [Indexed: 12/29/2022]
Abstract
Early development of the microbiome has been shown to affect general health and physical development of the infant and, although some studies have been undertaken in high-income countries, there are few studies from low- and middle-income countries. As part of the BARNARDS study, we examined the rectal microbiota of 2,931 neonates (term used up to 60 d) with clinical signs of sepsis and of 15,217 mothers screening for blaCTX-M-15, blaNDM, blaKPC and blaOXA-48-like genes, which were detected in 56.1%, 18.5%, 0% and 4.1% of neonates’ rectal swabs and 47.1%, 4.6%, 0% and 1.6% of mothers’ rectal swabs, respectively. Carbapenemase-positive bacteria were identified by MALDI-TOF MS and showed a high diversity of bacterial species (57 distinct species/genera) which exhibited resistance to most of the antibiotics tested. Escherichia coli, Klebsiella pneumoniae and Enterobacter cloacae/E. cloacae complex, the most commonly found isolates, were subjected to whole-genome sequencing analysis and revealed close relationships between isolates from different samples, suggesting transmission of bacteria between neonates, and between neonates and mothers. Associations between the carriage of antimicrobial resistance genes (ARGs) and healthcare/environmental factors were identified, and the presence of ARGs was a predictor of neonatal sepsis and adverse birth outcomes. Analysis of gut microbiota of mothers and its neonates—as part of the BARNARDS study—reveals associations between β-lactamase gene carriage and neonatal sepsis risk in low-income settings.
Collapse
|
34
|
Hunt M, Letcher B, Malone KM, Nguyen G, Hall MB, Colquhoun RM, Lima L, Schatz MC, Ramakrishnan S, Iqbal Z. Minos: variant adjudication and joint genotyping of cohorts of bacterial genomes. Genome Biol 2022; 23:147. [PMID: 35791022 PMCID: PMC9254434 DOI: 10.1186/s13059-022-02714-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 06/20/2022] [Indexed: 12/30/2022] Open
Abstract
There are many short-read variant-calling tools, with different strengths and weaknesses. We present a tool, Minos, which combines outputs from arbitrary variant callers, increasing recall without loss of precision. We benchmark on 62 samples from three bacterial species and an outbreak of 385 Mycobacterium tuberculosis samples. Minos also enables joint genotyping; we demonstrate on a large (N=13k) M. tuberculosis cohort, building a map of non-synonymous SNPs and indels in a region where all such variants are assumed to cause rifampicin resistance. We quantify the correlation with phenotypic resistance and then replicate in a second cohort (N=10k).
Collapse
Affiliation(s)
- Martin Hunt
- EMBL-EBI, Cambridge, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | | | | | | | | | - Rachel M Colquhoun
- Institute of Evolutionary Biology, Ashworth Laboratories, University of Edinburgh, Edinburgh, UK
| | | | - Michael C Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | | | | |
Collapse
|
35
|
Sabin S, Morales-Arce AY, Pfeifer SP, Jensen JD. The impact of frequently neglected model violations on bacterial recombination rate estimation: a case study in Mycobacterium canettii and Mycobacterium tuberculosis. G3 (BETHESDA, MD.) 2022; 12:jkac055. [PMID: 35253851 PMCID: PMC9073693 DOI: 10.1093/g3journal/jkac055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 02/28/2022] [Indexed: 12/04/2022]
Abstract
Mycobacterium canettii is a causative agent of tuberculosis in humans, along with the members of the Mycobacterium tuberculosis complex. Frequently used as an outgroup to the M. tuberculosis complex in phylogenetic analyses, M. canettii is thought to offer the best proxy for the progenitor species that gave rise to the complex. Here, we leverage whole-genome sequencing data and biologically relevant population genomic models to compare the evolutionary dynamics driving variation in the recombining M. canettii with that in the nonrecombining M. tuberculosis complex, and discuss differences in observed genomic diversity in the light of expected levels of Hill-Robertson interference. In doing so, we highlight the methodological challenges of estimating recombination rates through traditional population genetic approaches using sequences called from populations of microorganisms and evaluate the likely mis-inference that arises owing to a neglect of common model violations including purifying selection, background selection, progeny skew, and population size change. In addition, we compare performance when full within-host polymorphism data are utilized, versus the more common approach of basing analyses on within-host consensus sequences.
Collapse
Affiliation(s)
- Susanna Sabin
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA
| | - Ana Y Morales-Arce
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA
| | - Susanne P Pfeifer
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA
| | - Jeffrey D Jensen
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
36
|
Cherchame E, Guillier L, Lailler R, Vignaud ML, Jourdan-Da Silva N, Le Hello S, Weill FX, Cadel-Six S. Salmonella enterica subsp. enterica Welikade: guideline for phylogenetic analysis of serovars rarely involved in foodborne outbreaks. BMC Genomics 2022; 23:217. [PMID: 35303794 PMCID: PMC8933937 DOI: 10.1186/s12864-022-08439-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Accepted: 02/23/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Salmonella spp. is a major foodborne pathogen with a wide variety of serovars associated with human cases and food sources. Nevertheless, in Europe a panel of ten serovars is responsible for up to 80% of confirmed human cases. Clustering studies by single nucleotide polymorphism (SNP) core-genome phylogenetic analysis of outbreaks due to these major serovars are simplified by the availability of many complete genomes in the free access databases. This is not the case for outbreaks due to less common serovars, such as Welikade, for which no reference genomes are available. In this study, we propose a method to solve this problem. We propose to perform a core genome MLST (cgMLST) analysis based on hierarchical clustering using the free-access EnteroBase to select the most suitable genome to use as a reference for SNP phylogenetic analysis. In this study, we applied this protocol to a retrospective analysis of a Salmonella enterica serovar Welikade (S. Welikade) foodborne outbreak that occurred in France in 2016. Finally, we compared the cgMLST and SNP analyses. SNP phylogenetic reconstruction was carried out considering the effect of recombination events identified by the ClonalFrameML tool. The accessory genome was also explored by phage content and virulome analyses. RESULTS Our findings revealed high clustering concordance using cgMLST and SNP analyses. Nevertheless, SNP analysis allowed for better assessment of the genetic distance among strains. The results revealed epidemic clones of S. Welikade circulating within the poultry and dairy sectors in France, responsible for sporadic and non-sporadic human cases between 2012 and 2019. CONCLUSIONS This study increases knowledge on this poorly described serovar and enriches public genome databases with 42 genomes from human and non-human S. Welikade strains, including the isolate collected in 1956 in Sri Lanka, which gave the name to this serovar. This is the first genomic analysis of an outbreak due to S. Welikade described to date.
Collapse
Affiliation(s)
- Emeline Cherchame
- Laboratory for Food Safety, French Agency for Food, Environmental and Occupational Health & Safety (ANSES), 94700, Maisons-Alfort, France. .,Present address: Data Analysis Core, Paris Brain Institute, ICM, Paris, France.
| | - Laurent Guillier
- Laboratory for Food Safety, French Agency for Food, Environmental and Occupational Health & Safety (ANSES), 94700, Maisons-Alfort, France
| | - Renaud Lailler
- Laboratory for Food Safety, French Agency for Food, Environmental and Occupational Health & Safety (ANSES), 94700, Maisons-Alfort, France
| | - Marie-Leone Vignaud
- Laboratory for Food Safety, French Agency for Food, Environmental and Occupational Health & Safety (ANSES), 94700, Maisons-Alfort, France
| | | | - Simon Le Hello
- Centre National de Référence Des Escherichia Coli, Institut Pasteur, Unité Des Bactéries Pathogènes Entériques, Shigella et Salmonella, 75015, Paris, France.,Present address: Groupe de Recherche Sur L'Adaptation Microbienne (GRAM 2.0), Normandie Univ, UNICAEN, Caen, France
| | - François-Xavier Weill
- Centre National de Référence Des Escherichia Coli, Institut Pasteur, Unité Des Bactéries Pathogènes Entériques, Shigella et Salmonella, 75015, Paris, France
| | - Sabrina Cadel-Six
- Laboratory for Food Safety, French Agency for Food, Environmental and Occupational Health & Safety (ANSES), 94700, Maisons-Alfort, France
| |
Collapse
|
37
|
van der Putten BCL, Huijsmans NAH, Mende DR, Schultsz C. Benchmarking the topological accuracy of bacterial phylogenomic workflows using in silico evolution. Microb Genom 2022; 8. [PMID: 35290758 PMCID: PMC9176278 DOI: 10.1099/mgen.0.000799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Phylogenetic analyses are widely used in microbiological research, for example to trace the progression of bacterial outbreaks based on whole-genome sequencing data. In practice, multiple analysis steps such as de novo assembly, alignment and phylogenetic inference are combined to form phylogenetic workflows. Comprehensive benchmarking of the accuracy of complete phylogenetic workflows is lacking. To benchmark different phylogenetic workflows, we simulated bacterial evolution under a wide range of evolutionary models, varying the relative rates of substitution, insertion, deletion, gene duplication, gene loss and lateral gene transfer events. The generated datasets corresponded to a genetic diversity usually observed within bacterial species (≥95 % average nucleotide identity). We replicated each simulation three times to assess replicability. In total, we benchmarked 19 distinct phylogenetic workflows using 8 different simulated datasets. We found that recently developed k-mer alignment methods such as kSNP and ska achieve similar accuracy as reference mapping. The high accuracy of k-mer alignment methods can be explained by the large fractions of genomes these methods can align, relative to other approaches. We also found that the choice of de novo assembly algorithm influences the accuracy of phylogenetic reconstruction, with workflows employing SPAdes or skesa outperforming those employing Velvet. Finally, we found that the results of phylogenetic benchmarking are highly variable between replicates. We conclude that for phylogenomic reconstruction, k-mer alignment methods are relevant alternatives to reference mapping at the species level, especially in the absence of suitable reference genomes. We show de novo genome assembly accuracy to be an underappreciated parameter required for accurate phylogenomic reconstruction.
Collapse
Affiliation(s)
- Boas C L van der Putten
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.,Department of Global Health, Amsterdam Institute for Global Health and Development, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Niek A H Huijsmans
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Daniel R Mende
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Constance Schultsz
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.,Department of Global Health, Amsterdam Institute for Global Health and Development, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
38
|
Omaleki L, Blackall PJ, Cuddihy T, White RT, Courtice JM, Turni C, Forde BM, Beatson SA. Phase variation in the glycosyltransferase genes of Pasteurella multocida associated with outbreaks of fowl cholera on free-range layer farms. Microb Genom 2022; 8. [PMID: 35266868 PMCID: PMC9176279 DOI: 10.1099/mgen.0.000772] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Fowl cholera caused by Pasteurella multocida has re-emerged in Australian poultry production since the increasing adoption of free-range production systems. Currently, autogenous killed whole-cell vaccines prepared from the isolates previously obtained from each farm are the main preventative measures used. In this study, we use whole-genome sequencing and phylogenomic analysis to investigate outbreak dynamics, as well as monitoring and comparing the variations in the lipopolysaccharide (LPS) outer core biosynthesis loci of the outbreak and vaccine strains. In total, 73 isolates from two different free-range layer farms were included. Our genomic analysis revealed that all investigated isolates within the two farms (layer A and layer B) carried LPS type L3, albeit with a high degree of genetic diversity between them. Additionally, the isolates belonged to five different sequence types (STs), with isolates belonging to ST9 and ST20 being the most prevalent. The isolates carried ST-specific mutations within their LPS type L3 outer core biosynthesis loci, including frameshift mutations in the outer core heptosyltransferase gene (htpE) (ST7 and ST274) or galactosyltransferase gene (gatG) (ST20). The ST9 isolates could be separated into three groups based on their LPS outer core biosynthesis loci sequences, with evidence for potential phase variation mechanisms identified. The potential phase variation mechanisms included a tandem repeat insertion in natC and a single base deletion in a homopolymer region of gatG. Importantly, our results demonstrated that two of the three ST9 groups shared identical rep-PCR (repetitive extragenic palindromic PCR) patterns, while carrying differences in their LPS outer core biosynthesis loci region. In addition, we found that ST9 isolates either with or without the natC tandem repeat insertion were both associated with a single outbreak, which would indicate the importance of screening more than one isolate within an outbreak. Our results strongly suggest the need for a metagenomics culture-independent approach, as well as a genetic typing scheme for LPS, to ensure an appropriate vaccine strain with a matching predicted LPS structure is used.
Collapse
Affiliation(s)
- Lida Omaleki
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, St Lucia, QLD 4072, Australia.,Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Australia.,Australian Infectious Diseases Research Centre, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Australia
| | - Patrick J Blackall
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, St Lucia, QLD 4072, Australia
| | - Thom Cuddihy
- QFAB Bioinformatics - Research Computing Centre, University of Queensland, St Lucia, QLD 4072, Australia.,Present address: University of Queensland Centre for Clinical Research, Royal Brisbane and Women's Hospital Campus, Herston, QLD 4029, Australia
| | - Rhys T White
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Australia.,Australian Infectious Diseases Research Centre, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Australia
| | - Jodi M Courtice
- Division of Research and Innovation, University of Southern Queensland, Toowoomba, QLD 4350, Australia
| | - Conny Turni
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, St Lucia, QLD 4072, Australia
| | - Brian M Forde
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Australia.,Australian Infectious Diseases Research Centre, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Australia.,Present address: University of Queensland Centre for Clinical Research, Royal Brisbane and Women's Hospital Campus, Herston, QLD 4029, Australia
| | - Scott A Beatson
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Australia.,Australian Infectious Diseases Research Centre, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Australia
| |
Collapse
|
39
|
Rapid expansion and extinction of antibiotic resistance mutations during treatment of acute bacterial respiratory infections. Nat Commun 2022; 13:1231. [PMID: 35264582 PMCID: PMC8907320 DOI: 10.1038/s41467-022-28188-w] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 01/07/2022] [Indexed: 11/18/2022] Open
Abstract
Acute bacterial infections are often treated empirically, with the choice of antibiotic therapy updated during treatment. The effects of such rapid antibiotic switching on the evolution of antibiotic resistance in individual patients are poorly understood. Here we find that low-frequency antibiotic resistance mutations emerge, contract, and even go to extinction within days of changes in therapy. We analyzed Pseudomonas aeruginosa populations in sputum samples collected serially from 7 mechanically ventilated patients at the onset of respiratory infection. Combining short- and long-read sequencing and resistance phenotyping of 420 isolates revealed that while new infections are near-clonal, reflecting a recent colonization bottleneck, resistance mutations could emerge at low frequencies within days of therapy. We then measured the in vivo frequencies of select resistance mutations in intact sputum samples with resistance-targeted deep amplicon sequencing (RETRA-Seq), which revealed that rare resistance mutations not detected by clinically used culture-based methods can increase by nearly 40-fold over 5–12 days in response to antibiotic changes. Conversely, mutations conferring resistance to antibiotics not administered diminish and even go to extinction. Our results underscore how therapy choice shapes the dynamics of low-frequency resistance mutations at short time scales, and the findings provide a possibility for driving resistance mutations to extinction during early stages of infection by designing patient-specific antibiotic cycling strategies informed by deep genomic surveillance. It remains unclear how rapid antibiotic switching affects the evolution of antibiotic resistance in individual patients. Here, Chung et al. combine short- and long-read sequencing and resistance phenotyping of 420 serial isolates of Pseudomonas aeruginosa collected from the onset of respiratory infection, and show that rare resistance mutations can increase by nearly 40-fold over 5–12 days in response to antibiotic changes, while mutations conferring resistance to antibiotics not administered diminish and even go to extinction.
Collapse
|
40
|
van Dijk LR, Walker BJ, Straub TJ, Worby CJ, Grote A, Schreiber HL, Anyansi C, Pickering AJ, Hultgren SJ, Manson AL, Abeel T, Earl AM. StrainGE: a toolkit to track and characterize low-abundance strains in complex microbial communities. Genome Biol 2022; 23:74. [PMID: 35255937 PMCID: PMC8900328 DOI: 10.1186/s13059-022-02630-0] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 02/09/2022] [Indexed: 01/21/2023] Open
Abstract
Human-associated microbial communities comprise not only complex mixtures of bacterial species, but also mixtures of conspecific strains, the implications of which are mostly unknown since strain level dynamics are underexplored due to the difficulties of studying them. We introduce the Strain Genome Explorer (StrainGE) toolkit, which deconvolves strain mixtures and characterizes component strains at the nucleotide level from short-read metagenomic sequencing with higher sensitivity and resolution than other tools. StrainGE is able to identify strains at 0.1x coverage and detect variants for multiple conspecific strains within a sample from coverages as low as 0.5x.
Collapse
Affiliation(s)
- Lucas R. van Dijk
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA ,grid.5292.c0000 0001 2097 4740Delft Bioinformatics Lab, Delft University of Technology, Van Mourik Broekmanweg 6, Delft, 2628 XE The Netherlands
| | - Bruce J. Walker
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA ,Applied Invention, Cambridge, MA USA
| | - Timothy J. Straub
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA ,grid.38142.3c000000041936754XDepartment of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA 02115 USA
| | - Colin J. Worby
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA
| | - Alexandra Grote
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA
| | - Henry L. Schreiber
- grid.4367.60000 0001 2355 7002Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, MO 63110 USA ,grid.4367.60000 0001 2355 7002Center for Women’s Infectious Disease Research (CWIDR), Washington University School of Medicine, St. Louis, MO 63110 USA
| | - Christine Anyansi
- grid.5292.c0000 0001 2097 4740Delft Bioinformatics Lab, Delft University of Technology, Van Mourik Broekmanweg 6, Delft, 2628 XE The Netherlands
| | - Amy J. Pickering
- grid.47840.3f0000 0001 2181 7878Department of Civil and Environmental Engineering, University of California, Berkeley, Berkeley, CA 94720 USA ,grid.429997.80000 0004 1936 7531Stuart B. Levy Center for Integrated Management of Antimicrobial Resistance (Levy CIMAR), Tufts University, Boston, MA USA
| | - Scott J. Hultgren
- grid.4367.60000 0001 2355 7002Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, MO 63110 USA ,grid.4367.60000 0001 2355 7002Center for Women’s Infectious Disease Research (CWIDR), Washington University School of Medicine, St. Louis, MO 63110 USA
| | - Abigail L. Manson
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA
| | - Thomas Abeel
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA ,grid.5292.c0000 0001 2097 4740Delft Bioinformatics Lab, Delft University of Technology, Van Mourik Broekmanweg 6, Delft, 2628 XE The Netherlands
| | - Ashlee M. Earl
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA
| |
Collapse
|
41
|
Baktash A, Corver J, Harmanus C, Smits WK, Fawley W, Wilcox MH, Kumar N, Eyre DW, Indra A, Mellmann A, Kuijper EJ. Comparison of Whole-Genome Sequence-Based Methods and PCR Ribotyping for Subtyping of Clostridioides difficile. J Clin Microbiol 2022; 60:e0173721. [PMID: 34911367 PMCID: PMC8849210 DOI: 10.1128/jcm.01737-21] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 11/22/2021] [Indexed: 11/20/2022] Open
Abstract
Clostridioides difficile is the most common cause of antibiotic-associated gastrointestinal infections. Capillary electrophoresis (CE)-PCR ribotyping is currently the gold standard for C. difficile typing but lacks the discriminatory power to study transmission and outbreaks in detail. New molecular methods have the capacity to differentiate better and provide standardized and interlaboratory exchangeable data. Using a well-characterized collection of diverse strains (N = 630; 100 unique ribotypes [RTs]), we compared the discriminatory power of core genome multilocus sequence typing (cgMLST) (SeqSphere and EnteroBase), whole-genome MLST (wgMLST) (EnteroBase), and single-nucleotide polymorphism (SNP) analysis. A unique cgMLST profile (more than six allele differences) was observed in 82 of 100 RTs, indicating that cgMLST could distinguish most, but not all, RTs. Application of cgMLST in two outbreak settings with RT078 and RT181 (known to have low intra-RT allele differences) showed no distinction between outbreak and nonoutbreak strains in contrast to wgMLST and SNP analysis. We conclude that cgMLST has the potential to be an alternative to CE-PCR ribotyping. The method is reproducible, easy to standardize, and offers higher discrimination. However, adjusted cutoff thresholds and epidemiological data are necessary to recognize outbreaks of some specific RTs. We propose to use an allelic threshold of three alleles to identify outbreaks.
Collapse
Affiliation(s)
- A. Baktash
- Department of Medical Microbiology, Section Experimental Bacteriology, Leiden University Medical Center, Leiden, The Netherlands
| | - J. Corver
- Department of Medical Microbiology, Section Experimental Bacteriology, Leiden University Medical Center, Leiden, The Netherlands
| | - C. Harmanus
- Department of Medical Microbiology, Section Experimental Bacteriology, Leiden University Medical Center, Leiden, The Netherlands
- National Reference Laboratory for Clostridioides difficile, National Institute of Public Health and the Environment, Leiden University Medical Center, Leiden, The Netherlands
| | - W. K. Smits
- Department of Medical Microbiology, Section Experimental Bacteriology, Leiden University Medical Center, Leiden, The Netherlands
| | - W. Fawley
- National Infection Service, Public Health England, and University of Leeds, Leeds, United Kingdom
| | - M. H. Wilcox
- Department of Microbiology, Leeds Teaching Hospitals and University of Leeds, Leeds, United Kingdom
| | - N. Kumar
- Microbiota Interactions Laboratory, Wellcome Sanger Institute, Hinxton, United Kingdom
| | - D. W. Eyre
- Big Data Institute, Nuffield Department of Population Health, University of Oxford, Oxford, United Kingdom
| | - A. Indra
- Paracelsus Medical University of Salzburg, Salzburg, Austria
| | - A. Mellmann
- Institute of Hygiene, University Hospital Münster, and National Reference Center for C. difficile, Münster Branch, Münster, Germany
| | - E. J. Kuijper
- Department of Medical Microbiology, Section Experimental Bacteriology, Leiden University Medical Center, Leiden, The Netherlands
- National Reference Laboratory for Clostridioides difficile, National Institute of Public Health and the Environment, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
42
|
Abstract
Whole-genome sequencing (WGS) is a powerful method for detecting drug resistance, genetic diversity, and transmission dynamics of Mycobacterium tuberculosis. Implementation of WGS in public health microbiology laboratories is impeded by a lack of user-friendly, automated, and semiautomated pipelines. We present the COMBAT-TB Workbench, a modular, easy-to-install application that provides a web-based environment for Mycobacterium tuberculosis bioinformatics. The COMBAT-TB Workbench is built using two main software components: the IRIDA platform for its web-based user interface and data management capabilities and the Galaxy bioinformatics workflow platform for workflow execution. These components are combined into a single easy-to-install application using Docker container technology. We implemented two workflows, for M. tuberculosis sample analysis and phylogeny, in Galaxy. Building our workflows involved updating some Galaxy tools (Trimmomatic, snippy, and snp-sites) and writing new Galaxy tools (snp-dists, TB-Profiler, tb_variant_filter, and TB Variant Report). The irida-wf-ga2xml tool was updated to be able to work with recent versions of Galaxy and was further developed into IRIDA plugins for both workflows. In the case of the M. tuberculosis sample analysis, an interface was added to update the metadata stored for each sequence sample with results gleaned from the Galaxy workflow output. Data can be loaded into the COMBAT-TB Workbench via the web interface or via the command line IRIDA uploader tool. The COMBAT-TB Workbench application deploys IRIDA, the COMBAT-TB IRIDA plugins, the MariaDB database, and Galaxy using Docker containers (https://github.com/COMBAT-TB/irida-galaxy-deploy). IMPORTANCE While the reduction in the cost of WGS is making sequencing more affordable in lower- and middle-income countries (LMICs), public health laboratories in these countries seldom have access to bioinformaticians and system support engineers adept at using the Linux command line and complex bioinformatics software. The COMBAT-TB Workbench provides an open-source, modular, easy-to-deploy and -use environment for managing and analyzing M. tuberculosis WGS data and thereby makes WGS usable in practice in the LMIC context.
Collapse
|
43
|
Lozica L, Villumsen KR, Li G, Hu X, Maljković MM, Gottstein Ž. Genomic Analysis of Escherichia coli Longitudinally Isolated from Broiler Breeder Flocks after the Application of an Autogenous Vaccine. Microorganisms 2022; 10:microorganisms10020377. [PMID: 35208834 PMCID: PMC8879504 DOI: 10.3390/microorganisms10020377] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 01/29/2022] [Accepted: 02/03/2022] [Indexed: 11/24/2022] Open
Abstract
Escherichia coli is the main bacterial cause of major economic losses and animal welfare issues in poultry production. In this study, we investigate the effect of an autogenous vaccine on E. coli strains longitudinally isolated from broiler breeder flocks on two farms. In total, 115 E. coli isolates were sequenced using Illumina technologies, and compared based on a single-nucleotide polymorphism (SNP) analysis of the core-genome and antimicrobial resistance (AMR) genes they carried. The results showed that SNP-based phylogeny corresponds to a previous multilocus-sequence typing (MLST)-based phylogeny. Highly virulent sequence types (STs), including ST117-F, ST95-B2, ST131-B2 and ST390-B2, showed a higher level of homogeneity. On the other hand, less frequent STs, such as ST1485, ST3232, ST7013 and ST8573, were phylogenetically more distant and carried a higher number of antimicrobial resistance genes in most cases. In total, 25 antimicrobial genes were detected, of which the most prevalent were mdf(A) (100%), sitABCD (71.3%) and tet(A) (13.91%). The frequency of AMR genes showed a decreasing trend over time in both farms. The highest prevalence was detected in strains belonging to the B1 phylogenetic group, confirming the previous notion that commensal strains act as reservoirs and carry more resistance genes than pathogenic strains that are mostly associated with virulence genes.
Collapse
Affiliation(s)
- Liča Lozica
- Department of Poultry Diseases with Clinic, Faculty of Veterinary Medicine, University of Zagreb, Heinzelova 55, 10000 Zagreb, Croatia;
| | - Kasper Rømer Villumsen
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Dyrlægevej 88, 1870 Copenhagen, Denmark;
| | - Ganwu Li
- State Key Laboratory of Veterinary Biotechnology, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Harbin 150069, China;
- Department of Veterinary Diagnostic and Production Animal Medicine, College of Veterinary Medicine, Iowa State University, Ames, IA 50011, USA;
| | - Xiao Hu
- Department of Veterinary Diagnostic and Production Animal Medicine, College of Veterinary Medicine, Iowa State University, Ames, IA 50011, USA;
| | - Maja Maurić Maljković
- Department of Animal Breeding and Livestock Production, Faculty of Veterinary Medicine, University of Zagreb, Heinzelova 55, 10000 Zagreb, Croatia;
| | - Željko Gottstein
- Department of Poultry Diseases with Clinic, Faculty of Veterinary Medicine, University of Zagreb, Heinzelova 55, 10000 Zagreb, Croatia;
- Correspondence: ; Tel.: +385-1239-0280
| |
Collapse
|
44
|
Higgs C, Sherry NL, Seemann T, Horan K, Walpola H, Kinsella P, Bond K, Williamson DA, Marshall C, Kwong JC, Grayson ML, Stinear TP, Gorrie CL, Howden BP. Optimising genomic approaches for identifying vancomycin-resistant Enterococcus faecium transmission in healthcare settings. Nat Commun 2022; 13:509. [PMID: 35082278 PMCID: PMC8792028 DOI: 10.1038/s41467-022-28156-4] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 01/07/2022] [Indexed: 11/08/2022] Open
Abstract
Vancomycin-resistant Enterococcus faecium (VREfm) is a major nosocomial pathogen. Identifying VREfm transmission dynamics permits targeted interventions, and while genomics is increasingly being utilised, methods are not yet standardised or optimised for accuracy. We aimed to develop a standardized genomic method for identifying putative VREfm transmission links. Using comprehensive genomic and epidemiological data from a cohort of 308 VREfm infection or colonization cases, we compared multiple approaches for quantifying genetic relatedness. We showed that clustering by core genome multilocus sequence type (cgMLST) was more informative of population structure than traditional MLST. Pairwise genome comparisons using split k-mer analysis (SKA) provided the high-level resolution needed to infer patient-to-patient transmission. The more common mapping to a reference genome was not sufficiently discriminatory, defining more than three times more genomic transmission events than SKA (3729 compared to 1079 events). Here, we show a standardized genomic framework for inferring VREfm transmission that can be the basis for global deployment of VREfm genomics into routine outbreak detection and investigation.
Collapse
Affiliation(s)
- Charlie Higgs
- Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
| | - Norelle L Sherry
- Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
- Department of Infectious Diseases, Austin Health, Melbourne, VIC, Australia
| | - Torsten Seemann
- Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
| | - Kristy Horan
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
| | - Hasini Walpola
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
| | - Paul Kinsella
- Department of Microbiology, Royal Melbourne Hospital, Melbourne, VIC, Australia
| | - Katherine Bond
- Department of Microbiology, Royal Melbourne Hospital, Melbourne, VIC, Australia
| | - Deborah A Williamson
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
- Department of Microbiology, Royal Melbourne Hospital, Melbourne, VIC, Australia
- Victorian Infectious Diseases Reference Laboratory, Royal Melbourne Hospital, The Peter Doherty Institute for Infection and Immunity, Melbourne, VIC, Australia
| | - Caroline Marshall
- Victorian Infectious Diseases Service, The Peter Doherty Institute for Infection and Immunity, Melbourne, VIC, Australia
| | - Jason C Kwong
- Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
- Department of Infectious Diseases, Austin Health, Melbourne, VIC, Australia
| | - M Lindsay Grayson
- Department of Infectious Diseases, Austin Health, Melbourne, VIC, Australia
- Department of Medicine, Austin Health, The University of Melbourne, Heidelberg, VIC, Australia
| | - Timothy P Stinear
- Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
| | - Claire L Gorrie
- Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
| | - Benjamin P Howden
- Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia.
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology & Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia.
- Department of Infectious Diseases, Austin Health, Melbourne, VIC, Australia.
| |
Collapse
|
45
|
Field JT, Abrams AJ, Cartee JC, McTavish EJ. Rapid alignment updating with Extensiphy. Methods Ecol Evol 2022. [DOI: 10.1111/2041-210x.13790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Jasper Toscani Field
- Quantitative and Systems Biology Program School of Natural Sciences University of California Merced CA USA
| | - A. Jeanine Abrams
- Division of STD Prevention National Centers for HIV/AIDS Viral Hepatitis, STD, and TB Prevention Atlanta GA USA
| | - John C. Cartee
- Division of STD Prevention National Centers for HIV/AIDS Viral Hepatitis, STD, and TB Prevention Atlanta GA USA
| | - Emily Jane McTavish
- Life and Environmental Sciences Department School of Natural Sciences University of California Merced CA USA
| |
Collapse
|
46
|
Whole-Genome Sequencing Reveals the High Nosocomial Transmission and Antimicrobial Resistance of Clostridioides difficile in a Single Center in China, a Four-Year Retrospective Study. Microbiol Spectr 2022; 10:e0132221. [PMID: 35019676 PMCID: PMC8754133 DOI: 10.1128/spectrum.01322-21] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Clostridioides difficile, which causes life-threatening diarrheal disease, presents an urgent threat to health care systems. In this study, we present a retrospective genomic and epidemiological analysis of C. difficile in a large teaching hospital. First, we collected 894 nonduplicate fecal samples from patients during a whole year to elucidate the C. difficile molecular epidemiology. We then presented a detailed description of the population structure of C. difficile based on 270 isolates separated between 2015 and 2020 and clarified the genetic and phenotypic features by MIC and whole-genome sequencing. We observed a high carriage rate (19.4%, 173/894) of C. difficile among patients in this hospital. The population structure of C. difficile was diverse with a total of 36 distinct STs assigned. In total, 64.8% (175/270) of the isolates were toxigenic, including four CDT-positive (C. difficile transferase) isolates, and 50.4% (135/268) of the isolates were multidrug-resistant. Statistically, the rates of resistance to erythromycin, moxifloxacin, and rifaximin were higher for nontoxigenic isolates. Although no vancomycin-resistant isolates were detected, the MIC for vancomycin was higher for toxigenic isolates (P < 0.01). The in-hospital transmission was observed, with 43.8% (110/251) of isolates being genetically linked to a prior case. However, no strong correlation was detected between the genetic linkage and epidemiological linkage. Asymptomatic colonized patients play the same role in nosocomial transmission as infected patients, raising the issue of routine screening of C. difficile on admission. This work provides an in-depth description of C. difficile in a hospital setting and paves the way for better surveillance and effective prevention of related diseases in China. IMPORTANCEClostridioides difficile infections (CDI) are the leading cause of healthcare-associated diarrhea and are known to be resistant to multiple antibiotics. In the past decade, C. difficile has emerged rapidly and has spread globally, causing great concern among American and European countries. However, research on CDI remains limited in China. Here, we characterized the comprehensive spectrum of C. difficile by whole-genome sequencing (WGS) in a Chinese hospital, showing a high detection rate among patients, diverse genome characteristics, a high level of antibiotic resistance, and an unknown nosocomial transmission risk of C. difficile. During the study period, two C. difficile transferase (CDT)-positive isolates belonging to a new multilocus sequence type (ST820) were detected, which have caused serious clinical symptoms. This work describes C. difficile integrally and provides new insight into C. difficile surveillance based on WGS in China.
Collapse
|
47
|
Goossens SN, Heupink TH, De Vos E, Dippenaar A, De Vos M, Warren R, Van Rie A. Detection of minor variants in Mycobacterium tuberculosis whole genome sequencing data. Brief Bioinform 2021; 23:6484510. [PMID: 34962257 PMCID: PMC8769888 DOI: 10.1093/bib/bbab541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 11/05/2021] [Accepted: 11/24/2021] [Indexed: 11/25/2022] Open
Abstract
The study of genetic minority variants is fundamental to the understanding of complex processes such as evolution, fitness, transmission, virulence, heteroresistance and drug tolerance in Mycobacterium tuberculosis (Mtb). We evaluated the performance of the variant calling tool LoFreq to detect de novo as well as drug resistance conferring minor variants in both in silico and clinical Mtb next generation sequencing (NGS) data. The in silico simulations demonstrated that LoFreq is a conservative variant caller with very high precision (≥96.7%) over the entire range of depth of coverage tested (30x to1000x), independent of the type and frequency of the minor variant. Sensitivity increased with increasing depth of coverage and increasing frequency of the variant, and was higher for calling insertion and deletion (indel) variants than for single nucleotide polymorphisms (SNP). The variant frequency limit of detection was 0.5% and 3% for indel and SNP minor variants, respectively. For serial isolates from a patient with DR-TB; LoFreq successfully identified all minor Mtb variants in the Rv0678 gene (allele frequency as low as 3.22% according to targeted deep sequencing) in whole genome sequencing data (median coverage of 62X). In conclusion, LoFreq can successfully detect minor variant populations in Mtb NGS data, thus limiting the need for filtering of possible false positive variants due to sequencing error. The observed performance statistics can be used to determine the limit of detection in existing whole genome sequencing Mtb data and guide the required depth of future studies that aim to investigate the presence of minor variants.
Collapse
Affiliation(s)
- Sander N Goossens
- Family Medicine and Population Health (FAMPOP), Faculty of Medicine and Health Sciences, University of Antwerp, Wilrijk, Belgium
| | - Tim H Heupink
- Family Medicine and Population Health (FAMPOP), Faculty of Medicine and Health Sciences, University of Antwerp, Wilrijk, Belgium
| | - Elise De Vos
- Family Medicine and Population Health (FAMPOP), Faculty of Medicine and Health Sciences, University of Antwerp, Wilrijk, Belgium
| | - Anzaan Dippenaar
- Family Medicine and Population Health (FAMPOP), Faculty of Medicine and Health Sciences, University of Antwerp, Wilrijk, Belgium
| | | | - Rob Warren
- Department of Science and Innovation-National Research Foundation Centre for Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Tygerberg, South Africa
| | - Annelies Van Rie
- Family Medicine and Population Health (FAMPOP), Faculty of Medicine and Health Sciences, University of Antwerp, Wilrijk, Belgium
| |
Collapse
|
48
|
Viehweger A, Blumenscheit C, Lippmann N, Wyres KL, Brandt C, Hans JB, Hölzer M, Irber L, Gatermann S, Lübbert C, Pletz MW, Holt KE, König B. Context-aware genomic surveillance reveals hidden transmission of a carbapenemase-producing Klebsiella pneumoniae. Microb Genom 2021; 7:000741. [PMID: 34913861 PMCID: PMC8767333 DOI: 10.1099/mgen.0.000741] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Accepted: 11/04/2021] [Indexed: 01/18/2023] Open
Abstract
Genomic surveillance can inform effective public health responses to pathogen outbreaks. However, integration of non-local data is rarely done. We investigate two large hospital outbreaks of a carbapenemase-carrying Klebsiella pneumoniae strain in Germany and show the value of contextual data. By screening about 10 000 genomes, over 400 000 metagenomes and two culture collections using in silico and in vitro methods, we identify a total of 415 closely related genomes reported in 28 studies. We identify the relationship between the two outbreaks through time-dated phylogeny, including their respective origin. One of the outbreaks presents extensive hidden transmission, with descendant isolates only identified in other studies. We then leverage the genome collection from this meta-analysis to identify genes under positive selection. We thereby identify an inner membrane transporter (ynjC) with a putative role in colistin resistance. Contextual data from other sources can thus enhance local genomic surveillance at multiple levels and should be integrated by default when available.
Collapse
Affiliation(s)
- Adrian Viehweger
- Institute of Medical Microbiology and Virology, University Hospital Leipzig, Leipzig, Germany
| | | | - Norman Lippmann
- Institute of Medical Microbiology and Virology, University Hospital Leipzig, Leipzig, Germany
| | - Kelly L. Wyres
- Department of Infectious Diseases, Central Clinical School, Monash University, Melbourne, Australia
| | - Christian Brandt
- Institute for Infectious Diseases and Infection Control, Jena University Hospital, Jena, Germany
| | - Jörg B. Hans
- National Reference Center for multidrug-resistant Gram-negative bacteria, Department for Medical Microbiology, Ruhr-University Bochum, Bochum, Germany
| | - Martin Hölzer
- Methodology and Research Infrastructure, MF1 Bioinformatics, Robert Koch Institute, Berlin, Germany
| | - Luiz Irber
- Department of Population Health and Reproduction, University of California, Davis, Davis, California, USA
| | - Sören Gatermann
- National Reference Center for multidrug-resistant Gram-negative bacteria, Department for Medical Microbiology, Ruhr-University Bochum, Bochum, Germany
| | - Christoph Lübbert
- Division of Infectious Diseases and Tropical Medicine, Department of Medicine II, University Hospital Leipzig, Leipzig, Germany
| | - Mathias W. Pletz
- Institute for Infectious Diseases and Infection Control, Jena University Hospital, Jena, Germany
| | - Kathryn E. Holt
- Department of Infectious Diseases, Central Clinical School, Monash University, Melbourne, Australia
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, UK
| | - Brigitte König
- Institute of Medical Microbiology and Virology, University Hospital Leipzig, Leipzig, Germany
| |
Collapse
|
49
|
Bogaerts B, Winand R, Van Braekel J, Hoffman S, Roosens NHC, De Keersmaecker SCJ, Marchal K, Vanneste K. Evaluation of WGS performance for bacterial pathogen characterization with the Illumina technology optimized for time-critical situations. Microb Genom 2021; 7:000699. [PMID: 34739368 PMCID: PMC8743554 DOI: 10.1099/mgen.0.000699] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 09/30/2021] [Indexed: 12/29/2022] Open
Abstract
Whole genome sequencing (WGS) has become the reference standard for bacterial outbreak investigation and pathogen typing, providing a resolution unattainable with conventional molecular methods. Data generated with Illumina sequencers can however only be analysed after the sequencing run has finished, thereby losing valuable time during emergency situations. We evaluated both the effect of decreasing overall run time, and also a protocol to transfer and convert intermediary files generated by Illumina sequencers enabling real-time data analysis for multiple samples part of the same ongoing sequencing run, as soon as the forward reads have been sequenced. To facilitate implementation for laboratories operating under strict quality systems, extensive validation of several bioinformatics assays (16S rRNA species confirmation, gene detection against virulence factor and antimicrobial resistance databases, SNP-based antimicrobial resistance detection, serotype determination, and core genome multilocus sequence typing) for three bacterial pathogens (Mycobacterium tuberculosis , Neisseria meningitidis , and Shiga-toxin producing Escherichia coli ) was performed by evaluating performance in function of the two most critical sequencing parameters, i.e. read length and coverage. For the majority of evaluated bioinformatics assays, actionable results could be obtained between 14 and 22 h of sequencing, decreasing the overall sequencing-to-results time by more than half. This study aids in reducing the turn-around time of WGS analysis by facilitating a faster response in time-critical scenarios and provides recommendations for time-optimized WGS with respect to required read length and coverage to achieve a minimum level of performance for the considered bioinformatics assay(s), which can also be used to maximize the cost-effectiveness of routine surveillance sequencing when response time is not essential.
Collapse
Affiliation(s)
- Bert Bogaerts
- Transversal activities in Applied Genomics, Sciensano, Brussels (1050), Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent (9000), Belgium
| | - Raf Winand
- Transversal activities in Applied Genomics, Sciensano, Brussels (1050), Belgium
| | - Julien Van Braekel
- Transversal activities in Applied Genomics, Sciensano, Brussels (1050), Belgium
| | - Stefan Hoffman
- Transversal activities in Applied Genomics, Sciensano, Brussels (1050), Belgium
| | - Nancy H. C. Roosens
- Transversal activities in Applied Genomics, Sciensano, Brussels (1050), Belgium
| | | | - Kathleen Marchal
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent (9000), Belgium
- Department of Information Technology, IDLab, imec, Ghent University, Ghent (9000), Belgium
- Department of Genetics, University of Pretoria, 0001 Pretoria, South Africa
| | - Kevin Vanneste
- Transversal activities in Applied Genomics, Sciensano, Brussels (1050), Belgium
| |
Collapse
|
50
|
Coll F. Key variables affecting genetic distance calculations in genomic epidemiology. THE LANCET MICROBE 2021; 2:e486-e487. [DOI: 10.1016/s2666-5247(21)00183-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Accepted: 07/06/2021] [Indexed: 11/29/2022] Open
|