1
|
Sherry NL, Lee JYH, Giulieri SG, Connor CH, Horan K, Lacey JA, Lane CR, Carter GP, Seemann T, Egli A, Stinear TP, Howden BP. Genomics for antimicrobial resistance-progress and future directions. Antimicrob Agents Chemother 2025; 69:e0108224. [PMID: 40227048 PMCID: PMC12057382 DOI: 10.1128/aac.01082-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2025] Open
Abstract
Antimicrobial resistance (AMR) is a critical global public health threat, with bacterial pathogens of primary concern. Pathogen genomics has revolutionized the study of bacterial pathogens and provided deep insights into the mechanisms and dissemination of AMR, with the precision of whole-genome sequencing informing better control strategies. However, generating actionable data from genomic surveillance and diagnostic efforts requires integration at the public health and clinical interface that goes beyond academic efforts to identify resistance mechanisms, undertake post hoc analyses of outbreaks, and share data after research publications. In addition to timely genomics data, consideration also needs to be given to epidemiological sampling frames, analysis, and reporting mechanisms that meet International Organization for Standardization (ISO) standards and generation of reports that are interpretable and actionable for public health and clinical "end-users." Importantly, ensuring all countries have equitable access to data and technology is critical, through timely data sharing following the FAIR principles (findable, accessible, interoperable, and re-usable). In this review, we describe (i) advances in genomic approaches for AMR research and surveillance to understand emergence, evolution, and transmission of AMR and the key requirements to enable this work and (ii) discuss emerging and future applications of genomics at the clinical and public health interface, including barriers to implementation. Harnessing advances in genomics-enhanced AMR research and embedding robust and reproducible workflows within clinical and public health practice promises to maximize the impact of pathogen genomics for AMR globally in the coming decade.
Collapse
Affiliation(s)
- Norelle L. Sherry
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
- WHO Collaborating Centre for Antimicrobial Resistance, Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
- Department of Infectious Diseases and Immunology, Austin Health, Heidelberg, Victoria, Australia
| | - Jean Y. H. Lee
- Centre for Pathogen Genomics, University of Melbourne, Melbourne, Victoria, Australia
- Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
- Department of Infectious Diseases, Monash Health, Clayton, Victoria, Australia
| | - Stefano G. Giulieri
- Centre for Pathogen Genomics, University of Melbourne, Melbourne, Victoria, Australia
- Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
- Victorian Infectious Diseases Service, Doherty Institute for Infection and Immunity, The Royal Melbourne Hospital, , Melbourne, Victoria, Australia
| | - Christopher H. Connor
- Centre for Pathogen Genomics, University of Melbourne, Melbourne, Victoria, Australia
- Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
| | - Kristy Horan
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
| | - Jake A. Lacey
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
| | - Courtney R. Lane
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
- WHO Collaborating Centre for Antimicrobial Resistance, Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
- Centre for Pathogen Genomics, University of Melbourne, Melbourne, Victoria, Australia
| | - Glen P. Carter
- Centre for Pathogen Genomics, University of Melbourne, Melbourne, Victoria, Australia
- Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
| | - Torsten Seemann
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
- Centre for Pathogen Genomics, University of Melbourne, Melbourne, Victoria, Australia
| | - Adrian Egli
- Institute of Medical Microbiology, University of Zurich, Zurich, Switzerland
| | - Timothy P. Stinear
- Centre for Pathogen Genomics, University of Melbourne, Melbourne, Victoria, Australia
- Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
| | - Benjamin P. Howden
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
- WHO Collaborating Centre for Antimicrobial Resistance, Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
- Department of Infectious Diseases and Immunology, Austin Health, Heidelberg, Victoria, Australia
- Centre for Pathogen Genomics, University of Melbourne, Melbourne, Victoria, Australia
- Microbiology Department, Royal Melbourne Hospital, Melbourne, Victoria, Australia
| |
Collapse
|
2
|
Chibani S, Yacoub E, Boujemaa S, Mardassi H, Guglielmini J, Vaysse A, Khadraoui N, Mlik B, Ben Abdelmoumen Mardassi B. A genome-wide investigation of Mycoplasma hominis genes associated with gynecological infections or infertility. Front Microbiol 2025; 16:1561378. [PMID: 40371111 PMCID: PMC12075135 DOI: 10.3389/fmicb.2025.1561378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2025] [Accepted: 03/18/2025] [Indexed: 05/16/2025] Open
Abstract
Background and aim Mycoplasma hominis is a human pathogenic bacterium that causes a wide range of genital infections and reproductive issues. Previously, based on an extended multilocus sequence typing scheme, we provided evidence for the segregation of M. hominis clinical strains into two distinct pathotypes: gynecological infections or infertility. Here, based on whole genome sequencing (WGS) data, we sought to provide a more refined picture of the phylogenetic relationship between these two M. hominis pathotypes, with the aim to delineate the underlying genetic determinants. Methods We carried out WGS of 62 Tunisian M. hominis clinical strains collected over a 17-year period. The majority of these clinical strains are associated with infertility (n = 53) and the remaining nine isolates are from gynecological infections cases. An alignment-free distance-based procedure (Jolytree) was used to infer phylogenetic relationships among M. hominis isolates, while the phylogenetic method treeWAS was used to determine the statistical association between pathotypes of interest and genotypes at all loci. Results The total pangenome of M. hominis strains was found to contain 1,590 genes including 966 core genes and 592 accessory genes, representing 60 and 37% of the total genome, respectively. Collectively, phylogenetic analyses based on WGS confirmed the distinction between the two M. hominis pathotypes. Strikingly, genome wide association analyses identified 4 virulence genes associated with gynecological infections, mainly involved in nucleotide salvage pathways and tolerance to oxidative stress, while five genes have been associated with infertility cases, two of which are implicated in biofilm formation. Conclusion In sum, this study further established the categorization of M. hominis into two pathotypes, and led to the identification of the associated genetic loci, thus holding out promising prospects for a better understanding of the differential interaction of M. hominis with its host.
Collapse
Affiliation(s)
- Salim Chibani
- Group of Mycoplasmas, Laboratory of Molecular Microbiology, Vaccinology, and Biotechnological Development, Pasteur Institute of Tunis, University of Tunis-El Manar, Tunis, Tunisia
| | - Elhem Yacoub
- Group of Mycoplasmas, Laboratory of Molecular Microbiology, Vaccinology, and Biotechnological Development, Pasteur Institute of Tunis, University of Tunis-El Manar, Tunis, Tunisia
| | - Safa Boujemaa
- Group of Mycoplasmas, Laboratory of Molecular Microbiology, Vaccinology, and Biotechnological Development, Pasteur Institute of Tunis, University of Tunis-El Manar, Tunis, Tunisia
| | - Helmi Mardassi
- Unit of Typing and Genetics of Mycobacteria, Laboratory of Molecular Microbiology, Vaccinology, and Biotechnology Development, Pasteur Institute of Tunis, University of Tunis-El Manar, Tunis, Tunisia
| | - Julien Guglielmini
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France
| | - Amaury Vaysse
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France
| | - Nadine Khadraoui
- Group of Mycoplasmas, Laboratory of Molecular Microbiology, Vaccinology, and Biotechnological Development, Pasteur Institute of Tunis, University of Tunis-El Manar, Tunis, Tunisia
| | - Béhija Mlik
- Group of Mycoplasmas, Laboratory of Molecular Microbiology, Vaccinology, and Biotechnological Development, Pasteur Institute of Tunis, University of Tunis-El Manar, Tunis, Tunisia
| | - Boutheina Ben Abdelmoumen Mardassi
- Group of Mycoplasmas, Laboratory of Molecular Microbiology, Vaccinology, and Biotechnological Development, Pasteur Institute of Tunis, University of Tunis-El Manar, Tunis, Tunisia
| |
Collapse
|
3
|
Bujdoš D, Walter J, O'Toole PW. aurora: a machine learning gwas tool for analyzing microbial habitat adaptation. Genome Biol 2025; 26:66. [PMID: 40122838 PMCID: PMC11930000 DOI: 10.1186/s13059-025-03524-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 03/03/2025] [Indexed: 03/25/2025] Open
Abstract
A primary goal of microbial genome-wide association studies is identifying genomic variants associated with a particular habitat. Existing tools fail to identify known causal variants if the analyzed trait shaped the phylogeny. Furthermore, due to inclusion of allochthonous strains or metadata errors, the stated sources of strains in public databases are often incorrect, and strains may not be adapted to the habitat from which they were isolated. We describe a new tool, aurora, that identifies autochthonous strains and the genes associated with habitats while acknowledging the potential role of the habitat adaptation trait in shaping phylogeny.
Collapse
Affiliation(s)
- Dalimil Bujdoš
- APC Microbiome Ireland, University College Cork, National University of Ireland, Cork, Ireland
- School of Microbiology, University College Cork, National University of Ireland, Cork, Ireland
| | - Jens Walter
- APC Microbiome Ireland, University College Cork, National University of Ireland, Cork, Ireland
- School of Microbiology, University College Cork, National University of Ireland, Cork, Ireland
- Department of Medicine, University College Cork, National University of Ireland, Cork, Ireland
| | - Paul W O'Toole
- APC Microbiome Ireland, University College Cork, National University of Ireland, Cork, Ireland.
- School of Microbiology, University College Cork, National University of Ireland, Cork, Ireland.
| |
Collapse
|
4
|
Tsoumtsa Meda L, Lagarde J, Guillier L, Roussel S, Douarre PE. Using GWAS and Machine Learning to Identify and Predict Genetic Variants Associated with Foodborne Bacteria Phenotypic Traits. Methods Mol Biol 2025; 2852:223-253. [PMID: 39235748 DOI: 10.1007/978-1-0716-4100-2_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2024]
Abstract
One of the main challenges in food microbiology is to prevent the risk of outbreaks by avoiding the distribution of food contaminated by bacteria. This requires constant monitoring of the circulating strains throughout the food production chain. Bacterial genomes contain signatures of natural evolution and adaptive markers that can be exploited to better understand the behavior of pathogen in the food industry. The monitoring of foodborne strains can therefore be facilitated by the use of these genomic markers capable of rapidly providing essential information on isolated strains, such as the source of contamination, risk of illness, potential for biofilm formation, and tolerance or resistance to biocides. The increasing availability of large genome datasets is enhancing the understanding of the genetic basis of complex traits such as host adaptation, virulence, and persistence. Genome-wide association studies have shown very promising results in the discovery of genomic markers that can be integrated into rapid detection tools. In addition, machine learning has successfully predicted phenotypes and classified important traits. Genome-wide association and machine learning tools have therefore the potential to support decision-making circuits intending at reducing the burden of foodborne diseases. The aim of this chapter review is to provide knowledge on the use of these two methods in food microbiology and to recommend their use in the field.
Collapse
Affiliation(s)
- Landry Tsoumtsa Meda
- ACTALIA, La Roche-sur-Foron, France
- ANSES, Salmonella and Listeria Unit (USEL), University of Paris-Est, Maisons-Alfort Laboratory for Food Safety, Maisons-Alfort, France
| | - Jean Lagarde
- ANSES, Salmonella and Listeria Unit (USEL), University of Paris-Est, Maisons-Alfort Laboratory for Food Safety, Maisons-Alfort, France
- INRAE, Unit of Process Optimisation in Food, Agriculture and the Environment (UR OPAALE), Rennes, France
| | | | - Sophie Roussel
- ANSES, Salmonella and Listeria Unit (USEL), University of Paris-Est, Maisons-Alfort Laboratory for Food Safety, Maisons-Alfort, France
| | - Pierre-Emmanuel Douarre
- ANSES, Salmonella and Listeria Unit (USEL), University of Paris-Est, Maisons-Alfort Laboratory for Food Safety, Maisons-Alfort, France.
| |
Collapse
|
5
|
Hernández-García M, Barbero-Herranz R, Bastón-Paz N, Díez-Aguilar M, López-Collazo E, Márquez-Garrido FJ, Hernández-Pérez JM, Baquero F, Ekkelenkamp MB, Fluit AC, Fuentes-Valverde V, Moscoso M, Bou G, del Campo R, Cantón R, Avendaño-Ortiz J. Unravelling the mechanisms causing murepavadin resistance in Pseudomonas aeruginosa: lipopolysaccharide alterations and its consequences. Front Cell Infect Microbiol 2024; 14:1446626. [PMID: 39711784 PMCID: PMC11659217 DOI: 10.3389/fcimb.2024.1446626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Accepted: 11/18/2024] [Indexed: 12/24/2024] Open
Abstract
Introduction Murepavadin is an antimicrobial peptide (AMP) in clinical development that selectively targets Pseudomonas aeruginosa LptD and whose resistance profile remains unknown. We aimed to explore genomic modifications and consequences underlying murepavadin and/or colistin susceptibility. Methods To define genomic mechanisms underlying resistance, we performed two approaches: 1) a genome-wide association study (GWAS) in a P. aeruginosa clinical collection (n=496), considering >0.25 mg/L as tentative cut-off of murepavadin acquired resistance; 2) a paired genomic comparison in a subset of 5 isolates and their isogenic murepavadin-resistant mutants obtained in vitro. Lipid-A composition, immunogenicity and cathelicidin and indolicidin effects on bacterial growth were also tested in this last subset of isolates. Murepavadin MICs were determined in ΔlpxL1 and ΔlpxL2 knock-out mutants obtained from a auxotroph PAO1 derivative. Results GWAS revealed a missense variant (A→G p.Thr260Ala in the hisJ gene) associated with murepavadin resistance although both resistant and susceptible strains harbored it (21% and 12% respectively, OR=1.92, p=0.012 in χ² test). Among the isolate subset, murepavadin-resistant mutants with deletions in lpxL1 and lpxL2 genes showed lower abundance of hexa-acylated lipid-A (m/z 1616, 1632). 4-aminoarabinose addition was found only in colistin-resistant isolates but not in the other ones, irrespective of murepavadin susceptibility. Accordingly, ΔlpxL1 and ΔlpxL2 mutants exhibited higher murepavadin MICs than parental PAO1 auxotroph strain (2 and 4 vs 0.5 mg/L respectively). Lipopolysaccharide from murepavadin-resistant mutants triggered lower inflammatory responses in human monocytes. Those with lpxL mutations and hexa-acylated lipid-A loss also exhibited greater growth reduction when exposed to host-derived AMPs cathelicidin and indolicidin. Discussion High murepavadin-resistance seems to be linked to lpxL1 and lpxL2 mutations and lower hexa-acylated lipid-A, corresponding to lower inflammatory induction and higher susceptibility to host-derived AMPs. Although GWAS identified one variant associated with the murepavadin-resistant phenotype, data revealed that there was no unique single genetic event underlying this phenotype. Our study provides insight into the mechanisms underlying murepavadin susceptibility.
Collapse
Affiliation(s)
- Marta Hernández-García
- Servicio de Microbiología, Hospital Universitario Ramón y Cajal, Instituto Ramón y Cajal de Investigación Sanitaria (IRYCIS), Madrid, Spain
- CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, Madrid, Spain
| | - Raquel Barbero-Herranz
- Servicio de Microbiología, Hospital Universitario Ramón y Cajal, Instituto Ramón y Cajal de Investigación Sanitaria (IRYCIS), Madrid, Spain
| | - Natalia Bastón-Paz
- Servicio de Microbiología, Hospital Universitario Ramón y Cajal, Instituto Ramón y Cajal de Investigación Sanitaria (IRYCIS), Madrid, Spain
| | - María Díez-Aguilar
- Servicio de Microbiología y Parasitología, Hospital Universitario La Princesa, Madrid, Spain
| | - Eduardo López-Collazo
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
- Innate Immune Response Group, IdiPAZ, Madrid, Spain
| | | | - José María Hernández-Pérez
- Plataforma de Proteómica y Metabolómica, Instituto de Investigación Germans Trias i Pujol, Badalona, Spain
| | - Fernando Baquero
- Servicio de Microbiología, Hospital Universitario Ramón y Cajal, Instituto Ramón y Cajal de Investigación Sanitaria (IRYCIS), Madrid, Spain
- CIBER de Epidemiología y Salud Pública, Instituto de Salud Carlos III, Madrid, Spain
| | - Miquel B. Ekkelenkamp
- Department of Medical Microbiology, University Medical Center Utrecht, Utrecht, Netherlands
| | - Ad C. Fluit
- Department of Medical Microbiology, University Medical Center Utrecht, Utrecht, Netherlands
| | - Víctor Fuentes-Valverde
- CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, Madrid, Spain
- Department of Microbiology, University Hospital A Coruña (CHUAC)-Biomedical Research Institute A Coruña (INIBIC), A Coruña, Spain
| | - Miriam Moscoso
- CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, Madrid, Spain
- Department of Microbiology, University Hospital A Coruña (CHUAC)-Biomedical Research Institute A Coruña (INIBIC), A Coruña, Spain
| | - Germán Bou
- CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, Madrid, Spain
- Department of Microbiology, University Hospital A Coruña (CHUAC)-Biomedical Research Institute A Coruña (INIBIC), A Coruña, Spain
| | - Rosa del Campo
- Servicio de Microbiología, Hospital Universitario Ramón y Cajal, Instituto Ramón y Cajal de Investigación Sanitaria (IRYCIS), Madrid, Spain
- CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, Madrid, Spain
| | - Rafael Cantón
- Servicio de Microbiología, Hospital Universitario Ramón y Cajal, Instituto Ramón y Cajal de Investigación Sanitaria (IRYCIS), Madrid, Spain
- CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, Madrid, Spain
| | - José Avendaño-Ortiz
- Servicio de Microbiología, Hospital Universitario Ramón y Cajal, Instituto Ramón y Cajal de Investigación Sanitaria (IRYCIS), Madrid, Spain
- CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, Madrid, Spain
| |
Collapse
|
6
|
Schadron T, van den Beld M, Mughini-Gras L, Franz E. Use of whole genome sequencing for surveillance and control of foodborne diseases: status quo and quo vadis. Front Microbiol 2024; 15:1460335. [PMID: 39345263 PMCID: PMC11427404 DOI: 10.3389/fmicb.2024.1460335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Accepted: 08/27/2024] [Indexed: 10/01/2024] Open
Abstract
Improvements in sequencing quality, availability, speed and costs results in an increased presence of genomics in infectious disease applications. Nevertheless, there are still hurdles in regard to the optimal use of WGS for public health purposes. Here, we discuss the current state ("status quo") and future directions ("quo vadis") based on literature regarding the use of genomics in surveillance, hazard characterization and source attribution of foodborne pathogens. The future directions include the application of new techniques, such as machine learning and network approaches that may overcome the current shortcomings. These include the use of fixed genomic distances in cluster delineation, disentangling similarity or lack thereof in source attribution, and difficulties ascertaining function in hazard characterization. Although, the aforementioned methods can relatively easily be applied technically, an overarching challenge is the inference and biological/epidemiological interpretation of these large amounts of high-resolution data. Understanding the context in terms of bacterial isolate and host diversity allows to assess the level of representativeness in regard to sources and isolates in the dataset, which in turn defines the level of certainty associated with defining clusters, sources and risks. This also marks the importance of metadata (clinical, epidemiological, and biological) when using genomics for public health purposes.
Collapse
Affiliation(s)
- Tristan Schadron
- Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
| | - Maaike van den Beld
- Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
| | - Lapo Mughini-Gras
- Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
- Institute for Risk Assessment Sciences, Utrecht University, Utrecht, Netherlands
| | - Eelco Franz
- Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
| |
Collapse
|
7
|
Pham NP, Gingras H, Godin C, Feng J, Groppi A, Nikolski M, Leprohon P, Ouellette M. Holistic understanding of trimethoprim resistance in Streptococcus pneumoniae using an integrative approach of genome-wide association study, resistance reconstruction, and machine learning. mBio 2024; 15:e0136024. [PMID: 39120145 PMCID: PMC11389379 DOI: 10.1128/mbio.01360-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Accepted: 07/08/2024] [Indexed: 08/10/2024] Open
Abstract
Antimicrobial resistance (AMR) is a public health threat worldwide. Next-generation sequencing (NGS) has opened unprecedented opportunities to accelerate AMR mechanism discovery and diagnostics. Here, we present an integrative approach to investigate trimethoprim (TMP) resistance in the key pathogen Streptococcus pneumoniae. We explored a collection of 662 S. pneumoniae genomes by conducting a genome-wide association study (GWAS), followed by functional validation using resistance reconstruction experiments, combined with machine learning (ML) approaches to predict TMP minimum inhibitory concentration (MIC). Our study showed that multiple additive mutations in the folA and sulA loci are responsible for TMP non-susceptibility in S. pneumoniae and can be used as key features to build ML models for digital MIC prediction, reaching an average accuracy within ±1 twofold dilution factor of 86.3%. Our roadmap of in silico analysis-wet-lab validation-diagnostic tool building could be adapted to explore AMR in other combinations of bacteria-antibiotic. IMPORTANCE In the age of next-generation sequencing (NGS), while data-driven methods such as genome-wide association study (GWAS) and machine learning (ML) excel at finding patterns, functional validation can be challenging due to the high numbers of candidate variants. We designed an integrative approach combining a GWAS on S. pneumoniae clinical isolates, followed by whole-genome transformation coupled with NGS to functionally characterize a large set of GWAS candidates. Our study validated several phenotypic folA mutations beyond the standard Ile100Leu mutation, and showed that the overexpression of the sulA locus produces trimethoprim (TMP) resistance in Streptococcus pneumoniae. These validated loci, when used to build ML models, were found to be the best inputs for predicting TMP minimal inhibitory concentrations. Integrative approaches can bridge the genotype-phenotype gap by biological insights that can be incorporated in ML models for accurate prediction of drug susceptibility.
Collapse
Affiliation(s)
- Nguyen-Phuong Pham
- Centre de Recherche en Infectiologie du Centre de Recherche du CHU de Québec and Département de Microbiologie, Infectiologie et Immunologie, Faculté de Médecine, Université Laval, Québec City, Québec, Canada
| | - Hélène Gingras
- Centre de Recherche en Infectiologie du Centre de Recherche du CHU de Québec and Département de Microbiologie, Infectiologie et Immunologie, Faculté de Médecine, Université Laval, Québec City, Québec, Canada
| | - Chantal Godin
- Centre de Recherche en Infectiologie du Centre de Recherche du CHU de Québec and Département de Microbiologie, Infectiologie et Immunologie, Faculté de Médecine, Université Laval, Québec City, Québec, Canada
| | - Jie Feng
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Alexis Groppi
- Bordeaux Bioinformatics Center and CNRS, Institut de Biochimie et Génétique Cellulaires (IBGC) UMR 5095, Université de Bordeaux, Bordeaux, France
| | - Macha Nikolski
- Bordeaux Bioinformatics Center and CNRS, Institut de Biochimie et Génétique Cellulaires (IBGC) UMR 5095, Université de Bordeaux, Bordeaux, France
| | - Philippe Leprohon
- Centre de Recherche en Infectiologie du Centre de Recherche du CHU de Québec and Département de Microbiologie, Infectiologie et Immunologie, Faculté de Médecine, Université Laval, Québec City, Québec, Canada
| | - Marc Ouellette
- Centre de Recherche en Infectiologie du Centre de Recherche du CHU de Québec and Département de Microbiologie, Infectiologie et Immunologie, Faculté de Médecine, Université Laval, Québec City, Québec, Canada
| |
Collapse
|
8
|
Roder T, Pimentel G, Fuchsmann P, Stern MT, von Ah U, Vergères G, Peischl S, Brynildsrud O, Bruggmann R, Bär C. Scoary2: rapid association of phenotypic multi-omics data with microbial pan-genomes. Genome Biol 2024; 25:93. [PMID: 38605417 PMCID: PMC11007987 DOI: 10.1186/s13059-024-03233-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 03/29/2024] [Indexed: 04/13/2024] Open
Abstract
Unraveling bacterial gene function drives progress in various areas, such as food production, pharmacology, and ecology. While omics technologies capture high-dimensional phenotypic data, linking them to genomic data is challenging, leaving 40-60% of bacterial genes undescribed. To address this bottleneck, we introduce Scoary2, an ultra-fast microbial genome-wide association studies (mGWAS) software. With its data exploration app and improved performance, Scoary2 is the first tool to enable the study of large phenotypic datasets using mGWAS. As proof of concept, we explore the metabolome of yogurts, each produced with a different Propionibacterium reichii strain and discover two genes affecting carnitine metabolism.
Collapse
Affiliation(s)
- Thomas Roder
- Interfaculty Bioinformatics Unit and Swiss Institute of Bioinformatics, University of Bern, Bern, CH-3012, Switzerland
- Graduate School for Cellular and Biomedical Sciences, University of Bern, CH-3012, Bern, Switzerland
| | - Grégory Pimentel
- Methods development and analytics, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| | - Pascal Fuchsmann
- Food microbial systems, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| | - Mireille Tena Stern
- Food microbial systems, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| | - Ueli von Ah
- Food microbial systems, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| | - Guy Vergères
- Food microbial systems, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| | - Stephan Peischl
- Interfaculty Bioinformatics Unit and Swiss Institute of Bioinformatics, University of Bern, Bern, CH-3012, Switzerland
| | - Ola Brynildsrud
- Norwegian Institute of Public Health, Oslo and Norwegian University of Life Science, Ås, Norway
| | - Rémy Bruggmann
- Interfaculty Bioinformatics Unit and Swiss Institute of Bioinformatics, University of Bern, Bern, CH-3012, Switzerland.
| | - Cornelia Bär
- Methods development and analytics, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| |
Collapse
|
9
|
Batisti Biffignandi G, Chindelevitch L, Corbella M, Feil EJ, Sassera D, Lees JA. Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae. Microb Genom 2024; 10:001222. [PMID: 38529944 PMCID: PMC10995625 DOI: 10.1099/mgen.0.001222] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 03/07/2024] [Indexed: 03/27/2024] Open
Abstract
Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab-based MIC determination can be time-consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow in silico MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi-quantitatively, with varying resolution, and are typically also left- and right-censored within varying ranges. We therefore investigated genome-based prediction of MICs in the pathogen Klebsiella pneumoniae using 4367 genomes with both simulated semi-quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black-box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal variants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predictive models can be improved when prior biological knowledge is taken into account, due to the varying genetic architecture of each antibiotic resistance trait. Finally, we emphasise that incrementing the population database is pivotal for the future clinical implementation of these models to support routine machine-learning based diagnostics.
Collapse
Affiliation(s)
- Gherard Batisti Biffignandi
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
- MRC Centre for Global Infectious Disease Analysis, Imperial College, London, England, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Leonid Chindelevitch
- MRC Centre for Global Infectious Disease Analysis, Imperial College, London, England, UK
| | - Marta Corbella
- Microbiology and Virology Unit, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
| | - Edward J. Feil
- The Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, UK
| | - Davide Sassera
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
- Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
| | - John A. Lees
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| |
Collapse
|
10
|
Torres Ortiz A, Grandjean L. Phylogenetic Survival Analysis. Methods Mol Biol 2024; 2833:121-128. [PMID: 38949706 DOI: 10.1007/978-1-0716-3981-8_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Going back in time through a phylogenetic tree makes it possible to evaluate ancestral genomes and assess their potential to acquire key polymorphisms of interest over evolutionary time. Knowledge of this kind may allow for the emergence of key traits to be predicted and pre-empted from currently circulating strains in the future. Here, we present a novel genome-wide survival analysis and use the emergence of drug resistance in Mycobacterium tuberculosis as an example to demonstrate the potential and utility of the technique.
Collapse
Affiliation(s)
- Arturo Torres Ortiz
- Department of Infection, Immunity and Inflammation, Institute of Child Health, University College London, London, UK
| | - Louis Grandjean
- Department of Infection, Immunity and Inflammation, Institute of Child Health, University College London, London, UK.
| |
Collapse
|
11
|
Burgaya J, Marin J, Royer G, Condamine B, Gachet B, Clermont O, Jaureguy F, Burdet C, Lefort A, de Lastours V, Denamur E, Galardini M, Blanquart F. The bacterial genetic determinants of Escherichia coli capacity to cause bloodstream infections in humans. PLoS Genet 2023; 19:e1010842. [PMID: 37531401 PMCID: PMC10395866 DOI: 10.1371/journal.pgen.1010842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 06/23/2023] [Indexed: 08/04/2023] Open
Abstract
Escherichia coli is both a highly prevalent commensal and a major opportunistic pathogen causing bloodstream infections (BSI). A systematic analysis characterizing the genomic determinants of extra-intestinal pathogenic vs. commensal isolates in human populations, which could inform mechanisms of pathogenesis, diagnostic, prevention and treatment is still lacking. We used a collection of 912 BSI and 370 commensal E. coli isolates collected in France over a 17-year period (2000-2017). We compared their pangenomes, genetic backgrounds (phylogroups, STs, O groups), presence of virulence-associated genes (VAGs) and antimicrobial resistance genes, finding significant differences in all comparisons between commensal and BSI isolates. A machine learning linear model trained on all the genetic variants derived from the pangenome and controlling for population structure reveals similar differences in VAGs, discovers new variants associated with pathogenicity (capacity to cause BSI), and accurately classifies BSI vs. commensal strains. Pathogenicity is a highly heritable trait, with up to 69% of the variance explained by bacterial genetic variants. Lastly, complementing our commensal collection with an older collection from 1980, we predict that pathogenicity continuously increased through 1980, 2000, to 2010. Together our findings imply that E. coli exhibit substantial genetic variation contributing to the transition between commensalism and pathogenicity and that this species evolved towards higher pathogenicity.
Collapse
Affiliation(s)
- Judit Burgaya
- Institute for Molecular Bacteriology, TWINCORE Centre for Experimental and Clinical Infection Research, a joint venture between the Hannover Medical School (MHH) and the Helmholtz Centre for Infection Research (HZI), Hannover, Germany
- Cluster of Excellence RESIST (EXC 2155), Hannover Medical School (MHH), Hannover, Germany
| | - Julie Marin
- Université Sorbonne Paris Nord, INSERM, IAME, Bobigny, France
| | - Guilhem Royer
- Université Paris Cité, INSERM, IAME, Paris, France
- Département de Prévention, Diagnostic et Traitement des Infections, Hôpital Henri Mondor, Créteil, France
- Unité Ecologie et Evolution de la Résistance aux Antibiotiques, Institut Pasteur, UMR CNRS 6047, Université Paris-Cité, Paris, France
| | | | | | | | | | | | - Agnès Lefort
- Université Paris Cité, INSERM, IAME, Paris, France
| | | | - Erick Denamur
- Université Paris Cité, INSERM, IAME, Paris, France
- Laboratoire de Génétique Moléculaire, Hôpital Bichat, AP-HP, Paris, France
| | - Marco Galardini
- Institute for Molecular Bacteriology, TWINCORE Centre for Experimental and Clinical Infection Research, a joint venture between the Hannover Medical School (MHH) and the Helmholtz Centre for Infection Research (HZI), Hannover, Germany
- Cluster of Excellence RESIST (EXC 2155), Hannover Medical School (MHH), Hannover, Germany
| | - François Blanquart
- Center for Interdisciplinary Research in Biology, Collège de France, CNRS UMR7241 / INSERM U1050, PSL Research University, Paris, France
| |
Collapse
|
12
|
Karlsen ST, Rau MH, Sánchez BJ, Jensen K, Zeidan AA. From genotype to phenotype: computational approaches for inferring microbial traits relevant to the food industry. FEMS Microbiol Rev 2023; 47:fuad030. [PMID: 37286882 PMCID: PMC10337747 DOI: 10.1093/femsre/fuad030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/31/2023] [Accepted: 06/06/2023] [Indexed: 06/09/2023] Open
Abstract
When selecting microbial strains for the production of fermented foods, various microbial phenotypes need to be taken into account to achieve target product characteristics, such as biosafety, flavor, texture, and health-promoting effects. Through continuous advances in sequencing technologies, microbial whole-genome sequences of increasing quality can now be obtained both cheaper and faster, which increases the relevance of genome-based characterization of microbial phenotypes. Prediction of microbial phenotypes from genome sequences makes it possible to quickly screen large strain collections in silico to identify candidates with desirable traits. Several microbial phenotypes relevant to the production of fermented foods can be predicted using knowledge-based approaches, leveraging our existing understanding of the genetic and molecular mechanisms underlying those phenotypes. In the absence of this knowledge, data-driven approaches can be applied to estimate genotype-phenotype relationships based on large experimental datasets. Here, we review computational methods that implement knowledge- and data-driven approaches for phenotype prediction, as well as methods that combine elements from both approaches. Furthermore, we provide examples of how these methods have been applied in industrial biotechnology, with special focus on the fermented food industry.
Collapse
Affiliation(s)
- Signe T Karlsen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Martin H Rau
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Benjamín J Sánchez
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Kristian Jensen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Ahmad A Zeidan
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| |
Collapse
|
13
|
Palmieri N, Apostolakos I, Paudel S, Hess M. The genetic network underlying the evolution of pathogenicity in avian Escherichia coli. Front Vet Sci 2023; 10:1195585. [PMID: 37415967 PMCID: PMC10321414 DOI: 10.3389/fvets.2023.1195585] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 06/05/2023] [Indexed: 07/08/2023] Open
Abstract
Introduction Colibacillosis is a worldwide prevalent disease in poultry production linked to Escherichia coli strains that belong to the avian pathogenic E. coli (APEC) pathotype. While many virulence factors have been linked to APEC isolates, no single gene or set of genes has been found to be exclusively associated with the pathotype. Moreover, a comprehensive description of the biological processes linked to APEC pathogenicity is currently lacking. Methods In this study, we compiled a dataset of 2015 high-quality avian E. coli genomes from pathogenic and commensal isolates, based on publications from 2000 to 2021. We then conducted a genome-wide association study (GWAS) and integrated candidate gene identification with available protein-protein interaction data to decipher the genetic network underlying the biological processes connected to APEC pathogenicity. Results Our GWAS identified variations in gene content for 13 genes and SNPs in 3 different genes associated with APEC isolates, suggesting both gene-level and SNP-level variations contribute to APEC pathogenicity. Integrating protein-protein interaction data, we found that 15 of these genes clustered in the same genetic network, suggesting the pathogenicity of APEC might be due to the interplay of different regulated pathways. We also found novel candidate genes including an uncharacterized multi-pass membrane protein (yciC) and the outer membrane porin (ompD) as linked to APEC isolates. Discussion Our findings suggest that convergent pathways related to nutrient uptake from host cells and defense from host immune system play a major role in APEC pathogenicity. In addition, the dataset curated in this study represents a comprehensive historical genomic collection of avian E. coli isolates and constitutes a valuable resource for their comparative genomics investigations.
Collapse
Affiliation(s)
- Nicola Palmieri
- Clinic for Poultry and Fish Medicine, Department for Farm Animals and Veterinary Public Health, University of Veterinary Medicine, Vienna, Austria
| | | | - Surya Paudel
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| | - Michael Hess
- Clinic for Poultry and Fish Medicine, Department for Farm Animals and Veterinary Public Health, University of Veterinary Medicine, Vienna, Austria
| |
Collapse
|
14
|
Hachani A, Giulieri SG, Guérillot R, Walsh CJ, Herisse M, Soe YM, Baines SL, Thomas DR, Cheung SD, Hayes AS, Cho E, Newton HJ, Pidot S, Massey RC, Howden BP, Stinear TP. A high-throughput cytotoxicity screening platform reveals agr-independent mutations in bacteraemia-associated Staphylococcus aureus that promote intracellular persistence. eLife 2023; 12:e84778. [PMID: 37289634 PMCID: PMC10259494 DOI: 10.7554/elife.84778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 05/23/2023] [Indexed: 06/10/2023] Open
Abstract
Staphylococcus aureus infections are associated with high mortality rates. Often considered an extracellular pathogen, S. aureus can persist and replicate within host cells, evading immune responses, and causing host cell death. Classical methods for assessing S. aureus cytotoxicity are limited by testing culture supernatants and endpoint measurements that do not capture the phenotypic diversity of intracellular bacteria. Using a well-established epithelial cell line model, we have developed a platform called InToxSa (intracellular toxicity of S. aureus) to quantify intracellular cytotoxic S. aureus phenotypes. Studying a panel of 387 S. aureus bacteraemia isolates, and combined with comparative, statistical, and functional genomics, our platform identified mutations in S. aureus clinical isolates that reduced bacterial cytotoxicity and promoted intracellular persistence. In addition to numerous convergent mutations in the Agr quorum sensing system, our approach detected mutations in other loci that also impacted cytotoxicity and intracellular persistence. We discovered that clinical mutations in ausA, encoding the aureusimine non-ribosomal peptide synthetase, reduced S. aureus cytotoxicity, and increased intracellular persistence. InToxSa is a versatile, high-throughput cell-based phenomics platform and we showcase its utility by identifying clinically relevant S. aureus pathoadaptive mutations that promote intracellular residency.
Collapse
Affiliation(s)
- Abderrahman Hachani
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
| | - Stefano G Giulieri
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
| | - Romain Guérillot
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
| | - Calum J Walsh
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
| | - Marion Herisse
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
| | - Ye Mon Soe
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
| | - Sarah L Baines
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
| | - David R Thomas
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
- Infection and Immunity Program, Department of Microbiology and Biomedicine Discovery Institute, Monash UniversityClaytonAustralia
| | - Shane Doris Cheung
- Biological Optical Microscopy Platform, University of MelbourneMelbourneAustralia
| | - Ashleigh S Hayes
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
| | - Ellie Cho
- Biological Optical Microscopy Platform, University of MelbourneMelbourneAustralia
| | - Hayley J Newton
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
- Infection and Immunity Program, Department of Microbiology and Biomedicine Discovery Institute, Monash UniversityClaytonAustralia
| | - Sacha Pidot
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
| | - Ruth C Massey
- School of Microbiology, University College CorkCorkIreland
- School of Medicine, University College CorkCorkIreland
- APC Microbiome Ireland, University College CorkCorkIreland
- School of Cellular and Molecular Medicine, University of BristolBristolUnited Kingdom
| | - Benjamin P Howden
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
| | - Timothy P Stinear
- Department of Microbiology and Immunology, Doherty Institute, University of MelbourneMelbourneAustralia
| |
Collapse
|
15
|
Saber MM, Donner J, Levade I, Acosta N, Parkins MD, Boyle B, Levesque RC, Nguyen D, Shapiro BJ. Single nucleotide variants in Pseudomonas aeruginosa populations from sputum correlate with baseline lung function and predict disease progression in individuals with cystic fibrosis. Microb Genom 2023; 9. [PMID: 37052589 DOI: 10.1099/mgen.0.000981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/14/2023] Open
Abstract
The severity and progression of lung disease are highly variable across individuals with cystic fibrosis (CF) and are imperfectly predicted by mutations in the human gene CFTR, lung microbiome variation or other clinical factors. The opportunistic pathogen Pseudomonas aeruginosa (Pa) dominates airway infections in most CF adults. Here we hypothesized that within-host genetic variation of Pa populations would be associated with lung disease severity. To quantify Pa genetic variation within CF sputum samples, we used deep amplicon sequencing (AmpliSeq) of 209 Pa genes previously associated with pathogenesis or adaptation to the CF lung. We trained machine learning models using Pa single nucleotide variants (SNVs), microbiome diversity data and clinical factors to classify lung disease severity at the time of sputum sampling, and to predict lung function decline after 5 years in a cohort of 54 adult CF patients with chronic Pa infection. Models using Pa SNVs alone classified lung disease severity with good sensitivity and specificity (area under the receiver operating characteristic curve: AUROC=0.87). Models were less predictive of lung function decline after 5 years (AUROC=0.74) but still significantly better than random. The addition of clinical data, but not sputum microbiome diversity data, yielded only modest improvements in classifying baseline lung function (AUROC=0.92) and predicting lung function decline (AUROC=0.79), suggesting that Pa AmpliSeq data account for most of the predictive value. Our work provides a proof of principle that Pa genetic variation in sputum tracks lung disease severity, moderately predicts lung function decline and could serve as a disease biomarker among CF patients with chronic Pa infections.
Collapse
Affiliation(s)
- Morteza M Saber
- Department of Microbiology and Immunology, McGill University, Montreal, QC, Canada
| | - Jannik Donner
- Department of Medicine, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Inès Levade
- Department of Medicine, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Nicole Acosta
- Department of Microbiology, Immunology and Infectious Disease, University of Calgary, Calgary, AB, Canada
| | - Michael D Parkins
- Department of Microbiology, Immunology and Infectious Disease, University of Calgary, Calgary, AB, Canada
- Department of Medicine, University of Calgary, Calgary, AB, Canada
| | - Brian Boyle
- Integrative Systems Biology Institute, University of Laval, Québec, QC, Canada
| | - Roger C Levesque
- Integrative Systems Biology Institute, University of Laval, Québec, QC, Canada
| | - Dao Nguyen
- Department of Microbiology and Immunology, McGill University, Montreal, QC, Canada
- Department of Medicine, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
- Meakins Christie Laboratories, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - B Jesse Shapiro
- Department of Microbiology and Immunology, McGill University, Montreal, QC, Canada
- McGill Genome Centre, Montreal, QC, Canada
| |
Collapse
|
16
|
Both A, Huang J, Hentschke M, Tobys D, Christner M, Klatte TO, Seifert H, Aepfelbacher M, Rohde H. Genomics of Invasive Cutibacterium acnes Isolates from Deep-Seated Infections. Microbiol Spectr 2023; 11:e0474022. [PMID: 36976006 PMCID: PMC10100948 DOI: 10.1128/spectrum.04740-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 03/03/2023] [Indexed: 03/29/2023] Open
Abstract
Cutibacterium acnes, formerly known as Propionibacterium acnes, is a commensal of the human pilosebaceous unit but also causes deep-seated infection, especially in the context of orthopedic and neurosurgical foreign materials. Interestingly, little is known about the role of specific pathogenicity factors for infection establishment. Here, 86 infection-associated and 103 commensalism-associated isolates of C. acnes were collected from three independent microbiology laboratories. We sequenced the whole genomes of the isolates for genotyping and a genome-wide association study (GWAS). We found that C. acnes subsp. acnes IA1 was the most significant phylotype among the infection isolates (48.3% of all infection isolates; odds ratio [OR] = 1.98 for infection). Among the commensal isolates, C. acnes subsp. acnes IB was the most significant phylotype (40.8% of all commensal isolates; OR = 0.5 for infection). Interestingly, C. acnes subsp. elongatum (III) was rare overall and did not occur at all in infection. The open reading frame-based GWAS (ORF-GWAS) did not show any loci with a strong signal for infection association (no P values of ≤0.05 after adjustment for multiple testing; no logarithmic OR [logOR] of ≥|2|). We concluded that all subspecies and phylotypes of C. acnes, possibly with the exception of C. acnes subsp. elongatum, are able to cause deep-seated infection given favorable conditions, most importantly related to inserted foreign material. Genetic content appears to have a small effect on the likelihood of infection establishment, and functional studies are needed to understand the individual factors contributing to deep-seated infections caused by C. acnes. IMPORTANCE Opportunistic infections emerging from human skin microbiota are of ever-increasing importance. Cutibacterium acnes, being abundant on the human skin, may cause deep-seated infections (e.g., device-associated infections). Differentiation between invasive (i.e., clinically significant) C. acnes isolates and sole contaminants is often difficult. Identification of genetic markers associated with invasiveness not only would strengthen our knowledge related to pathogenesis but also could open ways to selectively categorize invasive and contaminating isolates in the clinical microbiology lab. We show that in contrast to other opportunistic pathogens (e.g., Staphylococcus epidermidis), invasiveness is apparently a broadly distributed ability across almost all C. acnes subspecies and phylotypes. Thus, our work strongly supports an approach in which clinical significance is judged from clinical context rather than by detecting specific genetic traits.
Collapse
Affiliation(s)
- Anna Both
- Institute for Medical Microbiology, Virology and Hygiene, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Jiabin Huang
- Institute for Medical Microbiology, Virology and Hygiene, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | | | - David Tobys
- Institute for Medical Microbiology, Immunology and Hygiene, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
- German Center for Infection Research (DZIF), Partner Site Bonn-Cologne, Cologne, Germany
| | - Martin Christner
- Institute for Medical Microbiology, Virology and Hygiene, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Till Orla Klatte
- Department for Trauma Surgery and Orthopedics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Harald Seifert
- Institute for Medical Microbiology, Immunology and Hygiene, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
- German Center for Infection Research (DZIF), Partner Site Bonn-Cologne, Cologne, Germany
| | - Martin Aepfelbacher
- Institute for Medical Microbiology, Virology and Hygiene, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Holger Rohde
- Institute for Medical Microbiology, Virology and Hygiene, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| |
Collapse
|
17
|
Migration Rates on Swim Plates Vary between Escherichia coli Soil Isolates: Differences Are Associated with Variants in Metabolic Genes. Appl Environ Microbiol 2023; 89:e0172722. [PMID: 36695629 PMCID: PMC9972950 DOI: 10.1128/aem.01727-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
This study investigates migration phenotypes of 265 Escherichia coli soil isolates from the Buffalo River basin in Minnesota, USA. Migration rates on semisolid tryptone swim plates ranged from nonmotile to 190% of the migration rate of a highly motile E. coli K-12 strain. The nonmotile isolate, LGE0550, had mutations in flagellar and chemotaxis genes, including two IS3 elements in the flagellin-encoding gene fliC. A genome-wide association study (GWAS), associating the migration rates with genetic variants in specific genes, yielded two metabolic variants (rygD-serA and metR-metE) with previous implications in chemotaxis. As a novel way of confirming GWAS results, we used minimal medium swim plates to confirm the associations. Other variants in metabolic genes and genes that are associated with biofilm were positively or negatively associated with migration rates. A determination of growth phenotypes on Biolog EcoPlates yielded differential growth for the 10 tested isolates on d-malic acid, putrescine, and d-xylose, all of which are important in the soil environment. IMPORTANCE E. coli is a Gram-negative, facultative anaerobic bacterium whose life cycle includes extra host environments in addition to human, animal, and plant hosts. The bacterium has the genomic capability of being motile. In this context, the significance of this study is severalfold: (i) the great diversity of migration phenotypes that we observed within our isolate collection supports previous (G. NandaKafle, A. A. Christie, S. Vilain, and V. S. Brözel, Front Microbiol 9:762, 2018, https://doi.org/10.3389/fmicb.2018.00762; Y. Somorin, F. Abram, F. Brennan, and C. O'Byrne, Appl Environ Microbiol 82:4628-4640, 2016, https://doi.org/10.1128/AEM.01175-16) ideas of soil promoting phenotypic heterogeneity, (ii) such heterogeneity may facilitate bacterial growth in the many different soil niches, and (iii) such heterogeneity may enable the bacteria to interact with human, animal, and plant hosts.
Collapse
|
18
|
dessouky YE, Elsayed SW, Abdelsalam NA, Saif NA, Álvarez-Ordóñez A, Elhadidy M. Genomic insights into zoonotic transmission and antimicrobial resistance in Campylobacter jejuni from farm to fork: a one health perspective. Gut Pathog 2022; 14:44. [PMID: 36471447 PMCID: PMC9721040 DOI: 10.1186/s13099-022-00517-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 11/08/2022] [Indexed: 12/09/2022] Open
Abstract
BACKGROUND Campylobacteriosis represents a global public health threat with various socio-economic impacts. Among different Campylobacter species, Campylobacter jejuni (C. jejuni) is considered to be the foremost Campylobacter species responsible for most of gastrointestinal-related infections. Although these species are reported to primarily inhabit birds, its high genetic and phenotypic diversity allowed their adaptation to other animal reservoirs and to the environment that may impact on human infection. MAIN BODY A stringent and consistent surveillance program based on high resolution subtyping is crucial. Recently, different epidemiological investigations have implemented high-throughput sequencing technologies and analytical pipelines for higher resolution subtyping, accurate source attribution, and detection of antimicrobial resistance determinants among these species. In this review, we aim to present a comprehensive overview on the epidemiology, clinical presentation, antibiotic resistance, and transmission dynamics of Campylobacter, with specific focus on C. jejuni. This review also summarizes recent attempts of applying whole-genome sequencing (WGS) coupled with bioinformatic algorithms to identify and provide deeper insights into evolutionary and epidemiological dynamics of C. jejuni precisely along the farm-to-fork continuum. CONCLUSION WGS is a valuable addition to traditional surveillance methods for Campylobacter. It enables accurate typing of this pathogen and allows tracking of its transmission sources. It is also advantageous for in silico characterization of antibiotic resistance and virulence determinants, and hence implementation of control measures for containment of infection.
Collapse
Affiliation(s)
- Yara El dessouky
- grid.440881.10000 0004 0576 5483Biomedical Sciences Program, University of Science and Technology, Zewail City of Science and Technology, Giza, Egypt ,grid.440881.10000 0004 0576 5483Center for Genomics, Helmy Institute for Medical Sciences, Zewail City of Science and Technology, Giza, Egypt
| | - Salma W. Elsayed
- grid.440881.10000 0004 0576 5483Biomedical Sciences Program, University of Science and Technology, Zewail City of Science and Technology, Giza, Egypt ,grid.440881.10000 0004 0576 5483Center for Genomics, Helmy Institute for Medical Sciences, Zewail City of Science and Technology, Giza, Egypt ,grid.7269.a0000 0004 0621 1570Department of Microbiology and Immunology, Faculty of Pharmacy, Ain Shams University, Cairo, Egypt
| | - Nehal Adel Abdelsalam
- grid.440881.10000 0004 0576 5483Biomedical Sciences Program, University of Science and Technology, Zewail City of Science and Technology, Giza, Egypt ,grid.440881.10000 0004 0576 5483Center for Genomics, Helmy Institute for Medical Sciences, Zewail City of Science and Technology, Giza, Egypt ,grid.7776.10000 0004 0639 9286Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, Egypt
| | - Nehal A. Saif
- grid.440881.10000 0004 0576 5483Biomedical Sciences Program, University of Science and Technology, Zewail City of Science and Technology, Giza, Egypt ,grid.440881.10000 0004 0576 5483Center for Genomics, Helmy Institute for Medical Sciences, Zewail City of Science and Technology, Giza, Egypt
| | - Avelino Álvarez-Ordóñez
- grid.4807.b0000 0001 2187 3167Department of Food Hygiene and Technology and Institute of Food Science and Technology, Universidad de León, León, Spain
| | - Mohamed Elhadidy
- grid.440881.10000 0004 0576 5483Biomedical Sciences Program, University of Science and Technology, Zewail City of Science and Technology, Giza, Egypt ,grid.440881.10000 0004 0576 5483Center for Genomics, Helmy Institute for Medical Sciences, Zewail City of Science and Technology, Giza, Egypt ,grid.10251.370000000103426662Department of Bacteriology, Mycology and Immunology, Faculty of Veterinary Medicine, Mansoura University, Mansoura, Egypt
| |
Collapse
|
19
|
Creasy-Marrazzo A, Saber MM, Kamat M, Bailey LS, Brinkley L, Cato E, Begum Y, Rashid MM, Khan AI, Qadri F, Basso KB, Shapiro BJ, Nelson EJ. Genome-wide association studies reveal distinct genetic correlates and increased heritability of antimicrobial resistance in Vibrio cholerae under anaerobic conditions. Microb Genom 2022; 8:mgen000905. [PMID: 36748512 PMCID: PMC9837564 DOI: 10.1099/mgen.0.000905] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 10/06/2022] [Indexed: 12/07/2022] Open
Abstract
The antibiotic formulary is threatened by high rates of antimicrobial resistance (AMR) among enteropathogens. Enteric bacteria are exposed to anaerobic conditions within the gastrointestinal tract, yet little is known about how oxygen exposure influences AMR. The facultative anaerobe Vibrio cholerae was chosen as a model to address this knowledge gap. We obtained V. cholerae isolates from 66 cholera patients, sequenced their genomes, and grew them under anaerobic and aerobic conditions with and without three clinically relevant antibiotics (ciprofloxacin, azithromycin, doxycycline). For ciprofloxacin and azithromycin, the minimum inhibitory concentration (MIC) increased under anaerobic conditions compared to aerobic conditions. Using standard resistance breakpoints, the odds of classifying isolates as resistant increased over 10 times for ciprofloxacin and 100 times for azithromycin under anaerobic conditions compared to aerobic conditions. For doxycycline, nearly all isolates were sensitive under both conditions. Using genome-wide association studies, we found associations between genetic elements and AMR phenotypes that varied by oxygen exposure and antibiotic concentrations. These AMR phenotypes were more heritable, and the AMR-associated genetic elements were more often discovered, under anaerobic conditions. These AMR-associated genetic elements are promising targets for future mechanistic research. Our findings provide a rationale to determine whether increased MICs under anaerobic conditions are associated with therapeutic failures and/or microbial escape in cholera patients. If so, there may be a need to determine new AMR breakpoints for anaerobic conditions.
Collapse
Affiliation(s)
- Ashton Creasy-Marrazzo
- Departments of Pediatrics, University of Florida, Gainesville, FL, USA
- Department of Environmental and Global Health, University of Florida, Gainesville, FL, USA
| | - Morteza M. Saber
- Department of Microbiology and Immunology, McGill University, Gainesville, FL, USA
| | - Manasi Kamat
- Department of Chemistry, University of Florida, Gainesville, FL, USA
| | - Laura S. Bailey
- Department of Chemistry, University of Florida, Gainesville, FL, USA
| | - Lindsey Brinkley
- Departments of Pediatrics, University of Florida, Gainesville, FL, USA
| | - Emilee Cato
- Departments of Pediatrics, University of Florida, Gainesville, FL, USA
| | - Yasmin Begum
- Infectious Diseases Division (IDD) and Nutrition and Clinical Services Division (NCSD), International Centre for Diarrhoeal Disease Research, Bangladesh (icddr, b), Dhaka, Bangladesh
| | - Md. Mahbubur Rashid
- Infectious Diseases Division (IDD) and Nutrition and Clinical Services Division (NCSD), International Centre for Diarrhoeal Disease Research, Bangladesh (icddr, b), Dhaka, Bangladesh
| | - Ashraful I. Khan
- Infectious Diseases Division (IDD) and Nutrition and Clinical Services Division (NCSD), International Centre for Diarrhoeal Disease Research, Bangladesh (icddr, b), Dhaka, Bangladesh
| | - Firdausi Qadri
- Infectious Diseases Division (IDD) and Nutrition and Clinical Services Division (NCSD), International Centre for Diarrhoeal Disease Research, Bangladesh (icddr, b), Dhaka, Bangladesh
| | - Kari B. Basso
- Department of Chemistry, University of Florida, Gainesville, FL, USA
| | - B. Jesse Shapiro
- Department of Microbiology and Immunology, McGill University, Gainesville, FL, USA
| | - Eric J. Nelson
- Departments of Pediatrics, University of Florida, Gainesville, FL, USA
| |
Collapse
|
20
|
Kim JI, Maguire F, Tsang KK, Gouliouris T, Peacock SJ, McAllister TA, McArthur AG, Beiko RG. Machine Learning for Antimicrobial Resistance Prediction: Current Practice, Limitations, and Clinical Perspective. Clin Microbiol Rev 2022; 35:e0017921. [PMID: 35612324 PMCID: PMC9491192 DOI: 10.1128/cmr.00179-21] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Antimicrobial resistance (AMR) is a global health crisis that poses a great threat to modern medicine. Effective prevention strategies are urgently required to slow the emergence and further dissemination of AMR. Given the availability of data sets encompassing hundreds or thousands of pathogen genomes, machine learning (ML) is increasingly being used to predict resistance to different antibiotics in pathogens based on gene content and genome composition. A key objective of this work is to advocate for the incorporation of ML into front-line settings but also highlight the further refinements that are necessary to safely and confidently incorporate these methods. The question of what to predict is not trivial given the existence of different quantitative and qualitative laboratory measures of AMR. ML models typically treat genes as independent predictors, with no consideration of structural and functional linkages; they also may not be accurate when new mutational variants of known AMR genes emerge. Finally, to have the technology trusted by end users in public health settings, ML models need to be transparent and explainable to ensure that the basis for prediction is clear. We strongly advocate that the next set of AMR-ML studies should focus on the refinement of these limitations to be able to bridge the gap to diagnostic implementation.
Collapse
Affiliation(s)
- Jee In Kim
- Faculty of Computer Science, Dalhousie University, Halifax, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, Canada
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, Canada
| | - Finlay Maguire
- Faculty of Computer Science, Dalhousie University, Halifax, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, Canada
- Department of Community Health and Epidemiology, Faculty of Medicine, Dalhousie University, Halifax, Canada
- Shared Hospital Laboratory, Toronto, Canada
- Sunnybrook Research Institute, Sunnybrook Health Sciences Centre, Toronto, Canada
| | - Kara K. Tsang
- London School of Hygiene & Tropical Medicine, London, United Kingdom
| | - Theodore Gouliouris
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- Clinical Microbiology and Public Health Laboratory, Public Health England, Cambridge, United Kingdom
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Sharon J. Peacock
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Tim A. McAllister
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, Canada
| | - Andrew G. McArthur
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Canada
- M.G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Canada
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Canada
| | - Robert G. Beiko
- Faculty of Computer Science, Dalhousie University, Halifax, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, Canada
| |
Collapse
|
21
|
Guillier L, Palma F, Fritsch L. Taking account of genomics in quantitative microbial risk assessment: what methods? what issues? Curr Opin Food Sci 2022. [DOI: 10.1016/j.cofs.2022.100922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
22
|
van der Putten BCL, Huijsmans NAH, Mende DR, Schultsz C. Benchmarking the topological accuracy of bacterial phylogenomic workflows using in silico evolution. Microb Genom 2022; 8. [PMID: 35290758 PMCID: PMC9176278 DOI: 10.1099/mgen.0.000799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Phylogenetic analyses are widely used in microbiological research, for example to trace the progression of bacterial outbreaks based on whole-genome sequencing data. In practice, multiple analysis steps such as de novo assembly, alignment and phylogenetic inference are combined to form phylogenetic workflows. Comprehensive benchmarking of the accuracy of complete phylogenetic workflows is lacking. To benchmark different phylogenetic workflows, we simulated bacterial evolution under a wide range of evolutionary models, varying the relative rates of substitution, insertion, deletion, gene duplication, gene loss and lateral gene transfer events. The generated datasets corresponded to a genetic diversity usually observed within bacterial species (≥95 % average nucleotide identity). We replicated each simulation three times to assess replicability. In total, we benchmarked 19 distinct phylogenetic workflows using 8 different simulated datasets. We found that recently developed k-mer alignment methods such as kSNP and ska achieve similar accuracy as reference mapping. The high accuracy of k-mer alignment methods can be explained by the large fractions of genomes these methods can align, relative to other approaches. We also found that the choice of de novo assembly algorithm influences the accuracy of phylogenetic reconstruction, with workflows employing SPAdes or skesa outperforming those employing Velvet. Finally, we found that the results of phylogenetic benchmarking are highly variable between replicates. We conclude that for phylogenomic reconstruction, k-mer alignment methods are relevant alternatives to reference mapping at the species level, especially in the absence of suitable reference genomes. We show de novo genome assembly accuracy to be an underappreciated parameter required for accurate phylogenomic reconstruction.
Collapse
Affiliation(s)
- Boas C L van der Putten
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.,Department of Global Health, Amsterdam Institute for Global Health and Development, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Niek A H Huijsmans
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Daniel R Mende
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Constance Schultsz
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.,Department of Global Health, Amsterdam Institute for Global Health and Development, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
23
|
Denamur E, Condamine B, Esposito-Farèse M, Royer G, Clermont O, Laouenan C, Lefort A, de Lastours V, Galardini M, the COLIBAFI, SEPTICOLI groups. Genome wide association study of Escherichia coli bloodstream infection isolates identifies genetic determinants for the portal of entry but not fatal outcome. PLoS Genet 2022; 18:e1010112. [PMID: 35324915 PMCID: PMC8946752 DOI: 10.1371/journal.pgen.1010112] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Accepted: 02/21/2022] [Indexed: 11/19/2022] Open
Abstract
Escherichia coli is an important cause of bloodstream infections (BSI), which is of concern given its high mortality and increasing worldwide prevalence. Finding bacterial genetic variants that might contribute to patient death is of interest to better understand infection progression and implement diagnostic methods that specifically look for those factors. E. coli samples isolated from patients with BSI are an ideal dataset to systematically search for those variants, as long as the influence of host factors such as comorbidities are taken into account. Here we performed a genome-wide association study (GWAS) using data from 912 patients with E. coli BSI from hospitals in Paris, France. We looked for associations between bacterial genetic variants and three patient outcomes (death at 28 days, septic shock and admission to intensive care unit), as well as two portals of entry (urinary and digestive tract), using various clinical variables from each patient to account for host factors. We did not find any association between genetic variants and patient outcomes, potentially confirming the strong influence of host factors in influencing the course of BSI; we however found a strong association between the papGII operon and entrance of E. coli through the urinary tract, which demonstrates the power of bacterial GWAS when applied to actual clinical data. Despite the lack of associations between E. coli genetic variants and patient outcomes, we estimate that increasing the sample size by one order of magnitude could lead to the discovery of some putative causal variants. Given the wide adoption of bacterial genome sequencing of clinical isolates, such sample sizes may be soon available.
Collapse
Affiliation(s)
- Erick Denamur
- Université de Paris, IAME, UMR 1137, INSERM, Paris, France
- Laboratoire de Génétique Moléculaire, Hôpital Bichat, AP-HP, Paris, France
| | | | - Marina Esposito-Farèse
- Département d’épidémiologie, biostatistiques et recherche clinique, Hôpital Bichat, AP-HP, Paris, France
| | - Guilhem Royer
- Université de Paris, IAME, UMR 1137, INSERM, Paris, France
- LABGeM, Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Université Paris-Saclay, Evry, France
- Département de Prévention, Diagnostic et Traitement des Infections, Hôpital Henri Mondor, Créteil, France
| | | | - Cédric Laouenan
- Université de Paris, IAME, UMR 1137, INSERM, Paris, France
- Département d’épidémiologie, biostatistiques et recherche clinique, Hôpital Bichat, AP-HP, Paris, France
| | - Agnès Lefort
- Université de Paris, IAME, UMR 1137, INSERM, Paris, France
- Service de Médecine Interne, Hôpital Beaujon, AP-HP, Clichy, France
| | - Victoire de Lastours
- Université de Paris, IAME, UMR 1137, INSERM, Paris, France
- Service de Médecine Interne, Hôpital Beaujon, AP-HP, Clichy, France
| | - Marco Galardini
- Institute for Molecular Bacteriology, TWINCORE Centre for Experimental and Clinical Infection Research, a joint venture between the Hannover Medical School (MHH) and the Helmholtz Centre for Infection Research (HZI), Hannover, Germany
- Cluster of Excellence RESIST (EXC 2155), Hannover Medical School (MHH), Hannover, Germany
| | | | | |
Collapse
|
24
|
Buckley SJ, Harvey RJ. Lessons Learnt From Using the Machine Learning Random Forest Algorithm to Predict Virulence in Streptococcus pyogenes. Front Cell Infect Microbiol 2022; 11:809560. [PMID: 35004362 PMCID: PMC8739889 DOI: 10.3389/fcimb.2021.809560] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 12/13/2021] [Indexed: 11/13/2022] Open
Abstract
Group A Streptococcus is a globally significant human pathogen. The extensive variability of the GAS genome, virulence phenotypes and clinical outcomes, render it an excellent candidate for the application of genotype-phenotype association studies in the era of whole-genome sequencing. We have catalogued the distribution and diversity of the transcription regulators of GAS, and employed phylogenetics, concordance metrics and machine learning (ML) to test for associations. In this review, we communicate the lessons learnt in the context of the recent bacteria genotype-phenotype association studies of others that have utilised both genome-wide association studies (GWAS) and ML. We envisage a promising future for the application GWAS in bacteria genotype-phenotype association studies and foresee the increasing use of ML. However, progress in this field is hindered by several outstanding bottlenecks. These include the shortcomings that are observed when GWAS techniques that have been fine-tuned on human genomes, are applied to bacterial genomes. Furthermore, there is a deficit of easy-to-use end-to-end workflows, and a lag in the collection of detailed phenotype and clinical genomic metadata. We propose a novel quality control protocol for the collection of high-quality GAS virulence phenotype coupled to clinical outcome data. Finally, we incorporate this protocol into a workflow for testing genotype-phenotype associations using ML and ‘linked’ patient-microbe genome sets that better represent the infection event.
Collapse
Affiliation(s)
- Sean J Buckley
- School of Health and Behavioural Sciences, University of the Sunshine Coast, Maroochydore DC, QLD, Australia
| | - Robert J Harvey
- School of Health and Behavioural Sciences, University of the Sunshine Coast, Maroochydore DC, QLD, Australia.,Sunshine Coast Health Institute, Birtinya, QLD, Australia
| |
Collapse
|
25
|
Viehweger A, Blumenscheit C, Lippmann N, Wyres KL, Brandt C, Hans JB, Hölzer M, Irber L, Gatermann S, Lübbert C, Pletz MW, Holt KE, König B. Context-aware genomic surveillance reveals hidden transmission of a carbapenemase-producing Klebsiella pneumoniae. Microb Genom 2021; 7:000741. [PMID: 34913861 PMCID: PMC8767333 DOI: 10.1099/mgen.0.000741] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Accepted: 11/04/2021] [Indexed: 01/18/2023] Open
Abstract
Genomic surveillance can inform effective public health responses to pathogen outbreaks. However, integration of non-local data is rarely done. We investigate two large hospital outbreaks of a carbapenemase-carrying Klebsiella pneumoniae strain in Germany and show the value of contextual data. By screening about 10 000 genomes, over 400 000 metagenomes and two culture collections using in silico and in vitro methods, we identify a total of 415 closely related genomes reported in 28 studies. We identify the relationship between the two outbreaks through time-dated phylogeny, including their respective origin. One of the outbreaks presents extensive hidden transmission, with descendant isolates only identified in other studies. We then leverage the genome collection from this meta-analysis to identify genes under positive selection. We thereby identify an inner membrane transporter (ynjC) with a putative role in colistin resistance. Contextual data from other sources can thus enhance local genomic surveillance at multiple levels and should be integrated by default when available.
Collapse
Affiliation(s)
- Adrian Viehweger
- Institute of Medical Microbiology and Virology, University Hospital Leipzig, Leipzig, Germany
| | | | - Norman Lippmann
- Institute of Medical Microbiology and Virology, University Hospital Leipzig, Leipzig, Germany
| | - Kelly L. Wyres
- Department of Infectious Diseases, Central Clinical School, Monash University, Melbourne, Australia
| | - Christian Brandt
- Institute for Infectious Diseases and Infection Control, Jena University Hospital, Jena, Germany
| | - Jörg B. Hans
- National Reference Center for multidrug-resistant Gram-negative bacteria, Department for Medical Microbiology, Ruhr-University Bochum, Bochum, Germany
| | - Martin Hölzer
- Methodology and Research Infrastructure, MF1 Bioinformatics, Robert Koch Institute, Berlin, Germany
| | - Luiz Irber
- Department of Population Health and Reproduction, University of California, Davis, Davis, California, USA
| | - Sören Gatermann
- National Reference Center for multidrug-resistant Gram-negative bacteria, Department for Medical Microbiology, Ruhr-University Bochum, Bochum, Germany
| | - Christoph Lübbert
- Division of Infectious Diseases and Tropical Medicine, Department of Medicine II, University Hospital Leipzig, Leipzig, Germany
| | - Mathias W. Pletz
- Institute for Infectious Diseases and Infection Control, Jena University Hospital, Jena, Germany
| | - Kathryn E. Holt
- Department of Infectious Diseases, Central Clinical School, Monash University, Melbourne, Australia
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, UK
| | - Brigitte König
- Institute of Medical Microbiology and Virology, University Hospital Leipzig, Leipzig, Germany
| |
Collapse
|
26
|
Abstract
Bacterial genome-wide association studies (bGWAS) capture associations between genomic variation and phenotypic variation. Convergence-based bGWAS methods identify genomic mutations that occur independently multiple times on the phylogenetic tree in the presence of phenotypic variation more often than is expected by chance. This work introduces hogwash, an open source R package that implements three algorithms for convergence-based bGWAS. Hogwash additionally contains two burden testing approaches to perform gene or pathway analysis to improve power and increase convergence detection for related but weakly penetrant genotypes. To identify optimal use cases, we applied hogwash to data simulated with a variety of phylogenetic signals and convergence distributions. These simulated data are publicly available and contain the relevant metadata regarding convergence and phylogenetic signal for each phenotype and genotype. Hogwash is available for download from GitHub.
Collapse
Affiliation(s)
- Katie Saund
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA
| | - Evan S Snitkin
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA.,Department of Internal Medicine, Division of Infectious Diseases, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
27
|
Zabeti H, Dexter N, Safari AH, Sedaghat N, Libbrecht M, Chindelevitch L. INGOT-DR: an interpretable classifier for predicting drug resistance in M. tuberculosis. Algorithms Mol Biol 2021; 16:17. [PMID: 34376217 PMCID: PMC8353837 DOI: 10.1186/s13015-021-00198-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 07/23/2021] [Indexed: 12/13/2022] Open
Abstract
Motivation Prediction of drug resistance and identification of its mechanisms in bacteria such as Mycobacterium tuberculosis, the etiological agent of tuberculosis, is a challenging problem. Solving this problem requires a transparent, accurate, and flexible predictive model. The methods currently used for this purpose rarely satisfy all of these criteria. On the one hand, approaches based on testing strains against a catalogue of previously identified mutations often yield poor predictive performance; on the other hand, machine learning techniques typically have higher predictive accuracy, but often lack interpretability and may learn patterns that produce accurate predictions for the wrong reasons. Current interpretable methods may either exhibit a lower accuracy or lack the flexibility needed to generalize them to previously unseen data. Contribution In this paper we propose a novel technique, inspired by group testing and Boolean compressed sensing, which yields highly accurate predictions, interpretable results, and is flexible enough to be optimized for various evaluation metrics at the same time. Results We test the predictive accuracy of our approach on five first-line and seven second-line antibiotics used for treating tuberculosis. We find that it has a higher or comparable accuracy to that of commonly used machine learning models, and is able to identify variants in genes with previously reported association to drug resistance. Our method is intrinsically interpretable, and can be customized for different evaluation metrics. Our implementation is available at github.com/hoomanzabeti/INGOT_DR and can be installed via The Python Package Index (Pypi) under ingotdr. This package is also compatible with most of the tools in the Scikit-learn machine learning library.
Collapse
|
28
|
Allen JP, Snitkin E, Pincus NB, Hauser AR. Forest and Trees: Exploring Bacterial Virulence with Genome-wide Association Studies and Machine Learning. Trends Microbiol 2021; 29:621-633. [PMID: 33455849 PMCID: PMC8187264 DOI: 10.1016/j.tim.2020.12.002] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 12/07/2020] [Accepted: 12/08/2020] [Indexed: 12/15/2022]
Abstract
The advent of inexpensive and rapid sequencing technologies has allowed bacterial whole-genome sequences to be generated at an unprecedented pace. This wealth of information has revealed an unanticipated degree of strain-to-strain genetic diversity within many bacterial species. Awareness of this genetic heterogeneity has corresponded with a greater appreciation of intraspecies variation in virulence. A number of comparative genomic strategies have been developed to link these genotypic and pathogenic differences with the aim of discovering novel virulence factors. Here, we review recent advances in comparative genomic approaches to identify bacterial virulence determinants, with a focus on genome-wide association studies and machine learning.
Collapse
Affiliation(s)
- Jonathan P Allen
- Department of Microbiology and Immunology, Loyola University Chicago Stritch School of Medicine, Maywood, IL 60153, USA.
| | - Evan Snitkin
- Department of Microbiology and Immunology, Department of Internal Medicine/Division of Infectious Diseases, University of Michigan, Ann Arbor, MI 48109, USA
| | - Nathan B Pincus
- Department of Microbiology-Immunology, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Alan R Hauser
- Department of Microbiology-Immunology, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA; Department of Medicine/Division of Infectious Diseases, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| |
Collapse
|
29
|
Liu X, Ma Y, Wang J. Genetic variation and function: revealing potential factors associated with microbial phenotypes. BIOPHYSICS REPORTS 2021; 7:111-126. [PMID: 37288143 PMCID: PMC10235906 DOI: 10.52601/bpr.2021.200040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 03/09/2021] [Indexed: 06/09/2023] Open
Abstract
Innovations in sequencing technology have generated voluminous microbial and host genomic data, making it possible to detect these genetic variations and analyze the function influenced by them. Recently, many studies have linked such genetic variations to phenotypes through association or comparative analysis, which have further advanced our understanding of multiple microbial functions. In this review, we summarized the application of association analysis in microbes like Mycobacterium tuberculosis, focusing on screening of microbial genetic variants potentially associated with phenotypes such as drug resistance, pathogenesis and novel drug targets etc.; reviewed the application of additional comparative genomic or transcriptomic methods to identify genetic factors associated with functions in microbes; expanded the scope of our study to focus on host genetic factors associated with certain microbes or microbiome and summarized the recent host genetic variations associated with microbial phenotypes, including susceptibility and load after infection of HIV, presence/absence of different taxa, and quantitative traits of microbiome, and lastly, discussed the challenges that may be encountered and the apparent or potential viable solutions. Gene-function analysis of microbe and microbiome is still in its infancy, and in order to unleash its full potential, it is necessary to understand its history, current status, and the challenges hindering its development.
Collapse
Affiliation(s)
- Xiaolin Liu
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yue Ma
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jun Wang
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
30
|
Yamamoto E, Matsunaga H. Exploring Efficient Linear Mixed Models to Detect Quantitative Trait Locus-by-Environment Interactions. G3-GENES GENOMES GENETICS 2021; 11:6237888. [PMID: 33871575 PMCID: PMC8496289 DOI: 10.1093/g3journal/jkab119] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 04/06/2021] [Indexed: 11/29/2022]
Abstract
Genotype-by-environment (G × E) interactions are important for understanding genotype–phenotype relationships. To date, various statistical models have been proposed to account for G × E effects, especially in genomic selection (GS) studies. Generally, GS does not focus on the detection of each quantitative trait locus (QTL), while the genome-wide association study (GWAS) was designed for QTL detection. G × E modeling methods in GS can be included as covariates in GWAS using unified linear mixed models (LMMs). However, the efficacy of G × E modeling methods in GS studies has not been evaluated for GWAS. In this study, we performed a comprehensive comparison of LMMs that integrate the G × E modeling methods to detect both QTL and QTL-by-environment (Q × E) interaction effects. Model efficacy was evaluated using simulation experiments. For the fixed effect terms representing Q × E effects, simultaneous scoring of specific and nonspecific environmental effects was recommended because of the higher recall and improved genomic inflation factor value. For random effects, it was necessary to account for both G × E and genotype-by-trial (G × T) effects to control genomic inflation factor value. Thus, the recommended LMM includes fixed QTL effect terms that simultaneously score specific and nonspecific environmental effects and random effects accounting for both G × E and G × T. The LMM was applied to real tomato phenotype data obtained from two different cropping seasons. We detected not only QTLs with persistent effects across the cropping seasons but also QTLs with Q × E effects. The optimal LMM identified in this study successfully detected more QTLs with Q × E effects.
Collapse
Affiliation(s)
- Eiji Yamamoto
- Graduate School of Agriculture, Meiji University, Kawasaki 214-8571, Japan.,PRESTO, Japan Science and Technology Agency, Kawaguchi 332-0012, Japan
| | - Hiroshi Matsunaga
- Institute of Vegetable and Floriculture Science, National Agriculture and Food Research Organization, Ano, Tsu 514-2392, Japan
| |
Collapse
|
31
|
Ma KC, Mortimer TD, Duckett MA, Hicks AL, Wheeler NE, Sánchez-Busó L, Grad YH. Increased power from conditional bacterial genome-wide association identifies macrolide resistance mutations in Neisseria gonorrhoeae. Nat Commun 2020; 11:5374. [PMID: 33097713 PMCID: PMC7584619 DOI: 10.1038/s41467-020-19250-6] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Accepted: 10/02/2020] [Indexed: 12/21/2022] Open
Abstract
The emergence of resistance to azithromycin complicates treatment of Neisseria gonorrhoeae, the etiologic agent of gonorrhea. Substantial azithromycin resistance remains unexplained after accounting for known resistance mutations. Bacterial genome-wide association studies (GWAS) can identify novel resistance genes but must control for genetic confounders while maintaining power. Here, we show that compared to single-locus GWAS, conducting GWAS conditioned on known resistance mutations reduces the number of false positives and identifies a G70D mutation in the RplD 50S ribosomal protein L4 as significantly associated with increased azithromycin resistance (p-value = 1.08 × 10-11). We experimentally confirm our GWAS results and demonstrate that RplD G70D and other macrolide binding site mutations are prevalent (present in 5.42% of 4850 isolates) and widespread (identified in 21/65 countries across two decades). Overall, our findings demonstrate the utility of conditional associations for improving the performance of microbial GWAS and advance our understanding of the genetic basis of macrolide resistance.
Collapse
Affiliation(s)
- Kevin C Ma
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Tatum D Mortimer
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Marissa A Duckett
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Allison L Hicks
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Nicole E Wheeler
- Centre for Genomic Pathogen Surveillance, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Leonor Sánchez-Busó
- Centre for Genomic Pathogen Surveillance, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Yonatan H Grad
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Division of Infectious Diseases, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
32
|
Lai YP, Ioerger TR. Exploiting Homoplasy in Genome-Wide Association Studies to Enhance Identification of Antibiotic-Resistance Mutations in Bacterial Genomes. Evol Bioinform Online 2020; 16:1176934320944932. [PMID: 32782426 PMCID: PMC7385850 DOI: 10.1177/1176934320944932] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 06/30/2020] [Indexed: 12/23/2022] Open
Abstract
Many antibacterial drugs have multiple mechanisms of resistance, which are often represented simultaneously by a mixture of resistance mutations (some more frequent than others) in a clinical population. This presents a challenge for Genome-Wide Association Studies (GWAS) methods, making it difficult to detect less prevalent resistance mechanisms purely through (weak) statistical associations. Homoplasy, or the occurrence of multiple independent mutations at the same site, is often observed with drug resistance mutations and can be a strong indicator of positive selection. However, traditional GWAS methods, such as those based on allele counting or linear regression, are not designed to take homoplasy into account. In this article, we present a new method, called ECAT (for Evolutionary Cluster-based Association Test), that extends traditional regression-based GWAS methods with the ability to take advantage of homoplasy. This is achieved through a preprocessing step which identifies hypervariable regions in the genome exhibiting statistically significant clusters of distinct evolutionary changes, to which association testing by a linear mixed model (LMM) is applied using GEMMA (a well-established LMM-based GWAS tool). Thus, the approach can be viewed as extending GEMMA from the usual site- or gene-level analysis to focusing on clustered regions of mutations. This approach was evaluated on a large collection of more than 600 clinical isolates of multidrug-resistant (MDR) Mycobacterium tuberculosis from Lima, Peru. We show that ECAT does a better job of detecting known resistance mutations for several antitubercular drugs (including less prevalent mutations with weaker associations), compared with (site- or gene-based) GEMMA, as representative of existing GWAS methods. The power of the multiphase approach in ECAT comes from focusing association testing on the hypervariable regions of the genome, which reduces complexity in the model and increases statistical power.
Collapse
Affiliation(s)
- Yi-Pin Lai
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
| | - Thomas R Ioerger
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
| |
Collapse
|
33
|
Lees JA, Mai TT, Galardini M, Wheeler NE, Horsfield ST, Parkhill J, Corander J. Improved Prediction of Bacterial Genotype-Phenotype Associations Using Interpretable Pangenome-Spanning Regressions. mBio 2020; 11:e01344-20. [PMID: 32636251 PMCID: PMC7343994 DOI: 10.1128/mbio.01344-20] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 06/05/2020] [Indexed: 12/19/2022] Open
Abstract
Discovery of genetic variants underlying bacterial phenotypes and the prediction of phenotypes such as antibiotic resistance are fundamental tasks in bacterial genomics. Genome-wide association study (GWAS) methods have been applied to study these relations, but the plastic nature of bacterial genomes and the clonal structure of bacterial populations creates challenges. We introduce an alignment-free method which finds sets of loci associated with bacterial phenotypes, quantifies the total effect of genetics on the phenotype, and allows accurate phenotype prediction, all within a single computationally scalable joint modeling framework. Genetic variants covering the entire pangenome are compactly represented by extended DNA sequence words known as unitigs, and model fitting is achieved using elastic net penalization, an extension of standard multiple regression. Using an extensive set of state-of-the-art bacterial population genomic data sets, we demonstrate that our approach performs accurate phenotype prediction, comparable to popular machine learning methods, while retaining both interpretability and computational efficiency. Compared to those of previous approaches, which test each genotype-phenotype association separately for each variant and apply a significance threshold, the variants selected by our joint modeling approach overlap substantially.IMPORTANCE Being able to identify the genetic variants responsible for specific bacterial phenotypes has been the goal of bacterial genetics since its inception and is fundamental to our current level of understanding of bacteria. This identification has been based primarily on painstaking experimentation, but the availability of large data sets of whole genomes with associated phenotype metadata promises to revolutionize this approach, not least for important clinical phenotypes that are not amenable to laboratory analysis. These models of phenotype-genotype association can in the future be used for rapid prediction of clinically important phenotypes such as antibiotic resistance and virulence by rapid-turnaround or point-of-care tests. However, despite much effort being put into adapting genome-wide association study (GWAS) approaches to cope with bacterium-specific problems, such as strong population structure and horizontal gene exchange, current approaches are not yet optimal. We describe a method that advances methodology for both association and generation of portable prediction models.
Collapse
Affiliation(s)
- John A Lees
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
| | - T Tien Mai
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway
| | - Marco Galardini
- Biological Design Center, Boston University, Boston, Massachusetts, USA
| | - Nicole E Wheeler
- Centre for Genomic Pathogen Surveillance, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Samuel T Horsfield
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
| | - Julian Parkhill
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Jukka Corander
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway
- Centre for Genomic Pathogen Surveillance, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
- Helsinki Institute of Information Technology, Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| |
Collapse
|
34
|
Bobay LM. CoreSimul: a forward-in-time simulator of genome evolution for prokaryotes modeling homologous recombination. BMC Bioinformatics 2020; 21:264. [PMID: 32580695 PMCID: PMC7315543 DOI: 10.1186/s12859-020-03619-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Accepted: 06/19/2020] [Indexed: 12/26/2022] Open
Abstract
Background Prokaryotes are asexual, but these organisms frequently engage in homologous recombination, a process that differs from meiotic recombination in sexual organisms. Most tools developed to simulate genome evolution either assume sexual reproduction or the complete absence of DNA flux in the population. As a result, very few simulators are adapted to model prokaryotic genome evolution while accounting for recombination. Moreover, many simulators are based on the coalescent, which assumes a neutral model of genomic evolution, and those are best suited for organisms evolving under weak selective pressures, such as animals and plants. In contrast, prokaryotes are thought to be evolving under much stronger selective pressures, suggesting that forward-in-time simulators are better suited for these organisms. Results Here, I present CoreSimul, a forward-in-time simulator of core genome evolution for prokaryotes modeling homologous recombination. Simulations are guided by a phylogenetic tree and incorporate different substitution models, including models of codon selection. Conclusions CoreSimul is a flexible forward-in-time simulator that constitutes a significant addition to the limited list of available simulators applicable to prokaryote genome evolution.
Collapse
Affiliation(s)
- Louis-Marie Bobay
- Department of Biology, University of North Carolina Greensboro, 321 McIver Street, PO Box 26170, Greensboro, NC, 27402, USA.
| |
Collapse
|
35
|
Gröschel MI, Meehan CJ, Barilar I, Diricks M, Gonzaga A, Steglich M, Conchillo-Solé O, Scherer IC, Mamat U, Luz CF, De Bruyne K, Utpatel C, Yero D, Gibert I, Daura X, Kampmeier S, Rahman NA, Kresken M, van der Werf TS, Alio I, Streit WR, Zhou K, Schwartz T, Rossen JWA, Farhat MR, Schaible UE, Nübel U, Rupp J, Steinmann J, Niemann S, Kohl TA. The phylogenetic landscape and nosocomial spread of the multidrug-resistant opportunist Stenotrophomonas maltophilia. Nat Commun 2020; 11:2044. [PMID: 32341346 PMCID: PMC7184733 DOI: 10.1038/s41467-020-15123-0] [Citation(s) in RCA: 84] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Accepted: 02/15/2020] [Indexed: 02/06/2023] Open
Abstract
Recent studies portend a rising global spread and adaptation of human- or healthcare-associated pathogens. Here, we analyse an international collection of the emerging, multidrug-resistant, opportunistic pathogen Stenotrophomonas maltophilia from 22 countries to infer population structure and clonality at a global level. We show that the S. maltophilia complex is divided into 23 monophyletic lineages, most of which harbour strains of all degrees of human virulence. Lineage Sm6 comprises the highest rate of human-associated strains, linked to key virulence and resistance genes. Transmission analysis identifies potential outbreak events of genetically closely related strains isolated within days or weeks in the same hospitals. Multidrug resistance of the opportunistic pathogen Stenotrophomonas maltophilia is an increasing problem. Here, analyzing strains from 22 countries, the authors show that the S. maltophilia complex is divided into 23 monophyletic lineages and find evidence for intra-hospital transmission.
Collapse
Affiliation(s)
- Matthias I Gröschel
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany.,Department of Pulmonary Diseases & Tuberculosis, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Conor J Meehan
- School of Chemistry and Bioscience, University of Bradford, Bradford, United Kingdom
| | - Ivan Barilar
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
| | - Margo Diricks
- bioMérieux, Applied Maths NV, Keistraat 120, 9830, St-Martens-Latem, Belgium
| | - Aitor Gonzaga
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Matthias Steglich
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Oscar Conchillo-Solé
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Barcelona, Spain.,Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Isabell-Christin Scherer
- Department of Infectious Diseases and Microbiology, University Hospital Schleswig-Holstein, Lübeck, Germany
| | - Uwe Mamat
- Cellular Microbiology, Research Center Borstel, Borstel, Germany
| | - Christian F Luz
- Department of Medical Microbiology and Infection Prevention, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Katrien De Bruyne
- bioMérieux, Applied Maths NV, Keistraat 120, 9830, St-Martens-Latem, Belgium
| | - Christian Utpatel
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
| | - Daniel Yero
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Barcelona, Spain.,Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Isidre Gibert
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Barcelona, Spain.,Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Xavier Daura
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Barcelona, Spain.,Catalan Institution for Research and Advanced Studies, Barcelona, Spain
| | | | | | - Michael Kresken
- Antiinfectives Intelligence GmbH, Rheinbach, Germany.,Rheinische Fachhochschule Köln gGmbH, Cologne, Germany
| | - Tjip S van der Werf
- Department of Pulmonary Diseases & Tuberculosis, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Ifey Alio
- Department of Microbiology and Biotechnology, University of Hamburg, Hamburg, Germany
| | - Wolfgang R Streit
- Department of Microbiology and Biotechnology, University of Hamburg, Hamburg, Germany
| | - Kai Zhou
- Shenzhen Institute of Respiratory Diseases, the First Affiliated Hospital (Shenzhen People's Hospital), Southern University of Science and Technology, Shenzhen, China.,Second Clinical Medical College, Jinan University, Shenzhen, China
| | - Thomas Schwartz
- Karlsruhe Institute of Technology, Institute of Functional Interfaces, Eggenstein- Leopoldshafen, Germany
| | - John W A Rossen
- Department of Medical Microbiology and Infection Prevention, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Maha R Farhat
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.,Division of Pulmonary and Critical Care, Massachusetts General Hospital, Boston, MA, USA
| | - Ulrich E Schaible
- Cellular Microbiology, Research Center Borstel, Borstel, Germany.,German Center for Infection Research (DZIF), partner site Hamburg - Lübeck - Borstel - Riems, Cologne, Germany.,Leibniz Research Alliance INFECTIONS'21, Cologne, Germany
| | - Ulrich Nübel
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany.,Leibniz Research Alliance INFECTIONS'21, Cologne, Germany.,Germany Center for Infection Research (DZIF), partner site Hannover - Braunschweig, Cologne, Germany.,Braunschweig Integrated Center of Systems Biology (BRICS), Technical University, Braunschweig, Germany
| | - Jan Rupp
- Department of Infectious Diseases and Microbiology, University Hospital Schleswig-Holstein, Lübeck, Germany.,German Center for Infection Research (DZIF), partner site Hamburg - Lübeck - Borstel - Riems, Cologne, Germany
| | - Joerg Steinmann
- Institute of Medical Microbiology, University Medical Center Essen, Essen, Germany.,Medical Microbiology and Infection Prevention, Institute of Clinical Hygiene, Paracelsus Medical Private University, Klinikum Nürnberg, Nuremberg, Germany
| | - Stefan Niemann
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany. .,German Center for Infection Research (DZIF), partner site Hamburg - Lübeck - Borstel - Riems, Cologne, Germany. .,Leibniz Research Alliance INFECTIONS'21, Cologne, Germany.
| | - Thomas A Kohl
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany.,German Center for Infection Research (DZIF), partner site Hamburg - Lübeck - Borstel - Riems, Cologne, Germany
| |
Collapse
|