1
|
Podda M, Bonechi S, Palladino A, Scaramuzzino M, Brozzi A, Roma G, Muzzi A, Priami C, Sîrbu A, Bodini M. Classification of Neisseria meningitidis genomes with a bag-of-words approach and machine learning. iScience 2024; 27:109257. [PMID: 38439962 PMCID: PMC10910294 DOI: 10.1016/j.isci.2024.109257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 12/13/2023] [Accepted: 02/13/2024] [Indexed: 03/06/2024] Open
Abstract
Whole genome sequencing of bacteria is important to enable strain classification. Using entire genomes as an input to machine learning (ML) models would allow rapid classification of strains while using information from multiple genetic elements. We developed a "bag-of-words" approach to encode, using SentencePiece or k-mer tokenization, entire bacterial genomes and analyze these with ML. Initial model selection identified SentencePiece with 8,000 and 32,000 words as the best approach for genome tokenization. We then classified in Neisseria meningitidis genomes the capsule B group genotype with 99.6% accuracy and the multifactor invasive phenotype with 90.2% accuracy, in an independent test set. Subsequently, in silico knockouts of 2,808 genes confirmed that the ML model predictions aligned with our current understanding of the underlying biology. To our knowledge, this is the first ML method using entire bacterial genomes to classify strains and identify genes considered relevant by the classifier.
Collapse
Affiliation(s)
- Marco Podda
- Vaccines Discovery Data Sciences, GSK Vaccines, GSK, 53100 Siena, Italy
| | - Simone Bonechi
- Vaccines Discovery Data Sciences, GSK Vaccines, GSK, 53100 Siena, Italy
- Department of Computer Science, University of Pisa, 56127 Pisa, Italy
| | - Andrea Palladino
- Vaccines Discovery Data Sciences, GSK Vaccines, GSK, 53100 Siena, Italy
| | | | - Alessandro Brozzi
- Vaccines Discovery Data Sciences, GSK Vaccines, GSK, 53100 Siena, Italy
| | - Guglielmo Roma
- Vaccines Discovery Data Sciences, GSK Vaccines, GSK, 53100 Siena, Italy
| | - Alessandro Muzzi
- Vaccines Discovery Data Sciences, GSK Vaccines, GSK, 53100 Siena, Italy
| | - Corrado Priami
- Department of Computer Science, University of Pisa, 56127 Pisa, Italy
| | - Alina Sîrbu
- Department of Computer Science, University of Pisa, 56127 Pisa, Italy
| | - Margherita Bodini
- Vaccines Discovery Data Sciences, GSK Vaccines, GSK, 53100 Siena, Italy
| |
Collapse
|
2
|
Deschênes T, Tohoundjona FWE, Plante PL, Di Marzo V, Raymond F. Gene-based microbiome representation enhances host phenotype classification. mSystems 2023; 8:e0053123. [PMID: 37404032 PMCID: PMC10469787 DOI: 10.1128/msystems.00531-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 05/24/2023] [Indexed: 07/06/2023] Open
Abstract
With the concomitant advances in both the microbiome and machine learning fields, the gut microbiome has become of great interest for the potential discovery of biomarkers to be used in the classification of the host health status. Shotgun metagenomics data derived from the human microbiome is composed of a high-dimensional set of microbial features. The use of such complex data for the modeling of host-microbiome interactions remains a challenge as retaining de novo content yields a highly granular set of microbial features. In this study, we compared the prediction performances of machine learning approaches according to different types of data representations derived from shotgun metagenomics. These representations include commonly used taxonomic and functional profiles and the more granular gene cluster approach. For the five case-control datasets used in this study (Type 2 diabetes, obesity, liver cirrhosis, colorectal cancer, and inflammatory bowel disease), gene-based approaches, whether used alone or in combination with reference-based data types, allowed improved or similar classification performances as the taxonomic and functional profiles. In addition, we show that using subsets of gene families from specific functional categories of genes highlight the importance of these functions on the host phenotype. This study demonstrates that both reference-free microbiome representations and curated metagenomic annotations can provide relevant representations for machine learning based on metagenomic data. IMPORTANCE Data representation is an essential part of machine learning performance when using metagenomic data. In this work, we show that different microbiome representations provide varied host phenotype classification performance depending on the dataset. In classification tasks, untargeted microbiome gene content can provide similar or improved classification compared to taxonomical profiling. Feature selection based on biological function also improves classification performance for some pathologies. Function-based feature selection combined with interpretable machine learning algorithms can generate new hypotheses that can potentially be assayed mechanistically. This work thus proposes new approaches to represent microbiome data for machine learning that can potentiate the findings associated with metagenomic data.
Collapse
Affiliation(s)
- Thomas Deschênes
- Centre Nutrition, Santé et Société (NUTRISS) – Institut sur la Nutrition et les Aliments Fonctionnels (INAF), Université Laval, Québec, Canada
- Canada Research Excellence Chair on the Microbiome-Endocannabinoidome Axis in Metabolic Health (CERC-MEND), Quebec City, Quebec, Canada
- Institut Intelligence et Données, Université Laval, Québec, Canada
| | - Fred Wilfried Elom Tohoundjona
- Centre Nutrition, Santé et Société (NUTRISS) – Institut sur la Nutrition et les Aliments Fonctionnels (INAF), Université Laval, Québec, Canada
- Canada Research Excellence Chair on the Microbiome-Endocannabinoidome Axis in Metabolic Health (CERC-MEND), Quebec City, Quebec, Canada
| | - Pier-Luc Plante
- Centre Nutrition, Santé et Société (NUTRISS) – Institut sur la Nutrition et les Aliments Fonctionnels (INAF), Université Laval, Québec, Canada
- Canada Research Excellence Chair on the Microbiome-Endocannabinoidome Axis in Metabolic Health (CERC-MEND), Quebec City, Quebec, Canada
- Institut Intelligence et Données, Université Laval, Québec, Canada
| | - Vincenzo Di Marzo
- Centre Nutrition, Santé et Société (NUTRISS) – Institut sur la Nutrition et les Aliments Fonctionnels (INAF), Université Laval, Québec, Canada
- Canada Research Excellence Chair on the Microbiome-Endocannabinoidome Axis in Metabolic Health (CERC-MEND), Quebec City, Quebec, Canada
- École de nutrition, Faculté des sciences de l’agriculture et de l’alimentation (FSAA), Université Laval, Québec, Canada
- Centre de recherche de l’Institut universitaire de cardiologie et de pneumologie de Québec (IUCPQ), Québec, Canada
- Département de médecine, Faculté de Médecine, Université Laval, Québec, Canada
- Joint International Unit on Chemical and Biomolecular Research on the Microbiome and its Impact on Metabolic Health and Nutrition (UMI-MicroMeNu), Quebec City, Canada
| | - Frédéric Raymond
- Centre Nutrition, Santé et Société (NUTRISS) – Institut sur la Nutrition et les Aliments Fonctionnels (INAF), Université Laval, Québec, Canada
- Canada Research Excellence Chair on the Microbiome-Endocannabinoidome Axis in Metabolic Health (CERC-MEND), Quebec City, Quebec, Canada
- Institut Intelligence et Données, Université Laval, Québec, Canada
- École de nutrition, Faculté des sciences de l’agriculture et de l’alimentation (FSAA), Université Laval, Québec, Canada
| |
Collapse
|
3
|
Abed JY, Godon T, Mehdaoui F, Plante PL, Boissinot M, Bergeron MG, Bélanger RE, Muckle G, Poliakova N, Ayotte P, Corbeil J, Rousseau E. Gut metagenome profile of the Nunavik Inuit youth is distinct from industrial and non-industrial counterparts. Commun Biol 2022; 5:1415. [PMID: 36566300 PMCID: PMC9790006 DOI: 10.1038/s42003-022-04372-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 12/13/2022] [Indexed: 12/25/2022] Open
Abstract
Comparative metagenomics studies have highlighted differences in microbiome community structure among human populations over diverse lifestyles and environments. With their unique environmental and historical backgrounds, Nunavik Inuit have a distinctive gut microbiome with undocumented health-related implications. Using shotgun metagenomics, we explored the taxonomic and functional structure of the gut microbiome from 275 Nunavik Inuit ranging from 16 to 30-year-old. Whole-metagenome analyses revealed that Nunavik Inuit youths have a more diverse microbiome than their non-industrialized and industrialized counterparts. A comparison of k-mer content illustrated the uniqueness of the Nunavik gut microbiome. Short-chain fatty acids producing species, and carbohydrates degradation pathways dominated Inuit metagenomes. We identified a taxonomic and functional signature unique to the Nunavik gut microbiome contrasting with other populations using a random forest classifier. Here, we show that the Nunavik Inuit gut microbiome exhibits high diversity and a distinct community structure.
Collapse
Affiliation(s)
- Jehane Y. Abed
- grid.23856.3a0000 0004 1936 8390Centre de Recherche en Infectiologie de l’Université Laval, Axe Maladies Infectieuses et Immunitaires, Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Centre de Recherche en Données Massives de l’Université Laval, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Département de microbiologie-infectiologie et d’immunologie, Faculté de médecine, Université Laval, Québec City, QC Canada
| | - Thibaud Godon
- grid.23856.3a0000 0004 1936 8390Centre de Recherche en Infectiologie de l’Université Laval, Axe Maladies Infectieuses et Immunitaires, Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Centre de Recherche en Données Massives de l’Université Laval, Québec City, QC Canada
| | - Fadwa Mehdaoui
- grid.23856.3a0000 0004 1936 8390Centre de Recherche en Infectiologie de l’Université Laval, Axe Maladies Infectieuses et Immunitaires, Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Département d’informatique et génie logiciel, Université Laval, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Centre Nutrition, Santé et Société (NUTRISS), Institute of Nutrition and Functional Foods (INAF), Université Laval, Québec City, QC Canada
| | - Pier-Luc Plante
- grid.23856.3a0000 0004 1936 8390Centre Nutrition, Santé et Société (NUTRISS), Institute of Nutrition and Functional Foods (INAF), Université Laval, Québec City, QC Canada
| | - Maurice Boissinot
- grid.23856.3a0000 0004 1936 8390Centre de Recherche en Infectiologie de l’Université Laval, Axe Maladies Infectieuses et Immunitaires, Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC Canada
| | - Michel G. Bergeron
- grid.23856.3a0000 0004 1936 8390Centre de Recherche en Infectiologie de l’Université Laval, Axe Maladies Infectieuses et Immunitaires, Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Département de microbiologie-infectiologie et d’immunologie, Faculté de médecine, Université Laval, Québec City, QC Canada
| | - Richard E. Bélanger
- grid.23856.3a0000 0004 1936 8390Axe santé des populations et pratiques optimales en santé, Centre de recherche du CHU de Québec-Université Laval, Hôpital du Saint-Sacrement, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Département de pédiatrie, Faculté de médecine, Université Laval, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Centre mère-enfant Soleil, CHU de Québec-Université Laval, Département de pédiatrie, Québec City, QC Canada
| | - Gina Muckle
- grid.23856.3a0000 0004 1936 8390Axe santé des populations et pratiques optimales en santé, Centre de recherche du CHU de Québec-Université Laval, Hôpital du Saint-Sacrement, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390École de psychologie, Faculté des sciences sociales, Université Laval, Québec City, QC Canada
| | - Natalia Poliakova
- grid.23856.3a0000 0004 1936 8390Axe santé des populations et pratiques optimales en santé, Centre de recherche du CHU de Québec-Université Laval, Hôpital du Saint-Sacrement, Québec City, QC Canada
| | - Pierre Ayotte
- grid.23856.3a0000 0004 1936 8390Axe santé des populations et pratiques optimales en santé, Centre de recherche du CHU de Québec-Université Laval, Hôpital du Saint-Sacrement, Québec City, QC Canada ,grid.434819.30000 0000 8929 2775Centre de Toxicologie du Québec, Institut national de santé publique du Québec (INSPQ), Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Département de médecine sociale et préventive, Faculté de médecine, Université Laval, Québec City, QC Canada
| | - Jacques Corbeil
- grid.23856.3a0000 0004 1936 8390Centre de Recherche en Infectiologie de l’Université Laval, Axe Maladies Infectieuses et Immunitaires, Centre de Recherche du CHU de Québec-Université Laval, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Centre de Recherche en Données Massives de l’Université Laval, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Département de Médecine Moléculaire, Faculté de médecine, Université Laval, Québec City, QC Canada
| | - Elsa Rousseau
- grid.23856.3a0000 0004 1936 8390Centre de Recherche en Données Massives de l’Université Laval, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Département d’informatique et génie logiciel, Université Laval, Québec City, QC Canada ,grid.23856.3a0000 0004 1936 8390Centre Nutrition, Santé et Société (NUTRISS), Institute of Nutrition and Functional Foods (INAF), Université Laval, Québec City, QC Canada
| |
Collapse
|
4
|
George PBL, Rossi F, St-Germain MW, Amato P, Badard T, Bergeron MG, Boissinot M, Charette SJ, Coleman BL, Corbeil J, Culley AI, Gaucher ML, Girard M, Godbout S, Kirychuk SP, Marette A, McGeer A, O’Shaughnessy PT, Parmley EJ, Simard S, Reid-Smith RJ, Topp E, Trudel L, Yao M, Brassard P, Delort AM, Larios AD, Létourneau V, Paquet VE, Pedneau MH, Pic É, Thompson B, Veillette M, Thaler M, Scapino I, Lebeuf M, Baghdadi M, Castillo Toro A, Cayouette AB, Dubois MJ, Durocher AF, Girard SB, Diaz AKC, Khalloufi A, Leclerc S, Lemieux J, Maldonado MP, Pilon G, Murphy CP, Notling CA, Ofori-Darko D, Provencher J, Richer-Fortin A, Turgeon N, Duchaine C. Antimicrobial Resistance in the Environment: Towards Elucidating the Roles of Bioaerosols in Transmission and Detection of Antibacterial Resistance Genes. Antibiotics (Basel) 2022; 11:974. [PMID: 35884228 PMCID: PMC9312183 DOI: 10.3390/antibiotics11070974] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 06/30/2022] [Accepted: 07/15/2022] [Indexed: 02/01/2023] Open
Abstract
Antimicrobial resistance (AMR) is continuing to grow across the world. Though often thought of as a mostly public health issue, AMR is also a major agricultural and environmental problem. As such, many researchers refer to it as the preeminent One Health issue. Aerial transport of antimicrobial-resistant bacteria via bioaerosols is still poorly understood. Recent work has highlighted the presence of antibiotic resistance genes in bioaerosols. Emissions of AMR bacteria and genes have been detected from various sources, including wastewater treatment plants, hospitals, and agricultural practices; however, their impacts on the broader environment are poorly understood. Contextualizing the roles of bioaerosols in the dissemination of AMR necessitates a multidisciplinary approach. Environmental factors, industrial and medical practices, as well as ecological principles influence the aerial dissemination of resistant bacteria. This article introduces an ongoing project assessing the presence and fate of AMR in bioaerosols across Canada. Its various sub-studies include the assessment of the emissions of antibiotic resistance genes from many agricultural practices, their long-distance transport, new integrative methods of assessment, and the creation of dissemination models over short and long distances. Results from sub-studies are beginning to be published. Consequently, this paper explains the background behind the development of the various sub-studies and highlight their shared aspects.
Collapse
Affiliation(s)
- Paul B. L. George
- Département de Médecine Moléculaire, Université Laval, Quebec City, QC G1V 0A6, Canada; (P.B.L.G.); (J.C.); (I.S.)
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
| | - Florent Rossi
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Institut de Chimie de Clermont-Ferrand, SIGMA Clermont, CNRS, Université Clermont-Auvergne, 63178 Clermont-Ferrand, France; (P.A.); (A.-M.D.)
| | - Magali-Wen St-Germain
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Pierre Amato
- Institut de Chimie de Clermont-Ferrand, SIGMA Clermont, CNRS, Université Clermont-Auvergne, 63178 Clermont-Ferrand, France; (P.A.); (A.-M.D.)
| | - Thierry Badard
- Centre de Recherche en Données et Intelligence Géospatiales (CRDIG), Quebec City, QC G1V 0A6, Canada;
| | - Michel G. Bergeron
- Centre de Recherche en Infectiologie, Centre de Recherche du CHU de Québec-Université Laval, Axe Maladies Infectieuses et Immunitaires, Quebec City, QC G1V 4G2, Canada; (M.G.B.); (M.B.); (É.P.)
| | - Maurice Boissinot
- Centre de Recherche en Infectiologie, Centre de Recherche du CHU de Québec-Université Laval, Axe Maladies Infectieuses et Immunitaires, Quebec City, QC G1V 4G2, Canada; (M.G.B.); (M.B.); (É.P.)
| | - Steve J. Charette
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Brenda L. Coleman
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada; (B.L.C.); (A.M.)
| | - Jacques Corbeil
- Département de Médecine Moléculaire, Université Laval, Quebec City, QC G1V 0A6, Canada; (P.B.L.G.); (J.C.); (I.S.)
- Centre de Recherche en Infectiologie, Centre de Recherche du CHU de Québec-Université Laval, Axe Maladies Infectieuses et Immunitaires, Quebec City, QC G1V 4G2, Canada; (M.G.B.); (M.B.); (É.P.)
| | - Alexander I. Culley
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Marie-Lou Gaucher
- Research Chair in Meat Safety, Département de Pathologie et Microbiologie, Université de Montréal, Saint-Hyacinthe, QC J2S 2M2, Canada;
| | | | - Stéphane Godbout
- Institut de Recherche et de Développement en Agroenvironnement (IRDA), Quebec City, QC G1P 3W8, Canada; (S.G.); (A.D.L.); (A.K.C.D.)
- Département des Sols et de Génie Agroalimentaire, Université Laval, Quebec City, QC G1V 0A6, Canada;
| | - Shelley P. Kirychuk
- Department of Medicine, University of Saskatchewan, Saskatoon, SK S7N 0X8, Canada; (S.P.K.); (B.T.); (A.C.T.); (C.A.N.)
| | - André Marette
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
- Institut sur la Nutrition et les Aliments Fonctionnels, Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Allison McGeer
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada; (B.L.C.); (A.M.)
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Patrick T. O’Shaughnessy
- Department of Occupational and Environmental Health, The University of Iowa, Iowa City, IA 52246, USA;
| | - E. Jane Parmley
- Canadian Wildlife Health Cooperative, University of Guelph, Guelph, ON N1G 2W1, Canada;
- Department of Population Medicine, University of Guelph, Guelph, ON N1G 2W1, Canada; (R.J.R.-S.); (M.P.M.)
| | - Serge Simard
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Richard J. Reid-Smith
- Department of Population Medicine, University of Guelph, Guelph, ON N1G 2W1, Canada; (R.J.R.-S.); (M.P.M.)
- Centre for Foodborne, Environmental and Zoonotic Infectious Diseases, Public Health Agency of Canada, Guelph, ON N1G 3W4, Canada; (C.P.M.); (D.O.-D.)
| | - Edward Topp
- Agriculture and Agri-Food Canada, London Research and Development Centre, London, ON N5V 4T3, Canada;
- Department of Biology, The University of Western Ontario, London, ON N6A 5B7, Canada
| | - Luc Trudel
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
| | - Maosheng Yao
- State Key Joint Laboratory of Environmental Simulation and Pollution Control, College of Environmental Sciences and Engineering, Peking University, Beijing 100871, China;
| | - Patrick Brassard
- Département des Sols et de Génie Agroalimentaire, Université Laval, Quebec City, QC G1V 0A6, Canada;
| | - Anne-Marie Delort
- Institut de Chimie de Clermont-Ferrand, SIGMA Clermont, CNRS, Université Clermont-Auvergne, 63178 Clermont-Ferrand, France; (P.A.); (A.-M.D.)
| | - Araceli D. Larios
- Institut de Recherche et de Développement en Agroenvironnement (IRDA), Quebec City, QC G1P 3W8, Canada; (S.G.); (A.D.L.); (A.K.C.D.)
- Tecnológico Nacional de México/ITS de Perote, Perote 91270, Mexico
| | - Valérie Létourneau
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Valérie E. Paquet
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Marie-Hélène Pedneau
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Émilie Pic
- Centre de Recherche en Infectiologie, Centre de Recherche du CHU de Québec-Université Laval, Axe Maladies Infectieuses et Immunitaires, Quebec City, QC G1V 4G2, Canada; (M.G.B.); (M.B.); (É.P.)
| | - Brooke Thompson
- Department of Medicine, University of Saskatchewan, Saskatoon, SK S7N 0X8, Canada; (S.P.K.); (B.T.); (A.C.T.); (C.A.N.)
| | - Marc Veillette
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Mary Thaler
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Ilaria Scapino
- Département de Médecine Moléculaire, Université Laval, Quebec City, QC G1V 0A6, Canada; (P.B.L.G.); (J.C.); (I.S.)
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Maria Lebeuf
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Mahsa Baghdadi
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Alejandra Castillo Toro
- Department of Medicine, University of Saskatchewan, Saskatoon, SK S7N 0X8, Canada; (S.P.K.); (B.T.); (A.C.T.); (C.A.N.)
| | - Amélia Bélanger Cayouette
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Marie-Julie Dubois
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
- Institut sur la Nutrition et les Aliments Fonctionnels, Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Alicia F. Durocher
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Sarah B. Girard
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Andrea Katherín Carranza Diaz
- Institut de Recherche et de Développement en Agroenvironnement (IRDA), Quebec City, QC G1P 3W8, Canada; (S.G.); (A.D.L.); (A.K.C.D.)
- Département des Sols et de Génie Agroalimentaire, Université Laval, Quebec City, QC G1V 0A6, Canada;
| | - Asmaâ Khalloufi
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Research Chair in Meat Safety, Département de Pathologie et Microbiologie, Université de Montréal, Saint-Hyacinthe, QC J2S 2M2, Canada;
| | - Samantha Leclerc
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Joanie Lemieux
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
- Centre de Recherche en Infectiologie, Centre de Recherche du CHU de Québec-Université Laval, Axe Maladies Infectieuses et Immunitaires, Quebec City, QC G1V 4G2, Canada; (M.G.B.); (M.B.); (É.P.)
| | - Manuel Pérez Maldonado
- Department of Population Medicine, University of Guelph, Guelph, ON N1G 2W1, Canada; (R.J.R.-S.); (M.P.M.)
| | - Geneviève Pilon
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Colleen P. Murphy
- Centre for Foodborne, Environmental and Zoonotic Infectious Diseases, Public Health Agency of Canada, Guelph, ON N1G 3W4, Canada; (C.P.M.); (D.O.-D.)
| | - Charly A. Notling
- Department of Medicine, University of Saskatchewan, Saskatoon, SK S7N 0X8, Canada; (S.P.K.); (B.T.); (A.C.T.); (C.A.N.)
| | - Daniel Ofori-Darko
- Centre for Foodborne, Environmental and Zoonotic Infectious Diseases, Public Health Agency of Canada, Guelph, ON N1G 3W4, Canada; (C.P.M.); (D.O.-D.)
| | - Juliette Provencher
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Annabelle Richer-Fortin
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Nathalie Turgeon
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| | - Caroline Duchaine
- Département de Biochimie, de Microbiologie et de Bio-Informatique, Université Laval, Quebec City, QC G1V 0A6, Canada; (F.R.); (M.-W.S.-G.); (S.J.C.); (A.I.C.); (L.T.); (V.E.P.); (M.T.); (M.B.); (A.B.C.); (A.F.D.); (S.B.G.); (A.K.); (S.L.); (J.L.); (J.P.); (A.R.-F.)
- Centre de Recherche de L’Institut Universitaire de Cardiologie et de Pneumologie de Québec, Quebec City, QC G1V 4G5, Canada; (A.M.); (S.S.); (V.L.); (M.-H.P.); (M.V.); (M.L.); (M.-J.D.); (G.P.); (N.T.)
| |
Collapse
|
5
|
Jungblut AD, Raymond F, Dion MB, Moineau S, Mohit V, Nguyen GQ, Déraspe M, Francovic-Fontaine É, Lovejoy C, Culley AI, Corbeil J, Vincent WF. Genomic diversity and CRISPR-Cas systems in the cyanobacterium Nostoc in the High Arctic. Environ Microbiol 2021; 23:2955-2968. [PMID: 33760341 DOI: 10.1111/1462-2920.15481] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 03/22/2021] [Indexed: 11/27/2022]
Abstract
Nostoc (Nostocales, Cyanobacteria) has a global distribution in the Polar Regions. However, the genomic diversity of Nostoc is little known and there are no genomes available for polar Nostoc. Here we carried out the first genomic analysis of the Nostoc commune morphotype with a recent sample from the High Arctic and a herbarium specimen collected during the British Arctic Expedition (1875-76). Comparisons of the polar genomes with 26 present-day non-polar members of the Nostocales family highlighted that there are pronounced genetic variations among Nostoc strains and species. Osmoprotection and other stress genes were found in all Nostoc strains, but the two Arctic strains had markedly higher numbers of biosynthetic gene clusters for uncharacterised non-ribosomal peptide synthetases, suggesting a high diversity of secondary metabolites. Since viral-host interactions contribute to microbial diversity, we analysed the CRISPR-Cas systems in the Arctic and two temperate Nostoc species. There were a large number of unique repeat-spacer arrays in each genome, indicating diverse histories of viral attack. All Nostoc strains had a subtype I-D system, but the polar specimens also showed evidence of a subtype I-B system that has not been previously reported in cyanobacteria, suggesting diverse cyanobacteria-virus interactions in the Arctic.
Collapse
Affiliation(s)
- Anne D Jungblut
- Life Sciences Department, Natural History Museum, Cromwell Road, London, SW7 5BD, UK
| | - Frédéric Raymond
- Department of Molecular Medicine and Big Data Research Centre, Université Laval, Quebec, QC, G1V 0A6, Canada.,School of Nutrition and Institute on Nutrition and Functional Foods, Université Laval, Québec City, QC, G1V 0A6, Canada
| | - Moïra B Dion
- Département de Biochimie, de Microbiologie et de Bio-informatique, Université Laval, Quebec City, QC, G1V 0A6, Canada.,Groupe de Recherche en Écologie Buccale, Université Laval, Quebec City, QC, G1V 0A6, Canada
| | - Sylvain Moineau
- Département de Biochimie, de Microbiologie et de Bio-informatique, Université Laval, Quebec City, QC, G1V 0A6, Canada.,Groupe de Recherche en Écologie Buccale, Université Laval, Quebec City, QC, G1V 0A6, Canada.,Félix d'Hérelle Reference Center for Bacterial Viruses, Université Laval, Quebec City, QC, G1V 0A6, Canada
| | - Vani Mohit
- Centre for Northern Studies (CEN), Université Laval, Quebec City, QC, G1V 0A6, Canada.,Takuvik Joint International Laboratory and Institute of Integrative Biology and Systems, Université Laval, Quebec City, QC, G1V 0A6, Canada.,Département de Biologie, Université Laval, Quebec City, QC, G1V 0A6, Canada
| | - Guillaume Quang Nguyen
- School of Nutrition and Institute on Nutrition and Functional Foods, Université Laval, Québec City, QC, G1V 0A6, Canada
| | - Maxime Déraspe
- Department of Molecular Medicine and Big Data Research Centre, Université Laval, Quebec, QC, G1V 0A6, Canada
| | - Élina Francovic-Fontaine
- Department of Molecular Medicine and Big Data Research Centre, Université Laval, Quebec, QC, G1V 0A6, Canada
| | - Connie Lovejoy
- Takuvik Joint International Laboratory and Institute of Integrative Biology and Systems, Université Laval, Quebec City, QC, G1V 0A6, Canada.,Département de Biologie, Université Laval, Quebec City, QC, G1V 0A6, Canada.,Québec-Océan, Université Laval, Quebec City, QC, G1V 0A6, Canada
| | - Alexander I Culley
- Département de Biochimie, de Microbiologie et de Bio-informatique, Université Laval, Quebec City, QC, G1V 0A6, Canada.,Groupe de Recherche en Écologie Buccale, Université Laval, Quebec City, QC, G1V 0A6, Canada.,Centre for Northern Studies (CEN), Université Laval, Quebec City, QC, G1V 0A6, Canada.,Takuvik Joint International Laboratory and Institute of Integrative Biology and Systems, Université Laval, Quebec City, QC, G1V 0A6, Canada
| | - Jacques Corbeil
- Department of Molecular Medicine and Big Data Research Centre, Université Laval, Quebec, QC, G1V 0A6, Canada
| | - Warwick F Vincent
- Centre for Northern Studies (CEN), Université Laval, Quebec City, QC, G1V 0A6, Canada.,Takuvik Joint International Laboratory and Institute of Integrative Biology and Systems, Université Laval, Quebec City, QC, G1V 0A6, Canada.,Département de Biologie, Université Laval, Quebec City, QC, G1V 0A6, Canada
| |
Collapse
|
6
|
Bize A, Midoux C, Mariadassou M, Schbath S, Forterre P, Da Cunha V. Exploring short k-mer profiles in cells and mobile elements from Archaea highlights the major influence of both the ecological niche and evolutionary history. BMC Genomics 2021; 22:186. [PMID: 33726663 PMCID: PMC7962313 DOI: 10.1186/s12864-021-07471-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 02/24/2021] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND K-mer-based methods have greatly advanced in recent years, largely driven by the realization of their biological significance and by the advent of next-generation sequencing. Their speed and their independence from the annotation process are major advantages. Their utility in the study of the mobilome has recently emerged and they seem a priori adapted to the patchy gene distribution and the lack of universal marker genes of viruses and plasmids. To provide a framework for the interpretation of results from k-mer based methods applied to archaea or their mobilome, we analyzed the 5-mer DNA profiles of close to 600 archaeal cells, viruses and plasmids. Archaea is one of the three domains of life. Archaea seem enriched in extremophiles and are associated with a high diversity of viral and plasmid families, many of which are specific to this domain. We explored the dataset structure by multivariate and statistical analyses, seeking to identify the underlying factors. RESULTS For cells, the 5-mer profiles were inconsistent with the phylogeny of archaea. At a finer taxonomic level, the influence of the taxonomy and the environmental constraints on 5-mer profiles was very strong. These two factors were interdependent to a significant extent, and the respective weights of their contributions varied according to the clade. A convergent adaptation was observed for the class Halobacteria, for which a strong 5-mer signature was identified. For mobile elements, coevolution with the host had a clear influence on their 5-mer profile. This enabled us to identify one previously known and one new case of recent host transfer based on the atypical composition of the mobile elements involved. Beyond the effect of coevolution, extrachromosomal elements strikingly retain the specific imprint of their own viral or plasmid taxonomic family in their 5-mer profile. CONCLUSION This specific imprint confirms that the evolution of extrachromosomal elements is driven by multiple parameters and is not restricted to host adaptation. In addition, we detected only recent host transfer events, suggesting the fast evolution of short k-mer profiles. This calls for caution when using k-mers for host prediction, metagenomic binning or phylogenetic reconstruction.
Collapse
Affiliation(s)
- Ariane Bize
- Université Paris-Saclay, INRAE, PROSE, F-92761, Antony, France.
| | - Cédric Midoux
- Université Paris-Saclay, INRAE, PROSE, F-92761, Antony, France.,Université Paris-Saclay, INRAE, MaIAGE, F-78350, Jouy-en-Josas, France.,Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, F-78350, Jouy-en-Josas, France
| | - Mahendra Mariadassou
- Université Paris-Saclay, INRAE, MaIAGE, F-78350, Jouy-en-Josas, France.,Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, F-78350, Jouy-en-Josas, France
| | - Sophie Schbath
- Université Paris-Saclay, INRAE, MaIAGE, F-78350, Jouy-en-Josas, France.,Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, F-78350, Jouy-en-Josas, France
| | - Patrick Forterre
- Institut Pasteur, Unité de Virologie des Archées, Département de Microbiologie, 25 Rue du Docteur Roux, 75015, Paris, France. .,Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France.
| | - Violette Da Cunha
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| |
Collapse
|
7
|
Hornischer K, Khaledi A, Pohl S, Schniederjans M, Pezoldt L, Casilag F, Muthukumarasamy U, Bruchmann S, Thöming J, Kordes A, Häussler S. BACTOME-a reference database to explore the sequence- and gene expression-variation landscape of Pseudomonas aeruginosa clinical isolates. Nucleic Acids Res 2020; 47:D716-D720. [PMID: 30272193 PMCID: PMC6324029 DOI: 10.1093/nar/gky895] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Accepted: 09/21/2018] [Indexed: 12/26/2022] Open
Abstract
Extensive use of next-generation sequencing (NGS) for pathogen profiling has the potential to transform our understanding of how genomic plasticity contributes to phenotypic versatility. However, the storage of large amounts of NGS data and visualization tools need to evolve to offer the scientific community fast and convenient access to these data. We introduce BACTOME as a database system that links aligned DNA- and RNA-sequencing reads of clinical Pseudomonas aeruginosa isolates with clinically relevant pathogen phenotypes. The database allows data extraction for any single isolate, gene or phenotype as well as data filtering and phenotypic grouping for specific research questions. With the integration of statistical tools we illustrate the usefulness of a relational database structure for the identification of phenotype-genotype correlations as an essential part of the discovery pipeline in genomic research. Furthermore, the database provides a compilation of DNA sequences and gene expression values of a plethora of clinical isolates to give a consensus DNA sequence and consensus gene expression signature. Deviations from the consensus thereby describe the genomic landscape and the transcriptional plasticity of the species P. aeruginosa. The database is available at https://bactome.helmholtz-hzi.de.
Collapse
Affiliation(s)
- Klaus Hornischer
- Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.,Institute of Molecular Bacteriology, TWINCORE GmbH, Center for Clinical and Experimental Infection Research, D-30625 Hannover, Germany.,Molecular Health GmbH, D-69115 Heidelberg, Germany
| | - Ariane Khaledi
- Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.,Institute of Molecular Bacteriology, TWINCORE GmbH, Center for Clinical and Experimental Infection Research, D-30625 Hannover, Germany
| | - Sarah Pohl
- Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.,Institute of Molecular Bacteriology, TWINCORE GmbH, Center for Clinical and Experimental Infection Research, D-30625 Hannover, Germany
| | - Monika Schniederjans
- Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.,Institute of Molecular Bacteriology, TWINCORE GmbH, Center for Clinical and Experimental Infection Research, D-30625 Hannover, Germany
| | - Lorena Pezoldt
- Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.,Institute of Molecular Bacteriology, TWINCORE GmbH, Center for Clinical and Experimental Infection Research, D-30625 Hannover, Germany
| | - Fiordiligie Casilag
- Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.,Institute of Molecular Bacteriology, TWINCORE GmbH, Center for Clinical and Experimental Infection Research, D-30625 Hannover, Germany
| | - Uthayakumar Muthukumarasamy
- Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.,Institute of Molecular Bacteriology, TWINCORE GmbH, Center for Clinical and Experimental Infection Research, D-30625 Hannover, Germany
| | - Sebastian Bruchmann
- Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.,Institute of Molecular Bacteriology, TWINCORE GmbH, Center for Clinical and Experimental Infection Research, D-30625 Hannover, Germany.,Pathogen Genomics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Janne Thöming
- Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.,Institute of Molecular Bacteriology, TWINCORE GmbH, Center for Clinical and Experimental Infection Research, D-30625 Hannover, Germany
| | - Adrian Kordes
- Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.,Institute of Molecular Bacteriology, TWINCORE GmbH, Center for Clinical and Experimental Infection Research, D-30625 Hannover, Germany
| | - Susanne Häussler
- Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.,Institute of Molecular Bacteriology, TWINCORE GmbH, Center for Clinical and Experimental Infection Research, D-30625 Hannover, Germany
| |
Collapse
|
8
|
Perkins V, Vignola S, Lessard MH, Plante PL, Corbeil J, Dugat-Bony E, Frenette M, Labrie S. Phenotypic and Genetic Characterization of the Cheese Ripening Yeast Geotrichum candidum. Front Microbiol 2020; 11:737. [PMID: 32457706 PMCID: PMC7220993 DOI: 10.3389/fmicb.2020.00737] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 03/30/2020] [Indexed: 01/04/2023] Open
Abstract
The yeast Geotrichum candidum (teleomorph Galactomyces candidus) is inoculated onto mold- and smear-ripened cheeses and plays several roles during cheese ripening. Its ability to metabolize proteins, lipids, and organic acids enables its growth on the cheese surface and promotes the development of organoleptic properties. Recent multilocus sequence typing (MLST) and phylogenetic analyses of G. candidum isolates revealed substantial genetic diversity, which may explain its strain-dependant technological capabilities. Here, we aimed to shed light on the phenotypic and genetic diversity among eight G. candidum and three Galactomyces spp. strains of environmental and dairy origin. Phenotypic tests such as carbon assimilation profiles, the ability to grow at 35°C and morphological traits on agar plates allowed us to discriminate G. candidum from Galactomyces spp. The genomes of these isolates were sequenced and assembled; whole genome comparison clustered the G. candidum strains into three subgroups and provided a reliable reference for MLST scheme optimization. Using the whole genome sequence as a reference, we optimized an MLST scheme using six loci that were proposed in two previous MLST schemes. This new MLST scheme allowed us to identify 15 sequence types (STs) out of 41 strains and revealed three major complexes named GeoA, GeoB, and GeoC. The population structure of these 41 strains was evaluated with STRUCTURE and a NeighborNet analysis of the combined six loci, which revealed recombination events between and within the complexes. These results hint that the allele variation conferring the different STs arose from recombination events. Recombination occurred for the six housekeeping genes studied, but most likely occurred throughout the genome. These recombination events may have induced an adaptive divergence between the wild strains and the cheesemaking strains, as observed for other cheese ripening fungi. Further comparative genomic studies are needed to confirm this phenomenon in G. candidum. In conclusion, the draft assembly of 11 G. candidum/Galactomyces spp. genomes allowed us to optimize a genotyping MLST scheme and, combined with the assessment of their ability to grow under different conditions, provides a reliable tool to cluster and eventually improves the selection of G. candidum strains.
Collapse
Affiliation(s)
- Vincent Perkins
- Department of Food Sciences and Nutrition, STELA Dairy Research Center, Institute of Nutrition and Functional Foods, Université Laval, Quebec City, QC, Canada
| | - Stéphanie Vignola
- Department of Food Sciences and Nutrition, STELA Dairy Research Center, Institute of Nutrition and Functional Foods, Université Laval, Quebec City, QC, Canada
| | - Marie-Hélène Lessard
- Department of Food Sciences and Nutrition, STELA Dairy Research Center, Institute of Nutrition and Functional Foods, Université Laval, Quebec City, QC, Canada
| | - Pier-Luc Plante
- Big Data Research Center, Université Laval, Quebec City, QC, Canada
| | - Jacques Corbeil
- Big Data Research Center, Université Laval, Quebec City, QC, Canada
| | - Eric Dugat-Bony
- Department of Food Sciences and Nutrition, STELA Dairy Research Center, Institute of Nutrition and Functional Foods, Université Laval, Quebec City, QC, Canada
- Université Paris-Saclay, INRAE, AgroParisTech, UMR SayFood, Thiverval-Grignon, France
| | - Michel Frenette
- Oral Ecology Research Group, Faculty of Dental Medicine, Université Laval, Quebec City, QC, Canada
- Faculty of Science and Engineering, Department of Biochemistry, Microbiology, and Bioinformatics, Université Laval, Quebec City, QC, Canada
| | - Steve Labrie
- Department of Food Sciences and Nutrition, STELA Dairy Research Center, Institute of Nutrition and Functional Foods, Université Laval, Quebec City, QC, Canada
| |
Collapse
|
9
|
Liu Z, Deng D, Lu H, Sun J, Lv L, Li S, Peng G, Ma X, Li J, Li Z, Rong T, Wang G. Evaluation of Machine Learning Models for Predicting Antimicrobial Resistance of Actinobacillus pleuropneumoniae From Whole Genome Sequences. Front Microbiol 2020; 11:48. [PMID: 32117101 PMCID: PMC7016212 DOI: 10.3389/fmicb.2020.00048] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Accepted: 01/10/2020] [Indexed: 01/05/2023] Open
Abstract
Antimicrobial resistance (AMR) is becoming a huge problem in countries all over the world, and new approaches to identifying strains resistant or susceptible to certain antibiotics are essential in fighting against antibiotic-resistant pathogens. Genotype-based machine learning methods showed great promise as a diagnostic tool, due to the increasing availability of genomic datasets and AST phenotypes. In this article, Support Vector Machine (SVM) and Set Covering Machine (SCM) models were used to learn and predict the resistance of the five drugs (Tetracycline, Ampicillin, Sulfisoxazole, Trimethoprim, and Enrofloxacin). The SVM model used the number of co-occurring k-mers between the genome of the isolates and the reference genes to learn and predict the phenotypes of the bacteria to a specific antimicrobial, while the SCM model uses a greedy approach to construct conjunction or disjunction of Boolean functions to find the most concise set of k-mers that allows for accurate prediction of the phenotype. Five-fold cross-validation was performed on the training set of the SVM and SCM model to select the best hyperparameter values to avoid model overfitting. The training accuracy (mean cross-validation score) and the testing accuracy of SVM and SCM models of five drugs were above 90% regardless of the resistant mechanism of which were acquired resistant or point mutation in the chromosome. The results of correlation between the phenotype and the model predictions of the five drugs indicated that both SVM and SCM models could significantly classify the resistant isolates from the sensitive isolates of the bacteria (p < 0.01), and would be used as potential tools in antimicrobial resistance surveillance and clinical diagnosis in veterinary medicine.
Collapse
Affiliation(s)
- Zhichang Liu
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China.,State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China.,Key Laboratory of Animal Nutrition and Feed Science of Ministry of Agriculture (South China), Guangzhou, China.,Guangdong Engineering Technology Research Center of Animal Meat Quality and Safety Control and Evaluation, Guangzhou, China
| | - Dun Deng
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China.,State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China.,Key Laboratory of Animal Nutrition and Feed Science of Ministry of Agriculture (South China), Guangzhou, China.,Guangdong Engineering Technology Research Center of Animal Meat Quality and Safety Control and Evaluation, Guangzhou, China
| | - Huijie Lu
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China.,State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China.,Key Laboratory of Animal Nutrition and Feed Science of Ministry of Agriculture (South China), Guangzhou, China.,Guangdong Engineering Technology Research Center of Animal Meat Quality and Safety Control and Evaluation, Guangzhou, China
| | - Jian Sun
- National Veterinary Microbiological Drug Resistance Risk Assessment Laboratory, College of Veterinary Medicine, South China Agricultural University, Guangzhou, China
| | - Luchao Lv
- National Veterinary Microbiological Drug Resistance Risk Assessment Laboratory, College of Veterinary Medicine, South China Agricultural University, Guangzhou, China
| | - Shuhong Li
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China.,State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China.,Key Laboratory of Animal Nutrition and Feed Science of Ministry of Agriculture (South China), Guangzhou, China.,Guangdong Engineering Technology Research Center of Animal Meat Quality and Safety Control and Evaluation, Guangzhou, China
| | - Guanghui Peng
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China.,State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China.,Key Laboratory of Animal Nutrition and Feed Science of Ministry of Agriculture (South China), Guangzhou, China.,Guangdong Engineering Technology Research Center of Animal Meat Quality and Safety Control and Evaluation, Guangzhou, China
| | - Xianyong Ma
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China.,State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China.,Key Laboratory of Animal Nutrition and Feed Science of Ministry of Agriculture (South China), Guangzhou, China.,Guangdong Engineering Technology Research Center of Animal Meat Quality and Safety Control and Evaluation, Guangzhou, China
| | - Jiazhou Li
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China.,State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China.,Key Laboratory of Animal Nutrition and Feed Science of Ministry of Agriculture (South China), Guangzhou, China.,Guangdong Engineering Technology Research Center of Animal Meat Quality and Safety Control and Evaluation, Guangzhou, China
| | - Zhenming Li
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China.,State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China.,Key Laboratory of Animal Nutrition and Feed Science of Ministry of Agriculture (South China), Guangzhou, China.,Guangdong Engineering Technology Research Center of Animal Meat Quality and Safety Control and Evaluation, Guangzhou, China
| | - Ting Rong
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China.,State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China.,Key Laboratory of Animal Nutrition and Feed Science of Ministry of Agriculture (South China), Guangzhou, China.,Guangdong Engineering Technology Research Center of Animal Meat Quality and Safety Control and Evaluation, Guangzhou, China
| | - Gang Wang
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China.,State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China.,Key Laboratory of Animal Nutrition and Feed Science of Ministry of Agriculture (South China), Guangzhou, China.,Guangdong Engineering Technology Research Center of Animal Meat Quality and Safety Control and Evaluation, Guangzhou, China
| |
Collapse
|
10
|
Biswas S, Noyce RS, Babiuk LA, Lung O, Bulach DM, Bowden TR, Boyle DB, Babiuk S, Evans DH. Extended sequencing of vaccine and wild-type capripoxvirus isolates provides insights into genes modulating virulence and host range. Transbound Emerg Dis 2019; 67:80-97. [PMID: 31379093 DOI: 10.1111/tbed.13322] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 07/22/2019] [Accepted: 07/30/2019] [Indexed: 11/29/2022]
Abstract
The genus Capripoxvirus in the subfamily Chordopoxvirinae, family Poxviridae, comprises sheeppox virus (SPPV), goatpox virus (GTPV) and lumpy skin disease virus (LSDV), which cause the eponymous diseases across parts of Africa, the Middle East and Asia. These diseases cause significant economic losses and can have a devastating impact on the livelihoods and food security of small farm holders. So far, only live classically attenuated SPPV, GTPV and LSDV vaccines are commercially available and the history, safety and efficacy of many have not been well established. Here, we report 13 new capripoxvirus genome sequences, including the hairpin telomeres, from both pathogenic field isolates and vaccine strains. We have also updated the genome annotations to incorporate recent advances in our understanding of poxvirus biology. These new genomes and genes grouped phenetically with other previously sequenced capripoxvirus strains, and these new alignments collectively identified several recurring alterations in genes thought to modulate virulence and host range. In particular, some of the many large capripoxvirus ankyrin and kelch-like proteins are commonly mutated in vaccine strains, while the variola virus B22R-like gene homolog has also been disrupted in many vaccine isolates. Among these vaccine isolates, frameshift mutations are especially common and clearly present a risk of reversion to wild type in vaccines bearing these mutations. A consistent pattern of gene inactivation from LSDV to GTPV and then SPPV is also observed, much like the pattern of gene loss in orthopoxviruses, but, rather surprisingly, the overall genome size of ~150 kbp remains relatively constant. These data provide new insights into the evolution of capripoxviruses and the determinants of pathogenicity and host range. They will find application in the development of new vaccines with better safety, efficacy and trade profiles.
Collapse
Affiliation(s)
- Siddhartha Biswas
- Department of Medical Microbiology and Immunology, Li Ka Shing Institute of Virology, University of Alberta, Edmonton, AB, Canada
| | - Ryan S Noyce
- Department of Medical Microbiology and Immunology, Li Ka Shing Institute of Virology, University of Alberta, Edmonton, AB, Canada
| | - Lorne A Babiuk
- Department of Agricultural, Food, and Nutritional Sciences, University of Alberta, Edmonton, AB, Canada
| | - Oliver Lung
- National Centre for Foreign Animal Disease (NCFAD), Canadian Food Inspection Agency, Winnipeg, MB, Canada
| | - Dieter M Bulach
- CSIRO Livestock Industries, Australian Animal Health Laboratory, Geelong, Vic., Australia
| | - Timothy R Bowden
- CSIRO Livestock Industries, Australian Animal Health Laboratory, Geelong, Vic., Australia
| | - David B Boyle
- CSIRO Livestock Industries, Australian Animal Health Laboratory, Geelong, Vic., Australia
| | - Shawn Babiuk
- National Centre for Foreign Animal Disease (NCFAD), Canadian Food Inspection Agency, Winnipeg, MB, Canada.,Department of Immunology, University of Manitoba, Winnipeg, MB, Canada
| | - David H Evans
- Department of Medical Microbiology and Immunology, Li Ka Shing Institute of Virology, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
11
|
Interpretable genotype-to-phenotype classifiers with performance guarantees. Sci Rep 2019; 9:4071. [PMID: 30858411 PMCID: PMC6411721 DOI: 10.1038/s41598-019-40561-2] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Accepted: 02/19/2019] [Indexed: 01/15/2023] Open
Abstract
Understanding the relationship between the genome of a cell and its phenotype is a central problem in precision medicine. Nonetheless, genotype-to-phenotype prediction comes with great challenges for machine learning algorithms that limit their use in this setting. The high dimensionality of the data tends to hinder generalization and challenges the scalability of most learning algorithms. Additionally, most algorithms produce models that are complex and difficult to interpret. We alleviate these limitations by proposing strong performance guarantees, based on sample compression theory, for rule-based learning algorithms that produce highly interpretable models. We show that these guarantees can be leveraged to accelerate learning and improve model interpretability. Our approach is validated through an application to the genomic prediction of antimicrobial resistance, an important public health concern. Highly accurate models were obtained for 12 species and 56 antibiotics, and their interpretation revealed known resistance mechanisms, as well as some potentially new ones. An open-source disk-based implementation that is both memory and computationally efficient is provided with this work. The implementation is turnkey, requires no prior knowledge of machine learning, and is complemented by comprehensive tutorials.
Collapse
|