1
|
Pallen MJ, Ponsero AJ, Telatin A, Moss CJ, Baker D, Heavens D, Davidson GL. Faecal metagenomes of great tits and blue tits provide insights into host, diet, pathogens and microbial biodiversity. Access Microbiol 2025; 7:000910.v3. [PMID: 40302838 PMCID: PMC12038002 DOI: 10.1099/acmi.0.000910.v3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Accepted: 04/14/2025] [Indexed: 05/02/2025] Open
Abstract
Background. The vertebrate gut microbiome plays crucial roles in host health and disease. However, there is limited information on the microbiomes of wild birds, most of which is restricted to barcode sequences. We therefore explored the use of shotgun metagenomics on the faecal microbiomes of two wild bird species widely used as model organisms in ecological studies: the great tit (Parus major) and the Eurasian blue tit (Cyanistes caeruleus). Results. Short-read sequencing of five faecal samples generated a metagenomic dataset, revealing substantial variation in composition between samples. Reference-based profiling with Kraken2 identified key differences in the ratios of reads assigned to host, diet and microbes. Some samples showed high abundance of potential pathogens, including siadenoviruses, coccidian parasites and the antimicrobial-resistant bacterial species Serratia fonticola. From metagenome assemblies, we obtained complete mitochondrial genomes from the host species and from Isospora spp., while metagenome-assembled genomes documented new prokaryotic species. Conclusions. Here, we have shown the utility of shotgun metagenomics in uncovering microbial diversity beyond what is possible with 16S rRNA gene sequencing. These findings provide a foundation for future hypothesis testing and microbiome manipulation to improve fitness in wild bird populations. The study also highlights the potential role of wild birds in the dissemination of antimicrobial resistance.
Collapse
Affiliation(s)
- Mark J. Pallen
- Quadram Institute Bioscience, Norwich Research Park, Norwich, UK
- University of East Anglia, Norwich Research Park, Norwich, UK
| | | | - Andrea Telatin
- Quadram Institute Bioscience, Norwich Research Park, Norwich, UK
| | - Cara-Jane Moss
- Quadram Institute Bioscience, Norwich Research Park, Norwich, UK
| | - David Baker
- Quadram Institute Bioscience, Norwich Research Park, Norwich, UK
| | - Darren Heavens
- Earlham Institute, Norwich Research Park, Norwich, Norfolk, NR4 7UZ UK
| | - Gabrielle L. Davidson
- University of East Anglia, Norwich Research Park, Norwich, UK
- University of Cambridge, Downing Street, Cambridge, CB2 3EB, UK
| |
Collapse
|
2
|
Virtanen S, Saqib S, Kanerva T, Ventin-Holmberg R, Nieminen P, Holster T, Kalliala I, Salonen A. Metagenome-validated combined amplicon sequencing and text mining-based annotations for simultaneous profiling of bacteria and fungi: vaginal microbiota and mycobiota in healthy women. MICROBIOME 2024; 12:273. [PMID: 39731160 DOI: 10.1186/s40168-024-01993-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 11/28/2024] [Indexed: 12/29/2024]
Abstract
BACKGROUND Amplicon sequencing of kingdom-specific tags such as 16S rRNA gene for bacteria and internal transcribed spacer (ITS) region for fungi are widely used for investigating microbial communities. So far most human studies have focused on bacteria while studies on host-associated fungi in health and disease have only recently started to accumulate. To enable cost-effective parallel analysis of bacterial and fungal communities in human and environmental samples, we developed a method where 16S rRNA gene and ITS1 amplicons were pooled together for a single Illumina MiSeq or HiSeq run and analysed after primer-based segregation. Taxonomic assignments were performed with Blast in combination with an iterative text-extraction-based filtration approach, which uses extensive literature records from public databases to select the most probable hits that were further validated by shotgun metagenomic sequencing. RESULTS Using 50 vaginal samples, we show that the combined run provides comparable results on bacterial composition and diversity to conventional 16S rRNA gene amplicon sequencing. The text-extraction-based taxonomic assignment-guided tool provided ecosystem-specific bacterial annotations that were confirmed by shotgun metagenomic sequencing (VIRGO, MetaPhlAn, Kraken2). Fungi were identified in 39/50 samples with ITS sequencing while in the metagenome data fungi largely remained undetected due to their low abundance and database issues. Co-abundance analysis of bacteria and fungi did not show strong between-kingdom correlations within the vaginal ecosystem of healthy women. CONCLUSION Combined amplicon sequencing for bacteria and fungi provides a simple and cost-effective method for simultaneous analysis of microbiota and mycobiota within the same samples. Conventional metagenomic sequencing does not provide sufficient fungal genome coverage for their reliable detection in vaginal samples. Text extraction-based annotation tool facilitates ecosystem-specific characterization and interpretation of microbial communities by coupling sequence homology to microbe metadata readily available through public databases. Video Abstract.
Collapse
Affiliation(s)
- Seppo Virtanen
- Department of Obstetrics and Gynaecology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Faculty of Medicine, Human Microbiome Research Program, University of Helsinki, Helsinki, Finland
| | - Schahzad Saqib
- Faculty of Medicine, Human Microbiome Research Program, University of Helsinki, Helsinki, Finland
| | - Tinja Kanerva
- Faculty of Medicine, Human Microbiome Research Program, University of Helsinki, Helsinki, Finland
- Present Address: Research and Development, Kemira Oyj, Helsinki, Finland
| | - Rebecka Ventin-Holmberg
- Faculty of Medicine, Human Microbiome Research Program, University of Helsinki, Helsinki, Finland
- Folkhälsan Research Center, 00250, Helsinki, Finland
| | - Pekka Nieminen
- Department of Obstetrics and Gynaecology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Tiina Holster
- Department of Obstetrics and Gynaecology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Ilkka Kalliala
- Department of Obstetrics and Gynaecology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Faculty of Medicine, Human Microbiome Research Program, University of Helsinki, Helsinki, Finland
- Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, UK
| | - Anne Salonen
- Faculty of Medicine, Human Microbiome Research Program, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
3
|
Koslicki D, White S, Ma C, Novikov A. YACHT: an ANI-based statistical test to detect microbial presence/absence in a metagenomic sample. Bioinformatics 2024; 40:btae047. [PMID: 38268451 PMCID: PMC10868342 DOI: 10.1093/bioinformatics/btae047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 01/05/2024] [Accepted: 01/22/2024] [Indexed: 01/26/2024] Open
Abstract
MOTIVATION In metagenomics, the study of environmentally associated microbial communities from their sampled DNA, one of the most fundamental computational tasks is that of determining which genomes from a reference database are present or absent in a given sample metagenome. Existing tools generally return point estimates, with no associated confidence or uncertainty associated with it. This has led to practitioners experiencing difficulty when interpreting the results from these tools, particularly for low-abundance organisms as these often reside in the "noisy tail" of incorrect predictions. Furthermore, few tools account for the fact that reference databases are often incomplete and rarely, if ever, contain exact replicas of genomes present in an environmentally derived metagenome. RESULTS We present solutions for these issues by introducing the algorithm YACHT: Yes/No Answers to Community membership via Hypothesis Testing. This approach introduces a statistical framework that accounts for sequence divergence between the reference and sample genomes, in terms of ANI, as well as incomplete sequencing depth, thus providing a hypothesis test for determining the presence or absence of a reference genome in a sample. After introducing our approach, we quantify its statistical power and how this changes with varying parameters. Subsequently, we perform extensive experiments using both simulated and real data to confirm the accuracy and scalability of this approach. AVAILABILITY AND IMPLEMENTATION The source code implementing this approach is available via Conda and at https://github.com/KoslickiLab/YACHT. We also provide the code for reproducing experiments at https://github.com/KoslickiLab/YACHT-reproducibles.
Collapse
Affiliation(s)
- David Koslicki
- Department of Computer Science and Engineering, Pennsylvania State University, State College, PA 16802, United States
- Department of Biology, Pennsylvania State University, State College, PA 16802, United States
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
- One Health Microbiome Center, Pennsylvania State University, State College, PA 16802, United States
| | - Stephen White
- Department of Mathematics, Pennsylvania State University, State College, PA 16802, United States
| | - Chunyu Ma
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
| | - Alexei Novikov
- Department of Mathematics, Pennsylvania State University, State College, PA 16802, United States
| |
Collapse
|
4
|
Xie J, Tan B, Zhang Y. A Large-Scale Study into Protist-Animal Interactions Based on Public Genomic Data Using DNA Barcodes. Animals (Basel) 2023; 13:2243. [PMID: 37508021 PMCID: PMC10376638 DOI: 10.3390/ani13142243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 07/06/2023] [Accepted: 07/06/2023] [Indexed: 07/30/2023] Open
Abstract
With the birth of next-generation sequencing (NGS) technology, genomic data in public databases have increased exponentially. Unfortunately, exogenous contamination or intracellular parasite sequences in assemblies could confuse genomic analysis. Meanwhile, they can provide a valuable resource for studies of host-microbe interactions. Here, we used a strategy based on DNA barcodes to scan protistan contamination in the GenBank WGS/TSA database. The results showed a total of 13,952 metazoan/animal assemblies in GenBank, where 17,036 contigs were found to be protistan contaminants in 1507 assemblies (10.8%), with even higher contamination rates in taxa of Cnidaria (150/281), Crustacea (237/480), and Mollusca (107/410). Taxonomic analysis of the protists derived from these contigs showed variations in abundance and evenness of protistan contamination across different metazoan taxa, reflecting host preferences of Apicomplexa, Ciliophora, Oomycota and Symbiodiniaceae for mammals and birds, Crustacea, insects, and Cnidaria, respectively. Finally, mitochondrial proteins COX1 and CYTB were predicted from these contigs, and the phylogenetic analysis corroborated the protistan origination and heterogeneous distribution of the contaminated contigs. Overall, in this study, we conducted a large-scale scan of protistan contaminant in genomic resources, and the protistan sequences detected will help uncover the protist diversity and relationships of these picoeukaryotes with Metazoa.
Collapse
Affiliation(s)
- Jiazheng Xie
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Bowen Tan
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Yi Zhang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| |
Collapse
|
5
|
Gilroy R, Adam ME, Kumar B, Pallen MJ. An initial genomic blueprint of the healthy human oesophageal microbiome. Access Microbiol 2023; 5:acmi000558.v3. [PMID: 37424544 PMCID: PMC10323806 DOI: 10.1099/acmi.0.000558.v3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 05/15/2023] [Indexed: 07/11/2023] Open
Abstract
Background The oesophageal microbiome is thought to contribute to the pathogenesis of oesophageal cancer. However, investigations using culture and molecular barcodes have provided only a low-resolution view of this important microbial community. We therefore explored the potential of culturomics and metagenomic binning to generate a catalogue of reference genomes from the healthy human oesophageal microbiome, alongside a comparison set from saliva. Results Twenty-two distinct colonial morphotypes from healthy oesophageal samples were genome-sequenced. These fell into twelve species clusters, eleven of which represented previously defined species. Two isolates belonged to a novel species, which we have named Rothia gullae. We performed metagenomic binning of reads generated from UK samples from this study alongside reads generated from Australian samples in a recent study. Metagenomic binning generated 136 medium or high-quality metagenome-assembled genomes (MAGs). MAGs were assigned to 56 species clusters, eight representing novel Candidatus species, which we have named Ca. Granulicatella gullae, Ca. Streptococcus gullae, Ca. Nanosynbacter quadramensis, Ca. Nanosynbacter gullae, Ca. Nanosynbacter colneyensis, Ca. Nanosynbacter norwichensis, Ca. Nanosynococcus oralis and Ca. Haemophilus gullae. Five of these novel species belong to the recently described phylum Patescibacteria . Although members of the Patescibacteria are known to inhabit the oral cavity, this is the first report of their presence in the oesophagus. Eighteen of the metagenomic species were, until recently, identified only by hard-to-remember alphanumeric placeholder designations. Here we illustrate the utility of a set of recently published arbitrary Latinate species names in providing user-friendly taxonomic labels for microbiome analyses.Our non-redundant species catalogue contained 63 species derived from cultured isolates or MAGs. Mapping revealed that these species account for around half of the sequences in the oesophageal and saliva metagenomes. Although no species was present in all oesophageal samples, 60 species occurred in at least one oesophageal metagenome from either study, with 50 identified in both cohorts. Conclusions Recovery of genomes and discovery of new species represents an important step forward in our understanding of the oesophageal microbiome. The genes and genomes that we have released into the public domain will provide a base line for future comparative, mechanistic and intervention studies.
Collapse
Affiliation(s)
- Rachel Gilroy
- Quadram Institute Bioscience, Norwich Research Park, Norwich, UK
| | - Mina E. Adam
- Norfolk & Norwich University Hospitals NHS Foundation Trust, Norwich, UK
- School of Veterinary Medicine, University of Surrey, Guildford, Surrey, UK
| | - Bhaskar Kumar
- Norfolk & Norwich University Hospitals NHS Foundation Trust, Norwich, UK
- School of Veterinary Medicine, University of Surrey, Guildford, Surrey, UK
| | - Mark J. Pallen
- Quadram Institute Bioscience, Norwich Research Park, Norwich, UK
- School of Veterinary Medicine, University of Surrey, Guildford, Surrey, UK
- University of East Anglia, Norwich Research Park, Norwich, UK
| |
Collapse
|
6
|
Abstract
Experiments involving metagenomics data are become increasingly commonplace. Processing such data requires a unique set of considerations. Quality control of metagenomics data is critical to extracting pertinent insights. In this chapter, we outline some considerations in terms of study design and other confounding factors that can often only be realized at the point of data analysis.In this chapter, we outline some basic principles of quality control in metagenomics, including overall reproducibility and some good practices to follow. The general quality control of sequencing data is then outlined, and we introduce ways to process this data by using bash scripts and developing pipelines in Snakemake (Python).A significant part of quality control in metagenomics is in analyzing the data to ensure you can spot relationships between variables and to identify when they might be confounded. This chapter provides a walkthrough of analyzing some microbiome data (in the R statistical language) and demonstrates a few days to identify overall differences and similarities in microbiome data. The chapter is concluded by discussing remarks about considering taxonomic results in the context of the study and interrogating sequence alignments using the command line.
Collapse
Affiliation(s)
- Abraham Gihawi
- Bob Champion Research & Education Building, Norwich Medical School, University of East Anglia, Norwich, UK
| | - Ryan Cardenas
- Bob Champion Research & Education Building, Norwich Medical School, University of East Anglia, Norwich, UK
| | - Rachel Hurst
- Bob Champion Research & Education Building, Norwich Medical School, University of East Anglia, Norwich, UK
| | - Daniel S Brewer
- Bob Champion Research & Education Building, Norwich Medical School, University of East Anglia, Norwich, UK.
- Earlham Institute, Norwich Research Park, Norwich, UK.
| |
Collapse
|
7
|
Abstract
Assigning taxonomy remains a challenging topic in microbiome studies, due largely to ambiguity of reads which overlap multiple reference genomes. With the Web of Life (WoL) reference database hosting 10,575 reference genomes and growing, the percentage of ambiguous reads will only increase. The resulting artifacts create both the illusion of co-occurrence and a long tail end of extraneous reference hits that confound interpretation. We introduce genome cover, the fraction of reference genome overlapped by reads, to distinguish these artifacts. We show how to dynamically predict genome cover by read count and examine our model in Staphylococcus aureus monoculture. Our modeling cleanly separates both S. aureus and true contaminants from the false artifacts of reference overlap. We next introduce saturated genome cover, the true fraction of a reference genome overlapped by sample contents. Genome cover may not saturate for low abundance or low prevalence bacteria. We assuage this worry with examination of a large human fecal data set. By compositing the metric across like samples, genome cover saturates even for rare species. We note that it is a threshold on saturated genome cover, not genome cover itself, which indicates a spurious reference hit or distant relative. We present Zebra, a method to compute and threshold the genome cover metric across like samples, a recurrence to estimate genome cover and confirm saturation, and provide guidance for choosing cover thresholds in real world scenarios. Standalone genome cover and integration into Woltka are available: https://github.com/biocore/zebra_filter, https://github.com/qiyunzhu/woltka. IMPORTANCE Taxonomic assignment, assigning sequences to specific taxonomic units, is a crucial processing step in microbiome analyses. Issues in taxonomic assignment affect interpretation of what microbes are present in each sample and may be associated with specific environmental or clinical conditions. Assigning importance to a particular taxon relies strongly on independence of assigned counts. The false inclusion of thousands of correlated taxa makes interpretation ambiguous, leading to underconstrained results which cannot be reproduced. The importance sometimes attached to implausible artifacts such as anthrax or bubonic plague is especially problematic. We show that the Zebra filter retrieves only the nearest relatives of sample contents enabling more reproducible and biologically plausible interpretation of metagenomic data.
Collapse
|
8
|
Garrido-Sanz L, Àngel Senar M, Piñol J. Drastic reduction of false positive species in samples of insects by intersecting the default output of two popular metagenomic classifiers. PLoS One 2022; 17:e0275790. [PMID: 36282811 PMCID: PMC9595558 DOI: 10.1371/journal.pone.0275790] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Accepted: 09/15/2022] [Indexed: 11/19/2022] Open
Abstract
The use of high-throughput sequencing to recover short DNA reads of many species has been widely applied on biodiversity studies, either as amplicon metabarcoding or shotgun metagenomics. These reads are assigned to taxa using classifiers. However, for different reasons, the results often contain many false positives. Here we focus on the reduction of false positive species attributable to the classifiers. We benchmarked two popular classifiers, BLASTn followed by MEGAN6 (BM) and Kraken2 (K2), to analyse shotgun sequenced artificial single-species samples of insects. To reduce the number of misclassified reads, we combined the output of the two classifiers in two different ways: (1) by keeping only the reads that were attributed to the same species by both classifiers (intersection approach); and (2) by keeping the reads assigned to some species by any classifier (union approach). In addition, we applied an analytical detection limit to further reduce the number of false positives species. As expected, both metagenomic classifiers used with default parameters generated an unacceptably high number of misidentified species (tens with BM, hundreds with K2). The false positive species were not necessarily phylogenetically close, as some of them belonged to different orders of insects. The union approach failed to reduce the number of false positives, but the intersection approach got rid of most of them. The addition of an analytic detection limit of 0.001 further reduced the number to ca. 0.5 false positive species per sample. The misidentification of species by most classifiers hampers the confidence of the DNA-based methods for assessing the biodiversity of biological samples. Our approach to alleviate the problem is straightforward and significantly reduced the number of reported false positive species.
Collapse
Affiliation(s)
- Lidia Garrido-Sanz
- Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
- * E-mail:
| | | | - Josep Piñol
- Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
- CREAF, Cerdanyola del Vallès, Spain
| |
Collapse
|
9
|
HAYSTAC: A Bayesian framework for robust and rapid species identification in high-throughput sequencing data. PLoS Comput Biol 2022; 18:e1010493. [PMID: 36178955 PMCID: PMC9555677 DOI: 10.1371/journal.pcbi.1010493] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 10/12/2022] [Accepted: 08/16/2022] [Indexed: 11/24/2022] Open
Abstract
Identification of specific species in metagenomic samples is critical for several key applications, yet many tools available require large computational power and are often prone to false positive identifications. Here we describe High-AccuracY and Scalable Taxonomic Assignment of MetagenomiC data (HAYSTAC), which can estimate the probability that a specific taxon is present in a metagenome. HAYSTAC provides a user-friendly tool to construct databases, based on publicly available genomes, that are used for competitive read mapping. It then uses a novel Bayesian framework to infer the abundance and statistical support for each species identification and provide per-read species classification. Unlike other methods, HAYSTAC is specifically designed to efficiently handle both ancient and modern DNA data, as well as incomplete reference databases, making it possible to run highly accurate hypothesis-driven analyses (i.e., assessing the presence of a specific species) on variably sized reference databases while dramatically improving processing speeds. We tested the performance and accuracy of HAYSTAC using simulated Illumina libraries, both with and without ancient DNA damage, and compared the results to other currently available methods (i.e., Kraken2/Bracken, KrakenUniq, MALT/HOPS, and Sigma). HAYSTAC identified fewer false positives than both Kraken2/Bracken, KrakenUniq and MALT in all simulations, and fewer than Sigma in simulations of ancient data. It uses less memory than Kraken2/Bracken, KrakenUniq as well as MALT both during database construction and sample analysis. Lastly, we used HAYSTAC to search for specific pathogens in two published ancient metagenomic datasets, demonstrating how it can be applied to empirical datasets. HAYSTAC is available from https://github.com/antonisdim/HAYSTAC. The emerging field of paleo-metagenomics (i.e., metagenomics from ancient DNA) holds great promise for novel discoveries in fields as diverse as pathogen evolution and paleoenvironmental reconstruction. However, there is presently a lack of computational methods for species identification from microbial communities in both degraded and nondegraded DNA material. Here, we present “HAYSTAC”, a user-friendly software package that implements a novel probabilistic model for species identification in metagenomic data obtained from both degraded and non-degraded DNA material. Through extensive benchmarking, we show that HAYSTAC can be used for accurately profiling the community composition, as well as for direct hypothesis testing for the presence of extremely low-abundance taxa, in complex metagenomic samples. After analysing simulated and publicly available datasets, HAYSTAC consistently produced the lowest number of false positive identifications during taxonomic profiling, produced robust results when databases of restricted size were used, and showed increased sensitivity for pathogen detection compared to other specialist methods. The newly proposed probabilistic model and software employed by HAYSTAC can have a substantial impact on the robust and rapid pathogen discovery in degraded/shallow sequenced metagenomic samples while optimising the use of computational resources.
Collapse
|
10
|
Peimbert M, Alcaraz LD. Where environmental microbiome meets its host: subway and passenger microbiome relationships. Mol Ecol 2022; 32:2602-2618. [PMID: 35318755 DOI: 10.1111/mec.16440] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 03/12/2022] [Accepted: 03/16/2022] [Indexed: 12/17/2022]
Abstract
Subways are urban transport systems with high capacity. Every day around the world, there are more than 150 million subway passengers. Since 2013, thousands of microbiome samples from various subways worldwide have been sequenced. Skin bacteria and environmental organisms dominate the subway microbiomes. The literature has revealed common bacterial groups in subway systems; even so, it is possible to identify cities by their microbiome. Low-frequency bacteria are responsible for specific bacterial fingerprints of each subway system. Furthermore, daily subway commuters leave their microbial clouds and interact with other passengers. Microbial exchange is quite fast; the hand microbiome changes within minutes, and after cleaning the handrails, the bacteria are re-established within minutes. To investigate new taxa and metabolic pathways of subway microbial communities, several high-quality metagenomic-assembled genomes (MAG) have been described. Subways are harsh environments unfavorable for microorganism growth. However, recent studies have observed a wide diversity of viable and metabolically active bacteria. Understanding which bacteria are living, dormant, or dead allows us to propose realistic ecological interactions. Questions regarding the relationship between humans and the subway microbiome, particularly the microbiome effects on personal and public health, remain unanswered. This review summarizes our knowledge of subway microbiomes and their relationship with passenger microbiomes.
Collapse
Affiliation(s)
- Mariana Peimbert
- Departamento de Ciencias Naturales, Unidad Cuajimalpa, Universidad Autónoma Metropolitana. Ciudad de México, México
| | - Luis D Alcaraz
- Departamento de Biología Celular, Facultad de Ciencias, Universidad Nacional Autónoma de México, Ciudad de México, México
| |
Collapse
|
11
|
The oesophageal microbiome and cancer: hope or hype? Trends Microbiol 2021; 30:322-329. [PMID: 34493428 DOI: 10.1016/j.tim.2021.08.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Revised: 08/12/2021] [Accepted: 08/16/2021] [Indexed: 02/08/2023]
Abstract
The human oesophagus is home to a complex microbial community, the oesophageal microbiome. Despite decades of work, we still have only a poor, low-resolution view of this community, which makes it hard to distinguish hope from hype when it comes to assessing links between the oesophageal microbiome and cancer. Here we review the potential importance of this microbiome and discuss new approaches, including culturomics, metagenomics, and recovery of whole-genome sequences, that bring renewed hope for an in-depth characterisation of this community that could deliver translational impact.
Collapse
|
12
|
Peterson D, Bonham KS, Rowland S, Pattanayak CW, Klepac-Ceraj V. Comparative Analysis of 16S rRNA Gene and Metagenome Sequencing in Pediatric Gut Microbiomes. Front Microbiol 2021; 12:670336. [PMID: 34335499 PMCID: PMC8320171 DOI: 10.3389/fmicb.2021.670336] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Accepted: 05/28/2021] [Indexed: 01/04/2023] Open
Abstract
The colonization of the human gut microbiome begins at birth, and over time, these microbial communities become increasingly complex. Most of what we currently know about the human microbiome, especially in early stages of development, was described using culture-independent sequencing methods that allow us to identify the taxonomic composition of microbial communities using genomic techniques, such as amplicon or shotgun metagenomic sequencing. Each method has distinct tradeoffs, but there has not been a direct comparison of the utility of these methods in stool samples from very young children, which have different features than those of adults. We compared the effects of profiling the human infant gut microbiome with 16S rRNA amplicon vs. shotgun metagenomic sequencing techniques in 338 fecal samples; younger than 15, 15-30, and older than 30 months of age. We demonstrate that observed changes in alpha-diversity and beta-diversity with age occur to similar extents using both profiling methods. We also show that 16S rRNA profiling identified a larger number of genera and we find several genera that are missed or underrepresented by each profiling method. We present the link between alpha diversity and shotgun metagenomic sequencing depth for children of different ages. These findings provide a guide for selecting an appropriate method and sequencing depth for the three studied age groups.
Collapse
Affiliation(s)
- Danielle Peterson
- Department of Biological Sciences, Wellesley College, Wellesley, MA, United States
| | - Kevin S Bonham
- Department of Biological Sciences, Wellesley College, Wellesley, MA, United States
| | - Sophie Rowland
- Department of Biological Sciences, Wellesley College, Wellesley, MA, United States
| | - Cassandra W Pattanayak
- Department of Mathematics, Quantitative Reasoning Program, and the Quantitative Analysis Institute at Wellesley College, Wellesley, MA, United States
| | | | - Vanja Klepac-Ceraj
- Department of Biological Sciences, Wellesley College, Wellesley, MA, United States
| |
Collapse
|
13
|
Poore GD, Kopylova E, Zhu Q, Carpenter C, Fraraccio S, Wandro S, Kosciolek T, Janssen S, Metcalf J, Song SJ, Kanbar J, Miller-Montgomery S, Heaton R, Mckay R, Patel SP, Swafford AD, Knight R. Microbiome analyses of blood and tissues suggest cancer diagnostic approach. Nature 2020; 579:567-574. [PMID: 32214244 PMCID: PMC7500457 DOI: 10.1038/s41586-020-2095-1] [Citation(s) in RCA: 697] [Impact Index Per Article: 139.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 02/06/2020] [Indexed: 01/05/2023]
Abstract
Systematic characterization of the cancer microbiome provides the opportunity to develop techniques that exploit non-human, microorganism-derived molecules in the diagnosis of a major human disease. Following recent demonstrations that some types of cancer show substantial microbial contributions1-10, we re-examined whole-genome and whole-transcriptome sequencing studies in The Cancer Genome Atlas11 (TCGA) of 33 types of cancer from treatment-naive patients (a total of 18,116 samples) for microbial reads, and found unique microbial signatures in tissue and blood within and between most major types of cancer. These TCGA blood signatures remained predictive when applied to patients with stage Ia-IIc cancer and cancers lacking any genomic alterations currently measured on two commercial-grade cell-free tumour DNA platforms, despite the use of very stringent decontamination analyses that discarded up to 92.3% of total sequence data. In addition, we could discriminate among samples from healthy, cancer-free individuals (n = 69) and those from patients with multiple types of cancer (prostate, lung, and melanoma; 100 samples in total) solely using plasma-derived, cell-free microbial nucleic acids. This potential microbiome-based oncology diagnostic tool warrants further exploration.
Collapse
Affiliation(s)
- Gregory D Poore
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
| | - Evguenia Kopylova
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
- Clarity Genomics, Beerse, Belgium
| | - Qiyun Zhu
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
| | - Carolina Carpenter
- Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA
| | - Serena Fraraccio
- Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA
| | - Stephen Wandro
- Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA
| | - Tomasz Kosciolek
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
- Malopolska Centre of Biotechnology, Jagiellonian University in Krakow, Krakow, Poland
| | - Stefan Janssen
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
- Algorithmic Bioinformatics, Department of Biology and Chemistry, Justus Liebig University Gießen, Gießen, Germany
| | - Jessica Metcalf
- Department of Animal Sciences, Colorado State University, Fort Collins, CO, USA
| | - Se Jin Song
- Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA
| | - Jad Kanbar
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Sandrine Miller-Montgomery
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
- Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA
| | - Robert Heaton
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Rana Mckay
- Moores Cancer Center, University of California San Diego Health, La Jolla, CA, USA
| | - Sandip Pravin Patel
- Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA
- Moores Cancer Center, University of California San Diego Health, La Jolla, CA, USA
| | - Austin D Swafford
- Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA
| | - Rob Knight
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA.
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
- Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA.
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
14
|
Bouslimani A, da Silva R, Kosciolek T, Janssen S, Callewaert C, Amir A, Dorrestein K, Melnik AV, Zaramela LS, Kim JN, Humphrey G, Schwartz T, Sanders K, Brennan C, Luzzatto-Knaan T, Ackermann G, McDonald D, Zengler K, Knight R, Dorrestein PC. The impact of skin care products on skin chemistry and microbiome dynamics. BMC Biol 2019; 17:47. [PMID: 31189482 PMCID: PMC6560912 DOI: 10.1186/s12915-019-0660-6] [Citation(s) in RCA: 94] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Accepted: 04/30/2019] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Use of skin personal care products on a regular basis is nearly ubiquitous, but their effects on molecular and microbial diversity of the skin are unknown. We evaluated the impact of four beauty products (a facial lotion, a moisturizer, a foot powder, and a deodorant) on 11 volunteers over 9 weeks. RESULTS Mass spectrometry and 16S rRNA inventories of the skin revealed decreases in chemical as well as in bacterial and archaeal diversity on halting deodorant use. Specific compounds from beauty products used before the study remain detectable with half-lives of 0.5-1.9 weeks. The deodorant and foot powder increased molecular, bacterial, and archaeal diversity, while arm and face lotions had little effect on bacterial and archaeal but increased chemical diversity. Personal care product effects last for weeks and produce highly individualized responses, including alterations in steroid and pheromone levels and in bacterial and archaeal ecosystem structure and dynamics. CONCLUSIONS These findings may lead to next-generation precision beauty products and therapies for skin disorders.
Collapse
Affiliation(s)
- Amina Bouslimani
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, San Diego, USA
| | - Ricardo da Silva
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, San Diego, USA
| | - Tomasz Kosciolek
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
| | - Stefan Janssen
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
- Department for Pediatric Oncology, Hematology and Clinical Immunology, University Children's Hospital, Medical Faculty, Heinrich-Heine-University Düsseldorf, Düsseldorf, Germany
| | - Chris Callewaert
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
- Center for Microbial Ecology and Technology, Ghent University, 9000, Ghent, Belgium
| | - Amnon Amir
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
| | - Kathleen Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, San Diego, USA
| | - Alexey V Melnik
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, San Diego, USA
| | - Livia S Zaramela
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
| | - Ji-Nu Kim
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
| | - Gregory Humphrey
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
| | - Tara Schwartz
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
| | - Karenina Sanders
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
| | - Caitriona Brennan
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
| | - Tal Luzzatto-Knaan
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, San Diego, USA
| | - Gail Ackermann
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
| | - Daniel McDonald
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
| | - Karsten Zengler
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, 92307, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Rob Knight
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA.
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, 92307, USA.
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093, USA.
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, 92093, USA.
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, San Diego, USA.
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92037, USA.
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, 92307, USA.
- Department of Pharmacology, University of California, San Diego, La Jolla, CA, 92037, USA.
| |
Collapse
|
15
|
Applications and challenges of forensic proteomics. Forensic Sci Int 2019; 297:350-363. [DOI: 10.1016/j.forsciint.2019.01.022] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 01/09/2019] [Accepted: 01/13/2019] [Indexed: 12/23/2022]
|
16
|
Martí JM. Recentrifuge: Robust comparative analysis and contamination removal for metagenomics. PLoS Comput Biol 2019; 15:e1006967. [PMID: 30958827 PMCID: PMC6472834 DOI: 10.1371/journal.pcbi.1006967] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 04/18/2019] [Accepted: 03/19/2019] [Indexed: 12/21/2022] Open
Abstract
Metagenomic sequencing is becoming widespread in biomedical and environmental research, and the pace is increasing even more thanks to nanopore sequencing. With a rising number of samples and data per sample, the challenge of efficiently comparing results within a specimen and between specimens arises. Reagents, laboratory, and host related contaminants complicate such analysis. Contamination is particularly critical in low microbial biomass body sites and environments, where it can comprise most of a sample if not all. Recentrifuge implements a robust method for the removal of negative-control and crossover taxa from the rest of samples. With Recentrifuge, researchers can analyze results from taxonomic classifiers using interactive charts with emphasis on the confidence level of the classifications. In addition to contamination-subtracted samples, Recentrifuge provides shared and exclusive taxa per sample, thus enabling robust contamination removal and comparative analysis in environmental and clinical metagenomics. Regarding the first area, Recentrifuge's novel approach has already demonstrated its benefits showing that microbiomes of Arctic and Antarctic solar panels display similar taxonomic profiles. In the clinical field, to confirm Recentrifuge's ability to analyze complex metagenomes, we challenged it with data coming from a metagenomic investigation of RNA in plasma that suffered from critical contamination to the point of preventing any positive conclusion. Recentrifuge provided results that yielded new biological insight into the problem, supporting the growing evidence of a blood microbiota even in healthy individuals, mostly translocated from the gut, the oral cavity, and the genitourinary tract. We also developed a synthetic dataset carefully designed to rate the robust contamination removal algorithm, which demonstrated a significant improvement in specificity while retaining a high sensitivity even in the presence of cross-contaminants. Recentrifuge's official website is www.recentrifuge.org. The data and source code are anonymously and freely available on GitHub and PyPI. The computing code is licensed under the AGPLv3. The Recentrifuge Wiki is the most extensive and continually-updated source of documentation for Recentrifuge, covering installation, use cases, testing, and other useful topics.
Collapse
Affiliation(s)
- Jose Manuel Martí
- Institute for Integrative Systems Biology (ISysBio), Valencia, Spain
| |
Collapse
|
17
|
Selection of Appropriate Metagenome Taxonomic Classifiers for Ancient Microbiome Research. mSystems 2018; 3:mSystems00080-18. [PMID: 30035235 PMCID: PMC6050634 DOI: 10.1128/msystems.00080-18] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Accepted: 06/20/2018] [Indexed: 02/01/2023] Open
Abstract
Ancient biomolecules from oral and gut microbiome samples have been shown to be preserved in the archaeological record. Studying ancient microbiome communities using metagenomic techniques offers a unique opportunity to reconstruct the evolutionary trajectories of microbial communities through time. DNA accumulates specific damage over time, which could potentially affect taxonomic classification and our ability to accurately reconstruct community assemblages. It is therefore necessary to assess whether ancient DNA (aDNA) damage patterns affect metagenomic taxonomic profiling. Here, we assessed biases in community structure, diversity, species detection, and relative abundance estimates by five popular metagenomic taxonomic classification programs using in silico-generated data sets with and without aDNA damage. Damage patterns had minimal impact on the taxonomic profiles produced by each program, while false-positive rates and biases were intrinsic to each program. Therefore, the most appropriate classification program is one that minimizes the biases related to the questions being addressed. Metagenomics enables the study of complex microbial communities from myriad sources, including the remains of oral and gut microbiota preserved in archaeological dental calculus and paleofeces, respectively. While accurate taxonomic assignment is essential to this process, DNA damage characteristic of ancient samples (e.g., reduction in fragment size and cytosine deamination) may reduce the accuracy of read taxonomic assignment. Using a set of in silico-generated metagenomic data sets, we investigated how the addition of ancient DNA (aDNA) damage patterns influences microbial taxonomic assignment by five widely used profilers: QIIME/UCLUST, MetaPhlAn2, MIDAS, CLARK-S, and MALT. In silico-generated data sets were designed to mimic dental plaque, consisting of 40, 100, and 200 microbial species/strains, both with and without simulated aDNA damage patterns. Following taxonomic assignment, the profiles were evaluated for species presence/absence, relative abundance, alpha diversity, beta diversity, and specific taxonomic assignment biases. Unifrac metrics indicated that both MIDAS and MetaPhlAn2 reconstructed the most accurate community structure. QIIME/UCLUST, CLARK-S, and MALT had the highest number of inaccurate taxonomic assignments; false-positive rates were highest by CLARK-S and QIIME/UCLUST. Filtering out species present at <0.1% abundance greatly increased the accuracy of CLARK-S and MALT. All programs except CLARK-S failed to detect some species from the input file that were in their databases. The addition of ancient DNA damage resulted in minimal differences in species detection and relative abundance between simulated ancient and modern data sets for most programs. Overall, taxonomic profiling biases are program specific rather than damage dependent, and the choice of taxonomic classification program should be tailored to specific research questions. IMPORTANCE Ancient biomolecules from oral and gut microbiome samples have been shown to be preserved in the archaeological record. Studying ancient microbiome communities using metagenomic techniques offers a unique opportunity to reconstruct the evolutionary trajectories of microbial communities through time. DNA accumulates specific damage over time, which could potentially affect taxonomic classification and our ability to accurately reconstruct community assemblages. It is therefore necessary to assess whether ancient DNA (aDNA) damage patterns affect metagenomic taxonomic profiling. Here, we assessed biases in community structure, diversity, species detection, and relative abundance estimates by five popular metagenomic taxonomic classification programs using in silico-generated data sets with and without aDNA damage. Damage patterns had minimal impact on the taxonomic profiles produced by each program, while false-positive rates and biases were intrinsic to each program. Therefore, the most appropriate classification program is one that minimizes the biases related to the questions being addressed.
Collapse
|
18
|
Bazinet AL, Ondov BD, Sommer DD, Ratnayake S. BLAST-based validation of metagenomic sequence assignments. PeerJ 2018; 6:e4892. [PMID: 29868286 PMCID: PMC5978398 DOI: 10.7717/peerj.4892] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Accepted: 05/13/2018] [Indexed: 12/29/2022] Open
Abstract
When performing bioforensic casework, it is important to be able to reliably detect the presence of a particular organism in a metagenomic sample, even if the organism is only present in a trace amount. For this task, it is common to use a sequence classification program that determines the taxonomic affiliation of individual sequence reads by comparing them to reference database sequences. As metagenomic data sets often consist of millions or billions of reads that need to be compared to reference databases containing millions of sequences, such sequence classification programs typically use search heuristics and databases with reduced sequence diversity to speed up the analysis, which can lead to incorrect assignments. Thus, in a bioforensic setting where correct assignments are paramount, assignments of interest made by "first-pass" classifiers should be confirmed using the most precise methods and comprehensive databases available. In this study we present a BLAST-based method for validating the assignments made by less precise sequence classification programs, with optimal parameters for filtering of BLAST results determined via simulation of sequence reads from genomes of interest, and we apply the method to the detection of four pathogenic organisms. The software implementing the method is open source and freely available.
Collapse
Affiliation(s)
- Adam L. Bazinet
- National Biodefense Analysis and Countermeasures Center, Fort Detrick, MD, USA
| | - Brian D. Ondov
- National Biodefense Analysis and Countermeasures Center, Fort Detrick, MD, USA
- National Human Genome Research Institute, Bethesda, MD, USA
| | - Daniel D. Sommer
- National Biodefense Analysis and Countermeasures Center, Fort Detrick, MD, USA
| | | |
Collapse
|
19
|
|
20
|
Microdiversity of an Abundant Terrestrial Bacterium Encompasses Extensive Variation in Ecologically Relevant Traits. mBio 2017; 8:mBio.01809-17. [PMID: 29138307 PMCID: PMC5686540 DOI: 10.1128/mbio.01809-17] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Much genetic diversity within a bacterial community is likely obscured by microdiversity within operational taxonomic units (OTUs) defined by 16S rRNA gene sequences. However, it is unclear how variation within this microdiversity influences ecologically relevant traits. Here, we employ a multifaceted approach to investigate microdiversity within the dominant leaf litter bacterium, Curtobacterium, which comprises 7.8% of the bacterial community at a grassland site undergoing global change manipulations. We use cultured bacterial isolates to interpret metagenomic data, collected in situ over 2 years, together with lab-based physiological assays to determine the extent of trait variation within this abundant OTU. The response of Curtobacterium to seasonal variability and the global change manipulations, specifically an increase in relative abundance under decreased water availability, appeared to be conserved across six Curtobacterium lineages identified at this site. Genomic and physiological analyses in the lab revealed that degradation of abundant polymeric carbohydrates within leaf litter, cellulose and xylan, is nearly universal across the genus, which may contribute to its high abundance in grassland leaf litter. However, the degree of carbohydrate utilization and temperature preference for this degradation varied greatly among clades. Overall, we find that traits within Curtobacterium are conserved at different phylogenetic depths. We speculate that similar to bacteria in marine systems, diverse microbes within this taxon may be structured in distinct ecotypes that are key to understanding Curtobacterium abundance and distribution in the environment. Despite the plummeting costs of sequencing, characterizing the fine-scale genetic diversity of a microbial community—and interpreting its functional importance—remains a challenge. Indeed, most studies, particularly studies of soil, assess community composition at a broad genetic level by classifying diversity into taxa (OTUs) defined by 16S rRNA sequence similarity. However, these classifications potentially obscure variation in traits that result in fine-scale ecological differentiation among closely related strains. Here, we investigated “microdiversity” in a highly diverse and poorly characterized soil system (leaf litter in a southern Californian grassland). We focused on the most abundant bacterium, Curtobacterium, which by standard methods is grouped into only one OTU. We find that the degree of carbohydrate usage and temperature preference vary within the OTU, whereas its responses to changes in precipitation are relatively uniform. These results suggest that microdiversity may be key to understanding how soil bacterial diversity is linked to ecosystem functioning.
Collapse
|
21
|
McIntyre ABR, Ounit R, Afshinnekoo E, Prill RJ, Hénaff E, Alexander N, Minot SS, Danko D, Foox J, Ahsanuddin S, Tighe S, Hasan NA, Subramanian P, Moffat K, Levy S, Lonardi S, Greenfield N, Colwell RR, Rosen GL, Mason CE. Comprehensive benchmarking and ensemble approaches for metagenomic classifiers. Genome Biol 2017; 18:182. [PMID: 28934964 PMCID: PMC5609029 DOI: 10.1186/s13059-017-1299-7] [Citation(s) in RCA: 181] [Impact Index Per Article: 22.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Accepted: 08/16/2017] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND One of the main challenges in metagenomics is the identification of microorganisms in clinical and environmental samples. While an extensive and heterogeneous set of computational tools is available to classify microorganisms using whole-genome shotgun sequencing data, comprehensive comparisons of these methods are limited. RESULTS In this study, we use the largest-to-date set of laboratory-generated and simulated controls across 846 species to evaluate the performance of 11 metagenomic classifiers. Tools were characterized on the basis of their ability to identify taxa at the genus, species, and strain levels, quantify relative abundances of taxa, and classify individual reads to the species level. Strikingly, the number of species identified by the 11 tools can differ by over three orders of magnitude on the same datasets. Various strategies can ameliorate taxonomic misclassification, including abundance filtering, ensemble approaches, and tool intersection. Nevertheless, these strategies were often insufficient to completely eliminate false positives from environmental samples, which are especially important where they concern medically relevant species. Overall, pairing tools with different classification strategies (k-mer, alignment, marker) can combine their respective advantages. CONCLUSIONS This study provides positive and negative controls, titrated standards, and a guide for selecting tools for metagenomic analyses by comparing ranges of precision, accuracy, and recall. We show that proper experimental design and analysis parameters can reduce false positives, provide greater resolution of species in complex metagenomic samples, and improve the interpretation of results.
Collapse
Affiliation(s)
- Alexa B R McIntyre
- Tri-Institutional Program in Computational Biology and Medicine, New York, NY, USA
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, 10021, USA
| | - Rachid Ounit
- Department of Computer Science and Engineering, University of California, Riverside, CA, 92521, USA
| | - Ebrahim Afshinnekoo
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, 10021, USA
- School of Medicine, New York Medical College, Valhalla, NY, 10595, USA
| | - Robert J Prill
- Accelerated Discovery Lab, IBM Almaden Research Center, San Jose, CA, 95120, USA
| | - Elizabeth Hénaff
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, 10021, USA
| | - Noah Alexander
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, 10021, USA
| | - Samuel S Minot
- One Codex, Reference Genomics, San Francisco, CA, 94103, USA
| | - David Danko
- Tri-Institutional Program in Computational Biology and Medicine, New York, NY, USA
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, 10021, USA
| | - Jonathan Foox
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, 10021, USA
| | - Sofia Ahsanuddin
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, 10021, USA
| | - Scott Tighe
- University of Vermont, Burlington, VT, 05405, USA
| | - Nur A Hasan
- CosmosID, Inc, Rockville, MD, 20850, USA
- Center for Bioinformatics and Computational Biology, University of Maryland Institute for Advanced Computer Studies (UMIACS), College Park, MD, 20742, USA
| | | | | | - Shawn Levy
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, 35806, USA
| | - Stefano Lonardi
- Department of Computer Science and Engineering, University of California, Riverside, CA, 92521, USA
| | - Nick Greenfield
- One Codex, Reference Genomics, San Francisco, CA, 94103, USA
| | - Rita R Colwell
- CosmosID, Inc, Rockville, MD, 20850, USA
- Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA
| | - Gail L Rosen
- Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA, 19104, USA.
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA.
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, 10021, USA.
- The Feil Family Brain and Mind Research Institute, New York, NY, 10065, USA.
| |
Collapse
|
22
|
Ruppé E, Lazarevic V, Girard M, Mouton W, Ferry T, Laurent F, Schrenzel J. Clinical metagenomics of bone and joint infections: a proof of concept study. Sci Rep 2017; 7:7718. [PMID: 28798333 PMCID: PMC5552814 DOI: 10.1038/s41598-017-07546-5] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 06/29/2017] [Indexed: 12/19/2022] Open
Abstract
Bone and joint infections (BJI) are severe infections that require a tailored and protracted antibiotic treatment. Yet, the diagnostic based on culturing samples lacks sensitivity, especially for hardly culturable bacteria. Metagenomic sequencing could potentially address those limitations. Here, we assessed the performances of metagenomic sequencing on 24 BJI samples for the identification of pathogens and the prediction of their antibiotic susceptibility. For monomicrobial samples in culture (n = 8), the presence of the pathogen was confirmed by metagenomics in all cases. For polymicrobial samples (n = 16), 32/55 bacteria (58.2%) were found at the species level (and 41/55 [74.5%] at the genus level). Conversely, 273 bacteria not found in culture were identified, 182 being possible pathogens and 91 contaminants. A correct antibiotic susceptibility could be inferred in 94.1% and 76.5% cases for monomicrobial and polymicrobial samples, respectively. Altogether, we found that clinical metagenomics applied to BJI samples is a potential tool to support conventional culture.
Collapse
Affiliation(s)
- Etienne Ruppé
- Genomic Research Laboratory, Service of Infectious Diseases, Geneva University Hospitals, rue Gabrielle-Perret-Gentil 4, 1205, Geneva, Switzerland.
| | - Vladimir Lazarevic
- Genomic Research Laboratory, Service of Infectious Diseases, Geneva University Hospitals, rue Gabrielle-Perret-Gentil 4, 1205, Geneva, Switzerland
| | - Myriam Girard
- Genomic Research Laboratory, Service of Infectious Diseases, Geneva University Hospitals, rue Gabrielle-Perret-Gentil 4, 1205, Geneva, Switzerland
| | - William Mouton
- Centre International de Recherche en Infectiologie, INSERM U1111, Pathogenesis of staphylococcal infections, University of Lyon 1, Lyon, France
- Department of Clinical Microbiology, Northern Hospital Group, Hospices Civils de Lyon, Lyon, France
| | - Tristan Ferry
- Centre International de Recherche en Infectiologie, INSERM U1111, Pathogenesis of staphylococcal infections, University of Lyon 1, Lyon, France
- Infectious Diseases Department, Northern Hospital Group, Hospices Civils de Lyon, Lyon, France
| | - Frédéric Laurent
- Centre International de Recherche en Infectiologie, INSERM U1111, Pathogenesis of staphylococcal infections, University of Lyon 1, Lyon, France
- Department of Clinical Microbiology, Northern Hospital Group, Hospices Civils de Lyon, Lyon, France
| | - Jacques Schrenzel
- Genomic Research Laboratory, Service of Infectious Diseases, Geneva University Hospitals, rue Gabrielle-Perret-Gentil 4, 1205, Geneva, Switzerland
- Bacteriology Laboratory, Service of Laboratory Medicine, Department of Genetics and Laboratory Medicine, Geneva University Hospitals, 4 rue Gabrielle-Perret-Gentil, 1205, Geneva, Switzerland
| |
Collapse
|
23
|
Pettengill JB, Rand H. Segal's Law, 16S rRNA gene sequencing, and the perils of foodborne pathogen detection within the American Gut Project. PeerJ 2017; 5:e3480. [PMID: 28652935 PMCID: PMC5483036 DOI: 10.7717/peerj.3480] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Accepted: 05/31/2017] [Indexed: 01/15/2023] Open
Abstract
Obtaining human population level estimates of the prevalence of foodborne pathogens is critical for understanding outbreaks and ameliorating such threats to public health. Estimates are difficult to obtain due to logistic and financial constraints, but citizen science initiatives like that of the American Gut Project (AGP) represent a potential source of information concerning enteric pathogens. With an emphasis on genera Listeria and Salmonella, we sought to document the prevalence of those two taxa within the AGP samples. The results provided by AGP suggest a surprising 14% and 2% of samples contained Salmonella and Listeria, respectively. However, a reanalysis of those AGP sequences described here indicated that results depend greatly on the algorithm for assigning taxonomy and differences persisted across both a range of parameter settings and different reference databases (i.e., Greengenes and HITdb). These results are perhaps to be expected given that AGP sequenced the V4 region of 16S rRNA gene, which may not provide good resolution at the lower taxonomic levels (e.g., species), but it was surprising how often methods differ in classifying reads-even at higher taxonomic ranks (e.g., family). This highlights the misleading conclusions that can be reached when relying on a single method that is not a gold standard; this is the essence of Segal's Law: an individual with one watch knows what time it is but an individual with two is never sure. Our results point to the need for an appropriate molecular marker for the taxonomic resolution of interest, and calls for the development of more conservative classification methods that are fit for purpose. Thus, with 16S rRNA gene datasets, one must be cautious regarding the detection of taxonomic groups of public health interest (e.g., culture independent identification of foodborne pathogens or taxa associated with a given phenotype).
Collapse
Affiliation(s)
- James B Pettengill
- Biostatistics and Bioinformatics Staff, Office of Analytics and Outreach, US Food and Drug Administration, College Park, MD, United States of America
| | - Hugh Rand
- Biostatistics and Bioinformatics Staff, Office of Analytics and Outreach, US Food and Drug Administration, College Park, MD, United States of America
| |
Collapse
|
24
|
Ruppé E, Greub G, Schrenzel J. Messages from the first International Conference on Clinical Metagenomics (ICCMg). Microbes Infect 2017; 19:223-228. [PMID: 28161601 DOI: 10.1016/j.micinf.2017.01.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2016] [Revised: 01/12/2017] [Accepted: 01/12/2017] [Indexed: 12/13/2022]
Abstract
Metagenomics is recently entering in the clinical microbiology and an increasing number of diagnostic laboratories are now proposing the sequencing & annotation of bacterial genomes and/or the analysis of clinical samples by direct or PCR-based metagenomics with short time to results. In this context, the first International Conference on Clinical Metagenomics (ICCMg) was held in Geneva in October 2016 and several key aspects have been discussed including: i) the need for improved resolution, ii) the importance of interpretation given the common occurrence of sequence contaminants, iii) the need for improved bioinformatic pipelines, iv) the bottleneck of DNA extraction, v) the importance of gold standards, vi) the need to further reduce time to results, vii) how to improve data sharing, viii) the applications of bacterial genomics and clinical metagenomics in better adapting therapeutics and ix) the impact of metagenomics and new sequencing technologies in discovering new microbes. Further efforts in term of reduced turnaround time, improved quality and lower costs are however warranted to fully translate metagenomics in clinical applications.
Collapse
Affiliation(s)
- Etienne Ruppé
- Genomic Research Laboratory, Service of Infectious Diseases, Geneva University Hospitals, Rue Gabrielle-Perret-Gentil 4, CH-1205 Geneva, Switzerland.
| | - Gilbert Greub
- Institute of Microbiology, Lausanne University Hospital and University of Lausanne, Rue du Bugnon 48, 1011 Lausanne, Switzerland
| | - Jacques Schrenzel
- Genomic Research Laboratory, Service of Infectious Diseases, Geneva University Hospitals, Rue Gabrielle-Perret-Gentil 4, CH-1205 Geneva, Switzerland; Bacteriology Laboratory, Service of Laboratory Medicine, Department of Genetics and Laboratory Medicine, Geneva University Hospitals, 4 Rue Gabrielle-Perret-Gentil, 1205 Geneva, Switzerland
| |
Collapse
|