1
|
Naas AE, Solden LM, Norbeck AD, Brewer H, Hagen LH, Heggenes IM, McHardy AC, Mackie RI, Paša-Tolić L, Arntzen MØ, Eijsink VGH, Koropatkin NM, Hess M, Wrighton KC, Pope PB. "Candidatus Paraporphyromonas polyenzymogenes" encodes multi-modular cellulases linked to the type IX secretion system. Microbiome 2018; 6:44. [PMID: 29490697 PMCID: PMC5831590 DOI: 10.1186/s40168-018-0421-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 02/07/2018] [Indexed: 05/07/2023]
Abstract
BACKGROUND In nature, obligate herbivorous ruminants have a close symbiotic relationship with their gastrointestinal microbiome, which proficiently deconstructs plant biomass. Despite decades of research, lignocellulose degradation in the rumen has thus far been attributed to a limited number of culturable microorganisms. Here, we combine meta-omics and enzymology to identify and describe a novel Bacteroidetes family ("Candidatus MH11") composed entirely of uncultivated strains that are predominant in ruminants and only distantly related to previously characterized taxa. RESULTS The first metabolic reconstruction of Ca. MH11-affiliated genome bins, with a particular focus on the provisionally named "Candidatus Paraporphyromonas polyenzymogenes", illustrated their capacity to degrade various lignocellulosic substrates via comprehensive inventories of singular and multi-modular carbohydrate active enzymes (CAZymes). Closer examination revealed an absence of archetypical polysaccharide utilization loci found in human gut microbiota. Instead, we identified many multi-modular CAZymes putatively secreted via the Bacteroidetes-specific type IX secretion system (T9SS). This included cellulases with two or more catalytic domains, which are modular arrangements that are unique to Bacteroidetes species studied to date. Core metabolic proteins from Ca. P. polyenzymogenes were detected in metaproteomic data and were enriched in rumen-incubated plant biomass, indicating that active saccharification and fermentation of complex carbohydrates could be assigned to members of this novel family. Biochemical analysis of selected Ca. P. polyenzymogenes CAZymes further iterated the cellulolytic activity of this hitherto uncultured bacterium towards linear polymers, such as amorphous and crystalline cellulose as well as mixed linkage β-glucans. CONCLUSION We propose that Ca. P. polyenzymogene genotypes and other Ca. MH11 members actively degrade plant biomass in the rumen of cows, sheep and most likely other ruminants, utilizing singular and multi-domain catalytic CAZymes secreted through the T9SS. The discovery of a prominent role of multi-modular cellulases in the Gram-negative Bacteroidetes, together with similar findings for Gram-positive cellulosomal bacteria (Ruminococcus flavefaciens) and anaerobic fungi (Orpinomyces sp.), suggests that complex enzymes are essential and have evolved within all major cellulolytic dominions inherent to the rumen.
Collapse
Affiliation(s)
- A E Naas
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Post Office Box 5003, 1432, Ås, Norway
| | - L M Solden
- Department of Microbiology, The Ohio State University, Columbus, OH, 43201, USA
| | - A D Norbeck
- Environmental and Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, 99354, USA
| | - H Brewer
- Environmental and Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, 99354, USA
| | - L H Hagen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Post Office Box 5003, 1432, Ås, Norway
| | - I M Heggenes
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Post Office Box 5003, 1432, Ås, Norway
| | - A C McHardy
- Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraβe 7, 38124, Braunschweig, Germany
| | - R I Mackie
- Institute for Genomic Biology and Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - L Paša-Tolić
- Environmental and Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, 99354, USA
| | - M Ø Arntzen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Post Office Box 5003, 1432, Ås, Norway
| | - V G H Eijsink
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Post Office Box 5003, 1432, Ås, Norway
| | - N M Koropatkin
- Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - M Hess
- Department of Animal Science, University of California, Davis, CA, 95616, USA
| | - K C Wrighton
- Department of Microbiology, The Ohio State University, Columbus, OH, 43201, USA
| | - P B Pope
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Post Office Box 5003, 1432, Ås, Norway.
| |
Collapse
|
2
|
Abstract
MOTIVATION Gene assembly is an important step in functional analysis of shotgun metagenomic data. Nonetheless, strain aware assembly remains a challenging task, as current assembly tools often fail to distinguish among strain variants or require closely related reference genomes of the studied species to be available. RESULTS We have developed Snowball, a novel strain aware gene assembler for shotgun metagenomic data that does not require closely related reference genomes to be available. It uses profile hidden Markov models (HMMs) of gene domains of interest to guide the assembly. Our assembler performs gene assembly of individual gene domains based on read overlaps and error correction using read quality scores at the same time, which results in very low per-base error rates. AVAILABILITY AND IMPLEMENTATION The software runs on a user-defined number of processor cores in parallel, runs on a standard laptop and is available under the GPL 3.0 license for installation under Linux or OS X at https://github.com/hzi-bifo/snowball CONTACT AMC14@helmholtz-hzi.de,a.schoenhuth@cwi.nl SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- I Gregor
- Department of Algorithmic Bioinformatics, Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig 38124, Germany
| | - A Schönhuth
- Centrum Wiskunde & Informatica, Amsterdam, XG 1098, The Netherlands
| | - A C McHardy
- Department of Algorithmic Bioinformatics, Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig 38124, Germany
| |
Collapse
|
3
|
Frank JA, Pan Y, Tooming-Klunderud A, Eijsink VGH, McHardy AC, Nederbragt AJ, Pope PB. Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data. Sci Rep 2016; 6:25373. [PMID: 27156482 PMCID: PMC4860591 DOI: 10.1038/srep25373] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Accepted: 04/12/2016] [Indexed: 01/22/2023] Open
Abstract
DNA assembly is a core methodological step in metagenomic pipelines used to study the structure and function within microbial communities. Here we investigate the utility of Pacific Biosciences long and high accuracy circular consensus sequencing (CCS) reads for metagenomic projects. We compared the application and performance of both PacBio CCS and Illumina HiSeq data with assembly and taxonomic binning algorithms using metagenomic samples representing a complex microbial community. Eight SMRT cells produced approximately 94 Mb of CCS reads from a biogas reactor microbiome sample that averaged 1319 nt in length and 99.7% accuracy. CCS data assembly generated a comparative number of large contigs greater than 1 kb, to those assembled from a ~190x larger HiSeq dataset (~18 Gb) produced from the same sample (i.e approximately 62% of total contigs). Hybrid assemblies using PacBio CCS and HiSeq contigs produced improvements in assembly statistics, including an increase in the average contig length and number of large contigs. The incorporation of CCS data produced significant enhancements in taxonomic binning and genome reconstruction of two dominant phylotypes, which assembled and binned poorly using HiSeq data alone. Collectively these results illustrate the value of PacBio CCS reads in certain metagenomics applications.
Collapse
Affiliation(s)
- J A Frank
- Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, 1432 Norway
| | - Y Pan
- Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraβe 7, 38124 Braunschweig, Germany
| | - A Tooming-Klunderud
- University of Oslo, Department of Biosciences, Centre for Ecological and Evolutionary Synthesis, Blindern, 0316 Norway
| | - V G H Eijsink
- Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, 1432 Norway
| | - A C McHardy
- Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraβe 7, 38124 Braunschweig, Germany
| | - A J Nederbragt
- University of Oslo, Department of Biosciences, Centre for Ecological and Evolutionary Synthesis, Blindern, 0316 Norway
| | - P B Pope
- Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, 1432 Norway
| |
Collapse
|
4
|
Dröge J, Gregor I, McHardy AC. Taxator-tk: precise taxonomic assignment of metagenomes by fast approximation of evolutionary neighborhoods. Bioinformatics 2015; 31:817-24. [PMID: 25388150 PMCID: PMC4380030 DOI: 10.1093/bioinformatics/btu745] [Citation(s) in RCA: 91] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2014] [Revised: 11/04/2014] [Accepted: 11/05/2014] [Indexed: 01/17/2023] Open
Abstract
MOTIVATION Metagenomics characterizes microbial communities by random shotgun sequencing of DNA isolated directly from an environment of interest. An essential step in computational metagenome analysis is taxonomic sequence assignment, which allows identifying the sequenced community members and reconstructing taxonomic bins with sequence data for the individual taxa. For the massive datasets generated by next-generation sequencing technologies, this cannot be performed with de-novo phylogenetic inference methods. We describe an algorithm and the accompanying software, taxator-tk, which performs taxonomic sequence assignment by fast approximate determination of evolutionary neighbors from sequence similarities. RESULTS Taxator-tk was precise in its taxonomic assignment across all ranks and taxa for a range of evolutionary distances and for short as well as for long sequences. In addition to the taxonomic binning of metagenomes, it is well suited for profiling microbial communities from metagenome samples because it identifies bacterial, archaeal and eukaryotic community members without being affected by varying primer binding strengths, as in marker gene amplification, or copy number variations of marker genes across different taxa. Taxator-tk has an efficient, parallelized implementation that allows the assignment of 6 Gb of sequence data per day on a standard multiprocessor system with 10 CPU cores and microbial RefSeq as the genomic reference data.
Collapse
Affiliation(s)
- J Dröge
- Department for Algorithmic Bioinformatics, Heinrich Heine University, Universitätsstraße 1, 40225 Düsseldorf, Germany, Max-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, University Campus E1 4, 66123 Saarbrücken, Germany and Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraße 7, 38124 Braunschweig, Germany Department for Algorithmic Bioinformatics, Heinrich Heine University, Universitätsstraße 1, 40225 Düsseldorf, Germany, Max-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, University Campus E1 4, 66123 Saarbrücken, Germany and Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraße 7, 38124 Braunschweig, Germany
| | - I Gregor
- Department for Algorithmic Bioinformatics, Heinrich Heine University, Universitätsstraße 1, 40225 Düsseldorf, Germany, Max-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, University Campus E1 4, 66123 Saarbrücken, Germany and Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraße 7, 38124 Braunschweig, Germany Department for Algorithmic Bioinformatics, Heinrich Heine University, Universitätsstraße 1, 40225 Düsseldorf, Germany, Max-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, University Campus E1 4, 66123 Saarbrücken, Germany and Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraße 7, 38124 Braunschweig, Germany
| | - A C McHardy
- Department for Algorithmic Bioinformatics, Heinrich Heine University, Universitätsstraße 1, 40225 Düsseldorf, Germany, Max-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, University Campus E1 4, 66123 Saarbrücken, Germany and Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraße 7, 38124 Braunschweig, Germany Department for Algorithmic Bioinformatics, Heinrich Heine University, Universitätsstraße 1, 40225 Düsseldorf, Germany, Max-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, University Campus E1 4, 66123 Saarbrücken, Germany and Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraße 7, 38124 Braunschweig, Germany Department for Algorithmic Bioinformatics, Heinrich Heine University, Universitätsstraße 1, 40225 Düsseldorf, Germany, Max-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, University Campus E1 4, 66123 Saarbrücken, Germany and Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Inhoffenstraße 7, 38124 Braunschweig, Germany
| |
Collapse
|
5
|
Droge J, McHardy AC. Taxonomic binning of metagenome samples generated by next-generation sequencing technologies. Brief Bioinform 2012; 13:646-55. [DOI: 10.1093/bib/bbs031] [Citation(s) in RCA: 79] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
6
|
Pope PB, Smith W, Denman SE, Tringe SG, Barry K, Hugenholtz P, McSweeney CS, McHardy AC, Morrison M. Isolation of Succinivibrionaceae Implicated in Low Methane Emissions from Tammar Wallabies. Science 2011; 333:646-8. [DOI: 10.1126/science.1205760] [Citation(s) in RCA: 126] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|