1
|
Zhao D, Salas-Leiva DE, Williams SK, Dunn KA, Shao JD, Roger AJ. Eukfinder: a pipeline to retrieve microbial eukaryote genome sequences from metagenomic data. mBio 2025; 16:e0069925. [PMID: 40207938 PMCID: PMC12077102 DOI: 10.1128/mbio.00699-25] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2025] [Accepted: 03/11/2025] [Indexed: 04/11/2025] Open
Abstract
Whole-genome shotgun (WGS) metagenomic sequencing of microbial communities enables the discovery of the functions, physiologies, and evolutionary histories of prokaryotic and eukaryotic microbes. However, metagenomic studies of microbial eukaryotes lag due to challenges in identifying and assembling high-quality genomes from WGS data. To address this problem, we developed Eukfinder, a bioinformatics pipeline that identifies potential eukaryotic sequences from WGS metagenomic data, with a complementary binning workflow for recovering nuclear and mitochondrial genomes. Eukfinder uses two specialized databases for read/contig classification, customizable to specific data sets or environments. We tested Eukfinder on simulated gut microbiome data sets which included varying numbers of reads from the protist Blastocystis, a human gut commensal. We also applied Eukfinder to previously published human gut microbiome WGS metagenomic data to recover new genomes of Blastocystis. Compared to other workflows, Eukfinder offers the potential to recover high-quality, near-complete genomes of diverse eukaryotes, including different Blastocystis subtypes, without relying on a reference genome. With sufficient sequencing depth, Eukfinder outperforms similar tools for recovering eukaryotic genomes from metagenomic data. Eukfinder is a valuable tool for reference-independent and cultivation-free studies of eukaryotic microbial genomes from environmental WGS metagenomic samples. IMPORTANCE Advancements in next-generation sequencing have made whole-genome shotgun (WGS) metagenomic sequencing an efficient method for de novo reconstruction of microbial genomes from various environments. Thousands of new prokaryotic genomes have been characterized; however, the large size and complexity of protistan genomes have hindered the use of WGS metagenomics to sample microbial eukaryotic diversity. Eukfinder enables the recovery of eukaryotic microbial genomes from environmental WGS metagenomic samples. Retrieval of high-quality protistan genomes from diverse metagenomic samples increases the number of reference genomes available. This aids future metagenomic investigations into the functions, physiologies, and evolutionary histories of eukaryotic microbes in the gut microbiome and other ecosystems.
Collapse
Affiliation(s)
- Dandan Zhao
- Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Dayana E. Salas-Leiva
- Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Biochemistry, Cambridge University, Cambridge, England, United Kingdom
| | - Shelby K. Williams
- Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Katherine A. Dunn
- Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Jason D. Shao
- Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Andrew J. Roger
- Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
2
|
Yan M, Andersen TO, Pope PB, Yu Z. Probing the eukaryotic microbes of ruminants with a deep-learning classifier and comprehensive protein databases. Genome Res 2025; 35:368-378. [PMID: 39730187 PMCID: PMC11874962 DOI: 10.1101/gr.279825.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Accepted: 12/19/2024] [Indexed: 12/29/2024]
Abstract
Metagenomics, particularly genome-resolved metagenomics, have significantly deepened our understanding of microbes, illuminating their taxonomic and functional diversity and roles in ecology, physiology, and evolution. However, eukaryotic populations within various microbiomes, including those in the mammalian gastrointestinal (GI) tract, remain relatively underexplored in metagenomic studies owing to the lack of comprehensive reference genome databases and robust bioinformatic tools. The GI tract of ruminants, particularly the rumen, contains a high eukaryotic biomass but a relatively low diversity of ciliates and fungi, which significantly impacts feed digestion, methane emissions, and rumen microbial ecology. In the present study, we developed GutEuk, a bioinformatics tool that improves upon the currently available Tiara and EukRep in accurately identifying eukaryotic sequences from metagenomes. GutEuk is optimized for high precision across different sequence lengths. It can also distinguish fungal and protozoal sequences, further elucidating their unique ecological, physiological, and nutritional impacts. GutEuk was shown to facilitate comprehensive analyses of protozoa and fungi within more than 1000 rumen metagenomes, revealing a greater genomic diversity among protozoa than previously documented. We further curated several ruminant eukaryotic protein databases, significantly enhancing our ability to distinguish the functional roles of ruminant fungi and protozoa from those of prokaryotes. Overall, the newly developed package GutEuk and its associated databases create new opportunities for the in-depth study of GI tract eukaryotes.
Collapse
Affiliation(s)
- Ming Yan
- Department of Animal Sciences, The Ohio State University, Columbus, Ohio 43210, USA
- Center of Microbiome Science, The Ohio State University, Columbus, Ohio 43210, USA
| | - Thea O Andersen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås NO-7491, Norway
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås NO-7491, Norway
| | - Phillip B Pope
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås NO-7491, Norway
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås NO-7491, Norway
- Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology (QUT), Translational Research Institute, Woolloongabba 4102, Queensland, Australia
| | - Zhongtang Yu
- Department of Animal Sciences, The Ohio State University, Columbus, Ohio 43210, USA;
- Center of Microbiome Science, The Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
3
|
Michoud G, Peter H, Busi SB, Bourquin M, Kohler TJ, Geers A, Ezzat L, Battin TJ. Mapping the metagenomic diversity of the multi-kingdom glacier-fed stream microbiome. Nat Microbiol 2025; 10:217-230. [PMID: 39747693 DOI: 10.1038/s41564-024-01874-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 10/29/2024] [Indexed: 01/04/2025]
Abstract
Glacier-fed streams (GFS) feature among Earth's most extreme aquatic ecosystems marked by pronounced oligotrophy and environmental fluctuations. Microorganisms mainly organize in biofilms within them, but how they cope with such conditions is unknown. Here, leveraging 156 metagenomes from the Vanishing Glaciers project obtained from sediment samples in GFS from 9 mountains ranges, we report thousands of metagenome-assembled genomes (MAGs) encompassing prokaryotes, algae, fungi and viruses, that shed light on biotic interactions within glacier-fed stream biofilms. A total of 2,855 bacterial MAGs were characterized by diverse strategies to exploit inorganic and organic energy sources, in part via functional redundancy and mixotrophy. We show that biofilms probably become more complex and switch from chemoautotrophy to heterotrophy as algal biomass increases in GFS owing to glacier shrinkage. Our MAG compendium sheds light on the success of microbial life in GFS and provides a resource for future research on a microbiome potentially impacted by climate change.
Collapse
Affiliation(s)
- Grégoire Michoud
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, ENAC, Ecole Polytechnique Fédérale de Lausanne, Sion, Switzerland.
| | - Hannes Peter
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, ENAC, Ecole Polytechnique Fédérale de Lausanne, Sion, Switzerland
| | | | - Massimo Bourquin
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, ENAC, Ecole Polytechnique Fédérale de Lausanne, Sion, Switzerland
| | - Tyler J Kohler
- Department of Ecology, Faculty of Science, Charles University, Prague, Czechia
| | - Aileen Geers
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, ENAC, Ecole Polytechnique Fédérale de Lausanne, Sion, Switzerland
| | - Leila Ezzat
- MARBEC, University of Montpellier, CNRS, Ifremer, IRD, Montpellier, France
| | - Tom J Battin
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, ENAC, Ecole Polytechnique Fédérale de Lausanne, Sion, Switzerland.
| |
Collapse
|
4
|
Wei G. Insights into gut fungi in pigs: A comprehensive review. J Anim Physiol Anim Nutr (Berl) 2025; 109:96-112. [PMID: 39154229 DOI: 10.1111/jpn.14036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 06/17/2024] [Accepted: 08/04/2024] [Indexed: 08/19/2024]
Abstract
Fungi in the gut microbiota of mammals play a crucial role in host physiological regulation, including intestinal homeostasis and host immune regulation. However, our understanding of gut fungi in mammals remains limited, especially in economically valuable animals, such as pigs. Therefore, this review first describes the classification and characterisation of fungi, provides insights into the methods used to study gut fungi, and summarises the recent progress on pig gut fungi. Additionally, it discusses the challenges in the study of pig gut fungi and highlights potential perspectives. The aim of this review is to serve as a valuable reference for advancing our knowledge of gut fungi in animals.
Collapse
Affiliation(s)
- Guanyue Wei
- National Key Laboratory of Pig Genetic Improvement and Germplasm Innovation, Jiangxi Agricultural University, Nanchang, China
| |
Collapse
|
5
|
Salmaso N, Cerasino L, Pindo M, Boscaini A. Taxonomic and functional metagenomic assessment of a Dolichospermum bloom in a large and deep lake south of the Alps. FEMS Microbiol Ecol 2024; 100:fiae117. [PMID: 39227168 PMCID: PMC11412076 DOI: 10.1093/femsec/fiae117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 08/19/2024] [Accepted: 09/02/2024] [Indexed: 09/05/2024] Open
Abstract
Untargeted genetic approaches can be used to explore the high metabolic versatility of cyanobacteria. In this context, a comprehensive metagenomic shotgun analysis was performed on a population of Dolichospermum lemmermannii collected during a surface bloom in Lake Garda in the summer of 2020. Using a phylogenomic approach, the almost complete metagenome-assembled genome obtained from the analysis allowed to clarify the taxonomic position of the species within the genus Dolichospermum and contributed to frame the taxonomy of this genus within the ADA group (Anabaena/Dolichospermum/Aphanizomenon). In addition to common functional traits represented in the central metabolism of photosynthetic cyanobacteria, the genome annotation uncovered some distinctive and adaptive traits that helped define the factors that promote and maintain bloom-forming heterocytous nitrogen-fixing Nostocales in oligotrophic lakes. In addition, genetic clusters were identified that potentially encode several secondary metabolites that were previously unknown in the populations evolving in the southern Alpine Lake district. These included geosmin, anabaenopetins, and other bioactive compounds. The results expanded the knowledge of the distinctive competitive traits that drive algal blooms and provided guidance for more targeted analyses of cyanobacterial metabolites with implications for human health and water resource use.
Collapse
Affiliation(s)
- Nico Salmaso
- Research and Innovation Centre, Fondazione Edmund Mach, Via Edmund Mach, 1, 38098 San Michele all'Adige, Italy
- NBFC, National Biodiversity Future Center, Palermo 90133, Italy
| | - Leonardo Cerasino
- Research and Innovation Centre, Fondazione Edmund Mach, Via Edmund Mach, 1, 38098 San Michele all'Adige, Italy
| | - Massimo Pindo
- Research and Innovation Centre, Fondazione Edmund Mach, Via Edmund Mach, 1, 38098 San Michele all'Adige, Italy
| | - Adriano Boscaini
- Research and Innovation Centre, Fondazione Edmund Mach, Via Edmund Mach, 1, 38098 San Michele all'Adige, Italy
| |
Collapse
|
6
|
Osburn ED, McBride SG, Strickland MS. Microbial dark matter could add uncertainties to metagenomic trait estimations. Nat Microbiol 2024; 9:1427-1430. [PMID: 38740929 DOI: 10.1038/s41564-024-01687-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 03/25/2024] [Indexed: 05/16/2024]
Affiliation(s)
- Ernest D Osburn
- Department of Plant and Soil Sciences, University of Kentucky, Lexington, KY, USA.
- Department of Soil and Water Systems, University of Idaho, Moscow, ID, USA.
| | | | | |
Collapse
|
7
|
Thøgersen MS, Zervas A, Stougaard P, Ellegaard-Jensen L. Investigating eukaryotic and prokaryotic diversity and functional potential in the cold and alkaline ikaite columns in Greenland. Front Microbiol 2024; 15:1358787. [PMID: 38655082 PMCID: PMC11035741 DOI: 10.3389/fmicb.2024.1358787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 03/08/2024] [Indexed: 04/26/2024] Open
Abstract
The ikaite columns in the Ikka Fjord, SW Greenland, represent a permanently cold and alkaline environment known to contain a rich bacterial diversity. 16S and 18S rRNA gene amplicon and metagenomic sequencing was used to investigate the microbial diversity in the columns and for the first time, the eukaryotic and archaeal diversity in ikaite columns were analyzed. The results showed a rich prokaryotic diversity that varied across columns as well as within each column. Seven different archaeal phyla were documented in multiple locations inside the columns. The columns also contained a rich eukaryotic diversity with 27 phyla representing microalgae, protists, fungi, and small animals. Based on metagenomic sequencing, 25 high-quality MAGs were assembled and analyzed for the presence of genes involved in cycling of nitrogen, sulfur, and phosphorous as well as genes encoding carbohydrate-active enzymes (CAZymes), showing a potentially very bioactive microbial community.
Collapse
|
8
|
Nweze JE, Šustr V, Brune A, Angel R. Functional similarity, despite taxonomical divergence in the millipede gut microbiota, points to a common trophic strategy. MICROBIOME 2024; 12:16. [PMID: 38287457 PMCID: PMC10823672 DOI: 10.1186/s40168-023-01731-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Accepted: 11/22/2023] [Indexed: 01/31/2024]
Abstract
BACKGROUND Many arthropods rely on their gut microbiome to digest plant material, which is often low in nitrogen but high in complex polysaccharides. Detritivores, such as millipedes, live on a particularly poor diet, but the identity and nutritional contribution of their microbiome are largely unknown. In this study, the hindgut microbiota of the tropical millipede Epibolus pulchripes (large, methane emitting) and the temperate millipede Glomeris connexa (small, non-methane emitting), fed on an identical diet, were studied using comparative metagenomics and metatranscriptomics. RESULTS The results showed that the microbial load in E. pulchripes is much higher and more diverse than in G. connexa. The microbial communities of the two species differed significantly, with Bacteroidota dominating the hindguts of E. pulchripes and Proteobacteria (Pseudomonadota) in G. connexa. Despite equal sequencing effort, de novo assembly and binning recovered 282 metagenome-assembled genomes (MAGs) from E. pulchripes and 33 from G. connexa, including 90 novel bacterial taxa (81 in E. pulchripes and 9 in G. connexa). However, despite this taxonomic divergence, most of the functions, including carbohydrate hydrolysis, sulfate reduction, and nitrogen cycling, were common to the two species. Members of the Bacteroidota (Bacteroidetes) were the primary agents of complex carbon degradation in E. pulchripes, while members of Proteobacteria dominated in G. connexa. Members of Desulfobacterota were the potential sulfate-reducing bacteria in E. pulchripes. The capacity for dissimilatory nitrate reduction was found in Actinobacteriota (E. pulchripes) and Proteobacteria (both species), but only Proteobacteria possessed the capacity for denitrification (both species). In contrast, some functions were only found in E. pulchripes. These include reductive acetogenesis, found in members of Desulfobacterota and Firmicutes (Bacillota) in E. pulchripes. Also, diazotrophs were only found in E. pulchripes, with a few members of the Firmicutes and Proteobacteria expressing the nifH gene. Interestingly, fungal-cell-wall-degrading glycoside hydrolases (GHs) were among the most abundant carbohydrate-active enzymes (CAZymes) expressed in both millipede species, suggesting that fungal biomass plays an important role in the millipede diet. CONCLUSIONS Overall, these results provide detailed insights into the genomic capabilities of the microbial community in the hindgut of millipedes and shed light on the ecophysiology of these essential detritivores. Video Abstract.
Collapse
Affiliation(s)
- Julius Eyiuche Nweze
- Institute of Soil Biology and Biogeochemistry, Biology Centre CAS, České Budějovice, Czechia
- Faculty of Science, University of South Bohemia, České Budějovice, Czechia
| | - Vladimír Šustr
- Institute of Soil Biology and Biogeochemistry, Biology Centre CAS, České Budějovice, Czechia
| | - Andreas Brune
- RG Insect Gut Microbiology and Symbiosis, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Roey Angel
- Institute of Soil Biology and Biogeochemistry, Biology Centre CAS, České Budějovice, Czechia.
- Faculty of Science, University of South Bohemia, České Budějovice, Czechia.
| |
Collapse
|
9
|
Violette MJ, Hyland E, Burgener L, Ghosh A, Montoya BM, Kleiner M. Meta-omics reveals role of photosynthesis in microbially induced carbonate precipitation at a CO 2-rich geyser. ISME COMMUNICATIONS 2024; 4:ycae139. [PMID: 39866677 PMCID: PMC11760937 DOI: 10.1093/ismeco/ycae139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 10/14/2024] [Indexed: 01/28/2025]
Abstract
Microbially induced carbonate precipitation (MICP) is a natural process with potential biotechnological applications to address both carbon sequestration and sustainable construction needs. However, our understanding of the microbial processes involved in MICP is limited to a few well-researched pathways such as ureolytic hydrolysis. To expand our knowledge of MICP, we conducted an omics-based study on sedimentary communities from travertine around the CO2-driven Crystal Geyser near Green River, Utah. Using metagenomics and metaproteomics, we identified the community members and potential metabolic pathways involved in MICP. We found variations in microbial community composition between the two sites we sampled, but Rhodobacterales were consistently the most abundant order, including both chemoheterotrophs and anoxygenic phototrophs. We also identified several highly abundant genera of Cyanobacteriales. The dominance of these community members across both sites and the abundant presence of photosynthesis-related proteins suggest that photosynthesis could play a role in MICP at Crystal Geyser. We also found abundant bacterial proteins involved in phosphorous starvation response at both sites suggesting that P-limitation shapes both composition and function of the microbial community driving MICP.
Collapse
Affiliation(s)
- Marlene J Violette
- Department of Plant and Microbial Biology, North Carolina State University, 112 Derieux Place, Thomas Hall, Raleigh, NC 27607, United States
| | - Ethan Hyland
- Department of Marine, Earth, & Atmospheric Sciences, North Carolina State University, 2800 Faucette Drive, Jordan Hall, Raleigh, NC 27607, United States
| | - Landon Burgener
- Department of Geological Sciences, Brigham Young University, Carl F. Eyring Science Center, Provo, UT 84602, United States
| | - Adit Ghosh
- Department of Earth Sciences, University of Southern California, 3651 Trousdale Pkwy, Los Angeles, CA 90089, United States
| | - Brina M Montoya
- Department of Civil, Construction, and Environmental Engineering, North Carolina State University, 915 Partners Way, Fitts Wool Hall, Raleigh, NC 27606, United States
| | - Manuel Kleiner
- Department of Plant and Microbial Biology, North Carolina State University, 112 Derieux Place, Thomas Hall, Raleigh, NC 27607, United States
| |
Collapse
|
10
|
Pavlopoulos GA, Baltoumas FA, Liu S, Selvitopi O, Camargo AP, Nayfach S, Azad A, Roux S, Call L, Ivanova NN, Chen IM, Paez-Espino D, Karatzas E, Iliopoulos I, Konstantinidis K, Tiedje JM, Pett-Ridge J, Baker D, Visel A, Ouzounis CA, Ovchinnikov S, Buluç A, Kyrpides NC. Unraveling the functional dark matter through global metagenomics. Nature 2023; 622:594-602. [PMID: 37821698 PMCID: PMC10584684 DOI: 10.1038/s41586-023-06583-7] [Citation(s) in RCA: 54] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 08/30/2023] [Indexed: 10/13/2023]
Abstract
Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities1,2. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database3. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matter.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Institute for Fundamental Biomedical Research, Biomedical Science Research Center Alexander Fleming, Vari, Greece.
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Center for New Biotechnologies and Precision Medicine, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.
| | - Fotis A Baltoumas
- Institute for Fundamental Biomedical Research, Biomedical Science Research Center Alexander Fleming, Vari, Greece
| | - Sirui Liu
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, USA
| | - Oguz Selvitopi
- Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Antonio Pedro Camargo
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Stephen Nayfach
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Ariful Azad
- Luddy School of Informatics, Computing and Engineering, Indiana University Bloomington, Bloomington, IN, USA
| | - Simon Roux
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Lee Call
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Natalia N Ivanova
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - I Min Chen
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - David Paez-Espino
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, Biomedical Science Research Center Alexander Fleming, Vari, Greece
| | - Ioannis Iliopoulos
- Department of Basic Sciences, School of Medicine, University of Crete, Heraklion, Greece
| | | | - James M Tiedje
- Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA
| | - Jennifer Pett-Ridge
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Axel Visel
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Christos A Ouzounis
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica, Greece
- Biological Computation & Computational Biology Group, Artificial Intelligence & Information Analysis Lab, School of Informatics, Aristotle University of Thessalonica, Thessalonica, Greece
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, USA
| | - Aydin Buluç
- Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| |
Collapse
|
11
|
Houtkamp IM, van Zijll Langhout M, Bessem M, Pirovano W, Kort R. Multiomics characterisation of the zoo-housed gorilla gut microbiome reveals bacterial community compositions shifts, fungal cellulose-degrading, and archaeal methanogenic activity. GUT MICROBIOME (CAMBRIDGE, ENGLAND) 2023; 4:e12. [PMID: 39295898 PMCID: PMC11406404 DOI: 10.1017/gmb.2023.11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 05/15/2023] [Accepted: 07/06/2023] [Indexed: 09/21/2024]
Abstract
We carried out a comparative analysis between the bacterial microbiota composition of zoo-housed western lowland gorillas and their wild counterparts through 16S rRNA gene amplicon sequencing. In addition, we characterised the carbohydrate-active and methanogenic potential of the zoo-housed gorilla (ZHG) microbiome through shotgun metagenomics and RNA sequencing. The ZHG microbiota showed increased alpha diversity in terms of bacterial species richness and a distinct composition from that of the wild gorilla microbiota, including a loss of abundant fibre-degrading and hydrogenic Chloroflexi. Metagenomic analysis of the CAZyome indicated predominant oligosaccharide-degrading activity, while RNA sequencing revealed diverse cellulase and hemi-cellulase activities in the ZHG gut, contributing to a total of 268 identified carbohydrate-active enzymes. Metatranscriptome analysis revealed a substantial contribution of 38% of the transcripts from anaerobic fungi and archaea to the gorilla microbiome. This activity originates from cellulose-degrading and hydrogenic fungal species belonging to the class Neocallimastigomycetes, as well as from methylotrophic and hydrogenotrophic methanogenic archaea belonging to the classes Thermoplasmata and Methanobacteria, respectively. Our study shows the added value of RNA sequencing in a multiomics approach and highlights the contribution of eukaryotic and archaeal activities to the gut microbiome of gorillas.
Collapse
Affiliation(s)
- Isabel M Houtkamp
- Amsterdam Institute for Life and Environment (A-LIFE), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | | | - Mark Bessem
- Bioinformatics Department, BaseClear, Leiden, The Netherlands
| | - Walter Pirovano
- Bioinformatics Department, BaseClear, Leiden, The Netherlands
| | - Remco Kort
- Amsterdam Institute for Life and Environment (A-LIFE), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- ARTIS Amsterdam Royal Zoo, Amsterdam, The Netherlands
| |
Collapse
|
12
|
Jiang C, Wang G, Zhang J, Gu S, Wang X, Qin W, Chen K, Yuan D, Chai X, Yang M, Zhou F, Xiong J, Miao W. iGDP: An integrated genome decontamination pipeline for wild ciliated microeukaryotes. Mol Ecol Resour 2023. [PMID: 36912756 DOI: 10.1111/1755-0998.13782] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 02/25/2023] [Accepted: 03/08/2023] [Indexed: 03/14/2023]
Abstract
Ciliates are a large group of ubiquitous and highly diverse single-celled eukaryotes that play an essential role in the functioning of microbial food webs. However, their genomic diversity is far from clear due to the need to develop cultivation methods for most species, so most research is based on wild organisms that almost invariably contain contaminants. Here we establish an integrated Genome Decontamination Pipeline (iGDP) that combines homology search, telomere reads-assisted and clustering approaches to filter contaminated ciliate genome assemblies from wild specimens. We benchmarked the performance of iGDP using genomic data from a contaminated ciliate culture and the results showed that iGDP could recall 91.9% of the target sequences with 96.9% precision. We also used a synthetic dataset to offer guidelines for the application of iGDP in the removal of various groups of contaminants. Compared with several popular metagenome binning tools, iGDP could show better performance. To further validate the effectiveness of iGDP on real-world data, we applied it to decontaminate genome assemblies of three wild ciliate specimens and obtained their genomes with high quality comparable to that of previously well-studied model ciliate genomes. It is anticipated that the newly generated genomes and the established iGDP method will be valuable community resources for detailed studies on ciliate biodiversity, phylogeny, ecology and evolution. The pipeline (https://github.com/GWang2022/iGDP) can be implemented automatically to reduce manual filtering and classification and may be further developed to apply to other microeukaryotes.
Collapse
Affiliation(s)
- Chuanqi Jiang
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Guangying Wang
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Jing Zhang
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Siyu Gu
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xueyan Wang
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Weiwei Qin
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Kai Chen
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Dongxia Yuan
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Xiaocui Chai
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Mingkun Yang
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Fang Zhou
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Jie Xiong
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Wei Miao
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Freshwater Ecology and Biotechnology, Wuhan, China
- CAS Center for Excellence in Animal Evolution and Genetics, Kunming, China
| |
Collapse
|
13
|
Gabrielli M, Dai Z, Delafont V, Timmers PHA, van der Wielen PWJJ, Antonelli M, Pinto AJ. Identifying Eukaryotes and Factors Influencing Their Biogeography in Drinking Water Metagenomes. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:3645-3660. [PMID: 36827617 PMCID: PMC9996835 DOI: 10.1021/acs.est.2c09010] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 02/13/2023] [Accepted: 02/13/2023] [Indexed: 06/18/2023]
Abstract
The biogeography of eukaryotes in drinking water systems is poorly understood relative to that of prokaryotes or viruses, limiting the understanding of their role and management. A challenge with studying complex eukaryotic communities is that metagenomic analysis workflows are currently not as mature as those that focus on prokaryotes or viruses. In this study, we benchmarked different strategies to recover eukaryotic sequences and genomes from metagenomic data and applied the best-performing workflow to explore the factors affecting the relative abundance and diversity of eukaryotic communities in drinking water distribution systems (DWDSs). We developed an ensemble approach exploiting k-mer- and reference-based strategies to improve eukaryotic sequence identification and identified MetaBAT2 as the best-performing binning approach for their clustering. Applying this workflow to the DWDS metagenomes showed that eukaryotic sequences typically constituted small proportions (i.e., <1%) of the overall metagenomic data with higher relative abundances in surface water-fed or chlorinated systems with high residuals. The α and β diversities of eukaryotes were correlated with those of prokaryotic and viral communities, highlighting the common role of environmental/management factors. Finally, a co-occurrence analysis highlighted clusters of eukaryotes whose members' presence and abundance in DWDSs were affected by disinfection strategies, climate conditions, and source water types.
Collapse
Affiliation(s)
- Marco Gabrielli
- Dipartimento
di Ingegneria Civile e Ambientale—Sezione Ambientale, Politecnico di Milano, Milan 20133, Italy
| | - Zihan Dai
- Research
Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Vincent Delafont
- Laboratoire
Ecologie et Biologie des Interactions (EBI), Equipe Microorganismes,
Hôtes, Environnements, Université
de Poitiers, Poitiers 86073, France
| | - Peer H. A. Timmers
- KWR
Watercycle Research Institute, 3433 PE Nieuwegein, The Netherlands
- Department
of Microbiology, Radboud University, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands
| | - Paul W. J. J. van der Wielen
- KWR
Watercycle Research Institute, 3433 PE Nieuwegein, The Netherlands
- Laboratory
of Microbiology, Wageningen University, 6700 HB Wageningen, The Netherlands
| | - Manuela Antonelli
- Dipartimento
di Ingegneria Civile e Ambientale—Sezione Ambientale, Politecnico di Milano, Milan 20133, Italy
| | - Ameet J. Pinto
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
14
|
Baltoumas FA, Karatzas E, Paez-Espino D, Venetsianou NK, Aplakidou E, Oulas A, Finn RD, Ovchinnikov S, Pafilis E, Kyrpides NC, Pavlopoulos GA. Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters. FRONTIERS IN BIOINFORMATICS 2023; 3:1157956. [PMID: 36959975 PMCID: PMC10029925 DOI: 10.3389/fbinf.2023.1157956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 02/21/2023] [Indexed: 03/06/2023] Open
Abstract
Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.
Collapse
Affiliation(s)
- Fotis A. Baltoumas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - David Paez-Espino
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
| | - Nefeli K. Venetsianou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Eleni Aplakidou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Anastasis Oulas
- The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Robert D. Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, United States
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Heraklion, Greece
| | - Nikos C. Kyrpides
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
| | - Georgios A. Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
- Center of New Biotechnologies and Precision Medicine, Department of Medicine, School of Health Sciences, National and Kapodistrian University of Athens, Athens, Greece
- Hellenic Army Academy, Vari, Greece
| |
Collapse
|
15
|
Saraiva JP, Bartholomäus A, Toscan RB, Baldrian P, Nunes da Rocha U. Recovery of 197 eukaryotic bins reveals major challenges for eukaryote genome reconstruction from terrestrial metagenomes. Mol Ecol Resour 2023. [PMID: 36847735 DOI: 10.1111/1755-0998.13776] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 01/23/2023] [Accepted: 02/21/2023] [Indexed: 03/01/2023]
Abstract
As most eukaryotic genomes are yet to be sequenced, the mechanisms underlying their contribution to different ecosystem processes remain untapped. Although approaches to recovering Prokaryotic genomes have become common in genome biology, few studies have tackled the recovery of eukaryotic genomes from metagenomes. This study assessed the reconstruction of microbial eukaryotic genomes using 6000 metagenomes from terrestrial and some transition environments using the EukRep pipeline. Only 215 metagenomic libraries yielded eukaryotic bins. From a total of 447 eukaryotic bins recovered 197 were classified at the phylum level. Streptophytes and fungi were the most represented clades with 83 and 73 bins, respectively. More than 78% of the obtained eukaryotic bins were recovered from samples whose biomes were classified as host-associated, aquatic, and anthropogenic terrestrial. However, only 93 bins were taxonomically assigned at the genus level and 17 bins at the species level. Completeness and contamination estimates were obtained for a total of 193 bins and consisted of 44.64% (σ = 27.41%) and 3.97% (σ = 6.53%), respectively. Micromonas commoda was the most frequent taxon found while Saccharomyces cerevisiae presented the highest completeness, probably because more reference genomes are available. Current measures of completeness are based on the presence of single-copy genes. However, mapping of the contigs from the recovered eukaryotic bins to the chromosomes of the reference genomes showed many gaps, suggesting that completeness measures should also include chromosome coverage. Recovering eukaryotic genomes will benefit significantly from long-read sequencing, development of tools for dealing with repeat-rich genomes, and improved reference genomes databases.
Collapse
Affiliation(s)
- Joao Pedro Saraiva
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research-UFZ GmbH, Leipzig, Germany
| | | | - Rodolfo Brizola Toscan
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research-UFZ GmbH, Leipzig, Germany
| | - Petr Baldrian
- Laboratory of Environmental Microbiology, Institute of Microbiology of the Czech Academy of Sciences, Praha, Czech Republic
| | - Ulisses Nunes da Rocha
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research-UFZ GmbH, Leipzig, Germany
| |
Collapse
|
16
|
Salazar VW, Shaban B, Quiroga MDM, Turnbull R, Tescari E, Rossetto Marcelino V, Verbruggen H, Lê Cao KA. Metaphor-A workflow for streamlined assembly and binning of metagenomes. Gigascience 2022; 12:giad055. [PMID: 37522759 PMCID: PMC10388702 DOI: 10.1093/gigascience/giad055] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 06/05/2023] [Accepted: 07/04/2023] [Indexed: 08/01/2023] Open
Abstract
Recent advances in bioinformatics and high-throughput sequencing have enabled the large-scale recovery of genomes from metagenomes. This has the potential to bring important insights as researchers can bypass cultivation and analyze genomes sourced directly from environmental samples. There are, however, technical challenges associated with this process, most notably the complexity of computational workflows required to process metagenomic data, which include dozens of bioinformatics software tools, each with their own set of customizable parameters that affect the final output of the workflow. At the core of these workflows are the processes of assembly-combining the short-input reads into longer, contiguous fragments (contigs)-and binning, clustering these contigs into individual genome bins. The limitations of assembly and binning algorithms also pose different challenges depending on the selected strategy to execute them. Both of these processes can be done for each sample separately or by pooling together multiple samples to leverage information from a combination of samples. Here we present Metaphor, a fully automated workflow for genome-resolved metagenomics (GRM). Metaphor differs from existing GRM workflows by offering flexible approaches for the assembly and binning of the input data and by combining multiple binning algorithms with a bin refinement step to achieve high-quality genome bins. Moreover, Metaphor generates reports to evaluate the performance of the workflow. We showcase the functionality of Metaphor on different synthetic datasets and the impact of available assembly and binning strategies on the final results.
Collapse
Affiliation(s)
- Vinícius W Salazar
- Melbourne Integrative Genomics, School of Mathematics & Statistics, University of Melbourne, Parkville, VIC 3052, Victoria, Australia
| | - Babak Shaban
- Melbourne Data Analytics Platform (MDAP), University of Melbourne, Carlton, VIC 3053, Victoria, Australia
| | - Maria del Mar Quiroga
- Melbourne Data Analytics Platform (MDAP), University of Melbourne, Carlton, VIC 3053, Victoria, Australia
| | - Robert Turnbull
- Melbourne Data Analytics Platform (MDAP), University of Melbourne, Carlton, VIC 3053, Victoria, Australia
| | - Edoardo Tescari
- Melbourne Data Analytics Platform (MDAP), University of Melbourne, Carlton, VIC 3053, Victoria, Australia
| | - Vanessa Rossetto Marcelino
- Department of Molecular and Translational Sciences, Monash University, Clayton, VIC 3168, Victoria, Australia
- Centre for Innate Immunity and Infectious Diseases, Hudson Institute of Medical Research, Clayton, VIC 3168, Victoria, Australia
- School of BioSciences, University of Melbourne, Parkville, VIC 3052, Victoria, Australia
- Department of Microbiology and Immunology, The University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Parkville, VIC 3052, Victoria, Australia
| | - Heroen Verbruggen
- School of BioSciences, University of Melbourne, Parkville, VIC 3052, Victoria, Australia
| | - Kim-Anh Lê Cao
- Melbourne Integrative Genomics, School of Mathematics & Statistics, University of Melbourne, Parkville, VIC 3052, Victoria, Australia
| |
Collapse
|