1
|
Genomic insights into the cellular specialization of predation in raptorial protists. BMC Biol 2024; 22:107. [PMID: 38715037 PMCID: PMC11077807 DOI: 10.1186/s12915-024-01904-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 04/26/2024] [Indexed: 05/12/2024] Open
Abstract
BACKGROUND Predation is a fundamental mechanism for organisms to acquire energy, and various species have evolved diverse tools to enhance their hunting abilities. Among protozoan predators, raptorial Haptorian ciliates are particularly fascinating as they possess offensive extrusomes known as toxicysts, which are rapidly discharged upon prey contact. However, our understanding of the genetic processes and specific toxins involved in toxicyst formation and discharge is still limited. RESULTS In this study, we investigated the predation strategies and subcellular structures of seven Haptoria ciliate species and obtained their genome sequences using single-cell sequencing technology. Comparative genomic analysis revealed distinct gene duplications related to membrane transport proteins and hydrolytic enzymes in Haptoria, which play a crucial role in the production and discharge of toxicysts. Transcriptomic analysis further confirmed the abundant expression of genes related to membrane transporters and cellular toxins in Haptoria compared to Trichostomatia. Notably, polyketide synthases (PKS) and L-amino acid oxidases (LAAO) were identified as potentially toxin genes that underwent extensive duplication events in Haptoria. CONCLUSIONS Our results shed light on the evolutionary and genomic adaptations of Haptorian ciliates for their predation strategies in evolution and provide insights into their toxic mechanisms.
Collapse
|
2
|
Critical steps in an environmental metaproteomics workflow. Environ Microbiol 2024; 26:e16637. [PMID: 38760994 DOI: 10.1111/1462-2920.16637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 04/30/2024] [Indexed: 05/20/2024]
Abstract
Environmental metaproteomics is a rapidly advancing field that provides insights into the structure, dynamics, and metabolic activity of microbial communities. As the field is still maturing, it lacks consistent workflows, making it challenging for non-expert researchers to navigate. This review aims to introduce the workflow of environmental metaproteomics. It outlines the standard practices for sample collection, processing, and analysis, and offers strategies to overcome the unique challenges presented by common environmental matrices such as soil, freshwater, marine environments, biofilms, sludge, and symbionts. The review also highlights the bottlenecks in data analysis that are specific to metaproteomics samples and provides suggestions for researchers to obtain high-quality datasets. It includes recent benchmarking studies and descriptions of software packages specifically built for metaproteomics analysis. The article is written without assuming the reader's familiarity with single-organism proteomic workflows, making it accessible to those new to proteomics or mass spectrometry in general. This primer for environmental metaproteomics aims to improve accessibility to this exciting technology and empower researchers to tackle challenging and ambitious research questions. While it is primarily a resource for those new to the field, it should also be useful for established researchers looking to streamline or troubleshoot their metaproteomics experiments.
Collapse
|
3
|
Ten common issues with reference sequence databases and how to mitigate them. FRONTIERS IN BIOINFORMATICS 2024; 4:1278228. [PMID: 38560517 PMCID: PMC10978663 DOI: 10.3389/fbinf.2024.1278228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 03/05/2024] [Indexed: 04/04/2024] Open
Abstract
Metagenomic sequencing has revolutionized our understanding of microbiology. While metagenomic tools and approaches have been extensively evaluated and benchmarked, far less attention has been given to the reference sequence database used in metagenomic classification. Issues with reference sequence databases are pervasive. Database contamination is the most recognized issue in the literature; however, it remains relatively unmitigated in most analyses. Other common issues with reference sequence databases include taxonomic errors, inappropriate inclusion and exclusion criteria, and sequence content errors. This review covers ten common issues with reference sequence databases and the potential downstream consequences of these issues. Mitigation measures are discussed for each issue, including bioinformatic tools and database curation strategies. Together, these strategies present a path towards more accurate, reproducible and translatable metagenomic sequencing.
Collapse
|
4
|
Isolation and identification of Wickerhamiella tropicalis from blood culture by MALDI-MS. Front Cell Infect Microbiol 2024; 14:1361432. [PMID: 38510957 PMCID: PMC10953818 DOI: 10.3389/fcimb.2024.1361432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Accepted: 02/20/2024] [Indexed: 03/22/2024] Open
Abstract
Wickerhamiella is a genus of budding yeast that is mainly isolated from environmental samples, and 40 species have been detected. The yeast isolated from human clinical samples usually only contain three species: W. infanticola, W. pararugosa and W. sorbophila. In this study, we isolated W. tropicalis from a blood sample of a six-year-old female with a history of B-cell precursor lymphoblastic leukemia in Japan in 2022. Though the strain was morphologically identified as Candida species by routine microbiological examinations, it was subsequently identified as W. tropicalis by sequencing the internal transcribed spacer (ITS) of ribosomal DNA (rDNA). The isolate had amino acid substitutions in ERG11 and FKS1 associated with azole and echinocandin resistance, respectively, in Candida species and showed intermediate-resistant to fluconazole and micafungin. The patient was successfully treated with micafungin. Furthermore, matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) detected three novel peaks that are specific for W. tropicalis, indicating that MALDI-MS analysis is useful for rapid detection of Wickerhamiella species in routine microbiological examinations.
Collapse
|
5
|
Quality assessment of gene repertoire annotations with OMArk. Nat Biotechnol 2024:10.1038/s41587-024-02147-w. [PMID: 38383603 DOI: 10.1038/s41587-024-02147-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 01/17/2024] [Indexed: 02/23/2024]
Abstract
In the era of biodiversity genomics, it is crucial to ensure that annotations of protein-coding gene repertoires are accurate. State-of-the-art tools to assess genome annotations measure the completeness of a gene repertoire but are blind to other errors, such as gene overprediction or contamination. We introduce OMArk, a software package that relies on fast, alignment-free sequence comparisons between a query proteome and precomputed gene families across the tree of life. OMArk assesses not only the completeness but also the consistency of the gene repertoire as a whole relative to closely related species and reports likely contamination events. Analysis of 1,805 UniProt Eukaryotic Reference Proteomes with OMArk demonstrated strong evidence of contamination in 73 proteomes and identified error propagation in avian gene annotation resulting from the use of a fragmented zebra finch proteome as a reference. This study illustrates the importance of comparing and prioritizing proteomes based on their quality measures.
Collapse
|
6
|
ContScout: sensitive detection and removal of contamination from annotated genomes. Nat Commun 2024; 15:936. [PMID: 38296951 PMCID: PMC10831095 DOI: 10.1038/s41467-024-45024-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 01/08/2024] [Indexed: 02/02/2024] Open
Abstract
Contamination of genomes is an increasingly recognized problem affecting several downstream applications, from comparative evolutionary genomics to metagenomics. Here we introduce ContScout, a precise tool for eliminating foreign sequences from annotated genomes. It achieves high specificity and sensitivity on synthetic benchmark data even when the contaminant is a closely related species, outperforms competing tools, and can distinguish horizontal gene transfer from contamination. A screen of 844 eukaryotic genomes for contamination identified bacteria as the most common source, followed by fungi and plants. Furthermore, we show that contaminants in ancestral genome reconstructions lead to erroneous early origins of genes and inflate gene loss rates, leading to a false notion of complex ancestral genomes. Taken together, we offer here a tool for sensitive removal of foreign proteins, identify and remove contaminants from diverse eukaryotic genomes and evaluate their impact on phylogenomic analyses.
Collapse
|
7
|
A genome catalog of the early-life human skin microbiome. Genome Biol 2023; 24:252. [PMID: 37946302 PMCID: PMC10636849 DOI: 10.1186/s13059-023-03090-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 10/17/2023] [Indexed: 11/12/2023] Open
Abstract
BACKGROUND Metagenome-assembled genomes have greatly expanded the reference genomes for skin microbiome. However, the current reference genomes are largely based on samples from adults in North America and lack representation from infants and individuals from other continents. RESULTS Here we use deep shotgun metagenomic sequencing to profile the skin microbiota of 215 infants at age 2-3 months and 12 months who are part of the VITALITY trial in Australia as well as 67 maternally matched samples. Based on the infant samples, we present the Early-Life Skin Genomes (ELSG) catalog, comprising 9483 prokaryotic genomes from 1056 species, 206 fungal genomes from 13 species, and 39 eukaryotic viral sequences. This genome catalog substantially expands the diversity of species previously known to comprise human skin microbiome and improves the classification rate of sequenced data by 21%. The protein catalog derived from these genomes provides insights into the functional elements such as defense mechanisms that distinguish early-life skin microbiome. We also find evidence for microbial sharing at the community, bacterial species, and strain levels between mothers and infants. CONCLUSIONS Overall, the ELSG catalog uncovers the skin microbiome of a previously underrepresented age group and population and provides a comprehensive view of human skin microbiome diversity, function, and development in early life.
Collapse
|
8
|
Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton. mBio 2023; 14:e0167623. [PMID: 37947402 PMCID: PMC10746220 DOI: 10.1128/mbio.01676-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 09/27/2023] [Indexed: 11/12/2023] Open
Abstract
Metagenomics is a powerful method for interpreting the ecological roles and physiological capabilities of mixed microbial communities. Yet, many tools for processing metagenomic data are neither designed to consider eukaryotes nor are they built for an increasing amount of sequence data. EukHeist is an automated pipeline to retrieve eukaryotic and prokaryotic metagenome-assembled genomes (MAGs) from large-scale metagenomic sequence data sets. We developed the EukHeist workflow to specifically process large amounts of both metagenomic and/or metatranscriptomic sequence data in an automated and reproducible fashion. Here, we applied EukHeist to the large-size fraction data (0.8-2,000 µm) from Tara Oceans to recover both eukaryotic and prokaryotic MAGs, which we refer to as TOPAZ (Tara Oceans Particle-Associated MAGs). The TOPAZ MAGs consisted of >900 environmentally relevant eukaryotic MAGs and >4,000 bacterial and archaeal MAGs. The bacterial and archaeal TOPAZ MAGs expand upon the phylogenetic diversity of likely particle- and host-associated taxa. We use these MAGs to demonstrate an approach to infer the putative trophic mode of the recovered eukaryotic MAGs. We also identify ecological cohorts of co-occurring MAGs, which are driven by specific environmental factors and putative host-microbe associations. These data together add to a number of growing resources of environmentally relevant eukaryotic genomic information. Complementary and expanded databases of MAGs, such as those provided through scalable pipelines like EukHeist, stand to advance our understanding of eukaryotic diversity through increased coverage of genomic representatives across the tree of life.IMPORTANCESingle-celled eukaryotes play ecologically significant roles in the marine environment, yet fundamental questions about their biodiversity, ecological function, and interactions remain. Environmental sequencing enables researchers to document naturally occurring protistan communities, without culturing bias, yet metagenomic and metatranscriptomic sequencing approaches cannot separate individual species from communities. To more completely capture the genomic content of mixed protistan populations, we can create bins of sequences that represent the same organism (metagenome-assembled genomes [MAGs]). We developed the EukHeist pipeline, which automates the binning of population-level eukaryotic and prokaryotic genomes from metagenomic reads. We show exciting insight into what protistan communities are present and their trophic roles in the ocean. Scalable computational tools, like EukHeist, may accelerate the identification of meaningful genetic signatures from large data sets and complement researchers' efforts to leverage MAG databases for addressing ecological questions, resolving evolutionary relationships, and discovering potentially novel biodiversity.
Collapse
|
9
|
A k-mer-Based Approach for Phylogenetic Classification of Taxa in Environmental Genomic Data. Syst Biol 2023; 72:1101-1118. [PMID: 37314057 DOI: 10.1093/sysbio/syad037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 03/20/2023] [Accepted: 06/12/2023] [Indexed: 06/15/2023] Open
Abstract
In the age of genome sequencing, whole-genome data is readily and frequently generated, leading to a wealth of new information that can be used to advance various fields of research. New approaches, such as alignment-free phylogenetic methods that utilize k-mer-based distance scoring, are becoming increasingly popular given their ability to rapidly generate phylogenetic information from whole-genome data. However, these methods have not yet been tested using environmental data, which often tends to be highly fragmented and incomplete. Here, we compare the results of one alignment-free approach (which utilizes the D2 statistic) to traditional multi-gene maximum likelihood trees in 3 algal groups that have high-quality genome data available. In addition, we simulate lower-quality, fragmented genome data using these algae to test method robustness to genome quality and completeness. Finally, we apply the alignment-free approach to environmental metagenome assembled genome data of unclassified Saccharibacteria and Trebouxiophyte algae, and single-cell amplified data from uncultured marine stramenopiles to demonstrate its utility with real datasets. We find that in all instances, the alignment-free method produces phylogenies that are comparable, and often more informative, than those created using the traditional multi-gene approach. The k-mer-based method performs well even when there are significant missing data that include marker genes traditionally used for tree reconstruction. Our results demonstrate the value of alignment-free approaches for classifying novel, often cryptic or rare, species, that may not be culturable or are difficult to access using single-cell methods, but fill important gaps in the tree of life.
Collapse
|
10
|
Expanded microbiome niches of RAG-deficient patients. Cell Rep Med 2023; 4:101205. [PMID: 37757827 PMCID: PMC10591041 DOI: 10.1016/j.xcrm.2023.101205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 11/28/2022] [Accepted: 08/31/2023] [Indexed: 09/29/2023]
Abstract
The complex interplay between microbiota and immunity is important to human health. To explore how altered adaptive immunity influences the microbiome, we characterize skin, nares, and gut microbiota of patients with recombination-activating gene (RAG) deficiency-a rare genetically defined inborn error of immunity (IEI) that results in a broad spectrum of clinical phenotypes. Integrating de novo assembly of metagenomes from RAG-deficient patients with reference genome catalogs provides an expansive multi-kingdom view of microbial diversity. RAG-deficient patient microbiomes exhibit inter-individual variation, including expansion of opportunistic pathogens (e.g., Corynebacterium bovis, Haemophilus influenzae), and a relative loss of body site specificity. We identify 35 and 27 bacterial species derived from skin/nares and gut microbiomes, respectively, which are distinct to RAG-deficient patients compared to healthy individuals. Underscoring IEI patients as potential reservoirs for viral persistence and evolution, we further characterize the colonization of eukaryotic RNA viruses (e.g., Coronavirus 229E, Norovirus GII) in this patient population.
Collapse
|
11
|
ACR: metagenome-assembled prokaryotic and eukaryotic genome refinement tool. Brief Bioinform 2023; 24:bbad381. [PMID: 37889119 DOI: 10.1093/bib/bbad381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 09/16/2023] [Accepted: 10/03/2023] [Indexed: 10/28/2023] Open
Abstract
Microbial genome recovery from metagenomes can further explain microbial ecosystem structures, functions and dynamics. Thus, this study developed the Additional Clustering Refiner (ACR) to enhance high-purity prokaryotic and eukaryotic metagenome-assembled genome (MAGs) recovery. ACR refines low-quality MAGs by subjecting them to iterative k-means clustering predicated on contig abundance and increasing bin purity through validated universal marker genes. Synthetic and real-world metagenomic datasets, including short- and long-read sequences, evaluated ACR's effectiveness. The results demonstrated improved MAG purity and a significant increase in high- and medium-quality MAG recovery rates. In addition, ACR seamlessly integrates with various binning algorithms, augmenting their strengths without modifying core features. Furthermore, its multiple sequencing technology compatibilities expand its applicability. By efficiently recovering high-quality prokaryotic and eukaryotic genomes, ACR is a promising tool for deepening our understanding of microbial communities through genome-centric metagenomics.
Collapse
|
12
|
Genomic Characterization of Theileria luwenshuni Strain Cheeloo. Microbiol Spectr 2023; 11:e0030123. [PMID: 37260375 PMCID: PMC10434005 DOI: 10.1128/spectrum.00301-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 05/11/2023] [Indexed: 06/02/2023] Open
Abstract
Theileria, a tick-borne intracellular protozoan, can cause infections of various livestock and wildlife around the world, posing a threat to veterinary health. Although more and more Theileria species have been identified, genomes have been available only from four Theileria species to date. Here, we assembled a whole genome of Theileria luwenshuni, an emerging Theileria, through next-generation sequencing of purified erythrocytes from the blood of a naturally infected goat. We designated it T. luwenshuni str. Cheeloo because its genome was assembled by the researchers at Cheeloo College of Medicine, Shandong University, China. The genome of T. lunwenshuni str. Cheeloo was the smallest in comparison with the other four Theileria species. T. luwenshuni str. Cheeloo possessed the fewest gene gains and gene family expansion. The protein count of each category was always comparable between T. luwenshuni str. Cheeloo and T. orientalis str. Shintoku in the Eukaryote Orthologs annotation, though there were remarkable differences in genome size. T. luwenshuni str. Cheeloo had lower counts than the other four Theileria species in most categories at level 3 of Gene Ontology annotation. Kyoto Encyclopedia of Genes and Genomes annotation revealed a loss of the c-Myb in T. luwenshuni str. Cheeloo. The infection rate of T. luwenshuni str. Cheeloo was up to 81.5% in a total of 54 goats from three flocks. The phylogenetic analyses based on both 18S rRNA and cox1 genes indicated that T. luwenshuni had relatively low diversity. The first characterization of the T. luwenshuni genome will promote better understanding of the emerging Theileria. IMPORTANCE Theileria has led to substantial economic losses in animal husbandry. Whole-genome sequencing data of the genus Theileria are currently limited, which has prohibited us from further understanding their molecular features. This work depicted whole-genome sequences of T. luwenshuni str. Cheeloo, an emerging Theileria species, and reported a high prevalence of T. luwenshuni str. Cheeloo infection in goats. The first assembly and characterization of T. luwenshuni genome will benefit exploring the infective and pathogenic mechanisms of the emerging Theileria to provide scientific basis for future control strategies of theileriosis.
Collapse
|
13
|
Terabase-Scale Coassembly of a Tropical Soil Microbiome. Microbiol Spectr 2023; 11:e0020023. [PMID: 37310219 PMCID: PMC10434106 DOI: 10.1128/spectrum.00200-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 05/24/2023] [Indexed: 06/14/2023] Open
Abstract
Petabases of environmental metagenomic data are publicly available, presenting an opportunity to characterize complex environments and discover novel lineages of life. Metagenome coassembly, in which many metagenomic samples from an environment are simultaneously analyzed to infer the underlying genomes' sequences, is an essential tool for achieving this goal. We applied MetaHipMer2, a distributed metagenome assembler that runs on supercomputing clusters, to coassemble 3.4 terabases (Tbp) of metagenome data from a tropical soil in the Luquillo Experimental Forest (LEF), Puerto Rico. The resulting coassembly yielded 39 high-quality (>90% complete, <5% contaminated, with predicted 23S, 16S, and 5S rRNA genes and ≥18 tRNAs) metagenome-assembled genomes (MAGs), including two from the candidate phylum Eremiobacterota. Another 268 medium-quality (≥50% complete, <10% contaminated) MAGs were extracted, including the candidate phyla Dependentiae, Dormibacterota, and Methylomirabilota. In total, 307 medium- or higher-quality MAGs were assigned to 23 phyla, compared to 294 MAGs assigned to nine phyla in the same samples individually assembled. The low-quality (<50% complete, <10% contaminated) MAGs from the coassembly revealed a 49% complete rare biosphere microbe from the candidate phylum FCPU426 among other low-abundance microbes, an 81% complete fungal genome from the phylum Ascomycota, and 30 partial eukaryotic MAGs with ≥10% completeness, possibly representing protist lineages. A total of 22,254 viruses, many of them low abundance, were identified. Estimation of metagenome coverage and diversity indicates that we may have characterized ≥87.5% of the sequence diversity in this humid tropical soil and indicates the value of future terabase-scale sequencing and coassembly of complex environments. IMPORTANCE Petabases of reads are being produced by environmental metagenome sequencing. An essential step in analyzing these data is metagenome assembly, the computational reconstruction of genome sequences from microbial communities. "Coassembly" of metagenomic sequence data, in which multiple samples are assembled together, enables more complete detection of microbial genomes in an environment than "multiassembly," in which samples are assembled individually. To demonstrate the potential for coassembling terabases of metagenome data to drive biological discovery, we applied MetaHipMer2, a distributed metagenome assembler that runs on supercomputing clusters, to coassemble 3.4 Tbp of reads from a humid tropical soil environment. The resulting coassembly, its functional annotation, and analysis are presented here. The coassembly yielded more, and phylogenetically more diverse, microbial, eukaryotic, and viral genomes than the multiassembly of the same data. Our resource may facilitate the discovery of novel microbial biology in tropical soils and demonstrates the value of terabase-scale metagenome sequencing.
Collapse
|
14
|
Ultra-deep sequencing of Hadza hunter-gatherers recovers vanishing gut microbes. Cell 2023; 186:3111-3124.e13. [PMID: 37348505 PMCID: PMC10330870 DOI: 10.1016/j.cell.2023.05.046] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 02/12/2023] [Accepted: 05/26/2023] [Indexed: 06/24/2023]
Abstract
The gut microbiome modulates immune and metabolic health. Human microbiome data are biased toward industrialized populations, limiting our understanding of non-industrialized microbiomes. Here, we performed ultra-deep metagenomic sequencing on 351 fecal samples from the Hadza hunter-gatherers of Tanzania and comparative populations in Nepal and California. We recovered 91,662 genomes of bacteria, archaea, bacteriophages, and eukaryotes, 44% of which are absent from existing unified datasets. We identified 124 gut-resident species vanishing in industrialized populations and highlighted distinct aspects of the Hadza gut microbiome related to in situ replication rates, signatures of selection, and strain sharing. Industrialized gut microbes were found to be enriched in genes associated with oxidative stress, possibly a result of microbiome adaptation to inflammatory processes. This unparalleled view of the Hadza gut microbiome provides a valuable resource, expands our understanding of microbes capable of colonizing the human gut, and clarifies the extensive perturbation induced by the industrialized lifestyle.
Collapse
|
15
|
A genome catalog of the early-life human skin microbiome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.22.541509. [PMID: 37398010 PMCID: PMC10312837 DOI: 10.1101/2023.05.22.541509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Metagenome-assembled genomes have greatly expanded the reference genomes for skin microbiome. However, the current reference genomes are largely based on samples from adults in North America and lack representation from infants and individuals from other continents. Here we used ultra-deep shotgun metagenomic sequencing to profile the skin microbiota of 215 infants at age 2-3 months and 12 months who were part of the VITALITY trial in Australia as well as 67 maternally-matched samples. Based on the infant samples, we present the Early-Life Skin Genomes (ELSG) catalog, comprising 9,194 bacterial genomes from 1,029 species, 206 fungal genomes from 13 species, and 39 eukaryotic viral sequences. This genome catalog substantially expands the diversity of species previously known to comprise human skin microbiome and improves the classification rate of sequenced data by 25%. The protein catalog derived from these genomes provides insights into the functional elements such as defense mechanisms that distinguish early-life skin microbiome. We also found evidence for vertical transmission at the microbial community, individual skin bacterial species and strain levels between mothers and infants. Overall, the ELSG catalog uncovers the skin microbiome of a previously underrepresented age group and population and provides a comprehensive view of human skin microbiome diversity, function, and transmission in early life.
Collapse
|
16
|
Dataset of 143 metagenome-assembled genomes from the Arctic and Atlantic Oceans, including 21 for eukaryotic organisms. Data Brief 2023; 47:108990. [PMID: 36879606 PMCID: PMC9984783 DOI: 10.1016/j.dib.2023.108990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 01/18/2023] [Accepted: 02/09/2023] [Indexed: 02/17/2023] Open
Abstract
This article presents metagenome-assembled genomes (MAGs) for both eukaryotic and prokaryotic organisms originating from the Arctic and Atlantic oceans, along with gene prediction and functional annotation for MAGs from both domains. Eleven samples from the chlorophyll-a maximum layer of the surface ocean were collected during two cruises in 2012; six from the Arctic in June-July on ARK-XXVII/1 (PS80), and five from the Atlantic in November on ANT-XXIX/1 (PS81). Sequencing and assembly was carried out by the Joint Genome Institute (JGI), who provide annotation of the assembled sequences, and 122 MAGs for prokaryotic organisms. A subsequent binning process identified 21 MAGs for eukaryotic organisms, mostly identified as Mamiellophyceae or Bacillariophyceae. The data for each MAG includes sequences in FASTA format, and tables of functional annotation of genes. For eukaryotic MAGs, transcript and protein sequences for predicted genes are available. A spreadsheet is provided summarising quality measures and taxonomic classifications for each MAG. These data provide draft genomes for uncultured marine microbes, including some of the first MAGs for polar eukaryotes, and can provide reference genetic data for these environments, or used in genomics-based comparison between environments.
Collapse
|
17
|
Metagenomics Shines Light on the Evolution of "Sunscreen" Pigment Metabolism in the Teloschistales (Lichen-Forming Ascomycota). Genome Biol Evol 2023; 15:6986375. [PMID: 36634008 PMCID: PMC9907504 DOI: 10.1093/gbe/evad002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 11/25/2022] [Accepted: 01/09/2023] [Indexed: 01/13/2023] Open
Abstract
Fungi produce a vast number of secondary metabolites that shape their interactions with other organisms and the environment. Characterizing the genes underpinning metabolite synthesis is therefore key to understanding fungal evolution and adaptation. Lichenized fungi represent almost one-third of Ascomycota diversity and boast impressive secondary metabolites repertoires. However, most lichen biosynthetic genes have not been linked to their metabolite products. Here we used metagenomic sequencing to survey gene families associated with production of anthraquinones, UV-protectant secondary metabolites present in various fungi, but especially abundant in a diverse order of lichens, the Teloschistales (class Lecanoromycetes, phylum Ascomycota). We successfully assembled 24 new, high-quality lichenized-fungal genomes de novo and combined them with publicly available Lecanoromycetes genomes from taxa with diverse secondary chemistry to produce a whole-genome tree. Secondary metabolite biosynthetic gene cluster (BGC) analysis showed that whilst lichen BGCs are numerous and highly dissimilar, core enzyme genes are generally conserved across taxa. This suggests metabolite diversification occurs via re-shuffling existing enzyme genes with novel accessory genes rather than BGC gains/losses or de novo gene evolution. We identified putative anthraquinone BGCs in our lichen dataset that appear homologous to anthraquinone clusters from non-lichenized fungi, suggesting these genes were present in the common ancestor of the subphylum Pezizomycotina. Finally, we identified unique transporter genes in Teloschistales anthraquinone BGCs that may explain why these metabolites are so abundant and ubiquitous in these lichens. Our results support the importance of metagenomics for understanding the secondary metabolism of non-model fungi such as lichens.
Collapse
|
18
|
The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics. Gigascience 2022; 12:giad022. [PMID: 37036103 PMCID: PMC10084500 DOI: 10.1093/gigascience/giad022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 01/29/2023] [Accepted: 03/14/2023] [Indexed: 04/11/2023] Open
Abstract
BACKGROUND Microbial culture collections play a key role in taxonomy by studying the diversity of their strains and providing well-characterized biological material to the scientific community for fundamental and applied research. These microbial resource centers thus need to implement new standards in species delineation, including whole-genome sequencing and phylogenomics. In this context, the genomic needs of the Belgian Coordinated Collections of Microorganisms were studied, resulting in the GEN-ERA toolbox. The latter is a unified cluster of bioinformatic workflows dedicated to both bacteria and small eukaryotes (e.g., yeasts). FINDINGS This public toolbox allows researchers without a specific training in bioinformatics to perform robust phylogenomic analyses. Hence, it facilitates all steps from genome downloading and quality assessment, including genomic contamination estimation, to tree reconstruction. It also offers workflows for average nucleotide identity comparisons and metabolic modeling. TECHNICAL DETAILS Nextflow workflows are launched by a single command and are available on the GEN-ERA GitHub repository (https://github.com/Lcornet/GENERA). All the workflows are based on Singularity containers to increase reproducibility. TESTING The toolbox was developed for a diversity of microorganisms, including bacteria and fungi. It was further tested on an empirical dataset of 18 (meta)genomes of early branching Cyanobacteria, providing the most up-to-date phylogenomic analysis of the Gloeobacterales order, the first group to diverge in the evolutionary tree of Cyanobacteria. CONCLUSION The GEN-ERA toolbox can be used to infer completely reproducible comparative genomic and metabolic analyses on prokaryotes and small eukaryotes. Although designed for routine bioinformatics of culture collections, it can also be used by all researchers interested in microbial taxonomy, as exemplified by our case study on Gloeobacterales.
Collapse
|
19
|
Genome-level analyses resolve an ancient lineage of symbiotic ascomycetes. Curr Biol 2022; 32:5209-5218.e5. [PMID: 36423639 DOI: 10.1016/j.cub.2022.11.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 08/30/2022] [Accepted: 11/07/2022] [Indexed: 11/24/2022]
Abstract
Ascomycota account for about two-thirds of named fungal species.1 Over 98% of known Ascomycota belong to the Pezizomycotina, including many economically important species as well as diverse pathogens, decomposers, and mutualistic symbionts.2 Our understanding of Pezizomycotina evolution has until now been based on sampling traditionally well-defined taxonomic classes.3,4,5 However, considerable diversity exists in undersampled and uncultured, putatively early-diverging lineages, and the effect of these on evolutionary models has seldom been tested. We obtained genomes from 30 putative early-diverging lineages not included in recent phylogenomic analyses and analyzed these together with 451 genomes covering all available ascomycete genera. We show that 22 of these lineages, collectively representing over 600 species, trace back to a single origin that diverged from the common ancestor of Eurotiomycetes and Lecanoromycetes over 300 million years BP. The new clade, which we recognize as a more broadly defined Lichinomycetes, includes lichen and insect symbionts, endophytes, and putative mycorrhizae and encompasses a range of morphologies so disparate that they have recently been placed in six different taxonomic classes. To test for shared hidden features within this group, we analyzed genome content and compared gene repertoires to related groups in Ascomycota. Regardless of their lifestyle, Lichinomycetes have smaller genomes than most filamentous Ascomycota, with reduced arsenals of carbohydrate-degrading enzymes and secondary metabolite gene clusters. Our expanded genome sample resolves the relationships of numerous "orphan" ascomycetes and establishes the independent evolutionary origins of multiple mutualistic lifestyles within a single, morphologically hyperdiverse clade of fungi.
Collapse
|
20
|
Bioprospecting uncultivable microbial diversity in tannery effluent contaminated soil using shotgun sequencing and bio-reduction of chromium by indigenous chromate reductase genes. ENVIRONMENTAL RESEARCH 2022; 215:114338. [PMID: 36116499 DOI: 10.1016/j.envres.2022.114338] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 09/06/2022] [Accepted: 09/10/2022] [Indexed: 06/15/2023]
Abstract
The tannery industry generates a consequential threat to the environment by producing a large amount of potentially toxic metal-containing waste. Bioremediation has been a promising approach for treating potentially toxic metals, but the efficiency of remediation in microbes is one of the factors limiting their application in tanneries waste treatment. The motivation behind the present work was to explore the microbial diversity and chromate reductase genes present in the tannery effluent-contaminated soil using metagenomics approach. The use of shotgun sequencing enabled the identification of operational parameters that influence microbiome composition and their ability to reduce Chromium (Cr) concentration. The Cr concentration in Kanpur tannery effluent contaminated soil sample was 700 ppm which is many folds than the approved permissible limit by World Health Organisation (WHO) for Cr is 100 ppm. Metagenomic Deoxyribo Nucleic Acid (DNA) was extracted to explore taxonomic community structure, phylogenetic linkages, and functional profile. With a Guanine-Cytosine (GC) abundance of 54%, total of 45,163,604 high-quality filtered reads were obtained. Bacteria (83%), Archaebacteria (14%), and Viruses (3%) were discovered in the structural biodiversity. Bacteria were classified to phylum level, with Proteobacteria (52%) being the dominant population, followed by Bacteriodetes (15%), Chloroflexi (15%), Spirochaetes (7%), Thermotogae (5%), Actinobacteria (4%), and Firmicutes (1%). The OXR genes were cloned and checked for their efficiency to reduce Cr concentration. Insitu validation of OXR8 gene showed a reduction of Cr concentration from 700 ppm to 24 ppm in 72 h (96.51% reduction). The results of this study suggests that there is a huge reservoir of microbes and chromate reductase genes which are unexplored yet.
Collapse
|
21
|
Pan-cancer analyses reveal cancer-type-specific fungal ecologies and bacteriome interactions. Cell 2022; 185:3789-3806.e17. [PMID: 36179670 PMCID: PMC9567272 DOI: 10.1016/j.cell.2022.09.005] [Citation(s) in RCA: 143] [Impact Index Per Article: 71.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 05/13/2022] [Accepted: 08/31/2022] [Indexed: 01/26/2023]
Abstract
Cancer-microbe associations have been explored for centuries, but cancer-associated fungi have rarely been examined. Here, we comprehensively characterize the cancer mycobiome within 17,401 patient tissue, blood, and plasma samples across 35 cancer types in four independent cohorts. We report fungal DNA and cells at low abundances across many major human cancers, with differences in community compositions that differ among cancer types, even when accounting for technical background. Fungal histological staining of tissue microarrays supported intratumoral presence and frequent spatial association with cancer cells and macrophages. Comparing intratumoral fungal communities with matched bacteriomes and immunomes revealed co-occurring bi-domain ecologies, often with permissive, rather than competitive, microenvironments and distinct immune responses. Clinically focused assessments suggested prognostic and diagnostic capacities of the tissue and plasma mycobiomes, even in stage I cancers, and synergistic predictive performance with bacteriomes.
Collapse
|
22
|
Marine DNA methylation patterns are associated with microbial community composition and inform virus-host dynamics. MICROBIOME 2022; 10:157. [PMID: 36167684 PMCID: PMC9516812 DOI: 10.1186/s40168-022-01340-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Accepted: 07/28/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND DNA methylation in prokaryotes is involved in many different cellular processes including cell cycle regulation and defense against viruses. To date, most prokaryotic methylation systems have been studied in culturable microorganisms, resulting in a limited understanding of DNA methylation from a microbial ecology perspective. Here, we analyze the distribution patterns of several microbial epigenetics marks in the ocean microbiome through genome-centric metagenomics across all domains of life. RESULTS We reconstructed 15,056 viral, 252 prokaryotic, 56 giant viral, and 6 eukaryotic metagenome-assembled genomes from northwest Pacific Ocean seawater samples using short- and long-read sequencing approaches. These metagenome-derived genomes mostly represented novel taxa, and recruited a majority of reads. Thanks to single-molecule real-time (SMRT) sequencing technology, base modification could also be detected for these genomes. This showed that DNA methylation can readily be detected across dominant oceanic bacterial, archaeal, and viral populations, and microbial epigenetic changes correlate with population differentiation. Furthermore, our genome-wide epigenetic analysis of Pelagibacter suggests that GANTC, a DNA methyltransferase target motif, is related to the cell cycle and is affected by environmental conditions. Yet, the presence of this motif also partitions the phylogeny of the Pelagibacter phages, possibly hinting at a competitive co-evolutionary history and multiple effects of a single methylation mark. CONCLUSIONS Overall, this study elucidates that DNA methylation patterns are associated with ecological changes and virus-host dynamics in the ocean microbiome. Video Abstract.
Collapse
|
23
|
Metagenome-assembled genomes of phytoplankton microbiomes from the Arctic and Atlantic Oceans. MICROBIOME 2022; 10:67. [PMID: 35484634 PMCID: PMC9047304 DOI: 10.1186/s40168-022-01254-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 02/28/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND Phytoplankton communities significantly contribute to global biogeochemical cycles of elements and underpin marine food webs. Although their uncultured genomic diversity has been estimated by planetary-scale metagenome sequencing and subsequent reconstruction of metagenome-assembled genomes (MAGs), this approach has yet to be applied for complex phytoplankton microbiomes from polar and non-polar oceans consisting of microbial eukaryotes and their associated prokaryotes. RESULTS Here, we have assembled MAGs from chlorophyll a maximum layers in the surface of the Arctic and Atlantic Oceans enriched for species associations (microbiomes) with a focus on pico- and nanophytoplankton and their associated heterotrophic prokaryotes. From 679 Gbp and estimated 50 million genes in total, we recovered 143 MAGs of medium to high quality. Although there was a strict demarcation between Arctic and Atlantic MAGs, adjacent sampling stations in each ocean had 51-88% MAGs in common with most species associations between Prasinophytes and Proteobacteria. Phylogenetic placement revealed eukaryotic MAGs to be more diverse in the Arctic whereas prokaryotic MAGs were more diverse in the Atlantic Ocean. Approximately 70% of protein families were shared between Arctic and Atlantic MAGs for both prokaryotes and eukaryotes. However, eukaryotic MAGs had more protein families unique to the Arctic whereas prokaryotic MAGs had more families unique to the Atlantic. CONCLUSION Our study provides a genomic context to complex phytoplankton microbiomes to reveal that their community structure was likely driven by significant differences in environmental conditions between the polar Arctic and warm surface waters of the tropical and subtropical Atlantic Ocean. Video Abstract.
Collapse
|
24
|
Functional repertoire convergence of distantly related eukaryotic plankton lineages abundant in the sunlit ocean. CELL GENOMICS 2022; 2:100123. [PMID: 36778897 PMCID: PMC9903769 DOI: 10.1016/j.xgen.2022.100123] [Citation(s) in RCA: 38] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 12/10/2021] [Accepted: 04/04/2022] [Indexed: 12/20/2022]
Abstract
Marine planktonic eukaryotes play critical roles in global biogeochemical cycles and climate. However, their poor representation in culture collections limits our understanding of the evolutionary history and genomic underpinnings of planktonic ecosystems. Here, we used 280 billion Tara Oceans metagenomic reads from polar, temperate, and tropical sunlit oceans to reconstruct and manually curate more than 700 abundant and widespread eukaryotic environmental genomes ranging from 10 Mbp to 1.3 Gbp. This genomic resource covers a wide range of poorly characterized eukaryotic lineages that complement long-standing contributions from culture collections while better representing plankton in the upper layer of the oceans. We performed the first, to our knowledge, comprehensive genome-wide functional classification of abundant unicellular eukaryotic plankton, revealing four major groups connecting distantly related lineages. Neither trophic modes of plankton nor its vertical evolutionary history could completely explain the functional repertoire convergence of major eukaryotic lineages that coexisted within oceanic currents for millions of years.
Collapse
|
25
|
Genomic and metabolic adaptations of biofilms to ecological windows of opportunity in glacier-fed streams. Nat Commun 2022; 13:2168. [PMID: 35444202 PMCID: PMC9021309 DOI: 10.1038/s41467-022-29914-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 04/07/2022] [Indexed: 11/09/2022] Open
Abstract
In glacier-fed streams, ecological windows of opportunity allow complex microbial biofilms to develop and transiently form the basis of the food web, thereby controlling key ecosystem processes. Using metagenome-assembled genomes, we unravel strategies that allow biofilms to seize this opportunity in an ecosystem otherwise characterized by harsh environmental conditions. We observe a diverse microbiome spanning the entire tree of life including a rich virome. Various co-existing energy acquisition pathways point to diverse niches and the exploitation of available resources, likely fostering the establishment of complex biofilms during windows of opportunity. The wide occurrence of rhodopsins, besides chlorophyll, highlights the role of solar energy capture in these biofilms while internal carbon and nutrient cycling between photoautotrophs and heterotrophs may help overcome constraints imposed by oligotrophy in these habitats. Mechanisms potentially protecting bacteria against low temperatures and high UV-radiation are also revealed and the selective pressure of this environment is further highlighted by a phylogenomic analysis differentiating important components of the glacier-fed stream microbiome from other ecosystems. Our findings reveal key genomic underpinnings of adaptive traits contributing to the success of complex biofilms to exploit environmental opportunities in glacier-fed streams, which are now rapidly changing owing to global warming.
Collapse
|
26
|
Merrill BD, Carter MM, Olm MR, Dahan D, Tripathi S, Spencer SP, Yu B, Jain S, Neff N, Jha AR, Sonnenburg ED, Sonnenburg JL. Ultra-deep Sequencing of Hadza Hunter-Gatherers Recovers Vanishing Microbes.. [PMID: 36238714 PMCID: PMC9558438 DOI: 10.1101/2022.03.30.486478] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The gut microbiome is a key modulator of immune and metabolic health. Human microbiome data is biased towards industrialized populations, providing limited understanding of the distinct and diverse non-industrialized microbiomes. Here, we performed ultra-deep metagenomic sequencing and strain cultivation on 351 fecal samples from the Hadza, hunter-gatherers in Tanzania, and comparative populations in Nepal and California. We recover 94,971 total genomes of bacteria, archaea, bacteriophages, and eukaryotes, 43% of which are absent from existing unified datasets. Analysis of in situ growth rates, genetic pN/pS signatures, high-resolution strain tracking, and 124 gut-resident species vanishing in industrialized populations reveals differentiating dynamics of the Hadza gut microbiome. Industrialized gut microbes are enriched in genes associated with oxidative stress, possibly a result of microbiome adaptation to inflammatory processes. This unparalleled view of the Hadza gut microbiome provides a valuable resource that expands our understanding of microbes capable of colonizing the human gut and clarifies the extensive perturbation brought on by the industrialized lifestyle.
Collapse
|
27
|
Abstract
Ciliated protists are among the oldest unicellular organisms with a heterotrophic lifestyle and share a common ancestor with Plantae. Unlike any other eukaryotes, there are two distinct nuclei in ciliates with separate germline and somatic cell functions. Here, we assembled a near-complete macronuclear genome of Fabrea salina, which belongs to one of the oldest clades of ciliates. Its extremely minimized genome (18.35 Mb) is the smallest among all free-living heterotrophic eukaryotes and exhibits typical streamlined genomic features, including high gene density, tiny introns, and shrinkage of gene paralogs. Gene families involved in hypersaline stress resistance, DNA replication proteins, and mitochondrial biogenesis are expanded, and the accumulation of phosphatidic acid may play an important role in resistance to high osmotic pressure. We further investigated the morphological and transcriptomic changes in the macronucleus during sexual reproduction and highlighted the potential contribution of macronuclear residuals to this process. We believe that the minimized genome generated in this study provides novel insights into the genome streamlining theory and will be an ideal model to study the evolution of eukaryotic heterotrophs.
Collapse
|
28
|
Abstract
The decreasing cost of sequencing and concomitant augmentation of publicly available genomes have created an acute need for automated software to assess genomic contamination. During the last 6 years, 18 programs have been published, each with its own strengths and weaknesses. Deciding which tools to use becomes more and more difficult without an understanding of the underlying algorithms. We review these programs, benchmarking six of them, and present their main operating principles. This article is intended to guide researchers in the selection of appropriate tools for specific applications. Finally, we present future challenges in the developing field of contamination detection.
Collapse
|
29
|
The Neglected Gut Microbiome: Fungi, Protozoa, and Bacteriophages in Inflammatory Bowel Disease. Inflamm Bowel Dis 2022; 28:1112-1122. [PMID: 35092426 PMCID: PMC9247841 DOI: 10.1093/ibd/izab343] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Indexed: 12/14/2022]
Abstract
The gut microbiome has been implicated in the pathogenesis of inflammatory bowel disease (IBD). Studies suggest that the IBD gut microbiome is less diverse than that of the unaffected population, a phenomenon often referred to as dysbiosis. However, these studies have heavily focused on bacteria, while other intestinal microorganisms-fungi, protozoa, and bacteriophages-have been neglected. Of the nonbacterial microbes that have been studied in relation to IBD, most are thought to be pathogens, although there is evidence that some of these species may instead be harmless commensals. In this review, we discuss the nonbacterial gut microbiome of IBD, highlighting the current biases, limitations, and outstanding questions that can be addressed with high-throughput DNA sequencing methods. Further, we highlight the importance of studying nonbacterial microorganisms alongside bacteria for a comprehensive view of the whole IBD biome and to provide a more precise definition of dysbiosis in patients. With the rise in popularity of microbiome-altering therapies for the treatment of IBD, such as fecal microbiota transplantation, it is important that we address these knowledge gaps to ensure safe and effective treatment of patients.
Collapse
|
30
|
Integrating cultivation and metagenomics for a multi-kingdom view of skin microbiome diversity and functions. Nat Microbiol 2022; 7:169-179. [PMID: 34952941 PMCID: PMC8732310 DOI: 10.1038/s41564-021-01011-w] [Citation(s) in RCA: 49] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 10/28/2021] [Indexed: 12/23/2022]
Abstract
Human skin functions as a physical barrier to foreign pathogen invasion and houses numerous commensals. Shifts in the human skin microbiome have been associated with conditions ranging from acne to atopic dermatitis. Previous metagenomic investigations into the role of the skin microbiome in health or disease have found that much of the sequenced data do not match reference genomes, making it difficult to interpret metagenomic datasets. We combined bacterial cultivation and metagenomic sequencing to assemble the Skin Microbial Genome Collection (SMGC), which comprises 622 prokaryotic species derived from 7,535 metagenome-assembled genomes and 251 isolate genomes. The metagenomic datasets that we generated were combined with publicly available skin metagenomic datasets to identify members and functions of the human skin microbiome. The SMGC collection includes 174 newly identified bacterial species and 12 newly identified bacterial genera, including the abundant genus 'Candidatus Pellibacterium', which has been newly associated with the skin. The SMGC increases the characterized set of known skin bacteria by 26%. We validated the SMGC metagenome-assembled genomes by comparing them with sequenced isolates obtained from the same samples. We also recovered 12 eukaryotic species and assembled thousands of viral sequences, including newly identified clades of jumbo phages. The SMGC enables classification of a median of 85% of skin metagenomic sequences and provides a comprehensive view of skin microbiome diversity, derived primarily from samples obtained in North America.
Collapse
|
31
|
PANTHER: Making genome-scale phylogenetics accessible to all. Protein Sci 2022; 31:8-22. [PMID: 34717010 PMCID: PMC8740835 DOI: 10.1002/pro.4218] [Citation(s) in RCA: 372] [Impact Index Per Article: 186.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 10/24/2021] [Accepted: 10/26/2021] [Indexed: 02/03/2023]
Abstract
Phylogenetics is a powerful tool for analyzing protein sequences, by inferring their evolutionary relationships to other proteins. However, phylogenetics analyses can be challenging: they are computationally expensive and must be performed carefully in order to avoid systematic errors and artifacts. Protein Analysis THrough Evolutionary Relationships (PANTHER; http://pantherdb.org) is a publicly available, user-focused knowledgebase that stores the results of an extensive phylogenetic reconstruction pipeline that includes computational and manual processes and quality control steps. First, fully reconciled phylogenetic trees (including ancestral protein sequences) are reconstructed for a set of "reference" protein sequences obtained from fully sequenced genomes of organisms across the tree of life. Second, the resulting phylogenetic trees are manually reviewed and annotated with function evolution events: inferred gains and losses of protein function along branches of the phylogenetic tree. Here, we describe in detail the current contents of PANTHER, how those contents are generated, and how they can be used in a variety of applications. The PANTHER knowledgebase can be downloaded or accessed via an extensive API. In addition, PANTHER provides software tools to facilitate the application of the knowledgebase to common protein sequence analysis tasks: exploring an annotated genome by gene function; performing "enrichment analysis" of lists of genes; annotating a single sequence or large batch of sequences by homology; and assessing the likelihood that a genetic variant at a particular site in a protein will have deleterious effects.
Collapse
|
32
|
The hidden genomic diversity of ciliated protists revealed by single-cell genome sequencing. BMC Biol 2021; 19:264. [PMID: 34903227 PMCID: PMC8670190 DOI: 10.1186/s12915-021-01202-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 12/01/2021] [Indexed: 12/04/2022] Open
Abstract
Background Ciliated protists are a widely distributed, morphologically diverse, and genetically heterogeneous group of unicellular organisms, usually known for containing two types of nuclei: a transcribed polyploid macronucleus involved in gene expression and a silent diploid micronucleus responsible for transmission of genetic material during sexual reproduction and generation of the macronucleus. Although studies in a few species of culturable ciliated protists have revealed the highly dynamic nature of replicative and recombination events relating the micronucleus to the macronucleus, the broader understanding of the genomic diversity of ciliated protists, as well as their phylogenetic relationships and metabolic potential, has been hampered by the inability to culture numerous other species under laboratory conditions, as well as the presence of symbiotic bacteria and microalgae which provide a challenge for current sequencing technologies. Here, we optimized single-cell sequencing methods and associated data analyses, to effectively remove contamination by commensal bacteria, and generated high-quality genomes for a number of Euplotia species. Results We obtained eight high-quality Euplotia genomes by using single-cell genome sequencing techniques. The genomes have high genomic completeness, with sizes between 68 and 125 M and gene numbers between 14K and 25K. Through comparative genomic analysis, we found that there are a large number of gene expansion events in Euplotia genomes, and these expansions are closely related to the phenotypic evolution and specific environmental adaptations of individual species. We further found four distinct subgroups in the genus Euplotes, which exhibited considerable genetic distance and relative lack of conserved genomic syntenies. Comparative genomic analyses of Uronychia and its relatives revealed significant gene expansion associated with the ciliary movement machinery, which may be related to the unique and strong swimming ability. Conclusions We employed single-cell genomics to obtain eight ciliate genomes, characterized the underestimated genomic diversity of Euplotia, and determined the divergence time of representative species in this subclass for the first time. We also further investigated the extensive duplication events associated with speciation and environmental adaptation. This study provides a unique and valuable resource for understanding the evolutionary history and genetic diversity of ciliates. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-021-01202-1.
Collapse
|
33
|
metaGEM: reconstruction of genome scale metabolic models directly from metagenomes. Nucleic Acids Res 2021; 49:e126. [PMID: 34614189 PMCID: PMC8643649 DOI: 10.1093/nar/gkab815] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Revised: 08/05/2021] [Accepted: 09/28/2021] [Indexed: 01/11/2023] Open
Abstract
Metagenomic analyses of microbial communities have revealed a large degree of interspecies and intraspecies genetic diversity through the reconstruction of metagenome assembled genomes (MAGs). Yet, metabolic modeling efforts mainly rely on reference genomes as the starting point for reconstruction and simulation of genome scale metabolic models (GEMs), neglecting the immense intra- and inter-species diversity present in microbial communities. Here, we present metaGEM (https://github.com/franciscozorrilla/metaGEM), an end-to-end pipeline enabling metabolic modeling of multi-species communities directly from metagenomes. The pipeline automates all steps from the extraction of context-specific prokaryotic GEMs from MAGs to community level flux balance analysis (FBA) simulations. To demonstrate the capabilities of metaGEM, we analyzed 483 samples spanning lab culture, human gut, plant-associated, soil, and ocean metagenomes, reconstructing over 14,000 GEMs. We show that GEMs reconstructed from metagenomes have fully represented metabolism comparable to isolated genomes. We demonstrate that metagenomic GEMs capture intraspecies metabolic diversity and identify potential differences in the progression of type 2 diabetes at the level of gut bacterial metabolic exchanges. Overall, metaGEM enables FBA-ready metabolic model reconstruction directly from metagenomes, provides a resource of metabolic models, and showcases community-level modeling of microbiomes associated with disease conditions allowing generation of mechanistic hypotheses.
Collapse
|
34
|
A high-quality fungal genome assembly resolved from a sample accidentally contaminated by multiple taxa. Biotechniques 2021; 72:39-50. [PMID: 34846173 DOI: 10.2144/btn-2021-0097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Contamination in sequenced genomes is a relatively common problem and several methods to remove non-target sequences have been devised. Typically, the target and contaminating organisms reside in different kingdoms, simplifying their separation. The authors present the case of a genome for the ascomycete fungus Teratosphaeria eucalypti, contaminated by another ascomycete fungus and a bacterium. Approaching the problem as a low-complexity metagenomics project, the authors used two available software programs, BlobToolKit and anvi'o, to filter the contaminated genome. Both the de novo and reference-assisted approaches yielded a high-quality draft genome assembly for the target fungus. Incorporating reference sequences increased assembly completeness and visualization elucidated previously unknown genome features. The authors suggest that visualization should be routine in any sequencing project, regardless of suspected contamination.
Collapse
|
35
|
Increased microbial expression of organic nitrogen cycling genes in long-term warmed grassland soils. ISME COMMUNICATIONS 2021; 1:69. [PMID: 36759732 PMCID: PMC9723740 DOI: 10.1038/s43705-021-00073-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 10/26/2021] [Accepted: 11/05/2021] [Indexed: 11/08/2022]
Abstract
Global warming increases soil temperatures and promotes faster growth and turnover of soil microbial communities. As microbial cell walls contain a high proportion of organic nitrogen, a higher turnover rate of microbes should also be reflected in an accelerated organic nitrogen cycling in soil. We used a metatranscriptomics and metagenomics approach to demonstrate that the relative transcription level of genes encoding enzymes involved in the extracellular depolymerization of high-molecular-weight organic nitrogen was higher in medium-term (8 years) and long-term (>50 years) warmed soils than in ambient soils. This was mainly driven by increased levels of transcripts coding for enzymes involved in the degradation of microbial cell walls and proteins. Additionally, higher transcription levels for chitin, nucleic acid, and peptidoglycan degrading enzymes were found in long-term warmed soils. We conclude that an acceleration in microbial turnover under warming is coupled to higher investments in N acquisition enzymes, particularly those involved in the breakdown and recycling of microbial residues, in comparison with ambient conditions.
Collapse
|
36
|
Abstract
The medically relevant Trichophyton rubrum species complex has a variety of phenotypic presentations but shows relatively little genetic differences. Conventional barcodes, such as the internal transcribed spacer (ITS) region or the beta-tubulin gene, are not able to completely resolve the relationships between these closely related taxa. T. rubrum, T. soudanense and T. violaceum are currently accepted as separate species. However, the status of certain variants, including the T. rubrum morphotypes megninii and kuryangei and the T. violaceum morphotype yaoundei, remains to be deciphered. We conducted the first phylogenomic analysis of the T. rubrum species complex by studying 3105 core genes of 18 new strains from the BCCM/IHEM culture collection and nine publicly available genomes. Our analyses revealed a highly resolved phylogenomic tree with six separate clades. Trichophyton rubrum, T. violaceum and T. soudanense were confirmed in their status of species. The morphotypes T. megninii, T. kuryangei and T. yaoundei all grouped in their own respective clade with high support, suggesting that these morphotypes should be reinstituted to the species-level. Robinson-Foulds distance analyses showed that a combination of two markers (a ubiquitin-protein transferase and a MYB DNA-binding domain-containing protein) can mirror the phylogeny obtained using genomic data, and thus represent potential new markers to accurately distinguish the species belonging to the T. rubrum complex.
Collapse
|
37
|
MOSGA 2: Comparative genomics and validation tools. Comput Struct Biotechnol J 2021; 19:5504-5509. [PMID: 34712396 PMCID: PMC8517542 DOI: 10.1016/j.csbj.2021.09.024] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Revised: 09/23/2021] [Accepted: 09/24/2021] [Indexed: 01/06/2023] Open
Abstract
Due to the highly growing number of available genomic information, the need for accessible and easy-to-use analysis tools is increasing. To facilitate eukaryotic genome annotations, we created MOSGA. In this work, we show how MOSGA 2 is developed by including several advanced analyses for genomic data. Since the genomic data quality greatly impacts the annotation quality, we included multiple tools to validate and ensure high-quality user-submitted genome assemblies. Moreover, thanks to the integration of comparative genomics methods, users can benefit from a broader genomic view by analyzing multiple genomic data sets simultaneously. Further, we demonstrate the new functionalities of MOSGA 2 by different use-cases and practical examples. MOSGA 2 extends the already established application to the quality control of the genomic data and integrates and analyzes multiple genomes in a larger context, e.g., by phylogenetics.
Collapse
|
38
|
Genome-resolved metagenomics using environmental and clinical samples. Brief Bioinform 2021; 22:bbab030. [PMID: 33758906 PMCID: PMC8425419 DOI: 10.1093/bib/bbab030] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 11/29/2020] [Accepted: 01/20/2021] [Indexed: 12/25/2022] Open
Abstract
Recent advances in high-throughput sequencing technologies and computational methods have added a new dimension to metagenomic data analysis i.e. genome-resolved metagenomics. In general terms, it refers to the recovery of draft or high-quality microbial genomes and their taxonomic classification and functional annotation. In recent years, several studies have utilized the genome-resolved metagenome analysis approach and identified previously unknown microbial species from human and environmental metagenomes. In this review, we describe genome-resolved metagenome analysis as a series of four necessary steps: (i) preprocessing of the sequencing reads, (ii) de novo metagenome assembly, (iii) genome binning and (iv) taxonomic and functional analysis of the recovered genomes. For each of these four steps, we discuss the most commonly used tools and the currently available pipelines to guide the scientific community in the recovery and subsequent analyses of genomes from any metagenome sample. Furthermore, we also discuss the tools required for validation of assembly quality as well as for improving quality of the recovered genomes. We also highlight the currently available pipelines that can be used to automate the whole analysis without having advanced bioinformatics knowledge. Finally, we will highlight the most widely adapted and actively maintained tools and pipelines that can be helpful to the scientific community in decision making before they commence the analysis.
Collapse
|
39
|
BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol 2021; 38:4647-4654. [PMID: 34320186 PMCID: PMC8476166 DOI: 10.1093/molbev/msab199] [Citation(s) in RCA: 1344] [Impact Index Per Article: 448.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Methods for evaluating the quality of genomic and metagenomic data are essential to aid genome assembly and to correctly interpret the results of subsequent analyses. BUSCO estimates the completeness and redundancy of processed genomic data based on universal single-copy orthologs. Here we present new functionalities and major improvements of the BUSCO software, as well as the renewal and expansion of the underlying datasets in sync with the OrthoDB v10 release. Among the major novelties, BUSCO now enables phylogenetic placement of the input sequence to automatically select the most appropriate dataset for the assessment, allowing the analysis of metagenome-assembled genomes of unknown origin. A newly-introduced genome workflow increases the efficiency and runtimes especially on large eukaryotic genomes. BUSCO is the only tool capable of assessing both eukaryotic and prokaryotic species, and can be applied to various data types, from genome assemblies and metagenomic bins, to transcriptomes and gene sets.
Collapse
|
40
|
Metagenome-Assembled Genomes Contribute to Unraveling of the Microbiome of Cocoa Fermentation. Appl Environ Microbiol 2021; 87:e0058421. [PMID: 34105982 DOI: 10.1128/aem.00584-21] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Metagenomic studies about cocoa fermentation have mainly reported on the analysis of short reads for determination of operational taxonomic units. However, it is also important to determine metagenome-assembled genomes (MAGs), which are genomes deriving from the assembly of metagenomics. For this research, all the cocoa metagenomes from public databases were downloaded, resulting in five data sets: one from Ghana and four from Brazil. In addition, in silico approaches were used to describe putative phenotypes and the metabolic potential of MAGs. A total of 17 high-quality MAGs were recovered from these microbiomes, as follows: (i) for fungi, Yamadazyma tenuis (n = 1); (ii) lactic acid bacteria, Limosilactobacillus fermentum (n = 5), Liquorilactobacillus cacaonum (n = 1), Liquorilactobacillus nagelli (n = 1), Leuconostoc pseudomesenteroides (n = 1), and Lactiplantibacillus plantarum subsp. plantarum (n = 1); (iii) acetic acid bacteria, Acetobacter senegalensis (n = 2) and Kozakia baliensis (n = 1); and (iv) Bacillus subtilis (n = 1), Brevundimonas sp. (n = 2), and Pseudomonas sp. (n = 1). Medium-quality MAGs were also recovered from cocoa microbiomes, including some that, to our knowledge, were not previously detected in this environment (Liquorilactobacillus vini, Komagataeibacter saccharivorans, and Komagataeibacter maltaceti) and others previously described (Fructobacillus pseudoficulneus and Acetobacter pasteurianus). Taken together, the MAGs were useful for providing an additional description of the microbiome of cocoa fermentation, revealing previously overlooked microorganisms, with prediction of key phenotypes and biochemical pathways. IMPORTANCE The production of chocolate starts with the harvesting of cocoa fruits and the spontaneous fermentation of the seeds in a microbial succession that depends on yeasts, lactic acid bacteria, and acetic acid bacteria in order to eliminate bitter and astringent compounds present in the raw material, which will be further roasted and grinded to originate the cocoa powder that will enter the food processing industry. The microbiota of cocoa fermentation is not completely known, and yet it advanced from culture-based studies to the advent of next-generation DNA sequencing, with the generation of a myriad of data that need bioinformatic approaches to be properly analyzed. Although the majority of metagenomic studies have been based on short reads (operational taxonomic units), it is also important to analyze entire genomes to determine more precisely possible ecological roles of different species. Metagenome-assembled genomes (MAGs) are very useful for this purpose; here, MAGs from cocoa fermentation microbiomes are described, and the possible implications of their phenotypic and metabolic potentials are discussed.
Collapse
|
41
|
Recovering prokaryotic genomes from host-associated, short-read shotgun metagenomic sequencing data. Nat Protoc 2021; 16:2520-2541. [PMID: 33864056 DOI: 10.1038/s41596-021-00508-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 01/12/2021] [Indexed: 02/02/2023]
Abstract
Recovering genomes from shotgun metagenomic sequence data allows detailed taxonomic and functional characterization of individual species or strains in a microbial community. Retrieving these metagenome-assembled genomes (MAGs) involves seven stages. First, low-quality bases, along with adapter and host sequences, are removed. Second, overlapping sequences are assembled to create longer contiguous fragments. Third, these fragments are clustered based on sequence composition and abundance. Fourth, these sequence clusters, or bins, undergo rounds of quality assessment and refinement to yield MAGs. The optional fifth stage is dereplication of MAGs to select representatives. Next, each MAG is taxonomically classified. The optional seventh stage is assessing the fraction of diversity that has been recovered. The output of this protocol is draft genomes, which can provide invaluable clues about uncultured organisms. This protocol takes ~1 week to run, depending on computational resources available, and requires prior experience with high-performance computing, shell script programming and Python.
Collapse
|
42
|
Predicted Input of Uncultured Fungal Symbionts to a Lichen Symbiosis from Metagenome-Assembled Genomes. Genome Biol Evol 2021; 13:6163286. [PMID: 33693712 PMCID: PMC8355462 DOI: 10.1093/gbe/evab047] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/03/2021] [Indexed: 12/15/2022] Open
Abstract
Basidiomycete yeasts have recently been reported as stably associated secondary
fungal symbionts of many lichens, but their role in the symbiosis remains
unknown. Attempts to sequence their genomes have been hampered both by the
inability to culture them and their low abundance in the lichen thallus
alongside two dominant eukaryotes (an ascomycete fungus and chlorophyte alga).
Using the lichen Alectoria sarmentosa, we selectively dissolved
the cortex layer in which secondary fungal symbionts are embedded to enrich
yeast cell abundance and sequenced DNA from the resulting slurries as well as
bulk lichen thallus. In addition to yielding a near-complete genome of the
filamentous ascomycete using both methods, metagenomes from cortex slurries
yielded a 36- to 84-fold increase in coverage and near-complete genomes for two
basidiomycete species, members of the classes Cystobasidiomycetes and
Tremellomycetes. The ascomycete possesses the largest gene repertoire of the
three. It is enriched in proteases often associated with pathogenicity and
harbors the majority of predicted secondary metabolite clusters. The
basidiomycete genomes possess ∼35% fewer predicted genes than the
ascomycete and have reduced secretomes even compared with close relatives, while
exhibiting signs of nutrient limitation and scavenging. Furthermore, both
basidiomycetes are enriched in genes coding for enzymes producing secreted
acidic polysaccharides, representing a potential contribution to the shared
extracellular matrix. All three fungi retain genes involved in dimorphic
switching, despite the ascomycete not being known to possess a yeast stage. The
basidiomycete genomes are an important new resource for exploration of lifestyle
and function in fungal–fungal interactions in lichen symbioses.
Collapse
|
43
|
Integrating pan-genome with metagenome for microbial community profiling. Comput Struct Biotechnol J 2021; 19:1458-1466. [PMID: 33841754 PMCID: PMC8010324 DOI: 10.1016/j.csbj.2021.02.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 02/24/2021] [Accepted: 02/27/2021] [Indexed: 02/07/2023] Open
Abstract
Advances in sequencing technology have led to the increased availability of genomes and metagenomes, which has greatly facilitated microbial pan-genome and metagenome analysis in the community. In line with this trend, studies on microbial genomes and phenotypes have gradually shifted from individuals to environmental communities. Pan-genomics and metagenomics are powerful strategies for in-depth profiling study of microbial communities. Pan-genomics focuses on genetic diversity, dynamics, and phylogeny at the multi-genome level, while metagenomics profiles the distribution and function of culture-free microbial communities in special environments. Combining pan-genome and metagenome analysis can reveal the microbial complicated connections from an individual complete genome to a mixture of genomes, thereby extending the catalog of traditional individual genomic profile to community microbial profile. Therefore, the combination of pan-genome and metagenome approaches has become a promising method to track the sources of various microbes and decipher the population-level evolution and ecosystem functions. This review summarized the pan-genome and metagenome approaches, the combined strategies of pan-genome and metagenome, and applications of these combined strategies in studies of microbial dynamics, evolution, and function in communities. We discussed emerging strategies for the study of microbial communities that integrate information in both pan-genome and metagenome. We emphasized studies in which the integrating pan-genome with metagenome approach improved the understanding of models of microbial community profiles, both structural and functional. Finally, we illustrated future perspectives of microbial community profile: more advanced analytical techniques, including big-data based artificial intelligence, will lead to an even better understanding of the patterns of microbial communities.
Collapse
|
44
|
Transcriptomic analysis provides insights into candidate genes and molecular pathways involved in growth of Manila clam Ruditapes philippinarum. Funct Integr Genomics 2021; 21:341-353. [PMID: 33660117 DOI: 10.1007/s10142-021-00780-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2020] [Revised: 02/14/2021] [Accepted: 02/19/2021] [Indexed: 10/22/2022]
Abstract
Growth is one of the most important traits of aquaculture breeding programs. Understanding the mechanisms underlying growth differences between individuals can contribute to improving growth rates through more efficient breeding schemes. Ruditapes philippinarum is an economically important marine bivalve. In order to gain insights into the molecular mechanisms to growth variability in marine shellfish, we conducted the transcriptome sequencing and examined the expression differences in growth-related gene and molecular pathways involved in growth trait of R. philippinarum. In this study, we investigated the molecular and gene expression differences in fast-growing and slow-growing Manila clam and focused on the analysis of the differential expression patterns of specific genes associated with growth by RNA-seq and qPCR analysis. A total of 61 differentially expressed genes (DEGs) were captured significantly differentially expressed, and were categorized into Ras signaling pathway, hedgehog signaling pathway, AMPK signaling pathway, p53 signaling pathway, regulation of actin cytoskeleton, focal adhesion, mTOR signaling pathway, VEGF signaling pathway, and TGF-beta signaling pathway. A total of 34 growth-related genes were validated significantly and up/downregulated at fast growing and slow growing of R. philippinarum. Functional enrichment analysis revealed the insulin signaling pathway, PI3K-Akt signaling pathway, and mTOR signaling pathway play pivotal roles in molecular function and regulation of growth trait in R. philippinarum. The growth-related genes and pathways obtained here provide important insights into the molecular basis of physiological acclimation, metabolic activity, and growth variability in marine bivalves.
Collapse
|
45
|
Accurate and sensitive detection of microbial eukaryotes from whole metagenome shotgun sequencing. MICROBIOME 2021; 9:58. [PMID: 33658077 PMCID: PMC7931531 DOI: 10.1186/s40168-021-01015-y] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Accepted: 02/02/2021] [Indexed: 05/08/2023]
Abstract
BACKGROUND Microbial eukaryotes are found alongside bacteria and archaea in natural microbial systems, including host-associated microbiomes. While microbial eukaryotes are critical to these communities, they are challenging to study with shotgun sequencing techniques and are therefore often excluded. RESULTS Here, we present EukDetect, a bioinformatics method to identify eukaryotes in shotgun metagenomic sequencing data. Our approach uses a database of 521,824 universal marker genes from 241 conserved gene families, which we curated from 3713 fungal, protist, non-vertebrate metazoan, and non-streptophyte archaeplastida genomes and transcriptomes. EukDetect has a broad taxonomic coverage of microbial eukaryotes, performs well on low-abundance and closely related species, and is resilient against bacterial contamination in eukaryotic genomes. Using EukDetect, we describe the spatial distribution of eukaryotes along the human gastrointestinal tract, showing that fungi and protists are present in the lumen and mucosa throughout the large intestine. We discover that there is a succession of eukaryotes that colonize the human gut during the first years of life, mirroring patterns of developmental succession observed in gut bacteria. By comparing DNA and RNA sequencing of paired samples from human stool, we find that many eukaryotes continue active transcription after passage through the gut, though some do not, suggesting they are dormant or nonviable. We analyze metagenomic data from the Baltic Sea and find that eukaryotes differ across locations and salinity gradients. Finally, we observe eukaryotes in Arabidopsis leaf samples, many of which are not identifiable from public protein databases. CONCLUSIONS EukDetect provides an automated and reliable way to characterize eukaryotes in shotgun sequencing datasets from diverse microbiomes. We demonstrate that it enables discoveries that would be missed or clouded by false positives with standard shotgun sequence analysis. EukDetect will greatly advance our understanding of how microbial eukaryotes contribute to microbiomes. Video abstract.
Collapse
|