1
|
Zhang M, Yang Q, Lou J, Hu Y, Shi Y. A new strategy to HER2-specific antibody discovery through artificial intelligence-powered phage display screening based on the Trastuzumab framework. Biochim Biophys Acta Mol Basis Dis 2025; 1871:167772. [PMID: 40056877 DOI: 10.1016/j.bbadis.2025.167772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2024] [Revised: 02/23/2025] [Accepted: 02/28/2025] [Indexed: 03/10/2025]
Abstract
Human epidermal growth factor receptor 2 (HER2) is a recognized drug target, and it serves as a critical target for various cancer treatments, necessitating the discovery of more antibodies for therapeutic and detection purposes. Here, we have developed an innovative workflow for antibody generation through Artificial Intelligence-powered Phage Display Screening (AIPDS). This workflow integrates artificial intelligence-driven antibody CDRH3 sequence design, high-throughput DNA synthesis and phage display screening. We applied AIPDS workflow to generate promising antibodies against the human epidermal growth factor receptor 2 (HER2), offering a template for streamlined antibody generation. Seven novel antibodies stood out, demonstrating promising efficacy in various functional assays. Notably, DYHER2-02 demonstrates strong performance across all experimental tests. In summary, our study introduces a novel methodology to generate new antibody variants of an existing antibody using an AI-assisted phage display approach. These new antibody variants hold potential applications in research, diagnosis, and therapeutic applications.
Collapse
Affiliation(s)
- Mancang Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200030, People's Republic of China
| | - Qiangzhen Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200030, People's Republic of China
| | - Jiangrong Lou
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200030, People's Republic of China
| | - Yang Hu
- United Research Center for Next Generation DNA Synthesis of SJTU-Dynegene, Shanghai 201108, People's Republic of China
| | - Yongyong Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200030, People's Republic of China; Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, People's Republic of China.
| |
Collapse
|
2
|
Zhang C, He Y, Wang J, Chen T, Baltar F, Hu M, Liao J, Xiao X, Li ZR, Dong X. LucaPCycle: Illuminating microbial phosphorus cycling in deep-sea cold seep sediments using protein language models. Nat Commun 2025; 16:4862. [PMID: 40419512 DOI: 10.1038/s41467-025-60142-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2024] [Accepted: 05/16/2025] [Indexed: 05/28/2025] Open
Abstract
Phosphorus is essential for life and critically influences marine productivity. Despite geochemical evidence of active phosphorus cycling in deep-sea cold seeps, the microbial processes involved remain poorly understood. Traditional sequence-based searches often fail to detect proteins with remote homology. To address this, we developed a deep learning model, LucaPCycle, integrating raw sequences and contextual embeddings based on the protein language model ESM2-3B. LucaPCycle identified 5241 phosphorus-cycling protein families from global cold seep gene and genome catalogs, substantially enhancing our understanding of their diversity, ecology, and function. Among previously unannotated sequences, we discovered three alkaline phosphatase families that feature unique domain organizations and preserved enzymatic capabilities. These results highlight previously overlooked ecological importance of phosphorus cycling within cold seeps, corroborated by data from porewater geochemistry, metatranscriptomics, and metabolomics. We revealed a previously unrecognized diversity of archaea, including Asgardarchaeota, anaerobic methanotrophic archaea and Thermoproteota, which contribute to organic phosphorus mineralization and inorganic phosphorus solubilization through various mechanisms. Additionally, auxiliary metabolic genes of cold seep viruses primarily encode the PhoR-PhoB regulatory system and PhnCDE transporter, potentially enhancing their hosts' phosphorus utilization. Overall, LucaPCycle are capable of accessing previously 'hidden' sequence spaces for microbial phosphorus cycling and can be applied to various ecosystems.
Collapse
Affiliation(s)
- Chuwen Zhang
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China
| | - Yong He
- Apsara Lab, Alibaba Cloud Intelligence, Alibaba Group, Hangzhou, China
| | - Jieni Wang
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China
| | - Tengkai Chen
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China
| | - Federico Baltar
- Fungal and Biogeochemical Oceanography Group, College of Oceanography and Ecological Science, Shanghai Ocean University, Shanghai, China
- Fungal and Biogeochemical Oceanography Group, Department of Functional and Evolutionary Ecology, University of Vienna, Vienna, Austria
| | - Minjie Hu
- Key Laboratory of Humid Sub-tropical Eco-geographical Process of Ministry of Education, Fujian Normal University, Fuzhou, China
- School of Geographical Sciences, Fujian Normal University, Fuzhou, China
| | - Jing Liao
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China
| | - Xi Xiao
- Key Laboratory of Marine Mineral Resources, Ministry of Natural Resources, Guangzhou Marine Geological Survey, China Geological Survey, Guangzhou, China
| | - Zhao-Rong Li
- Apsara Lab, Alibaba Cloud Intelligence, Alibaba Group, Hangzhou, China.
| | - Xiyang Dong
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China.
- Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China.
| |
Collapse
|
3
|
Tang D, Zhou X, Qian H, Jiao Y, Wang Y. Streptomyces flavusporus sp. nov., a Novel Actinomycete Isolated from Naidong, Xizang (Tibet), China. Microorganisms 2025; 13:1001. [PMID: 40431174 PMCID: PMC12113709 DOI: 10.3390/microorganisms13051001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2025] [Revised: 04/22/2025] [Accepted: 04/23/2025] [Indexed: 05/29/2025] Open
Abstract
The exploration of Streptomyces from extreme environments presents a particularly compelling avenue for novel compound discovery. A Gram-positive, pink-pigmented Streptomyces strain designated HC307T was isolated from a soil sample collected in Xizang (Tibet), China. The exploration of Streptomyces from extreme environments presents a particularly compelling avenue for novel compound discovery. In this study, the 16S rRNA sequence of strain HC307T exhibited the highest similarity with Streptomyces prasinosporus NRRL B-12431T (97.5%) and Streptomyces chromofuscus DSM 40273T (97.3%), which were below 98.7%. The draft genome of the bacteria was 10.0 Mb, with a G+C content of 70.0 mol%. The average nucleotide identity (ANI) values of strain HC307T and similar type strains ranged from 78.3% to 87.5% (<95%). The digital DNA-DNA hybridization (dDDH) values ranged from 22.6% to 33.9% (<70%), which was consistent with the results obtained from phylogenetic tree analysis. Phenotypically, this bacterium grew within the temperature range of 25-40 °C, at a pH range of 5 to 9, and in NaCl concentrations from 0% to 6% (w/v). The polar lipid profile of strain HC307T was diphosphatidylglycerol, phosphatidylglycerol, phosphatidylethanolamine and unidentified lipids. The analysis of 32 biosynthetic gene clusters (BGCs) indicated the strain's capacity to synthesize diverse compounds. Phylogenetic and phenotypic analyses demonstrated that strain HC307T represented a novel species within the genus Streptomyces, and proposed the name Streptomyces flavusporus sp. nov., with strain HC307T (=DSM 35222T=CGMCC 32047T). The strain was deposited in Deutsche Sammlung von Mikroorganismen und Zellkulturen and the China General Microbiological Culture Collection Center for patent procedures under the Budapest Treaty.
Collapse
Affiliation(s)
- Dan Tang
- School of Life Science and Engineering, Lanzhou University of Technology, Lanzhou 730050, China; (X.Z.); (H.Q.); (Y.J.); (Y.W.)
| | | | | | | | | |
Collapse
|
4
|
Haft DH, Tolstoy I. Novel selenoprotein neighborhoods suggest specialized biochemical processes. mSystems 2025; 10:e0141724. [PMID: 40162776 PMCID: PMC12013261 DOI: 10.1128/msystems.01417-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2024] [Accepted: 02/27/2025] [Indexed: 04/02/2025] Open
Abstract
Prokaryotic genomes encode selenoproteins sparsely, roughly one protein per 5,000. Finding novel selenoprotein families can expose unknown biological processes that are enabled, or at least enhanced, by having a selenium atom replace a sulfur atom in some cysteine residues. Here, we report the discovery of 18 novel selenoprotein families or second selenocysteine sites in previously unrecognized extensions of protein translations. Most of these families had some confounding factors-too small a family, too few selenoproteins in the family, selenocysteine (U) too close to one end, a skew toward understudied or uncultured lineages, and consequently were missed previously. Discoveries were triggered by observations during the ongoing construction of protein family models for the National Center for Biotechnology Information's RefSeq and Prokaryotic Gene Annotation Pipeline or made by targeted searches for novel selenoproteins in the vicinity of known ones, rather than by any broadly applied genome mining method. Unrelated families TsoA, TsoB, TsoC, and TsoX are adjacent in tso (three selenoprotein operon) loci in the bacterial phylum Thermodesulfobacteriota. TrsS (third radical SAM selenoprotein) occurs strictly in the context of a molybdopterin-dependent aldehyde oxidoreductase. A short carboxy-terminal motif, U-X-X-stop (UXX-star), occurs in selenoproteins with various architectures, usually providing the second U in the protein. The multiple new selenocysteine insertion sites, selenoprotein families, and selenium-dependent operons we curated manually suggest that many more proteins and pathways remain to be discovered; once improved computational methods are applied comprehensively to the latest collections of microbial genomes and metagenomes, they may reveal surprising new biochemical processes. IMPORTANCE Next-generation DNA sequencing and assembly of metagenome-assembled genomes (MAGs) for uncultured species of various microbiomes adds a vast "dark matter" of hard-to-decipher protein sequences. Selenoproteins, optimized by natural selection to encode selenocysteine where cysteine might have been encoded much more easily, carry a strong clue to their function-some specialized aspect of binding or catalysis. Operons with multiple adjacent, but otherwise unrelated, selenoproteins should provide even more vivid information. In this study, efforts in protein family construction and curation, aimed at improving the PGAP genome annotation pipeline, generated multiple novel selenoprotein-containing genomic contexts that may lead to the future characterization of several systems of proteins. Past observations suggest roles in the metabolic handling of trace elements (mercury, tungsten, arsenic, etc.) or of organic compounds refractory to simpler enzymatic pathways. In addition, the work significantly expands the truth set of validated selenoproteins, which should aid future, more automated genome mining efforts.
Collapse
Affiliation(s)
- Daniel H. Haft
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | | |
Collapse
|
5
|
Dakal TC, Xu C, Kumar A. Advanced computational tools, artificial intelligence and machine-learning approaches in gut microbiota and biomarker identification. FRONTIERS IN MEDICAL TECHNOLOGY 2025; 6:1434799. [PMID: 40303946 PMCID: PMC12037385 DOI: 10.3389/fmedt.2024.1434799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Accepted: 10/16/2024] [Indexed: 05/02/2025] Open
Abstract
The microbiome of the gut is a complex ecosystem that contains a wide variety of microbial species and functional capabilities. The microbiome has a significant impact on health and disease by affecting endocrinology, physiology, and neurology. It can change the progression of certain diseases and enhance treatment responses and tolerance. The gut microbiota plays a pivotal role in human health, influencing a wide range of physiological processes. Recent advances in computational tools and artificial intelligence (AI) have revolutionized the study of gut microbiota, enabling the identification of biomarkers that are critical for diagnosing and treating various diseases. This review hunts through the cutting-edge computational methodologies that integrate multi-omics data-such as metagenomics, metaproteomics, and metabolomics-providing a comprehensive understanding of the gut microbiome's composition and function. Additionally, machine learning (ML) approaches, including deep learning and network-based methods, are explored for their ability to uncover complex patterns within microbiome data, offering unprecedented insights into microbial interactions and their link to host health. By highlighting the synergy between traditional bioinformatics tools and advanced AI techniques, this review underscores the potential of these approaches in enhancing biomarker discovery and developing personalized therapeutic strategies. The convergence of computational advancements and microbiome research marks a significant step forward in precision medicine, paving the way for novel diagnostics and treatments tailored to individual microbiome profiles. Investigators have the ability to discover connections between the composition of microorganisms, the expression of genes, and the profiles of metabolites. Individual reactions to medicines that target gut microbes can be predicted by models driven by artificial intelligence. It is possible to obtain personalized and precision medicine by first gaining an understanding of the impact that the gut microbiota has on the development of disease. The application of machine learning allows for the customization of treatments to the specific microbial environment of an individual.
Collapse
Affiliation(s)
- Tikam Chand Dakal
- Genome and Computational Biology Lab, Department of Biotechnology, Mohanlal Sukhadia University, Udaipur, India
| | - Caiming Xu
- Beckman Research Institute of City of Hope, Monrovia, CA, United States
- Department of General Surgery, The First Affiliated Hospital of Dalian Medical University, Dalian, China
| | - Abhishek Kumar
- Manipal Academy of Higher Education (MAHE), Manipal, India
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| |
Collapse
|
6
|
Blaskowski S, Roald M, Berube PM, Braakman R, Armbrust EV. Simultaneous acclimation to nitrogen and iron scarcity in open ocean cyanobacteria revealed by sparse tensor decomposition of metatranscriptomes. SCIENCE ADVANCES 2025; 11:eadr4310. [PMID: 40184465 PMCID: PMC11970481 DOI: 10.1126/sciadv.adr4310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Accepted: 02/28/2025] [Indexed: 04/06/2025]
Abstract
Microbes respond to changes in their environment by adapting their physiology through coordinated adjustments to the expression levels of functionally related genes. To detect these shifts in situ, we developed a sparse tensor decomposition method that derives gene co-expression patterns from inherently complex whole community RNA sequencing data. Application of the method to metatranscriptomes of the abundant marine cyanobacteria Prochlorococcus and Synechococcus identified responses to scarcity of two essential nutrients, nitrogen and iron, including increased transporter expression, restructured photosynthesis and carbon metabolism, and mitigation of oxidative stress. Further, expression profiles of the identified gene clusters suggest that both cyanobacteria populations experience simultaneous nitrogen and iron stresses in a transition zone between North Pacific oceanic gyres. The results demonstrate the power of our approach to infer organism responses to environmental pressures, hypothesize functions of uncharacterized genes, and extrapolate ramifications for biogeochemical cycles in a changing ecosystem.
Collapse
Affiliation(s)
- Stephen Blaskowski
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, WA, USA
- School of Oceanography, University of Washington, Seattle, WA, USA
| | - Marie Roald
- Simula Metropolitan Center for Digital Engineering, Oslo, Norway
- Faculty of Technology, Art and Design, Oslo Metropolitan University, Oslo, Norway
| | - Paul M. Berube
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Rogier Braakman
- Department of Earth, Atmospheric and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | |
Collapse
|
7
|
Cheskis S, Akerman A, Levy A. Deciphering bacterial protein functions with innovative computational methods. Trends Microbiol 2025; 33:434-446. [PMID: 39736484 DOI: 10.1016/j.tim.2024.11.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 11/28/2024] [Accepted: 11/29/2024] [Indexed: 01/01/2025]
Abstract
Bacteria colonize every niche on Earth and play key roles in many environmental and host-associated processes. The sequencing revolution revealed the remarkable bacterial genetic and proteomic diversity and the genomic content of cultured and uncultured bacteria. However, deciphering functions of novel proteins remains a high barrier, often preventing the deep understanding of microbial life and its interaction with the surrounding environment. In recent years, exciting new bioinformatic tools, many of which are based on machine learning, facilitate the challenging task of gene and protein function discovery in the era of big genomics data, leading to the generation of testable hypotheses for bacterial protein functions. The new tools allow prediction of protein structures and interactions and allow sensitive and efficient sequence- and structure-based searching and clustering. Here, we summarize some of these recent tools which revolutionize modern microbiology research, along with examples for their usage, emphasizing the user-friendly, web-based ones. Adoption of these capabilities by experimentalists and computational biologists could save resources and accelerate microbiology research.
Collapse
Affiliation(s)
- Shani Cheskis
- Department of Plant Pathology and Microbiology, Institute of Environmental Science, The Faculty of Agriculture, Food, and Environment, The Hebrew University of Jerusalem, Rehovot, Israel
| | - Avital Akerman
- Department of Plant Pathology and Microbiology, Institute of Environmental Science, The Faculty of Agriculture, Food, and Environment, The Hebrew University of Jerusalem, Rehovot, Israel
| | - Asaf Levy
- Department of Plant Pathology and Microbiology, Institute of Environmental Science, The Faculty of Agriculture, Food, and Environment, The Hebrew University of Jerusalem, Rehovot, Israel.
| |
Collapse
|
8
|
Lin JD, Bhatt AS. Mind the gap: Intergenic regions in bacteria encode numerous small proteins. Mol Cell 2025; 85:1046-1048. [PMID: 40118035 DOI: 10.1016/j.molcel.2025.02.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2025] [Revised: 02/21/2025] [Accepted: 02/21/2025] [Indexed: 03/23/2025]
Abstract
In a recent issue of Molecular Cell, Fesenko et al.1 report a systematic investigation of intergenic regions within Enterobacteriaceae, shedding light on a vast, unexplored microprotein landscape that has been overlooked in well-characterized bacterial genomes.
Collapse
Affiliation(s)
- Jordan D Lin
- Department of Medicine (Division of Hematology), Stanford University, Stanford, CA, USA
| | - Ami S Bhatt
- Department of Medicine (Division of Hematology), Stanford University, Stanford, CA, USA; Department of Genetics, Stanford University, Stanford, CA, USA.
| |
Collapse
|
9
|
Secaira-Morocho H, Jiang X, Zhu Q. Augmenting microbial phylogenomic signal with tailored marker gene sets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.13.643052. [PMID: 40161675 PMCID: PMC11952537 DOI: 10.1101/2025.03.13.643052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Phylogenetic marker genes are traditionally selected from a fixed collection of whole genomes evenly distributed across major microbial phyla, covering only a small fraction of gene families. And yet, most microbial diversity is found in metagenome-assembled genomes that are unevenly distributed and harbor gene families that do not fit the criteria of universal orthologous genes. To address these limitations, we systematically evaluate the phylogenetic signal of gene families annotated from KEGG and EggNOG functional databases for deep microbial phylogenomics. We show that markers selected from an expanded pool of gene families and tailored to the input genomes improve the accuracy of phylogenetic trees across simulated and real-world datasets of whole genomes and metagenome-assembled genomes. The improved accuracy of trees compared to previous markers persists even when metagenome-assembled genomes lack a fraction of open reading frames. The selected markers have functional annotations related to metabolism, cellular processes, and environmental information processing, in addition to replication, translation, and transcription. We introduce TMarSel, a software tool for automated, systematic, free-from-expert opinion, and tailored marker selection that provides flexibility in the number of markers and annotation databases while remaining robust against uneven taxon sampling and incomplete genomic data.
Collapse
Affiliation(s)
- Henry Secaira-Morocho
- Center for Fundamental and Applied Microbiomics and School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Xiaofang Jiang
- National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Qiyun Zhu
- Center for Fundamental and Applied Microbiomics and School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| |
Collapse
|
10
|
Fesenko I, Sahakyan H, Dhyani R, Shabalina SA, Storz G, Koonin EV. The hidden bacterial microproteome. Mol Cell 2025; 85:1024-1041.e6. [PMID: 39978337 PMCID: PMC11890958 DOI: 10.1016/j.molcel.2025.01.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 11/05/2024] [Accepted: 01/22/2025] [Indexed: 02/22/2025]
Abstract
Microproteins encoded by small open reading frames comprise the "dark matter" of proteomes. Although microproteins have been detected in diverse organisms from all three domains of life, many more remain to be identified, and only a few have been functionally characterized. In this comprehensive study of intergenic small open reading frames (ismORFs, 15-70 codons) in 5,668 bacterial genomes of the family Enterobacteriaceae, we identify 67,297 clusters of ismORFs subject to purifying selection. Expression of tagged Escherichia coli microproteins is detected for 11 of the 16 tested, validating the predictions. Although the ismORFs mainly code for hydrophobic, potentially transmembrane, unstructured, or minimally structured microproteins, some globular folds, oligomeric structures, and possible interactions with proteins encoded by neighboring genes are predicted. Complete information on the predicted microprotein families, including evidence of transcription and translation, and structure predictions are available as an easily searchable resource for investigation of microprotein functions.
Collapse
Affiliation(s)
- Igor Fesenko
- Computational Biology Branch, Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Harutyun Sahakyan
- Computational Biology Branch, Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Rajat Dhyani
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD 20892, USA
| | - Svetlana A Shabalina
- Computational Biology Branch, Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Gisela Storz
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD 20892, USA.
| | - Eugene V Koonin
- Computational Biology Branch, Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| |
Collapse
|
11
|
Duan C, Zang Z, Xu Y, He H, Li S, Liu Z, Lei Z, Zheng JS, Li SZ. FGeneBERT: function-driven pre-trained gene language model for metagenomics. Brief Bioinform 2025; 26:bbaf149. [PMID: 40211978 PMCID: PMC11986344 DOI: 10.1093/bib/bbaf149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2024] [Revised: 02/22/2025] [Accepted: 03/14/2025] [Indexed: 04/14/2025] Open
Abstract
Metagenomic data, comprising mixed multi-species genomes, are prevalent in diverse environments like oceans and soils, significantly impacting human health and ecological functions. However, current research relies on K-mer, which limits the capture of structurally and functionally relevant gene contexts. Moreover, these approaches struggle with encoding biologically meaningful genes and fail to address the one-to-many and many-to-one relationships inherent in metagenomic data. To overcome these challenges, we introduce FGeneBERT, a novel metagenomic pre-trained model that employs a protein-based gene representation as a context-aware and structure-relevant tokenizer. FGeneBERT incorporates masked gene modeling to enhance the understanding of inter-gene contextual relationships and triplet enhanced metagenomic contrastive learning to elucidate gene sequence-function relationships. Pre-trained on over 100 million metagenomic sequences, FGeneBERT demonstrates superior performance on metagenomic datasets at four levels, spanning gene, functional, bacterial, and environmental levels and ranging from 1 to 213 k input sequences. Case studies of ATP synthase and gene operons highlight FGeneBERT's capability for functional recognition and its biological relevance in metagenomic research.
Collapse
Affiliation(s)
- Chenrui Duan
- College of Computer Science and Technology, Zhejiang University, No. 866, Yuhangtang Road, 310058 Zhejiang, P. R. China
- School of Engineering, Westlake University, No. 600 Dunyu Road, 310030 Zhejiang, P. R. China
| | - Zelin Zang
- Centre for Artificial Intelligence and Robotics (CAIR), HKISI-CAS Hong Kong Institute of Science & Innovation, Chinese Academy of Sciences, Hong Kong 310000, China
| | - Yongjie Xu
- College of Computer Science and Technology, Zhejiang University, No. 866, Yuhangtang Road, 310058 Zhejiang, P. R. China
- School of Engineering, Westlake University, No. 600 Dunyu Road, 310030 Zhejiang, P. R. China
| | - Hang He
- School of Medicine and School of Life Sciences, Westlake University, No. 600 Dunyu Road, 310030 Zhejiang, P. R. China
| | - Siyuan Li
- College of Computer Science and Technology, Zhejiang University, No. 866, Yuhangtang Road, 310058 Zhejiang, P. R. China
- School of Engineering, Westlake University, No. 600 Dunyu Road, 310030 Zhejiang, P. R. China
| | - Zihan Liu
- College of Computer Science and Technology, Zhejiang University, No. 866, Yuhangtang Road, 310058 Zhejiang, P. R. China
- School of Engineering, Westlake University, No. 600 Dunyu Road, 310030 Zhejiang, P. R. China
| | - Zhen Lei
- Centre for Artificial Intelligence and Robotics (CAIR), HKISI-CAS Hong Kong Institute of Science & Innovation, Chinese Academy of Sciences, Hong Kong 310000, China
- State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing 100190, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences (UCAS), Beijing 100049, China
| | - Ju-Sheng Zheng
- School of Medicine and School of Life Sciences, Westlake University, No. 600 Dunyu Road, 310030 Zhejiang, P. R. China
| | - Stan Z Li
- School of Engineering, Westlake University, No. 600 Dunyu Road, 310030 Zhejiang, P. R. China
| |
Collapse
|
12
|
Mahajna A, Geurkink B, Gacesa R, Keesman KJ, Euverink GJW, Jayawardhana B. Metatranscriptomes of activated sludge microbiomes from saline wastewater treatment plant. Sci Data 2025; 12:348. [PMID: 40011462 PMCID: PMC11865439 DOI: 10.1038/s41597-025-04682-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Accepted: 02/19/2025] [Indexed: 02/28/2025] Open
Abstract
The activated sludge microbiome (ASM) drives the biological wastewater treatment process in wastewater treatment plants. It has been established in the literature that the ASM is characterized by a high degree of taxonomic and metabolic diversity. However, meta-omics datasets have been derived from domestic wastewater treatment plants with little attention to saline wastewater treatment plants (SWWTP). Existing knowledge of how activated sludge microorganisms impact water quality, interrelate within habitat networks, and respond to environmental perturbations remains limited. Here we present datasets of the metatranscriptomes of SWWTP in The Netherlands, coupled with process data. The dataset represents a two-year and four-month time series of data collected from 2014 to 2017, with samples taken at approximately monthly intervals from the facultative zone in the activated sludge process of an SWWTP. In total, 32 activated sludge samples were analyzed. This dataset can be used to enhance understanding of the unique microbiome composition in SWWTPs, its dynamic responses to environmental variables, and the metabolic functions within the ASM.
Collapse
Affiliation(s)
- Asala Mahajna
- Wetsus - European Centre of Excellence for Sustainable Water Technology, Oostergoweg 9, 8911 MA, Leeuwarden, The Netherlands.
- Engineering and Technology Institute Groningen, Faculty of Science and Engineering, University of Groningen, Nijenborgh 4, 9747 AG, Groningen, The Netherlands.
| | - Bert Geurkink
- Wetsus - European Centre of Excellence for Sustainable Water Technology, Oostergoweg 9, 8911 MA, Leeuwarden, The Netherlands
| | - Ranko Gacesa
- Department of Genetics, University of Groningen and University Medical Center Groningen, Antonius Deusinglaan 1, 9713 AV, Groningen, The Netherlands
- Department of Gastroenterology and Hepatology, University of Groningen and University Medical Center Groningen, Antonius Deusinglaan 1, 9713 AV, Groningen, The Netherlands
| | - Karel J Keesman
- Mathematical and Statistical Methods - Biometris, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
| | - Gert-Jan W Euverink
- Engineering and Technology Institute Groningen, Faculty of Science and Engineering, University of Groningen, Nijenborgh 4, 9747 AG, Groningen, The Netherlands
| | - Bayu Jayawardhana
- Engineering and Technology Institute Groningen, Faculty of Science and Engineering, University of Groningen, Nijenborgh 4, 9747 AG, Groningen, The Netherlands
| |
Collapse
|
13
|
De Martinis ECP, Alves VF, Pereira MG, Andrade LN, Abichabki N, Abramova A, Dannborg M, Bengtsson-Palme J. Applying 3D cultures and high-throughput technologies to study host-pathogen interactions. Front Immunol 2025; 16:1488699. [PMID: 40051624 PMCID: PMC11882522 DOI: 10.3389/fimmu.2025.1488699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Accepted: 02/04/2025] [Indexed: 03/09/2025] Open
Abstract
Recent advances in cell culturing and DNA sequencing have dramatically altered the field of human microbiome research. Three-dimensional (3D) cell culture is an important tool in cell biology, in cancer research, and for studying host-microbe interactions, as it mimics the in vivo characteristics of the host environment in an in vitro system, providing reliable and reproducible models. This work provides an overview of the main 3D culture techniques applied to study interactions between host cells and pathogenic microorganisms, how these systems can be integrated with high-throughput molecular methods, and how multi-species model systems may pave the way forward to pinpoint interactions among host, beneficial microbes and pathogens.
Collapse
Affiliation(s)
| | | | - Marita Gimenez Pereira
- Ribeirão Preto School of Pharmaceutical Sciences, University of São Paulo, Ribeirão Preto, São Paulo, Brazil
| | - Leonardo Neves Andrade
- Ribeirão Preto School of Pharmaceutical Sciences, University of São Paulo, Ribeirão Preto, São Paulo, Brazil
| | - Nathália Abichabki
- Ribeirão Preto School of Pharmaceutical Sciences, University of São Paulo, Ribeirão Preto, São Paulo, Brazil
- Division of Systems and Synthetic Biology, Department of Life Sciences, SciLifeLab, Chalmers University of Technology, Gothenburg, Sweden
| | - Anna Abramova
- Division of Systems and Synthetic Biology, Department of Life Sciences, SciLifeLab, Chalmers University of Technology, Gothenburg, Sweden
- Centre for Antibiotic Resistance Research (CARe), Gothenburg, Sweden
| | - Mirjam Dannborg
- Division of Systems and Synthetic Biology, Department of Life Sciences, SciLifeLab, Chalmers University of Technology, Gothenburg, Sweden
- Centre for Antibiotic Resistance Research (CARe), Gothenburg, Sweden
- Department of Infectious Diseases, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden
| | - Johan Bengtsson-Palme
- Division of Systems and Synthetic Biology, Department of Life Sciences, SciLifeLab, Chalmers University of Technology, Gothenburg, Sweden
- Centre for Antibiotic Resistance Research (CARe), Gothenburg, Sweden
- Department of Infectious Diseases, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden
| |
Collapse
|
14
|
Foucault P, Halary S, Duval C, Goto M, Marie B, Hamlaoui S, Jardillier L, Lamy D, Lance E, Raimbault E, Allouti F, Troussellier M, Bernard C, Leloup J, Duperron S. A summer in the greater Paris: trophic status of peri-urban lakes shapes prokaryotic community structure and functional potential. ENVIRONMENTAL MICROBIOME 2025; 20:24. [PMID: 39962619 PMCID: PMC11834611 DOI: 10.1186/s40793-025-00681-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Accepted: 02/02/2025] [Indexed: 02/20/2025]
Abstract
With more than 12 million inhabitants, the Greater Paris offers a "natural laboratory" to explore the effects of eutrophication on freshwater lake's microbiomes within a relative restricted area (~ 70 km radius). Here, a 4-months survey was carried out during summertime to monitor planktonic microbial communities of nine lakes located around Paris (Île-de-France, France) of comparable morphologies, yet distinct trophic statuses from mesotrophic to hypereutrophic. By thus minimizing the confounding factors, we investigated how trophic status could influence prokaryotic community structures (16S rRNA gene sequencing) and functions (shotgun metagenomics). These freshwater lakes harbored highly distinct and diverse prokaryotic communities, and their trophic status appears as the main driver explaining both differences in community structure and functional potential. Although their gene pool was quite stable and shared among lakes, taxonomical and functional changes were correlated. According to trophic status, differences in phosphorus metabolism-related genes were highlighted among the relevant functions involved in the biogeochemical cycles. Overall, hypereutrophic lakes microbiomes displayed the highest contrast and heterogeneity over time, suggesting a specific microbial regime shift compared to eutrophic and mesotrophic lakes.
Collapse
Affiliation(s)
- Pierre Foucault
- Muséum National d'Histoire Naturelle, UMR 7245 CNRS-MNHN, Molécules de Communication et Adaptation des Microorganismes (MCAM), Paris, France
- Institut d'Écologie et des Sciences de l'Environnement de Paris (iEES-Paris), Sorbonne Université, UMR 7618 CNRS-INRA-IRD-Univ. Paris Cité-UPEC, Paris, France
| | - Sébastien Halary
- Muséum National d'Histoire Naturelle, UMR 7245 CNRS-MNHN, Molécules de Communication et Adaptation des Microorganismes (MCAM), Paris, France
| | - Charlotte Duval
- Muséum National d'Histoire Naturelle, UMR 7245 CNRS-MNHN, Molécules de Communication et Adaptation des Microorganismes (MCAM), Paris, France
| | - Midoli Goto
- Muséum National d'Histoire Naturelle, UMR 7245 CNRS-MNHN, Molécules de Communication et Adaptation des Microorganismes (MCAM), Paris, France
- Marine Biodiversity, Exploitation & Conservation (MARBEC), Univ. Montpellier-CNRS- Ifremer-IRD, Montpellier, France
| | - Benjamin Marie
- Muséum National d'Histoire Naturelle, UMR 7245 CNRS-MNHN, Molécules de Communication et Adaptation des Microorganismes (MCAM), Paris, France
| | - Sahima Hamlaoui
- Muséum National d'Histoire Naturelle, UMR 7245 CNRS-MNHN, Molécules de Communication et Adaptation des Microorganismes (MCAM), Paris, France
| | - Ludwig Jardillier
- Université Paris-Saclay, UMR 8079 Univ. Paris-Saclay-CNRS-AgroParisTech, Unité d'Écologie Systématique et Évolution (ESE), Gif-sur-Yvette, France
| | - Dominique Lamy
- Institut d'Écologie et des Sciences de l'Environnement de Paris (iEES-Paris), Sorbonne Université, UMR 7618 CNRS-INRA-IRD-Univ. Paris Cité-UPEC, Paris, France
| | - Emilie Lance
- Muséum National d'Histoire Naturelle, UMR 7245 CNRS-MNHN, Molécules de Communication et Adaptation des Microorganismes (MCAM), Paris, France
- Université de Reims, UMR-I 02, Stress environnementaux et biosurveillance des milieux aquatiques (SEBIO), Reims, France
| | - Emmanuelle Raimbault
- Institut de Physique du Globe de Paris, UMR 7154, Univ. Paris Cité-CNRS, Paris, France
| | - Fayçal Allouti
- Muséum National d'Histoire Naturelle, UAR 7200 MNHN, Acquisition et Analyses de Données pour l'Histoire naturelle (2AD), Paris, France
| | - Marc Troussellier
- Marine Biodiversity, Exploitation & Conservation (MARBEC), Univ. Montpellier-CNRS- Ifremer-IRD, Montpellier, France
| | - Cécile Bernard
- Muséum National d'Histoire Naturelle, UMR 7245 CNRS-MNHN, Molécules de Communication et Adaptation des Microorganismes (MCAM), Paris, France
| | - Julie Leloup
- Institut d'Écologie et des Sciences de l'Environnement de Paris (iEES-Paris), Sorbonne Université, UMR 7618 CNRS-INRA-IRD-Univ. Paris Cité-UPEC, Paris, France.
| | - Sébastien Duperron
- Muséum National d'Histoire Naturelle, UMR 7245 CNRS-MNHN, Molécules de Communication et Adaptation des Microorganismes (MCAM), Paris, France.
| |
Collapse
|
15
|
Tauriello G, Waterhouse AM, Haas J, Behringer D, Bienert S, Garello T, Schwede T. ModelArchive: A Deposition Database for Computational Macromolecular Structural Models. J Mol Biol 2025:168996. [PMID: 39947281 DOI: 10.1016/j.jmb.2025.168996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2024] [Revised: 02/04/2025] [Accepted: 02/07/2025] [Indexed: 02/27/2025]
Abstract
A wide range of applications in life science research benefit from the availability of three-dimensional structures of biological macromolecules as they provide valuable insights into their molecular function. Recent advances in structure prediction techniques have made it possible to generate high quality computational macromolecular structural models for almost all known proteins. In this context, ModelArchive (https://modelarchive.org/) serves as a deposition database for computational models, complementing the Protein Data Bank (PDB) and PDB-IHM, which require experimental data, and specialised databases such as the AlphaFold DB. ModelArchive contains over 600,000 models contributed by researchers using a variety of modelling techniques. It supports single biological macromolecules and complexes, including any combination of polymers and small molecules. Each deposited model can be referenced in manuscripts using an immutable accession code provided by ModelArchive. Depositors are required to provide a minimal set of information about the modelling process and the expected accuracy of the resulting model, enabling scientific reproducibility and maximising the potential reuse of the models. The vast majority of models in ModelArchive use the ModelCIF format which includes coordinates and metadata, allows for programmatic validation of the models, and makes the models interoperable with structures obtained from other sources such as the PDB. The ModelArchive web service provides access to the models and search queries. Model findability is also provided in external services either through APIs or by importing data from ModelArchive.
Collapse
Affiliation(s)
- Gerardo Tauriello
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Andrew M Waterhouse
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Juergen Haas
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Dario Behringer
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Thomas Garello
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
| |
Collapse
|
16
|
Zhou Z, Riley R, Kautsar S, Wu W, Egan R, Hofmeyr S, Goldhaber-Gordon S, Yu M, Ho H, Liu F, Chen F, Morgan-Kiss R, Shi L, Liu H, Wang Z. GenomeOcean: An Efficient Genome Foundation Model Trained on Large-Scale Metagenomic Assemblies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.30.635558. [PMID: 39975405 PMCID: PMC11838515 DOI: 10.1101/2025.01.30.635558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Genome foundation models hold transformative potential for precision medicine, drug discovery, and understanding complex biological systems. However, existing models are often inefficient, constrained by suboptimal tokenization and architectural design, and biased toward reference genomes, limiting their representation of low-abundance, uncultured microbes in the rare biosphere. To address these challenges, we developed GenomeOcean, a 4-billion-parameter generative genome foundation model trained on over 600 Gbp of high-quality contigs derived from 220 TB of metagenomic datasets collected from diverse habitats across Earth's ecosystems. A key innovation of GenomeOcean is training directly on large-scale co-assemblies of metagenomic samples, enabling enhanced representation of rare microbial species and improving generalizability beyond genome-centric approaches. We implemented a byte-pair encoding (BPE) tokenization strategy for genome sequence generation, alongside architectural optimizations, achieving up to 150× faster sequence generation while maintaining high biological fidelity. GenomeOcean excels in representing microbial species and generating protein-coding genes constrained by evolutionary principles. Additionally, its fine-tuned model demonstrates the ability to discover novel biosynthetic gene clusters (BGCs) in natural genomes and perform zero-shot synthesis of biochemically plausible, complete BGCs. GenomeOcean sets a new benchmark for metagenomic research, natural product discovery, and synthetic biology, offering a robust foundation for advancing these fields.
Collapse
Affiliation(s)
| | - Robert Riley
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Satria Kautsar
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Weimin Wu
- Northwestern University, Evanston, IL, USA
| | - Rob Egan
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Steven Hofmeyr
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | | | - Mutian Yu
- Northwestern University, Evanston, IL, USA
| | - Harrison Ho
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- University of California at Merced, Merced, CA, USA
| | - Fengchen Liu
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- University of California at Berkeley, Berkeley, CA, USA
| | | | | | - Lizhen Shi
- Northwestern University, Evanston, IL, USA
| | - Han Liu
- Northwestern University, Evanston, IL, USA
| | - Zhong Wang
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- University of California at Merced, Merced, CA, USA
| |
Collapse
|
17
|
Hahnfeld JM, Schwengers O, Jelonek L, Diedrich S, Cemič F, Goesmann A. sORFdb - a database for sORFs, small proteins, and small protein families in bacteria. BMC Genomics 2025; 26:110. [PMID: 39910485 PMCID: PMC11796252 DOI: 10.1186/s12864-025-11301-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Accepted: 01/29/2025] [Indexed: 02/07/2025] Open
Abstract
Small proteins with fewer than 100, particularly fewer than 50, amino acids are still largely unexplored. Nonetheless, they represent an essential part of bacteria's often neglected genetic repertoire. In recent years, the development of ribosome profiling protocols has led to the detection of an increasing number of previously unknown small proteins. Despite this, they are overlooked in many cases by automated genome annotation pipelines, and often, no functional descriptions can be assigned due to a lack of known homologs. To understand and overcome these limitations, the current abundance of small proteins in existing databases was evaluated, and a new dedicated database for small proteins and their potential functions, called 'sORFdb', was created. To this end, small proteins were extracted from annotated bacterial genomes in the GenBank database. Subsequently, they were quality-filtered, compared, and complemented with proteins from Swiss-Prot, UniProt, and SmProt to ensure reliable identification and characterization of small proteins. Families of similar small proteins were created using bidirectional best BLAST hits followed by Markov clustering. Analysis of small proteins in public databases revealed that their number is still limited due to historical and technical constraints. Additionally, functional descriptions were often missing despite the presence of potential homologs. As expected, a taxonomic bias was evident in over-represented clinically relevant bacteria. This new and comprehensive database is accessible via a feature-rich website providing specialized search features for sORFs and small proteins of high quality. Additionally, small protein families with Hidden Markov Models and information on taxonomic distribution and other physicochemical properties are available. In conclusion, the novel small protein database sORFdb is a specialized, taxonomy-independent database that improves the findability and classification of sORFs, small proteins, and their functions in bacteria, thereby supporting their future detection and consistent annotation. All sORFdb data is freely accessible via https://sorfdb.computational.bio .
Collapse
Affiliation(s)
- Julian M Hahnfeld
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, Heinrich-Buff-Ring, Giessen, 35392, Hesse, Germany.
| | - Oliver Schwengers
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, Heinrich-Buff-Ring, Giessen, 35392, Hesse, Germany
| | - Lukas Jelonek
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, Heinrich-Buff-Ring, Giessen, 35392, Hesse, Germany
| | - Sonja Diedrich
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, Heinrich-Buff-Ring, Giessen, 35392, Hesse, Germany
| | - Franz Cemič
- Department of Computer Science, University of Applied Sciences Giessen, Gutfleischstrasse, Giessen, 35390, Hesse, Germany
| | - Alexander Goesmann
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, Heinrich-Buff-Ring, Giessen, 35392, Hesse, Germany
| |
Collapse
|
18
|
Salamzade R, Tran P, Martin C, Manson A, Gilmore M, Earl A, Anantharaman K, Kalan L. zol and fai: large-scale targeted detection and evolutionary investigation of gene clusters. Nucleic Acids Res 2025; 53:gkaf045. [PMID: 39907107 PMCID: PMC11795205 DOI: 10.1093/nar/gkaf045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2024] [Revised: 12/06/2024] [Accepted: 01/24/2025] [Indexed: 02/06/2025] Open
Abstract
Many universally and conditionally important genes are genomically aggregated within clusters. Here, we introduce fai and zol, which together enable large-scale comparative analysis of different types of gene clusters and mobile-genetic elements, such as biosynthetic gene clusters (BGCs) or viruses. Fundamentally, they overcome a current bottleneck to reliably perform comprehensive orthology inference at large scale across broad taxonomic contexts and thousands of genomes. First, fai allows the identification of orthologous instances of a query gene cluster of interest amongst a database of target genomes. Subsequently, zol enables reliable, context-specific inference of ortholog groups for individual protein-encoding genes across gene cluster instances. In addition, zol performs functional annotation and computes a variety of evolutionary statistics for each inferred ortholog group. Importantly, in comparison to tools for visual exploration of homologous relationships between gene clusters, zol can scale to handle thousands of gene cluster instances and produce detailed reports that are easy to digest. To showcase fai and zol, we apply them for: (i) longitudinal tracking of a virus in metagenomes, (ii) performing population genetic investigations of BGCs for a fungal species, and (iii) uncovering evolutionary trends for a virulence-associated gene cluster across thousands of genomes from a diverse bacterial genus.
Collapse
Affiliation(s)
- Rauf Salamzade
- Department of Medical Microbiology and Immunology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, 53706, United States
- Microbiology Doctoral Training Program, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Patricia Q Tran
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, 53706, United States
- Freshwater and Marine Science Doctoral Program, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Cody Martin
- Microbiology Doctoral Training Program, University of Wisconsin-Madison, Madison, WI, 53706, United States
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Abigail L Manson
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, United States
| | - Michael S Gilmore
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, United States
- Department of Ophthalmology, Harvard Medical School and Massachusetts Eye and Ear, Boston, MA, 02114, United States
- Department of Microbiology, Harvard Medical School and Massachusetts Eye and Ear, Boston, MA, 02115, United States
| | - Ashlee M Earl
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, United States
| | - Karthik Anantharaman
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Lindsay R Kalan
- Department of Medical Microbiology and Immunology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, 53706, United States
- Department of Medicine, Division of Infectious Disease, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, 53705, United States
- M.G. DeGroote Institute for Infectious Disease Research, David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, L8S 4L8, Canada
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
| |
Collapse
|
19
|
Uchiyama I, Mihara M, Nishide H, Chiba H, Takayanagi M, Kawai M, Takami H. MBGD: Microbial Genome Database for Comparative Analysis Featuring Enhanced Functionality to Characterize Gene and Genome Functions Through Large-scale Orthology Analysis. J Mol Biol 2025:168957. [PMID: 39826711 DOI: 10.1016/j.jmb.2025.168957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2024] [Revised: 01/14/2025] [Accepted: 01/14/2025] [Indexed: 01/22/2025]
Abstract
Microbial Genome Database for Comparative Analysis (MBGD) is a comprehensive ortholog database encompassing published complete microbial genomes. The ortholog tables in MBGD are constructed in a hierarchical manner. The top-level ortholog table is now constructed from 1,812 genus-level pan-genomes, 6,268 species-level pan-genomes, and 34,079 genomes in total. To support analyses of newly sequenced genomes, MBGD updates MyMBGD functionality, which offers two analysis modes: assignment mode and clustering mode. Assignment mode rapidly classifies genes in the query genomes into existing MBGD ortholog groups, while clustering mode performs de novo clustering of query genomes using the DomClust program. In assignment mode, users can evaluate the presence of genomic functions, as defined in the KEGG Module database, in each query genome using the Genomaple software and compare the results across multiple genomes. To enhance this analysis, we developed a method to subdivide MBGD ortholog groups as needed to improve cross-references to the KEGG Orthology groups. Another notable feature is the phylogenetic profile search interface, which enables users to specify a set of organisms in which orthologs are present or absent (i.e., a phylogenetic profile), and search for ortholog groups with similar phylogenetic profiles. To construct a phylogenetic profile, users can search organisms by specifying phenotype, environment, taxonomy, or a particular ortholog group. MBGD is available at https://mbgd.nibb.ac.jp/.
Collapse
Affiliation(s)
- Ikuo Uchiyama
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan.
| | | | - Hiroyo Nishide
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan
| | - Hirokazu Chiba
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa, Japan
| | | | - Mikihiko Kawai
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan
| | - Hideto Takami
- Tokyo University of Agriculture and Technology, Fuchu, Japan; Center for Mathematical Science and Advanced Technology, Japan Agency for Marine-Earth Science and Technology (JAMSTEC), Yokohama, Japan
| |
Collapse
|
20
|
Chen Y. Beyond Meta-Omics: Functional Genomics in Future Marine Microbiome Research. ANNUAL REVIEW OF MARINE SCIENCE 2025; 17:577-592. [PMID: 38950441 DOI: 10.1146/annurev-marine-020123-100931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
When President Bill Clinton and Francis Collins, then the director of the National Human Genome Research Institute, celebrated the near completion of the human genome sequence at the White House in the summer of 2000, it is unlikely that they or anyone else could have predicted the blossoming of meta-omics in the following two decades and their applications in modern human microbiome and environmental microbiome research. This transformation was enabled by the development of high-throughput sequencing technologies and sophisticated computational biology tools and bioinformatics software packages. Today, environmental meta-omics has undoubtedly revolutionized our understanding of ocean ecosystems, providing the genetic blueprint of oceanic microscopic organisms. In this review, I discuss the importance of functional genomics in future marine microbiome research and advocate a position for a gene-centric, bottom-up approach in modern oceanography. I propose that a synthesis of multidimensional approaches is required for a better understanding of the true functionality of the marine microbiome.
Collapse
Affiliation(s)
- Yin Chen
- School of Life Sciences, University of Warwick, Coventry, United Kingdom;
- College of Marine Life Sciences, Ocean University of China, Qingdao, China
| |
Collapse
|
21
|
Zhang Y, Xue B, Mao Y, Chen X, Yan W, Wang Y, Wang Y, Liu L, Yu J, Zhang X, Chao S, Topp E, Zheng W, Zhang T. High-throughput single-cell sequencing of activated sludge microbiome. ENVIRONMENTAL SCIENCE AND ECOTECHNOLOGY 2025; 23:100493. [PMID: 39430728 PMCID: PMC11490935 DOI: 10.1016/j.ese.2024.100493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Revised: 09/11/2024] [Accepted: 09/11/2024] [Indexed: 10/22/2024]
Abstract
Wastewater treatment plants (WWTPs) represent one of biotechnology's largest and most critical applications, playing a pivotal role in environmental protection and public health. In WWTPs, activated sludge (AS) plays a major role in removing contaminants and pathogens from wastewater. While metagenomics has advanced our understanding of microbial communities, it still faces challenges in revealing the genomic heterogeneity of cells, uncovering the microbial dark matter, and establishing precise links between genetic elements and their host cells as a bulk method. These issues could be largely resolved by single-cell sequencing, which can offer unprecedented resolution to show the unique genetic information. Here we show the high-throughput single-cell sequencing to the AS microbiome. The single-amplified genomes (SAGs) of 15,110 individual cells were clustered into 2,454 SAG bins. We find that 27.5% of the genomes in the AS microbial community represent potential novel species, highlighting the presence of microbial dark matter. Furthermore, we identified 1,137 antibiotic resistance genes (ARGs), 10,450 plasmid fragments, and 1,343 phage contigs, with shared plasmid and phage groups broadly distributed among hosts, indicating a high frequency of horizontal gene transfer (HGT) within the AS microbiome. Complementary analysis using 1,529 metagenome-assembled genomes from the AS samples allowed for the taxonomic classification of 98 SAG bins, which were previously unclassified. Our study establishes the feasibility of single-cell sequencing in characterizing the AS microbiome, providing novel insights into its ecological dynamics, and deepening our understanding of HGT processes, particularly those involving ARGs. Additionally, this valuable tool could monitor the distribution, spread, and pathogenic hosts of ARGs both within AS environments and between AS and other environments, which will ultimately contribute to developing a health risk evaluation system for diverse environments within a One Health framework.
Collapse
Affiliation(s)
- Yulin Zhang
- Environmental Microbiome Engineering and Biotechnology Lab, Department of Civil Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong, 999077, China
| | - Bingjie Xue
- Environmental Microbiome Engineering and Biotechnology Lab, Department of Civil Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong, 999077, China
- School of Public Health, The University of Hong Kong, Pokfulam Road, Hong Kong, 999077, China
- College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen, 518071, Guangdong, China
| | - Yanping Mao
- College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen, 518071, Guangdong, China
| | - Xi Chen
- Environmental Microbiome Engineering and Biotechnology Lab, Department of Civil Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong, 999077, China
| | - Weifu Yan
- Environmental Microbiome Engineering and Biotechnology Lab, Department of Civil Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong, 999077, China
| | - Yanren Wang
- Environmental Microbiome Engineering and Biotechnology Lab, Department of Civil Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong, 999077, China
| | - Yulin Wang
- Environmental Microbiome Engineering and Biotechnology Lab, Department of Civil Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong, 999077, China
| | - Lei Liu
- Environmental Microbiome Engineering and Biotechnology Lab, Department of Civil Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong, 999077, China
| | - Jiale Yu
- MobiDrop (Zhejiang) Company Limited, Jiaxing, 314000, Zhejiang, China
| | - Xiaojin Zhang
- MobiDrop (Zhejiang) Company Limited, Jiaxing, 314000, Zhejiang, China
| | - Shan Chao
- MobiDrop (Zhejiang) Company Limited, Jiaxing, 314000, Zhejiang, China
| | - Edward Topp
- Agroecology Research unit, Bourgogne Franche-Comté Research Centre, National Research Institute for Agriculture, Food and the Environment, 35000, France
| | - Wenshan Zheng
- MobiDrop (Zhejiang) Company Limited, Jiaxing, 314000, Zhejiang, China
| | - Tong Zhang
- Environmental Microbiome Engineering and Biotechnology Lab, Department of Civil Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong, 999077, China
- School of Public Health, The University of Hong Kong, Pokfulam Road, Hong Kong, 999077, China
| |
Collapse
|
22
|
Ma Q, Xiang X, Ma Y, Li G, Liu X, Jia B, Yang W, Yin H, Zhang B. Identification and Bioactivity Analysis of a Novel Bacillus Species, B. maqinnsis sp. nov. Bos-x6-28, Isolated from Feces of the Yak ( Bos grunniens). Antibiotics (Basel) 2024; 13:1238. [PMID: 39766628 PMCID: PMC11672612 DOI: 10.3390/antibiotics13121238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Revised: 12/18/2024] [Accepted: 12/21/2024] [Indexed: 01/11/2025] Open
Abstract
Background: The identification of novel bacterial species from the intestines of yaks residing on the Qinghai-Tibet Plateau is pivotal in advancing our understanding of host-microbiome interactions and represents a promising avenue for microbial drug discovery. Methods: In this study, we conducted a polyphasic taxonomic analysis and bioactive assays on a Bacillus strain, designated Bos-x6-28, isolated from yak feces. Results: The findings revealed that strain Bos-x6-28 shares a high 16S rRNA gene sequence similarity (98.91%) with B. xiamenensis HYC-10T and B. zhangzhouensis DW5-4T, suggesting close phylogenetic affinity. Physiological and biochemical characterizations demonstrated that Bos-x6-28 could utilize nine carbon sources, including D-galactose, inositol, and fructose, alongside nine nitrogen sources, such as threonine, alanine, and proline. Analysis of biochemical markers indicated that Bos-x6-28's cell wall hydrolysates contained mannose, glucose, and meso-2,6-diaminopimelic acid, while menaquinone-7 (MK-7), phosphatidylethanolamine (PE), phosphatidylcholine (PC), and phosphatidylglycerol (DPG) were found in the cell membrane. The primary cellular fatty acids included C16:0 (28.00%), cyclo-C17:0 (19.97%), C14:0 (8.75%), cyclo-C19:0 (8.52%), iso-C15:0 (5.49%), anteiso-C15:0 (4.61%), and C12:0 (3.15%). Whole-genome sequencing identified a genome size of 3.33 Mbp with 3353 coding genes. Digital DNA-DNA hybridization (dDDH) and average nucleotide identity (ANI) analyses confirmed Bos-x6-28 as a novel species, hereby named B. maqinnsis Bos-x6-28 (MCCC 1K09379). Further genomic analysis unveiled biosynthetic gene clusters encoding bioactive natural compounds, including β-lactones, sactipeptides, fengycin, and lichenysin analogs. Additionally, in vitro assays demonstrated that this strain exhibits antibacterial and cytotoxic activities. Conclusions: These findings collectively indicate the novel Bacillus species B. maqinnsis Bos-x6-28 as a promising source for novel antibiotic and antitumor agents.
Collapse
Affiliation(s)
- Qiang Ma
- College of Eco-Environmental Engineering, Qinghai University, Xining 810016, China; (Q.M.); (X.X.); (G.L.); (X.L.); (B.J.); (W.Y.)
| | - Xin Xiang
- College of Eco-Environmental Engineering, Qinghai University, Xining 810016, China; (Q.M.); (X.X.); (G.L.); (X.L.); (B.J.); (W.Y.)
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining 810016, China
| | - Yan Ma
- College of Eco-Environmental Engineering, Qinghai Vocational and Technical University, Xining 810016, China;
| | - Guangzhi Li
- College of Eco-Environmental Engineering, Qinghai University, Xining 810016, China; (Q.M.); (X.X.); (G.L.); (X.L.); (B.J.); (W.Y.)
| | - Xingyu Liu
- College of Eco-Environmental Engineering, Qinghai University, Xining 810016, China; (Q.M.); (X.X.); (G.L.); (X.L.); (B.J.); (W.Y.)
| | - Boai Jia
- College of Eco-Environmental Engineering, Qinghai University, Xining 810016, China; (Q.M.); (X.X.); (G.L.); (X.L.); (B.J.); (W.Y.)
| | - Wenlin Yang
- College of Eco-Environmental Engineering, Qinghai University, Xining 810016, China; (Q.M.); (X.X.); (G.L.); (X.L.); (B.J.); (W.Y.)
| | - Hengxia Yin
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining 810016, China
| | - Benyin Zhang
- College of Eco-Environmental Engineering, Qinghai University, Xining 810016, China; (Q.M.); (X.X.); (G.L.); (X.L.); (B.J.); (W.Y.)
| |
Collapse
|
23
|
Han Y, He J, Li M, Peng Y, Jiang H, Zhao J, Li Y, Deng F. Unlocking the Potential of Metagenomics with the PacBio High-Fidelity Sequencing Technology. Microorganisms 2024; 12:2482. [PMID: 39770685 PMCID: PMC11728442 DOI: 10.3390/microorganisms12122482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 11/28/2024] [Accepted: 11/29/2024] [Indexed: 01/16/2025] Open
Abstract
Traditional methods for studying microbial communities have been limited due to difficulties in culturing and sequencing all microbial species. Recent advances in third-generation sequencing technologies, particularly PacBio's high-fidelity (HiFi) sequencing, have significantly advanced metagenomics by providing accurate long-read sequences. This review explores the role of HiFi sequencing in overcoming the limitations of previous sequencing methods, including high error rates and fragmented assemblies. We discuss the benefits and applications of HiFi sequencing across various environments, such as the human gut and soil, which provides broader context for further exploration. Key studies are discussed to highlight HiFi sequencing's ability to recover complete and coherent microbial genomes from complex microbiomes, showcasing its superior accuracy and continuity compared to other sequencing technologies. Additionally, we explore the potential applications of HiFi sequencing in quantitative microbial analysis, as well as the detection of single nucleotide variations (SNVs) and structural variations (SVs). PacBio HiFi sequencing is establishing a new benchmark in metagenomics, with the potential to significantly enhance our understanding of microbial ecology and drive forward advancements in both environmental and clinical applications.
Collapse
Affiliation(s)
- Yanhua Han
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Jinling He
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Minghui Li
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Yunjuan Peng
- College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (Y.P.); (J.Z.)
| | - Hui Jiang
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Jiangchao Zhao
- College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (Y.P.); (J.Z.)
| | - Ying Li
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Feilong Deng
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| |
Collapse
|
24
|
Aplakidou E, Vergoulidis N, Chasapi M, Venetsianou NK, Kokoli M, Panagiotopoulou E, Iliopoulos I, Karatzas E, Pafilis E, Georgakopoulos-Soares I, Kyrpides NC, Pavlopoulos GA, Baltoumas FA. Visualizing metagenomic and metatranscriptomic data: A comprehensive review. Comput Struct Biotechnol J 2024; 23:2011-2033. [PMID: 38765606 PMCID: PMC11101950 DOI: 10.1016/j.csbj.2024.04.060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Revised: 04/25/2024] [Accepted: 04/25/2024] [Indexed: 05/22/2024] Open
Abstract
The fields of Metagenomics and Metatranscriptomics involve the examination of complete nucleotide sequences, gene identification, and analysis of potential biological functions within diverse organisms or environmental samples. Despite the vast opportunities for discovery in metagenomics, the sheer volume and complexity of sequence data often present challenges in processing analysis and visualization. This article highlights the critical role of advanced visualization tools in enabling effective exploration, querying, and analysis of these complex datasets. Emphasizing the importance of accessibility, the article categorizes various visualizers based on their intended applications and highlights their utility in empowering bioinformaticians and non-bioinformaticians to interpret and derive insights from meta-omics data effectively.
Collapse
Affiliation(s)
- Eleni Aplakidou
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Department of Informatics and Telecommunications, Data Science and Information Technologies program, University of Athens, 15784 Athens, Greece
| | - Nikolaos Vergoulidis
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| | - Maria Chasapi
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Department of Informatics and Telecommunications, Data Science and Information Technologies program, University of Athens, 15784 Athens, Greece
| | - Nefeli K. Venetsianou
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| | - Maria Kokoli
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| | - Eleni Panagiotopoulou
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Department of Informatics and Telecommunications, Data Science and Information Technologies program, University of Athens, 15784 Athens, Greece
| | - Ioannis Iliopoulos
- Department of Basic Sciences, School of Medicine, University of Crete, 71003 Heraklion, Greece
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Heraklion, Greece
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Nikos C. Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Georgios A. Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Center of New Biotechnologies & Precision Medicine, Department of Medicine, School of Health Sciences, National and Kapodistrian University of Athens, Greece
- Hellenic Army Academy, 16673 Vari, Greece
| | - Fotis A. Baltoumas
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece
| |
Collapse
|
25
|
Nickols WA, McIver LJ, Walsh A, Zhang Y, Nearing JT, Asnicar F, Punčochář M, Segata N, Nguyen LH, Hartmann EM, Franzosa EA, Huttenhower C, Thompson KN. Evaluating metagenomic analyses for undercharacterized environments: what's needed to light up the microbial dark matter? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.08.622677. [PMID: 39574575 PMCID: PMC11580994 DOI: 10.1101/2024.11.08.622677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Non-human-associated microbial communities play important biological roles, but they remain less understood than human-associated communities. Here, we assess the impact of key environmental sample properties on a variety of state-of-the-art metagenomic analysis methods. In simulated datasets, all methods performed similarly at high taxonomic ranks, but newer marker-based methods incorporating metagenomic assembled genomes outperformed others at lower taxonomic levels. In real environmental data, taxonomic profiles assigned to the same sample by different methods showed little agreement at lower taxonomic levels, but the methods agreed better on community diversity estimates and estimates of the relationships between environmental parameters and microbial profiles.
Collapse
Affiliation(s)
- William A. Nickols
- Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Lauren J. McIver
- Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Aaron Walsh
- Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Yancong Zhang
- Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jacob T. Nearing
- Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Francesco Asnicar
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy
| | - Michal Punčochář
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy
| | - Nicola Segata
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy
| | - Long H. Nguyen
- Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Erica M. Hartmann
- Department of Civil and Environmental Engineering, McCormick School of Engineering, Northwestern University, Evanston, IL, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
- Department of Medicine/Division of Pulmonary Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Eric A. Franzosa
- Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Curtis Huttenhower
- Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Immunology and Infectious Diseases, T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
| | - Kelsey N. Thompson
- Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Harvard Chan Microbiome in Public Health Center, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
26
|
Probul N, Huang Z, Saak CC, Baumbach J, List M. AI in microbiome-related healthcare. Microb Biotechnol 2024; 17:e70027. [PMID: 39487766 PMCID: PMC11530995 DOI: 10.1111/1751-7915.70027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 09/23/2024] [Indexed: 11/04/2024] Open
Abstract
Artificial intelligence (AI) has the potential to transform clinical practice and healthcare. Following impressive advancements in fields such as computer vision and medical imaging, AI is poised to drive changes in microbiome-based healthcare while facing challenges specific to the field. This review describes the state-of-the-art use of AI in microbiome-related healthcare. It points out limitations across topics such as data handling, AI modelling and safeguarding patient privacy. Furthermore, we indicate how these current shortcomings could be overcome in the future and discuss the influence and opportunities of increasingly complex data on microbiome-based healthcare.
Collapse
Affiliation(s)
- Niklas Probul
- Institute for Computational Systems BiologyUniversity of HamburgHamburgGermany
| | - Zihua Huang
- Data Science in Systems Biology, TUM School of Life SciencesTechnical University of MunichFreisingGermany
| | | | - Jan Baumbach
- Institute for Computational Systems BiologyUniversity of HamburgHamburgGermany
- Computational Biomedicine Lab, Department of Mathematics and Computer ScienceUniversity of Southern DenmarkOdenseDenmark
| | - Markus List
- Data Science in Systems Biology, TUM School of Life SciencesTechnical University of MunichFreisingGermany
- Munich Data Science InstituteTechnical University of MunichGarchingGermany
| |
Collapse
|
27
|
Schrago CG, Mello B. Challenges in Assembling the Dated Tree of Life. Genome Biol Evol 2024; 16:evae229. [PMID: 39475308 PMCID: PMC11523137 DOI: 10.1093/gbe/evae229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/15/2024] [Indexed: 11/02/2024] Open
Abstract
The assembly of a comprehensive and dated Tree of Life (ToL) remains one of the most formidable challenges in evolutionary biology. The complexity of life's history, involving both vertical and horizontal transmission of genetic information, defies its representation by a simple bifurcating phylogeny. With the advent of genome and metagenome sequencing, vast amounts of data have become available. However, employing this information for phylogeny and divergence time inference has introduced significant theoretical and computational hurdles. This perspective addresses some key methodological challenges in assembling the dated ToL, namely, the identification and classification of homologous genes, accounting for gene tree-species tree mismatch due to population-level processes along with duplication, loss, and horizontal gene transfer, and the accurate dating of evolutionary events. Ultimately, the success of this endeavor requires new approaches that integrate knowledge databases with optimized phylogenetic algorithms capable of managing complex evolutionary models.
Collapse
Affiliation(s)
- Carlos G Schrago
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Beatriz Mello
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
28
|
Litchman E, Villéger S, Zinger L, Auguet JC, Thuiller W, Munoz F, Kraft NJB, Philippot L, Violle C. Refocusing the microbial rare biosphere concept through a functional lens. Trends Ecol Evol 2024; 39:923-936. [PMID: 38987022 DOI: 10.1016/j.tree.2024.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 06/04/2024] [Accepted: 06/11/2024] [Indexed: 07/12/2024]
Abstract
The influential concept of the rare biosphere in microbial ecology has underscored the importance of taxa occurring at low abundances yet potentially playing key roles in communities and ecosystems. Here, we refocus the concept of rare biosphere through a functional trait-based lens and provide a framework to characterize microbial functional rarity, a combination of numerical scarcity across space or time and trait distinctiveness. We demonstrate how this novel interpretation of the rare biosphere, rooted in microbial functions, can enhance our mechanistic understanding of microbial community structure. It also sheds light on functionally distinct microbes, directing conservation efforts towards taxa harboring rare yet ecologically crucial functions.
Collapse
Affiliation(s)
- Elena Litchman
- Department of Global Ecology, Carnegie Institution for Science, Stanford, CA, USA; Kellogg Biological Station, Michigan State University, Hickory Corners, MI, USA.
| | | | - Lucie Zinger
- Institut de Biologie de l'École Normale Supérieure (IBENS), École Normale Supérieure, CNRS, INSERM, PSL Université Paris, Paris, France; Centre de Recherche sur la Biodiversité et l'Environnement (CRBE), UMR 5300, CNRS, Institut de Recherche pour le Développement (IRD), Toulouse INP, Université Toulouse 3 Paul Sabatier, Toulouse, France
| | | | - Wilfried Thuiller
- Université Grenoble Alpes, Université Savoie Mont Blanc, CNRS, LECA, F-38000 Grenoble, France
| | - François Munoz
- Université Grenoble Alpes, CNRS, LIPhy, F-38000 Grenoble, France
| | - Nathan J B Kraft
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Laurent Philippot
- Université Bourgogne Franche-Comté, INRAE, Institut Agro Dijon, Agroecology, Dijon, France
| | - Cyrille Violle
- CEFE, Université Montpellier, CNRS, IRD, EPHE, Montpellier, France
| |
Collapse
|
29
|
Schmartz GP, Rehner J, Gund MP, Keller V, Molano LAG, Rupf S, Hannig M, Berger T, Flockerzi E, Seitz B, Fleser S, Schmitt-Grohé S, Kalefack S, Zemlin M, Kunz M, Götzinger F, Gevaerd C, Vogt T, Reichrath J, Diehl L, Hecksteden A, Meyer T, Herr C, Gurevich A, Krug D, Hegemann J, Bozhueyuek K, Gulder TAM, Fu C, Beemelmanns C, Schattenberg JM, Kalinina OV, Becker A, Unger M, Ludwig N, Seibert M, Stein ML, Hanna NL, Martin MC, Mahfoud F, Krawczyk M, Becker SL, Müller R, Bals R, Keller A. Decoding the diagnostic and therapeutic potential of microbiota using pan-body pan-disease microbiomics. Nat Commun 2024; 15:8261. [PMID: 39327438 PMCID: PMC11427559 DOI: 10.1038/s41467-024-52598-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 09/13/2024] [Indexed: 09/28/2024] Open
Abstract
The human microbiome emerges as a promising reservoir for diagnostic markers and therapeutics. Since host-associated microbiomes at various body sites differ and diseases do not occur in isolation, a comprehensive analysis strategy highlighting the full potential of microbiomes should include diverse specimen types and various diseases. To ensure robust data quality and comparability across specimen types and diseases, we employ standardized protocols to generate sequencing data from 1931 prospectively collected specimens, including from saliva, plaque, skin, throat, eye, and stool, with an average sequencing depth of 5.3 gigabases. Collected from 515 patients, these samples yield an average of 3.7 metagenomes per patient. Our results suggest significant microbial variations across diseases and specimen types, including unexpected anatomical sites. We identify 583 unexplored species-level genome bins (SGBs) of which 189 are significantly disease-associated. Of note, the existence of microbial resistance genes in one specimen was indicative of the same resistance genes in other specimens of the same patient. Annotated and previously undescribed SGBs collectively harbor 28,315 potential biosynthetic gene clusters (BGCs), with 1050 significant correlations to diseases. Our combinatorial approach identifies distinct SGBs and BGCs, emphasizing the value of pan-body pan-disease microbiomics as a source for diagnostic and therapeutic strategies.
Collapse
Affiliation(s)
- Georges P Schmartz
- Clinical Bioinformatics, Saarland University, 66123, Saarbrücken, Germany
| | - Jacqueline Rehner
- Institute of Medical Microbiology and Hygiene, Saarland University, 66421, Homburg, Germany
| | - Madline P Gund
- Clinic of Operative Dentistry, Periodontology and Preventive Dentistry, Saarland University, 66421, Homburg, Germany
| | - Verena Keller
- Department of Medicine II, Saarland University Medical Center, 66421, Homburg, Germany
| | | | - Stefan Rupf
- Clinic of Operative Dentistry, Periodontology and Preventive Dentistry, Saarland University, 66421, Homburg, Germany
- Synoptic Dentistry, Saarland University, 66421, Homburg, Germany
| | - Matthias Hannig
- Clinic of Operative Dentistry, Periodontology and Preventive Dentistry, Saarland University, 66421, Homburg, Germany
| | - Tim Berger
- Department of Ophthalmology, Saarland University Medical Center, 66421, Homburg, Germany
| | - Elias Flockerzi
- Department of Ophthalmology, Saarland University Medical Center, 66421, Homburg, Germany
| | - Berthold Seitz
- Department of Ophthalmology, Saarland University Medical Center, 66421, Homburg, Germany
| | - Sara Fleser
- Department of General Pediatrics and Neonatology, Saarland University, 66421, Homburg, Germany
| | - Sabina Schmitt-Grohé
- Department of General Pediatrics and Neonatology, Saarland University, 66421, Homburg, Germany
| | - Sandra Kalefack
- Department of General Pediatrics and Neonatology, Saarland University, 66421, Homburg, Germany
| | - Michael Zemlin
- Department of General Pediatrics and Neonatology, Saarland University, 66421, Homburg, Germany
| | - Michael Kunz
- Department of Internal Medicine III, Cardiology, Angiology, Intensive Care Medicine, Saarland University Hospital, 66421, Homburg, Germany
| | - Felix Götzinger
- Department of Internal Medicine III, Cardiology, Angiology, Intensive Care Medicine, Saarland University Hospital, 66421, Homburg, Germany
| | - Caroline Gevaerd
- Clinic for Dermatology, Venereology, and Allergology, 66421, Homburg, Germany
| | - Thomas Vogt
- Clinic for Dermatology, Venereology, and Allergology, 66421, Homburg, Germany
| | - Jörg Reichrath
- Clinic for Dermatology, Venereology, and Allergology, 66421, Homburg, Germany
| | - Lisa Diehl
- Clinical Bioinformatics, Saarland University, 66123, Saarbrücken, Germany
| | - Anne Hecksteden
- Institute for Sport and Preventive Medicine, Saarland University, 66123, Saarbrücken, Germany
- Chair of Sports Medicine, Institute of Physiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Tim Meyer
- Institute for Sport and Preventive Medicine, Saarland University, 66123, Saarbrücken, Germany
| | - Christian Herr
- Department of Internal Medicine V - Pulmonology, Allergology, Intensive Care Medicine, Saarland University, Saarbrücken, Germany
| | - Alexey Gurevich
- Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany
- Center for Bioinformatics Saar and Saarland University, Saarland Informatics Campus, 66123, Saarbrücken, Germany
| | - Daniel Krug
- Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany
| | - Julian Hegemann
- Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany
- Department of Pharmacy, Saarland University, 66123, Saarbrücken, Germany
| | - Kenan Bozhueyuek
- Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany
| | - Tobias A M Gulder
- Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany
- Department of Pharmacy, Saarland University, 66123, Saarbrücken, Germany
| | - Chengzhang Fu
- Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany
| | - Christine Beemelmanns
- Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany
| | - Jörn M Schattenberg
- Department of Medicine II, Saarland University Medical Center, 66421, Homburg, Germany
| | - Olga V Kalinina
- Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany
| | - Anouck Becker
- Department for Neurology, Saarland University Medical Center, 66421, Homburg, Germany
| | - Marcus Unger
- Department for Neurology, Saarland University Medical Center, 66421, Homburg, Germany
| | - Nicole Ludwig
- Clinical Bioinformatics, Saarland University, 66123, Saarbrücken, Germany
| | - Martina Seibert
- Department of Ophthalmology, Saarland University Medical Center, 66421, Homburg, Germany
| | - Marie-Louise Stein
- Department of Ophthalmology, Saarland University Medical Center, 66421, Homburg, Germany
| | - Nikolas Loka Hanna
- Department of Internal Medicine V - Pulmonology, Allergology, Intensive Care Medicine, Saarland University, Saarbrücken, Germany
| | - Marie-Christin Martin
- Department of Ophthalmology, Saarland University Medical Center, 66421, Homburg, Germany
| | - Felix Mahfoud
- Department of Internal Medicine III, Cardiology, Angiology, Intensive Care Medicine, Saarland University Hospital, 66421, Homburg, Germany
| | - Marcin Krawczyk
- Department of Medicine II, Saarland University Medical Center, 66421, Homburg, Germany
| | - Sören L Becker
- Institute of Medical Microbiology and Hygiene, Saarland University, 66421, Homburg, Germany
| | - Rolf Müller
- Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany
- PharmaScienceHub, 66123, Saarbrücken, Germany
| | - Robert Bals
- Department of Internal Medicine V - Pulmonology, Allergology, Intensive Care Medicine, Saarland University, Saarbrücken, Germany
- PharmaScienceHub, 66123, Saarbrücken, Germany
| | - Andreas Keller
- Clinical Bioinformatics, Saarland University, 66123, Saarbrücken, Germany.
- Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany.
- PharmaScienceHub, 66123, Saarbrücken, Germany.
| |
Collapse
|
30
|
MacGregor H, Fukai I, Ash K, Arkin AP, Hazen TC. Potential applications of microbial genomics in nuclear non-proliferation. Front Microbiol 2024; 15:1410820. [PMID: 39360321 PMCID: PMC11445143 DOI: 10.3389/fmicb.2024.1410820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 09/04/2024] [Indexed: 10/04/2024] Open
Abstract
As nuclear technology evolves in response to increased demand for diversification and decarbonization of the energy sector, new and innovative approaches are needed to effectively identify and deter the proliferation of nuclear arms, while ensuring safe development of global nuclear energy resources. Preventing the use of nuclear material and technology for unsanctioned development of nuclear weapons has been a long-standing challenge for the International Atomic Energy Agency and signatories of the Treaty on the Non-Proliferation of Nuclear Weapons. Environmental swipe sampling has proven to be an effective technique for characterizing clandestine proliferation activities within and around known locations of nuclear facilities and sites. However, limited tools and techniques exist for detecting nuclear proliferation in unknown locations beyond the boundaries of declared nuclear fuel cycle facilities, representing a critical gap in non-proliferation safeguards. Microbiomes, defined as "characteristic communities of microorganisms" found in specific habitats with distinct physical and chemical properties, can provide valuable information about the conditions and activities occurring in the surrounding environment. Microorganisms are known to inhabit radionuclide-contaminated sites, spent nuclear fuel storage pools, and cooling systems of water-cooled nuclear reactors, where they can cause radionuclide migration and corrosion of critical structures. Microbial transformation of radionuclides is a well-established process that has been documented in numerous field and laboratory studies. These studies helped to identify key bacterial taxa and microbially-mediated processes that directly and indirectly control the transformation, mobility, and fate of radionuclides in the environment. Expanding on this work, other studies have used microbial genomics integrated with machine learning models to successfully monitor and predict the occurrence of heavy metals, radionuclides, and other process wastes in the environment, indicating the potential role of nuclear activities in shaping microbial community structure and function. Results of this previous body of work suggest fundamental geochemical-microbial interactions occurring at nuclear fuel cycle facilities could give rise to microbiomes that are characteristic of nuclear activities. These microbiomes could provide valuable information for monitoring nuclear fuel cycle facilities, planning environmental sampling campaigns, and developing biosensor technology for the detection of undisclosed fuel cycle activities and proliferation concerns.
Collapse
Affiliation(s)
| | - Isis Fukai
- Bredesen Center, University of Tennessee, Knoxville, TN, United States
| | - Kurt Ash
- Department of Civil and Environmental Engineering, University of Tennessee, Knoxville, TN, United States
| | - Adam Paul Arkin
- University of California, Berkeley, Berkeley, CA, United States
| | - Terry C. Hazen
- Bredesen Center, University of Tennessee, Knoxville, TN, United States
- Department of Civil and Environmental Engineering, University of Tennessee, Knoxville, TN, United States
- Department of Microbiology, University of Tennessee, Knoxville, TN, United States
- Department of Earth and Planetary Sciences, University of Tennessee, Knoxville, TN, United States
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
| |
Collapse
|
31
|
Mazur-Marzec H, Andersson AF, Błaszczyk A, Dąbek P, Górecka E, Grabski M, Jankowska K, Jurczak-Kurek A, Kaczorowska AK, Kaczorowski T, Karlson B, Kataržytė M, Kobos J, Kotlarska E, Krawczyk B, Łuczkiewicz A, Piwosz K, Rybak B, Rychert K, Sjöqvist C, Surosz W, Szymczycha B, Toruńska-Sitarz A, Węgrzyn G, Witkowski A, Węgrzyn A. Biodiversity of microorganisms in the Baltic Sea: the power of novel methods in the identification of marine microbes. FEMS Microbiol Rev 2024; 48:fuae024. [PMID: 39366767 PMCID: PMC11500664 DOI: 10.1093/femsre/fuae024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 09/21/2024] [Accepted: 10/03/2024] [Indexed: 10/06/2024] Open
Abstract
Until recently, the data on the diversity of the entire microbial community from the Baltic Sea were relatively rare and very scarce. However, modern molecular methods have provided new insights into this field with interesting results. They can be summarized as follows. (i) Although low salinity causes a reduction in the biodiversity of multicellular species relative to the populations of the North-East Atlantic, no such reduction occurs in bacterial diversity. (ii) Among cyanobacteria, the picocyanobacterial group dominates when considering gene abundance, while filamentous cyanobacteria dominate in means of biomass. (iii) The diversity of diatoms and dinoflagellates is significantly larger than described a few decades ago; however, molecular studies on these groups are still scarce. (iv) Knowledge gaps in other protistan communities are evident. (v) Salinity is the main limiting parameter of pelagic fungal community composition, while the benthic fungal diversity is shaped by water depth, salinity, and sediment C and N availability. (vi) Bacteriophages are the predominant group of viruses, while among viruses infecting eukaryotic hosts, Phycodnaviridae are the most abundant; the Baltic Sea virome is contaminated with viruses originating from urban and/or industrial habitats. These features make the Baltic Sea microbiome specific and unique among other marine environments.
Collapse
Affiliation(s)
- Hanna Mazur-Marzec
- Department of Marine Biology and Biotechnology, University of Gdansk, Al. Piłsudskiego 46, PL-81-378 Gdynia, Poland
| | - Anders F Andersson
- Department of Gene Technology, KTH Royal Institute of Technology, Science for Life Laboratory, Tomtebodavägen 23A, SE-171 65 Solna, Stockholm, Sweden
| | - Agata Błaszczyk
- Department of Marine Biology and Biotechnology, University of Gdansk, Al. Piłsudskiego 46, PL-81-378 Gdynia, Poland
| | - Przemysław Dąbek
- Institute of Marine and Environmental Sciences, University of Szczecin, Mickiewicza 16a, PL-70-383 Szczecin, Poland
| | - Ewa Górecka
- Institute of Marine and Environmental Sciences, University of Szczecin, Mickiewicza 16a, PL-70-383 Szczecin, Poland
| | - Michał Grabski
- International Centre for Cancer Vaccine Science, University of Gdansk, Kładki 24, 80-822 Gdansk, Poland
| | - Katarzyna Jankowska
- Department of Environmental Engineering Technology, Gdansk University of Technology, Narutowicza 11/12, PL-80-233 Gdansk, Poland
| | - Agata Jurczak-Kurek
- Department of Evolutionary Genetics and Biosystematics, University of Gdansk, Wita Stwosza 59, PL-80-308 Gdansk, Poland
| | - Anna K Kaczorowska
- Collection of Plasmids and Microorganisms, University of Gdansk, Wita Stwosza 59, PL-80-308 Gdansk, Poland
| | - Tadeusz Kaczorowski
- Laboratory of Extremophiles Biology, Department of Microbiology, University of Gdansk, Wita Stwosza 59, PL-80-308 Gdansk, Poland
| | - Bengt Karlson
- Swedish Meteorological and Hydrological Institute
, Research and Development, Oceanography, Göteborgseskaderns plats 3, Västra Frölunda SE-426 71, Sweden
| | - Marija Kataržytė
- Marine Research Institute, Klaipėda University, Universiteto ave. 17, LT-92294 Klaipeda, Lithuania
| | - Justyna Kobos
- Department of Marine Biology and Biotechnology, University of Gdansk, Al. Piłsudskiego 46, PL-81-378 Gdynia, Poland
| | - Ewa Kotlarska
- Institute of Oceanology, Polish Academy of Sciences, Powstańców Warszawy 55, PL-81-712 Sopot, Poland
| | - Beata Krawczyk
- Department of Biotechnology and Microbiology, Gdansk University of Technology, Narutowicza 11/12, PL-80-233 Gdansk, Poland
| | - Aneta Łuczkiewicz
- Department of Environmental Engineering Technology, Gdansk University of Technology, Narutowicza 11/12, PL-80-233 Gdansk, Poland
| | - Kasia Piwosz
- National Marine Fisheries Research Institute, Kołłątaja 1, PL-81-332 Gdynia, Poland
| | - Bartosz Rybak
- Department of Environmental Toxicology, Faculty of Health Sciences with Institute of Maritime and Tropical Medicine, Medical University of Gdansk, Dębowa 23A, PL-80-204 Gdansk, Poland
| | - Krzysztof Rychert
- Pomeranian University in Słupsk, Arciszewskiego 22a, PL-76-200 Słupsk, Poland
| | - Conny Sjöqvist
- Environmental and Marine Biology, Åbo Akademi University, Henriksgatan 2, FI-20500 Åbo, Finland
| | - Waldemar Surosz
- Department of Marine Biology and Biotechnology, University of Gdansk, Al. Piłsudskiego 46, PL-81-378 Gdynia, Poland
| | - Beata Szymczycha
- Institute of Oceanology, Polish Academy of Sciences, Powstańców Warszawy 55, PL-81-712 Sopot, Poland
| | - Anna Toruńska-Sitarz
- Department of Marine Biology and Biotechnology, University of Gdansk, Al. Piłsudskiego 46, PL-81-378 Gdynia, Poland
| | - Grzegorz Węgrzyn
- Department of Molecular Biology, University of Gdansk, Wita Stwosza 59, PL-80-308 Gdansk, Poland
| | - Andrzej Witkowski
- Institute of Marine and Environmental Sciences, University of Szczecin, Mickiewicza 16a, PL-70-383 Szczecin, Poland
| | - Alicja Węgrzyn
- University Center for Applied and Interdisciplinary Research, University of Gdansk, Kładki 24, 80-822 Gdansk, Poland
| |
Collapse
|
32
|
Sardar P, Almeida A, Pedicord VA. Integrating functional metagenomics to decipher microbiome-immune interactions. Immunol Cell Biol 2024; 102:680-691. [PMID: 38952337 DOI: 10.1111/imcb.12798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 06/04/2024] [Accepted: 06/13/2024] [Indexed: 07/03/2024]
Abstract
Microbial metabolites can be viewed as the cytokines of the microbiome, transmitting information about the microbial and metabolic environment of the gut to orchestrate and modulate local and systemic immune responses. Still, many immunology studies focus solely on the taxonomy and community structure of the gut microbiota rather than its functions. Early sequencing-based microbiota profiling approaches relied on PCR amplification of small regions of bacterial and fungal genomes to facilitate identification of the microbes present. However, recent microbiome analysis methods, particularly shotgun metagenomic sequencing, now enable culture-independent profiling of microbiome functions and metabolites in addition to taxonomic characterization. In this review, we showcase recent advances in functional metagenomics methods and applications and discuss the current limitations and potential avenues for future development. Importantly, we highlight a few examples of key areas of opportunity in immunology research where integrating functional metagenomic analyses of the microbiome can substantially enhance a mechanistic understanding of microbiome-immune interactions and their contributions to health and disease states.
Collapse
Affiliation(s)
- Puspendu Sardar
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge, UK
- Department of Medicine, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Alexandre Almeida
- Department of Veterinary Medicine, University of Cambridge School of Biological Sciences, Cambridge, UK
| | - Virginia A Pedicord
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge, UK
- Department of Medicine, University of Cambridge School of Clinical Medicine, Cambridge, UK
| |
Collapse
|
33
|
Chung HC, Friedberg I, Bromberg Y. Assembling bacterial puzzles: piecing together functions into microbial pathways. NAR Genom Bioinform 2024; 6:lqae109. [PMID: 39184378 PMCID: PMC11344244 DOI: 10.1093/nargab/lqae109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 07/24/2024] [Accepted: 08/07/2024] [Indexed: 08/27/2024] Open
Abstract
Functional metagenomics enables the study of unexplored bacterial diversity, gene families, and pathways essential to microbial communities. However, discovering biological insights with these data is impeded by the scarcity of quality annotations. Here, we use a co-occurrence-based analysis of predicted microbial protein functions to uncover pathways in genomic and metagenomic biological systems. Our approach, based on phylogenetic profiles, improves the identification of functional relationships, or participation in the same biochemical pathway, between enzymes over a comparable homology-based approach. We optimized the design of our profiles to identify potential pathways using minimal data, clustered functionally related enzyme pairs into multi-enzymatic pathways, and evaluated our predictions against reference pathways in the KEGG database. We then demonstrated a novel extension of this approach to predict inter-bacterial protein interactions amongst members of a marine microbiome. Most significantly, we show our method predicts emergent biochemical pathways between known and unknown functions. Thus, our work establishes a basis for identifying the potential functional capacities of the entire metagenome, capturing previously unknown and abstract functions into discrete putative pathways.
Collapse
Affiliation(s)
- Henri C Chung
- Program in Bioinformatics and Computational Biology, Iowa State University, Ames, IA 50011 , USA
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA 50011, USA
| | - Iddo Friedberg
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA 50011, USA
| | - Yana Bromberg
- Department of Computer Science, Emory University, Atlanta, GA 30307, USA
- Department of Biology, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
34
|
Tian F, Wainaina JM, Howard-Varona C, Domínguez-Huerta G, Bolduc B, Gazitúa MC, Smith G, Gittrich MR, Zablocki O, Cronin DR, Eveillard D, Hallam SJ, Sullivan MB. Prokaryotic-virus-encoded auxiliary metabolic genes throughout the global oceans. MICROBIOME 2024; 12:159. [PMID: 39198891 PMCID: PMC11360552 DOI: 10.1186/s40168-024-01876-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 07/16/2024] [Indexed: 09/01/2024]
Abstract
BACKGROUND Prokaryotic microbes have impacted marine biogeochemical cycles for billions of years. Viruses also impact these cycles, through lysis, horizontal gene transfer, and encoding and expressing genes that contribute to metabolic reprogramming of prokaryotic cells. While this impact is difficult to quantify in nature, we hypothesized that it can be examined by surveying virus-encoded auxiliary metabolic genes (AMGs) and assessing their ecological context. RESULTS We systematically developed a global ocean AMG catalog by integrating previously described and newly identified AMGs and then placed this catalog into ecological and metabolic contexts relevant to ocean biogeochemistry. From 7.6 terabases of Tara Oceans paired prokaryote- and virus-enriched metagenomic sequence data, we increased known ocean virus populations to 579,904 (up 16%). From these virus populations, we then conservatively identified 86,913 AMGs that grouped into 22,779 sequence-based gene clusters, 7248 (~ 32%) of which were not previously reported. Using our catalog and modeled data from mock communities, we estimate that ~ 19% of ocean virus populations carry at least one AMG. To understand AMGs in their metabolic context, we identified 340 metabolic pathways encoded by ocean microbes and showed that AMGs map to 128 of them. Furthermore, we identified metabolic "hot spots" targeted by virus AMGs, including nine pathways where most steps (≥ 0.75) were AMG-targeted (involved in carbohydrate, amino acid, fatty acid, and nucleotide metabolism), as well as other pathways where virus-encoded AMGs outnumbered cellular homologs (involved in lipid A phosphates, phosphatidylethanolamine, creatine biosynthesis, phosphoribosylamine-glycine ligase, and carbamoyl-phosphate synthase pathways). CONCLUSIONS Together, this systematically curated, global ocean AMG catalog and analyses provide a valuable resource and foundational observations to understand the role of viruses in modulating global ocean metabolisms and their biogeochemical implications. Video Abstract.
Collapse
Affiliation(s)
- Funing Tian
- Department of Microbiology, Ohio State University, Columbus, OH, 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH, 43210, USA
- Department of Medicine, The University of Chicago, Chicago, IL, USA
| | - James M Wainaina
- Department of Microbiology, Ohio State University, Columbus, OH, 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH, 43210, USA
- Biology Department, Woods Hole Oceanographic Institution, Woods Hole, MA, USA
| | - Cristina Howard-Varona
- Department of Microbiology, Ohio State University, Columbus, OH, 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH, 43210, USA
| | - Guillermo Domínguez-Huerta
- Department of Microbiology, Ohio State University, Columbus, OH, 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH, 43210, USA
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH, 43210, USA
- Centro Oceanográfico de Málaga (IEO-CSIC), Puerto Pesquero S/N, 29640, Fuengirola (Málaga), Spain
| | - Benjamin Bolduc
- Department of Microbiology, Ohio State University, Columbus, OH, 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH, 43210, USA
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH, 43210, USA
| | | | - Garrett Smith
- Department of Microbiology, Ohio State University, Columbus, OH, 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH, 43210, USA
| | - Marissa R Gittrich
- Department of Microbiology, Ohio State University, Columbus, OH, 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH, 43210, USA
| | - Olivier Zablocki
- Department of Microbiology, Ohio State University, Columbus, OH, 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH, 43210, USA
| | - Dylan R Cronin
- Department of Microbiology, Ohio State University, Columbus, OH, 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH, 43210, USA
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH, 43210, USA
| | - Damien Eveillard
- Université de Nantes, CNRS, LS2N, Nantes, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, R2022/Tara GO-SEE, Paris, France
| | - Steven J Hallam
- Department of Microbiology & Immunology, University of British Columbia, Vancouver, BC, V6T 1Z1, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
- Genome Science and Technology Program, University of British Columbia, 2329 West Mall, Vancouver, BC, V6T 1Z4, Canada
- Life Sciences Institute, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
- ECOSCOPE Training Program, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
| | - Matthew B Sullivan
- Department of Microbiology, Ohio State University, Columbus, OH, 43210, USA.
- Center of Microbiome Science, Ohio State University, Columbus, OH, 43210, USA.
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH, 43210, USA.
- Department of Civil, Environmental, and Geodetic Engineering, Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
35
|
Margelevičius M. GTalign: spatial index-driven protein structure alignment, superposition, and search. Nat Commun 2024; 15:7305. [PMID: 39181863 PMCID: PMC11344802 DOI: 10.1038/s41467-024-51669-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 08/14/2024] [Indexed: 08/27/2024] Open
Abstract
With protein databases growing rapidly due to advances in structural and computational biology, the ability to accurately align and rapidly search protein structures has become essential for biological research. In response to the challenge posed by vast protein structure repositories, GTalign offers an innovative solution to protein structure alignment and search-an algorithm that achieves optimal superposition at high speeds. Through the design and implementation of spatial structure indexing, GTalign parallelizes all stages of superposition search across residues and protein structure pairs, yielding rapid identification of optimal superpositions. Rigorous evaluation across diverse datasets reveals GTalign as the most accurate among structure aligners while presenting orders of magnitude in speedup at state-of-the-art accuracy. GTalign's high speed and accuracy make it useful for numerous applications, including functional inference, evolutionary analyses, protein design, and drug discovery, contributing to advancing understanding of protein structure and function.
Collapse
|
36
|
Chakravarty D, Schafer JW, Chen EA, Thole JF, Ronish LA, Lee M, Porter LL. AlphaFold predictions of fold-switched conformations are driven by structure memorization. Nat Commun 2024; 15:7296. [PMID: 39181864 PMCID: PMC11344769 DOI: 10.1038/s41467-024-51801-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 08/19/2024] [Indexed: 08/27/2024] Open
Abstract
Recent work suggests that AlphaFold (AF)-a deep learning-based model that can accurately infer protein structure from sequence-may discern important features of folded protein energy landscapes, defined by the diversity and frequency of different conformations in the folded state. Here, we test the limits of its predictive power on fold-switching proteins, which assume two structures with regions of distinct secondary and/or tertiary structure. We find that (1) AF is a weak predictor of fold switching and (2) some of its successes result from memorization of training-set structures rather than learned protein energetics. Combining >280,000 models from several implementations of AF2 and AF3, a 35% success rate was achieved for fold switchers likely in AF's training sets. AF2's confidence metrics selected against models consistent with experimentally determined fold-switching structures and failed to discriminate between low and high energy conformations. Further, AF captured only one out of seven experimentally confirmed fold switchers outside of its training sets despite extensive sampling of an additional ~280,000 models. Several observations indicate that AF2 has memorized structural information during training, and AF3 misassigns coevolutionary restraints. These limitations constrain the scope of successful predictions, highlighting the need for physically based methods that readily predict multiple protein conformations.
Collapse
Affiliation(s)
- Devlina Chakravarty
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Joseph W Schafer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Ethan A Chen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Joseph F Thole
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Leslie A Ronish
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Myeongsang Lee
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Lauren L Porter
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
37
|
Flinkstrom Z, Bryson S, Candry P, Winkler MKH. Metagenomic clustering links specific metabolic functions to globally relevant ecosystems. mSystems 2024; 9:e0057324. [PMID: 38980052 PMCID: PMC11334424 DOI: 10.1128/msystems.00573-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 06/12/2024] [Indexed: 07/10/2024] Open
Abstract
Metagenomic sequencing has advanced our understanding of biogeochemical processes by providing an unprecedented view into the microbial composition of different ecosystems. While the amount of metagenomic data has grown rapidly, simple-to-use methods to analyze and compare across studies have lagged behind. Thus, tools expressing the metabolic traits of a community are needed to broaden the utility of existing data. Gene abundance profiles are a relatively low-dimensional embedding of a metagenome's functional potential and are, thus, tractable for comparison across many samples. Here, we compare the abundance of KEGG Ortholog Groups (KOs) from 6,539 metagenomes from the Joint Genome Institute's Integrated Microbial Genomes and Metagenomes (JGI IMG/M) database. We find that samples cluster into terrestrial, aquatic, and anaerobic ecosystems with marker KOs reflecting adaptations to these environments. For instance, functional clusters were differentiated by the metabolism of antibiotics, photosynthesis, methanogenesis, and surprisingly GC content. Using this functional gene approach, we reveal the broad-scale patterns shaping microbial communities and demonstrate the utility of ortholog abundance profiles for representing a rapidly expanding body of metagenomic data. IMPORTANCE Metagenomics, or the sequencing of DNA from complex microbiomes, provides a view into the microbial composition of different environments. Metagenome databases were created to compile sequencing data across studies, but it remains challenging to compare and gain insight from these large data sets. Consequently, there is a need to develop accessible approaches to extract knowledge across metagenomes. The abundance of different orthologs (i.e., genes that perform a similar function across species) provides a simplified representation of a metagenome's metabolic potential that can easily be compared with others. In this study, we cluster the ortholog abundance profiles of thousands of metagenomes from diverse environments and uncover the traits that distinguish them. This work provides a simple to use framework for functional comparison and advances our understanding of how the environment shapes microbial communities.
Collapse
Affiliation(s)
- Zachary Flinkstrom
- Department of Civil and Environmental Engineering, University of Washington, Seattle, Washington, USA
| | | | - Pieter Candry
- Department of Civil and Environmental Engineering, University of Washington, Seattle, Washington, USA
- Laboratory of Systems & Synthetic Biology, Wageningen University & Research, Wageningen, Netherlands
| | - Mari-Karoliina H. Winkler
- Department of Civil and Environmental Engineering, University of Washington, Seattle, Washington, USA
| |
Collapse
|
38
|
Espinoza JL, Phillips A, Prentice MB, Tan GS, Kamath PL, Lloyd KG, Dupont CL. Unveiling the microbial realm with VEBA 2.0: a modular bioinformatics suite for end-to-end genome-resolved prokaryotic, (micro)eukaryotic and viral multi-omics from either short- or long-read sequencing. Nucleic Acids Res 2024; 52:e63. [PMID: 38909293 DOI: 10.1093/nar/gkae528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 05/21/2024] [Accepted: 06/10/2024] [Indexed: 06/24/2024] Open
Abstract
The microbiome is a complex community of microorganisms, encompassing prokaryotic (bacterial and archaeal), eukaryotic, and viral entities. This microbial ensemble plays a pivotal role in influencing the health and productivity of diverse ecosystems while shaping the web of life. However, many software suites developed to study microbiomes analyze only the prokaryotic community and provide limited to no support for viruses and microeukaryotes. Previously, we introduced the Viral Eukaryotic Bacterial Archaeal (VEBA) open-source software suite to address this critical gap in microbiome research by extending genome-resolved analysis beyond prokaryotes to encompass the understudied realms of eukaryotes and viruses. Here we present VEBA 2.0 with key updates including a comprehensive clustered microeukaryotic protein database, rapid genome/protein-level clustering, bioprospecting, non-coding/organelle gene modeling, genome-resolved taxonomic/pathway profiling, long-read support, and containerization. We demonstrate VEBA's versatile application through the analysis of diverse case studies including marine water, Siberian permafrost, and white-tailed deer lung tissues with the latter showcasing how to identify integrated viruses. VEBA represents a crucial advancement in microbiome research, offering a powerful and accessible software suite that bridges the gap between genomics and biotechnological solutions.
Collapse
Affiliation(s)
- Josh L Espinoza
- Department of Environment and Sustainability, J. Craig Venter Institute, La Jolla, CA 92037, USA
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| | - Allan Phillips
- Department of Environment and Sustainability, J. Craig Venter Institute, La Jolla, CA 92037, USA
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| | - Melanie B Prentice
- School of Food and Agriculture, University of Maine, Orono, ME 04469, USA
| | - Gene S Tan
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| | - Pauline L Kamath
- School of Food and Agriculture, University of Maine, Orono, ME 04469, USA
- Maine Center for Genetics in the Environment, University of Maine, Orono, ME 04469, USA
| | - Karen G Lloyd
- Microbiology Department, University of Tennessee, Knoxville, TN 37917, USA
| | - Chris L Dupont
- Department of Environment and Sustainability, J. Craig Venter Institute, La Jolla, CA 92037, USA
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| |
Collapse
|
39
|
Mugabe D, Yoosefzadeh-Najafabadi M, Rajcan I. Genetic diversity and genome-wide association study of partial resistance to Sclerotinia stem rot in a Canadian soybean germplasm panel. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:201. [PMID: 39127987 DOI: 10.1007/s00122-024-04708-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 07/31/2024] [Indexed: 08/12/2024]
Abstract
KEY MESSAGE Developing genetically resistant soybean cultivars is key in controlling the destructive Sclerotinia Stem Rot (SSR) disease. Here, a GWAS study in Canadian soybeans identified potential marker-trait associations and candidate genes, paving the way for more efficient breeding methods for SSR. Sclerotinia stem rot (SSR), caused by the fungal pathogen Sclerotinia sclerotiorum, is one of the most important diseases leading to significant soybean yield losses in Canada and worldwide. Developing soybean cultivars that are genetically resistant to the disease is the most inexpensive and reliable method to control the disease. However, breeding for resistance is hampered by the highly complex nature of genetic resistance to SSR in soybean. This study sought to understand the genetic basis underlying SSR resistance particularly in soybean grown in Canada. Consequently, a panel of 193 genotypes was assembled based on maturity group and genetic diversity as representative of Canadian soybean cultivars. Plants were inoculated and screened for SSR resistance in controlled environments, where variation for SSR phenotypic response was observed. The panel was also genotyped via genotyping-by-sequencing and the resulting genotypic data were imputed using BEAGLE v5 leading to a catalogue of 417 K SNPs. Through genome-wide association analyses (GWAS) using FarmCPU method with threshold of FDR-adjusted p-values < 0.1, we identified significant SNPs on chromosomes 2 and 9 with allele effects of 16.1 and 14.3, respectively. Further analysis identified three potential candidate genes linked to SSR disease resistance within a 100 Kb window surrounding each of the peak SNPs. Our results will be important in developing molecular markers that can speed up the breeding for SSR resistance in Canadian grown soybean.
Collapse
Affiliation(s)
- Deus Mugabe
- Department of Plant Agriculture, University of Guelph, Guelph, ON, N1G 2W1, Canada
| | | | - Istvan Rajcan
- Department of Plant Agriculture, University of Guelph, Guelph, ON, N1G 2W1, Canada.
| |
Collapse
|
40
|
Dickson A, Mofrad MRK. Fine-tuning protein embeddings for functional similarity evaluation. Bioinformatics 2024; 40:btae445. [PMID: 38985218 PMCID: PMC11299545 DOI: 10.1093/bioinformatics/btae445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 06/25/2024] [Accepted: 07/09/2024] [Indexed: 07/11/2024] Open
Abstract
MOTIVATION Proteins with unknown function are frequently compared to better characterized relatives, either using sequence similarity, or recently through similarity in a learned embedding space. Through comparison, protein sequence embeddings allow for interpretable and accurate annotation of proteins, as well as for downstream tasks such as clustering for unsupervised discovery of protein families. However, it is unclear whether embeddings can be deliberately designed to improve their use in these downstream tasks. RESULTS We find that for functional annotation of proteins, as represented by Gene Ontology (GO) terms, direct fine-tuning of language models on a simple classification loss has an immediate positive impact on protein embedding quality. Fine-tuned embeddings show stronger performance as representations for K-nearest neighbor classifiers, reaching stronger performance for GO annotation than even directly comparable fine-tuned classifiers, while maintaining interpretability through protein similarity comparisons. They also maintain their quality in related tasks, such as rediscovering protein families with clustering. AVAILABILITY AND IMPLEMENTATION github.com/mofradlab/go_metric.
Collapse
Affiliation(s)
- Andrew Dickson
- Departments of Bioengineering and Mechanical Engineering, Molecular Cell Biomechanics Laboratory, University of California, Berkeley, CA 94720, United States
| | - Mohammad R K Mofrad
- Departments of Bioengineering and Mechanical Engineering, Molecular Cell Biomechanics Laboratory, University of California, Berkeley, CA 94720, United States
| |
Collapse
|
41
|
Chen L, Chen A, Zhang XD, Saenz Robles MT, Han HS, Xiao Y, Xiao G, Pipas JM, Weitz DA. Targeted whole-genome recovery of single viral species in a complex environmental sample. Proc Natl Acad Sci U S A 2024; 121:e2404727121. [PMID: 39052829 PMCID: PMC11295033 DOI: 10.1073/pnas.2404727121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 06/07/2024] [Indexed: 07/27/2024] Open
Abstract
Characterizing unknown viruses is essential for understanding viral ecology and preparing against viral outbreaks. Recovering complete genome sequences from environmental samples remains computationally challenging using metagenomics, especially for low-abundance species with uneven coverage. We present an experimental method for reliably recovering complete viral genomes from complex environmental samples. Individual genomes are encapsulated into droplets and amplified using multiple displacement amplification. A unique gene detection assay, which employs an RNA-based probe and an exonuclease, selectively identifies droplets containing the target viral genome. Labeled droplets are sorted using a microfluidic sorter, and genomes are extracted for sequencing. We demonstrate this method's efficacy by spiking two known viral genomes, Simian virus 40 (SV40, 5,243 bp) and Human Adenovirus 5 (HAd5, 35,938 bp), into a sewage sample with a final abundance in the droplets of around 0.1% and 0.015%, respectively. We achieve 100% recovery of the complete sequence of the spiked-in SV40 genome with uniform coverage distribution. For the larger HAd5 genome, we cover approximately 99.4% of its sequence. Notably, genome recovery is achieved with as few as one sorted droplet, which enables the recovery of any desired genomes in complex environmental samples, regardless of their abundance. This method enables single-genome whole-genome amplification and targeting characterizations of rare viral species and will facilitate our ability to access the mutational profile in single-virus genomes and contribute to an improved understanding of viral ecology.
Collapse
Affiliation(s)
- Liyin Chen
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | - Anqi Chen
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | - Xinge Diana Zhang
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | | | - Hee-Sun Han
- Department of Chemistry, University of Illinois Urbana-Champaign, Urbana, IL61801
- Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL61801
| | - Yi Xiao
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | - Gao Xiao
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | - James M. Pipas
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA15260
| | - David A. Weitz
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
- Department of Physics, Harvard University, Cambridge, MA02138
| |
Collapse
|
42
|
Vakirlis N, Kupczok A. Large-scale investigation of species-specific orphan genes in the human gut microbiome elucidates their evolutionary origins. Genome Res 2024; 34:888-903. [PMID: 38977308 PMCID: PMC11293555 DOI: 10.1101/gr.278977.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 06/12/2024] [Indexed: 07/10/2024]
Abstract
Species-specific genes, also known as orphans, are ubiquitous across life's domains. In prokaryotes, species-specific orphan genes (SSOGs) are mostly thought to originate in external elements such as viruses followed by horizontal gene transfer, whereas the scenario of native origination, through rapid divergence or de novo, is mostly dismissed. However, quantitative evidence supporting either scenario is lacking. Here, we systematically analyzed genomes from 4644 human gut microbiome species and identified more than 600,000 unique SSOGs, representing an average of 2.6% of a given species' pangenome. These sequences are mostly rare within each species yet show signs of purifying selection. Overall, SSOGs use optimal codons less frequently, and their proteins are more disordered than those of conserved genes (i.e., non-SSOGs). Importantly, across species, the GC content of SSOGs closely matches that of conserved ones. In contrast, the ∼5% of SSOGs that share similarity to known viral sequences have distinct characteristics, including lower GC content. Thus, SSOGs with similarity to viruses differ from the remaining SSOGs, contrasting an external origination scenario for most of them. By examining the orthologous genomic region in closely related species, we show that a small subset of SSOGs likely evolved natively de novo and find that these genes also differ in their properties from the remaining SSOGs. Our results challenge the notion that external elements are the dominant source of prokaryotic genetic novelty and will enable future studies into the biological role and relevance of species-specific genes in the human gut.
Collapse
Affiliation(s)
- Nikolaos Vakirlis
- Institute For Fundamental Biomedical Research, B.S.R.C. "Alexander Fleming," Vari 166 72, Greece;
- Institute for General Microbiology, Kiel University, 24118 Kiel, Germany
| | - Anne Kupczok
- Bioinformatics Group, Wageningen University, 6700 PB Wageningen, The Netherlands
| |
Collapse
|
43
|
Kim N, Ma J, Kim W, Kim J, Belenky P, Lee I. Genome-resolved metagenomics: a game changer for microbiome medicine. Exp Mol Med 2024; 56:1501-1512. [PMID: 38945961 PMCID: PMC11297344 DOI: 10.1038/s12276-024-01262-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 03/06/2024] [Accepted: 03/25/2024] [Indexed: 07/02/2024] Open
Abstract
Recent substantial evidence implicating commensal bacteria in human diseases has given rise to a new domain in biomedical research: microbiome medicine. This emerging field aims to understand and leverage the human microbiota and derivative molecules for disease prevention and treatment. Despite the complex and hierarchical organization of this ecosystem, most research over the years has relied on 16S amplicon sequencing, a legacy of bacterial phylogeny and taxonomy. Although advanced sequencing technologies have enabled cost-effective analysis of entire microbiota, translating the relatively short nucleotide information into the functional and taxonomic organization of the microbiome has posed challenges until recently. In the last decade, genome-resolved metagenomics, which aims to reconstruct microbial genomes directly from whole-metagenome sequencing data, has made significant strides and continues to unveil the mysteries of various human-associated microbial communities. There has been a rapid increase in the volume of whole metagenome sequencing data and in the compilation of novel metagenome-assembled genomes and protein sequences in public depositories. This review provides an overview of the capabilities and methods of genome-resolved metagenomics for studying the human microbiome, with a focus on investigating the prokaryotic microbiota of the human gut. Just as decoding the human genome and its variations marked the beginning of the genomic medicine era, unraveling the genomes of commensal microbes and their sequence variations is ushering us into the era of microbiome medicine. Genome-resolved metagenomics stands as a pivotal tool in this transition and can accelerate our journey toward achieving these scientific and medical milestones.
Collapse
Affiliation(s)
- Nayeon Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Junyeong Ma
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Wonjong Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Jungyeon Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Peter Belenky
- Department of Molecular Microbiology and Immunology, Brown University, Providence, RI, 02912, USA.
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea.
- POSTECH Biotech Center, Pohang University of Science and Technology (POSTECH), Pohang, 37673, Republic of Korea.
| |
Collapse
|
44
|
Pechlivanis N, Karakatsoulis G, Kyritsis K, Tsagiopoulou M, Sgardelis S, Kappas I, Psomopoulos F. Microbial co-occurrence network demonstrates spatial and climatic trends for global soil diversity. Sci Data 2024; 11:672. [PMID: 38909071 PMCID: PMC11193810 DOI: 10.1038/s41597-024-03528-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 06/14/2024] [Indexed: 06/24/2024] Open
Abstract
Despite recent research efforts to explore the co-occurrence patterns of diverse microbes within soil microbial communities, a substantial knowledge-gap persists regarding global climate influences on soil microbiota behaviour. Comprehending co-occurrence patterns within distinct geoclimatic groups is pivotal for unravelling the ecological structure of microbial communities, that are crucial for preserving ecosystem functions and services. Our study addresses this gap by examining global climatic patterns of microbial diversity. Using data from the Earth Microbiome Project, we analyse a meta-community co-occurrence network for bacterial communities. This method unveils substantial shifts in topological features, highlighting regional and climatic trends. Arid, Polar, and Tropical zones show lower diversity but maintain denser networks, whereas Temperate and Cold zones display higher diversity alongside more modular networks. Furthermore, it identifies significant co-occurrence patterns across diverse climatic regions. Central taxa associated with different climates are pinpointed, highlighting climate's pivotal role in community structure. In conclusion, our study identifies significant correlations between microbial interactions in diverse climatic regions, contributing valuable insights into the intricate dynamics of soil microbiota.
Collapse
Affiliation(s)
- Nikos Pechlivanis
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thermi, 57001, Thessaloniki, Greece
- Department of Genetics, Development and Molecular Biology, School of Biology, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
| | - Georgios Karakatsoulis
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thermi, 57001, Thessaloniki, Greece
| | - Konstantinos Kyritsis
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thermi, 57001, Thessaloniki, Greece
| | - Maria Tsagiopoulou
- Centro Nacional de Analisis Genomico (CNAG), C/Baldiri Reixac 4, 08028, Barcelona, Spain
| | - Stefanos Sgardelis
- Department of Ecology, School of Biology, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
| | - Ilias Kappas
- Department of Genetics, Development and Molecular Biology, School of Biology, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
| | - Fotis Psomopoulos
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thermi, 57001, Thessaloniki, Greece.
| |
Collapse
|
45
|
Ninck S, Klaus T, Kochetkova TV, Esser SP, Sewald L, Kaschani F, Bräsen C, Probst AJ, Kublanov IV, Siebers B, Kaiser M. Environmental activity-based protein profiling for function-driven enzyme discovery from natural communities. ENVIRONMENTAL MICROBIOME 2024; 19:36. [PMID: 38831353 PMCID: PMC11145796 DOI: 10.1186/s40793-024-00577-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Accepted: 05/06/2024] [Indexed: 06/05/2024]
Abstract
BACKGROUND Microbial communities are important drivers of global biogeochemical cycles, xenobiotic detoxification, as well as organic matter decomposition. Their major metabolic role in ecosystem functioning is ensured by a unique set of enzymes, providing a tremendous yet mostly hidden enzymatic potential. Exploring this enzymatic repertoire is therefore not only relevant for a better understanding of how microorganisms function in their natural environment, and thus for ecological research, but further turns microbial communities, in particular from extreme habitats, into a valuable resource for the discovery of novel enzymes with potential applications in biotechnology. Different strategies for their uncovering such as bioprospecting, which relies mainly on metagenomic approaches in combination with sequence-based bioinformatic analyses, have emerged; yet accurate function prediction of their proteomes and deciphering the in vivo activity of an enzyme remains challenging. RESULTS Here, we present environmental activity-based protein profiling (eABPP), a multi-omics approach that extends genome-resolved metagenomics with mass spectrometry-based ABPP. This combination allows direct profiling of environmental community samples in their native habitat and the identification of active enzymes based on their function, even without sequence or structural homologies to annotated enzyme families. eABPP thus bridges the gap between environmental genomics, correct function annotation, and in vivo enzyme activity. As a showcase, we report the successful identification of active thermostable serine hydrolases from eABPP of natural microbial communities from two independent hot springs in Kamchatka, Russia. CONCLUSIONS By reporting enzyme activities within an ecosystem in their native state, we anticipate that eABPP will not only advance current methodological approaches to sequence homology-guided enzyme discovery from environmental ecosystems for subsequent biocatalyst development but also contributes to the ecological investigation of microbial community interactions by dissecting their underlying molecular mechanisms.
Collapse
Affiliation(s)
- Sabrina Ninck
- Chemical Biology, Centre of Medical Biotechnology (ZMB), Faculty of Biology, University of Duisburg-Essen, Universitätsstr. 2, 45117, Essen, Germany.
| | - Thomas Klaus
- Molecular Enzyme Technology and Biochemistry, Environmental Microbiology and Biotechnology (EMB), Centre for Water and Environmental Research (CWE), Faculty of Chemistry, University of Duisburg-Essen, Universitätsstr. 5, 45117, Essen, Germany
| | - Tatiana V Kochetkova
- Winogradsky Institute of Microbiology, Research Center of Biotechnology, Russian Academy of Sciences, Prospekt 60-Let Oktyabrya 7-2, Moscow, 117312, Russia
| | - Sarah P Esser
- Environmental Metagenomics, Research Centre One Health Ruhr of the University Alliance Ruhr, Faculty of Chemistry, University of Duisburg-Essen, Universitätsstr. 5, 45117, Essen, Germany
| | - Leonard Sewald
- Chemical Biology, Centre of Medical Biotechnology (ZMB), Faculty of Biology, University of Duisburg-Essen, Universitätsstr. 2, 45117, Essen, Germany
| | - Farnusch Kaschani
- Chemical Biology, Centre of Medical Biotechnology (ZMB), Faculty of Biology, University of Duisburg-Essen, Universitätsstr. 2, 45117, Essen, Germany
| | - Christopher Bräsen
- Molecular Enzyme Technology and Biochemistry, Environmental Microbiology and Biotechnology (EMB), Centre for Water and Environmental Research (CWE), Faculty of Chemistry, University of Duisburg-Essen, Universitätsstr. 5, 45117, Essen, Germany
| | - Alexander J Probst
- Environmental Metagenomics, Research Centre One Health Ruhr of the University Alliance Ruhr, Faculty of Chemistry, University of Duisburg-Essen, Universitätsstr. 5, 45117, Essen, Germany
- Centre for Water and Environmental Research (CWE), University of Duisburg-Essen, Universitätsstr. 2, 45117, Essen, Germany
- Centre of Medical Biotechnology (ZMB), University of Duisburg-Essen, Universitätsstr. 2, 45117, Essen, Germany
| | - Ilya V Kublanov
- Winogradsky Institute of Microbiology, Research Center of Biotechnology, Russian Academy of Sciences, Prospekt 60-Let Oktyabrya 7-2, Moscow, 117312, Russia
| | - Bettina Siebers
- Molecular Enzyme Technology and Biochemistry, Environmental Microbiology and Biotechnology (EMB), Centre for Water and Environmental Research (CWE), Faculty of Chemistry, University of Duisburg-Essen, Universitätsstr. 5, 45117, Essen, Germany.
| | - Markus Kaiser
- Chemical Biology, Centre of Medical Biotechnology (ZMB), Faculty of Biology, University of Duisburg-Essen, Universitätsstr. 2, 45117, Essen, Germany.
| |
Collapse
|
46
|
Piton G, Allison SD, Bahram M, Hildebrand F, Martiny JBH, Treseder KK, Martiny AC. Reply to: Microbial dark matter could add uncertainties to metagenomic trait estimations. Nat Microbiol 2024; 9:1431-1433. [PMID: 38740930 DOI: 10.1038/s41564-024-01688-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 03/25/2024] [Indexed: 05/16/2024]
Affiliation(s)
- Gabin Piton
- Department of Earth System Science, University of California, Irvine, Irvine, CA, USA.
- Eco&Sols, INRAE-IRD-CIRAD-SupAgro, University of Montpellier, Montpellier, France.
| | - Steven D Allison
- Department of Earth System Science, University of California, Irvine, Irvine, CA, USA
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| | - Mohammad Bahram
- Department of Ecology, Swedish University of Agricultural Sciences, Uppsala, Sweden
- Institute of Ecology and Earth Sciences, University of Tartu, Tartu, Estonia
| | - Falk Hildebrand
- Gut Microbes and Health, Quadram Institute Bioscience, Norwich, UK
- Digital Biology, Earlham Institute, Norwich, UK
| | - Jennifer B H Martiny
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| | - Kathleen K Treseder
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| | - Adam C Martiny
- Department of Earth System Science, University of California, Irvine, Irvine, CA, USA
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| |
Collapse
|
47
|
Hogg BN, Schnepel C, Finnigan JD, Charnock SJ, Hayes MA, Turner NJ. The Impact of Metagenomics on Biocatalysis. Angew Chem Int Ed Engl 2024; 63:e202402316. [PMID: 38494442 PMCID: PMC11497237 DOI: 10.1002/anie.202402316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 03/11/2024] [Accepted: 03/12/2024] [Indexed: 03/19/2024]
Abstract
In the ever-growing demand for sustainable ways to produce high-value small molecules, biocatalysis has come to the forefront of greener routes to these chemicals. As such, the need to constantly find and optimise suitable biocatalysts for specific transformations has never been greater. Metagenome mining has been shown to rapidly expand the toolkit of promiscuous enzymes needed for new transformations, without requiring protein engineering steps. If protein engineering is needed, the metagenomic candidate can often provide a better starting point for engineering than a previously discovered enzyme on the open database or from literature, for instance. In this review, we highlight where metagenomics has made substantial impact on the area of biocatalysis in recent years. We review the discovery of enzymes in previously unexplored or 'hidden' sequence space, leading to the characterisation of enzymes with enhanced properties that originate from natural selection pressures in native environments.
Collapse
Affiliation(s)
- Bethany N. Hogg
- Department of ChemistryUniversity of ManchesterManchester Institute of Biotechnology131 Princess StreetManchesterM1 7DNUK
| | - Christian Schnepel
- School of Engineering Sciences in Chemistry, Biotechnology and HealthDepartment of Industrial BiotechnologyKTH Royal Institute of TechnologyAlbaNova University Center11421StockholmSE
| | | | | | - Martin A. Hayes
- Compound Synthesis and ManagementDiscovery SciencesBiopharmaceuticals R&D AstraZenecaMölndal 431 50GothenburgSE
| | - Nicholas J. Turner
- Department of ChemistryUniversity of ManchesterManchester Institute of Biotechnology131 Princess StreetManchesterM1 7DNUK
| |
Collapse
|
48
|
Drost HG. Unveiling the expanding protein universe of life. Nat Rev Genet 2024; 25:306. [PMID: 38424236 DOI: 10.1038/s41576-024-00716-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Affiliation(s)
- Hajk-Georg Drost
- Computational Biology Group, Max Planck Institute for Biology Tübingen, Tübingen, Germany.
| |
Collapse
|
49
|
Wang Y, Qu M, Bi Y, Liu WJ, Ma S, Wan B, Hu Y, Zhu B, Zhang G, Gao GF. The multi-kingdom microbiome catalog of the chicken gastrointestinal tract. BIOSAFETY AND HEALTH 2024; 6:101-115. [PMID: 40078943 PMCID: PMC11894977 DOI: 10.1016/j.bsheal.2024.02.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 02/22/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2025] Open
Abstract
Chicken is an important food animal worldwide and plays an important role in human life by providing meat and eggs. Despite recent significant advances in gut microbiome studies, a comprehensive study of chicken gut bacterial, archaeal, and viral genomes remains unavailable. In this study, we constructed a chicken multi-kingdom microbiome catalog (CMKMC), including 18,201 bacterial, 225 archaeal, and 33,411 viral genomes, and annotated over 6,076,006 protein-coding genes by integrating 135 chicken gut metagenomes and publicly available metagenome-assembled genomes (MAGs) from ten countries. We found that 812 and 240 MAGs in our dataset were putative novel species and genera, respectively, far beyond what was previously reported. The newly unclassified MAGs were predominant in Phyla Firmicutes_A (n = 263), followed by Firmicutes (n = 126), Bacteroidota (n = 121), and Proteobacteria (n = 87). Most of the classified species-level viral operational taxonomic units belong to Caudovirales. Approximately, 63.24 % of chicken gut viromes are predicted to infect two or more hosts, including complete circular viruses. Moreover, we found that diverse auxiliary metabolic genes and antibiotic resistance genes were carried by viruses. Together, our CMKMC provides the largest integrated MAGs and viral genomes from the chicken gut to date, functional insights into the chicken gastrointestinal tract microbiota, and paves the way for microbial interventions for better chicken health and productivity.
Collapse
Affiliation(s)
- Yanan Wang
- International Joint Research Center of National Animal Immunology, College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
- CAS Key Laboratory of Pathogen Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences (CAS), Beijing 100101, China
- Longhu Laboratory, Zhengzhou 450046, China
| | - Mengqi Qu
- International Joint Research Center of National Animal Immunology, College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
- CAS Key Laboratory of Pathogen Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Yuhai Bi
- CAS Key Laboratory of Pathogen Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences (CAS), Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - William J. Liu
- NHC Key Laboratory of Biosafety, National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 102206, China
| | - Sufang Ma
- CAS Key Laboratory of Pathogen Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Bo Wan
- International Joint Research Center of National Animal Immunology, College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
- Longhu Laboratory, Zhengzhou 450046, China
| | - Yongfei Hu
- State Key Laboratory of Animal Nutrition and Feeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Baoli Zhu
- CAS Key Laboratory of Pathogen Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences (CAS), Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Gaiping Zhang
- International Joint Research Center of National Animal Immunology, College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
- Longhu Laboratory, Zhengzhou 450046, China
| | - George F. Gao
- CAS Key Laboratory of Pathogen Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences (CAS), Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- NHC Key Laboratory of Biosafety, National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 102206, China
| |
Collapse
|
50
|
Van Goethem MW, Marasco R, Hong P, Daffonchio D. The antibiotic crisis: On the search for novel antibiotics and resistance mechanisms. Microb Biotechnol 2024; 17:e14430. [PMID: 38465465 PMCID: PMC10926060 DOI: 10.1111/1751-7915.14430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 02/07/2024] [Accepted: 02/13/2024] [Indexed: 03/12/2024] Open
Abstract
In the relentless battle for human health, the proliferation of antibiotic-resistant bacteria has emerged as an impending catastrophe of unprecedented magnitude, potentially driving humanity towards the brink of an unparalleled healthcare crisis. The unyielding advance of antibiotic resistance looms as the foremost threat of the 21st century in clinical, agricultural and environmental arenas. Antibiotic resistance is projected to be the genesis of the next global pandemic, with grim estimations of tens of millions of lives lost annually by 2050. Amidst this impending calamity, our capacity to unearth novel antibiotics has languished, with the past four decades marred by a disheartening 'antibiotic discovery void'. With nearly 80% of our current antibiotics originating from natural or semi-synthetic sources, our responsibility is to cast our investigative nets into uncharted ecological niches teeming with microbial strife, the so-called 'microbial oases of interactions'. Within these oases of interactions, where microorganisms intensively compete for space and nutrients, a dynamic and ever-evolving microbial 'arms race' is constantly in place. Such a continuous cycle of adaptation and counter-adaptation is a fundamental aspect of microbial ecology and evolution, as well as the secrets to unique, undiscovered antibiotics, our last bastion against the relentless tide of resistance. In this context, it is imperative to invest in research to explore the competitive realms, like the plant rhizosphere, biological soil crusts, deep sea hydrothermal vents, marine snow and the most modern plastisphere, in which competitive interactions are at the base of the microorganisms' struggle for survival and dominance in their ecosystems: identify novel antibiotic by targeting microbial oases of interactions could represent a 'missing piece of the puzzle' in our fight against antibiotic resistance.
Collapse
Affiliation(s)
- Marc W. Van Goethem
- Biological and Environmental Sciences and Engineering Division (BESE)King Abdullah University of Science and Technology (KAUST)ThuwalSaudi Arabia
| | - Ramona Marasco
- Biological and Environmental Sciences and Engineering Division (BESE)King Abdullah University of Science and Technology (KAUST)ThuwalSaudi Arabia
| | - Pei‐Ying Hong
- Biological and Environmental Sciences and Engineering Division (BESE)King Abdullah University of Science and Technology (KAUST)ThuwalSaudi Arabia
- Water Desalination and Reuse CenterBiological and Environmental Science and Engineering, King Abdullah University of Science and Technology (KAUST)ThuwalSaudi Arabia
| | - Daniele Daffonchio
- Biological and Environmental Sciences and Engineering Division (BESE)King Abdullah University of Science and Technology (KAUST)ThuwalSaudi Arabia
| |
Collapse
|