1
|
Hone H, Li T, Kaur J, Wood JL, Sawbridge T. Often in silico, rarely in vivo: characterizing endemic plant-associated microbes for system-appropriate biofertilizers. Front Microbiol 2025; 16:1568162. [PMID: 40356655 PMCID: PMC12066602 DOI: 10.3389/fmicb.2025.1568162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2025] [Accepted: 04/07/2025] [Indexed: 05/15/2025] Open
Abstract
The potential of phosphate-solubilizing microbes (PSMs) to enhance plant phosphorus uptake and reduce fertilizer dependency remains underutilized. This is partially attributable to frequent biofertilizer-farming system misalignments that reduce efficacy, and an incomplete understanding of underlying mechanisms. This study explored the seed microbiomes of nine Australian lucerne cultivars to identify and characterize high-efficiency PSMs. From a library of 223 isolates, 94 (42%) exhibited phosphate solubilization activity on Pikovskaya agar, with 15 showing high efficiency (PSI > 1.5). Genomic analysis revealed that the "high-efficiency" phosphate-solubilizing microbes belonged to four genera (Curtobacterium, Pseudomonas, Paenibacillus, Pantoea), including novel strains and species. However, key canonical genes, such as pqq operon and gcd, did not reliably predict phenotype, highlighting the limitations of in silico predictions. Mutagenesis of the high-efficiency isolate Pantoea rara Lu_Sq_004 generated mutants with enhanced and null solubilization phenotypes, revealing the potential role of "auxiliary" genes in downstream function of solubilization pathways. Inoculation studies with lucerne seedlings demonstrated a significant increase in shoot length (p < 0.05) following treatment with the enhanced-solubilization mutant, indicating a promising plant growth-promotion effect. These findings highlight the potential of more personalized "system-appropriate" biofertilizers and underscore the importance of integrating genomic, phenotypic, and in planta analyses to validate function. Further research is required to investigate links between genomic markers and functional outcomes to optimize the development of sustainable agricultural inputs.
Collapse
Affiliation(s)
- Holly Hone
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
- DairyBio, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| | - Tongda Li
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| | - Jatinder Kaur
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| | - Jennifer L. Wood
- Department of Microbiology, Anatomy, Physiology and Pharmacology, La Trobe University, Bundoora, VIC, Australia
| | - Timothy Sawbridge
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
- DairyBio, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| |
Collapse
|
2
|
Lakshman AH, Wright ES. EvoWeaver: large-scale prediction of gene functional associations from coevolutionary signals. Nat Commun 2025; 16:3878. [PMID: 40274827 PMCID: PMC12022180 DOI: 10.1038/s41467-025-59175-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2025] [Accepted: 04/09/2025] [Indexed: 04/26/2025] Open
Abstract
The known universe of uncharacterized proteins is expanding far faster than our ability to annotate their functions through laboratory study. Computational annotation approaches rely on similarity to previously studied proteins, thereby ignoring unstudied proteins. Coevolutionary approaches hold promise for injecting new information into our knowledge of the protein universe by linking proteins through 'guilt-by-association'. However, existing coevolutionary algorithms have insufficient accuracy and scalability to connect the entire universe of proteins. We present EvoWeaver, a method that weaves together 12 signals of coevolution to quantify the degree of shared evolution between genes. EvoWeaver accurately identifies proteins involved in protein complexes or separate steps of a biochemical pathway. We show the merits of EvoWeaver by partly reconstructing known biochemical pathways without any prior knowledge other than that available from genomic sequences. Applying EvoWeaver to 1545 gene groups from 8564 genomes reveals missing connections in popular databases and potentially undiscovered links between proteins.
Collapse
Affiliation(s)
- Aidan H Lakshman
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Erik S Wright
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA.
- Center for Evolutionary Biology and Medicine, Pittsburgh, PA, USA.
| |
Collapse
|
3
|
Cross K, Beckman N, Jahnes B, Sabree ZL. Microbiome metabolic capacity is buffered against phylotype losses by functional redundancy. Appl Environ Microbiol 2025; 91:e0236824. [PMID: 39882875 PMCID: PMC11837509 DOI: 10.1128/aem.02368-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2024] [Accepted: 01/01/2025] [Indexed: 01/31/2025] Open
Abstract
Many animals contain a species-rich and diverse gut microbiota that likely contributes to several host-supportive services that include diet processing and nutrient provisioning. Loss of microbiome taxa and their associated metabolic functions as result of perturbations may result in loss of microbiome-level services and reduction of metabolic capacity. If metabolic functions are shared by multiple taxa (i.e., functional redundancy), including deeply divergent lineages, then the impact of taxon/function losses may be dampened. We examined to what degree alterations in phylotype diversity impact microbiome-level metabolic capacity. Feeding two nutritionally imbalanced diets to omnivorous Periplaneta americana over 8 weeks reduced the diversity of their phylotype-rich gut microbiomes by ~25% based on 16S rRNA gene amplicon sequencing, yet PICRUSt2-inferred metabolic pathway richness was largely unaffected due to their being polyphyletic. We concluded that the nonlinearity between taxon and metabolic functional losses is due to microbiome members sharing many well-characterized metabolic functions, with lineages remaining after perturbation potentially being capable of preventing microbiome "service outages" due to functional redundancy. IMPORTANCE Diet can affect gut microbiome taxonomic composition and diversity, but its impacts on community-level functional capabilities are less clear. Host health and fitness are increasingly being linked to microbiome composition and further modeling of the relationship between microbiome taxonomic and metabolic functional capability is needed to inform these linkages. Invertebrate animal models like the omnivorous American cockroach are ideal for this inquiry because they are amenable to various diets and provide high replicates per treatment at low costs and thus enabling rigorous statistical analyses and hypothesis testing. Microbiome taxonomic composition is diet-labile and diversity was reduced after feeding on unbalanced diets (i.e., post-treatment), but the predicted functional capacities of the post-treatment microbiomes were less affected likely due to the resilience of several abundant taxa surviving the perturbation as well as many metabolic functions being shared by several taxa. These results suggest that both taxonomic and functional profiles should be considered when attempting to infer how perturbations are altering gut microbiome services and possible host outcomes.
Collapse
Affiliation(s)
- Kayla Cross
- Department of Microbiology, Ohio State University, Columbus, Ohio, USA
| | | | - Benjamin Jahnes
- Department of Evolution, Ecology and Organismal Biology, Ohio State University, Columbus, Ohio, USA
| | - Zakee L. Sabree
- Department of Microbiology, Ohio State University, Columbus, Ohio, USA
- Department of Evolution, Ecology and Organismal Biology, Ohio State University, Columbus, Ohio, USA
| |
Collapse
|
4
|
Kumar RKR, Haddad I, Ndiaye MM, Marbouty M, Vinh J, Verdier Y. A single microfluidic device for multi-omics analysis sample preparation. LAB ON A CHIP 2025; 25:590-599. [PMID: 39820672 DOI: 10.1039/d4lc00919c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2025]
Abstract
Combining different "omics" approaches, such as genomics and proteomics, is necessary to generate a detailed and complete insight into microbiome comprehension. Proper sample collection and processing and accurate analytical methods are crucial in generating reliable data. We previously developed the ChipFilter device for proteomic analysis of microbial samples. We have shown that this device coupled to LC-MS/MS can successfully be used to identify microbial proteins. In the present work, we have developed our workflow to analyze concomitantly proteins and nucleic acids from the same sample. We performed lysis and proteolysis in the device using cultures of E. coli, B. subtilis, and S. cerevisiae. After peptide recovery for LC-MS/MS analysis, DNA from the same samples was recovered and successfully amplified by PCR for the 3 species. This workflow was further extended to a complex microbial mixture of known compositions. Protein analysis was carried out, enabling the identification of more than 5000 proteins. The recovered DNA was sequenced, performing comparable to DNA extracted with a commercial kit without proteolysis. Our results show that the ChipFilter device is suited to prepare samples for parallel proteomic and genomic analyses, which is particularly relevant in the case of low-abundant samples and drastically reduces sampling bias.
Collapse
Affiliation(s)
- Ranjith Kumar Ravi Kumar
- Spectrométrie de Masse Biologique et Protéomique SMBP, ESPCI Paris, LPC CNRS UMR 8249, PSL University, 10 Rue Vauquelin, F-75005 Paris, France.
| | - Iman Haddad
- Spectrométrie de Masse Biologique et Protéomique SMBP, ESPCI Paris, LPC CNRS UMR 8249, PSL University, 10 Rue Vauquelin, F-75005 Paris, France.
| | - Massamba Mbacké Ndiaye
- Spectrométrie de Masse Biologique et Protéomique SMBP, ESPCI Paris, LPC CNRS UMR 8249, PSL University, 10 Rue Vauquelin, F-75005 Paris, France.
| | - Martial Marbouty
- Institut Pasteur, Spacial Regulation of Genome Group, Université Paris Cité, CNRS 3525 - 25-28 Rue du Dr Roux, F-75015 Paris, France
| | - Joëlle Vinh
- Spectrométrie de Masse Biologique et Protéomique SMBP, ESPCI Paris, LPC CNRS UMR 8249, PSL University, 10 Rue Vauquelin, F-75005 Paris, France.
| | - Yann Verdier
- Spectrométrie de Masse Biologique et Protéomique SMBP, ESPCI Paris, LPC CNRS UMR 8249, PSL University, 10 Rue Vauquelin, F-75005 Paris, France.
| |
Collapse
|
5
|
Boer MD, Melkonian C, Zafeiropoulos H, Haas AF, Garza DR, Dutilh BE. Improving genome-scale metabolic models of incomplete genomes with deep learning. iScience 2024; 27:111349. [PMID: 39660058 PMCID: PMC11629236 DOI: 10.1016/j.isci.2024.111349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 06/10/2024] [Accepted: 11/05/2024] [Indexed: 12/12/2024] Open
Abstract
Deciphering microbial metabolism is essential for understanding ecosystem functions. Genome-scale metabolic models (GSMMs) predict metabolic traits from genomic data, but constructing GSMMs for uncultured bacteria is challenging due to incomplete metagenome-assembled genomes, resulting in many gaps. We introduce the deep neural network guided imputation of reactomes (DNNGIOR), which uses AI to improve gap-filling by learning from the presence and absence of metabolic reactions across diverse bacterial genomes. Key factors for prediction accuracy are: (1) reaction frequency across all bacteria and (2) phylogenetic distance of the query to the training genomes. DNNGIOR predictions achieve an average F1 score of 0.85 for reactions present in over 30% of training genomes. DNNGIOR guided gap-filling was 14 times more accurate for draft reconstructions and 2-9 times for curated models than unweighted gap-filling.
Collapse
Affiliation(s)
- Meine D. Boer
- Theoretical Biology and Bioinformatics, Utrecht University, 3584 CH Utrecht, the Netherlands
- Department Marine Microbiology and Biogeochemistry, NIOZ Royal Netherlands Institute for Sea Research, PO Box 59, Den Burg 1790 AB, Texel, The Netherlands
| | - Chrats Melkonian
- Theoretical Biology and Bioinformatics, Utrecht University, 3584 CH Utrecht, the Netherlands
- Bioinformatics Group, Wageningen University and Research, Wageningen, the Netherlands
| | - Haris Zafeiropoulos
- Laboratory of Molecular Bacteriology, Rega Institute for Medical Research, Department of Microbiology, Immunology and Transplantation, KU Leuven, 3000 Leuven, Belgium
| | - Andreas F. Haas
- Department Marine Microbiology and Biogeochemistry, NIOZ Royal Netherlands Institute for Sea Research, PO Box 59, Den Burg 1790 AB, Texel, The Netherlands
| | | | - Bas E. Dutilh
- Theoretical Biology and Bioinformatics, Utrecht University, 3584 CH Utrecht, the Netherlands
- Institute of Biodiversity, Faculty of Biological Sciences, Cluster of Excellence Balance of the Microverse, Friedrich Schiller University Jena, 07743 Jena, Germany
| |
Collapse
|
6
|
Hourigan D, Miceli de Farias F, O’Connor PM, Hill C, Ross RP. Discovery and synthesis of leaderless bacteriocins from the Actinomycetota. J Bacteriol 2024; 206:e0029824. [PMID: 39404462 PMCID: PMC11580447 DOI: 10.1128/jb.00298-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Accepted: 09/24/2024] [Indexed: 11/22/2024] Open
Abstract
Leaderless bacteriocins are a unique class of bacteriocins that possess antimicrobial activity after translation and have few cases of documented resistance. Aureocin A53 and lacticin Q are considered two of the most well-studied leaderless bacteriocins. Here, we used in silico genome mining to search for novel aureocin A53-like leaderless bacteriocins in GenBank and MGnify. We identified 757 core peptides across 430 genomes with 75 species found currently without characterized leaderless bacteriocin production. These include putative novel species containing bacteriocin gene clusters (BGCs) from the genera Streptomyces (sp. NBC_00237) and Agrococcus (sp. SL85). To date, all characterized leaderless bacteriocins have been found within the phylum Bacillota, but this study identified 97 core peptides within the phylum Actinomycetota. Members of this phylum are traditionally associated with the production of antibiotics, such is the case with the genus Streptomyces. Actinomycetota is an underexplored phylum in terms of bacteriocin production with no characterized leaderless bacteriocin production to date. The two novel leaderless bacteriocins arcanocin and arachnicin from Actinomycetota members Arcanobacterium sp. and Arachnia sp., respectively, were chemically synthesized and antimicrobial activity was verified. These peptides were encoded in human gut (PRJNA485056) and oral (PRJEB43277) microbiomes, respectively. This research highlights the biosynthetic potential of Actinomycetota in terms of leaderless bacteriocin production and describes the first antimicrobial peptides encoded in the genera Arcanobacterium and Arachnia.IMPORTANCEBacteriocins are gathering attention as alternatives to current antibiotics given the increasing incidence of antimicrobial resistance. Leaderless bacteriocins are considered a commercially attractive subclass of bacteriocins due to the ability to synthesize active peptide and low levels of documented resistance. Therefore, in this work, we mined publicly available data to determine how widespread and diverse leaderless bacteriocins are within the domain of bacteria. Actinomycetota, known for its antibiotic producers but lacking described and characterized bacteriocins, proved to be a rich source of leaderless bacteriocins-97 in total. Two such peptides, arcanocin and arachnicin, were chemically synthesized and have antimicrobial activity. These bacteriocins may provide a novel source of novel antimicrobials that could aid in the development of future alternative antimicrobials and highlight that the Actinomycetota are an underexplored resource of bacteriocin peptides.
Collapse
Affiliation(s)
- David Hourigan
- APC Microbiome Ireland, University College Cork, Cork, Ireland
- School of Microbiology, University College Cork, Cork, Ireland
| | | | - Paula M. O’Connor
- APC Microbiome Ireland, University College Cork, Cork, Ireland
- Teagasc Food Research Centre, Cork, Ireland
| | - Colin Hill
- APC Microbiome Ireland, University College Cork, Cork, Ireland
- School of Microbiology, University College Cork, Cork, Ireland
| | - R. Paul Ross
- APC Microbiome Ireland, University College Cork, Cork, Ireland
- School of Microbiology, University College Cork, Cork, Ireland
- Teagasc Food Research Centre, Cork, Ireland
| |
Collapse
|
7
|
de Crécy-Lagard V, Dias R, Friedberg I, Yuan Y, Swairjo MA. Limitations of Current Machine-Learning Models in Predicting Enzymatic Functions for Uncharacterized Proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.01.601547. [PMID: 39005379 PMCID: PMC11244979 DOI: 10.1101/2024.07.01.601547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Thirty to seventy percent of proteins in any given genome have no assigned function and have been labeled as the protein "unknome". This large knowledge gap prevents the biological community from fully leveraging the plethora of genomic data that is now available. Machine-learning approaches are showing some promise in propagating functional knowledge from experimentally characterized proteins to the correct set of isofunctional orthologs. However, they largely fail to predict enzymatic functions unseen in the training set, as shown by dissecting the predictions made for over 450 enzymes of unknown function from the model bacteria Escherichia coli uxgsing the DeepECTransformer platform. Lessons from these failures can help the community develop machine-learning methods that assist domain experts in making testable functional predictions for more members of the uncharacterized proteome. Article Summary Many proteins in any genome, ranging from 30 to 70%, lack an assigned function. This knowledge gap limits the full use of the vast available genomic data. Machine learning has shown promise in transferring functional knowledge from proteins of known functions to similar ones, but largely fails to predict novel functions not seen in its training data. Understanding these failures can guide the development of better machine-learning methods to help experts make accurate functional predictions for uncharacterized proteins.
Collapse
|
8
|
Kim JI, Manuele A, Maguire F, Zaheer R, McAllister TA, Beiko RG. Identification of key drivers of antimicrobial resistance in Enterococcus using machine learning. Can J Microbiol 2024; 70:446-460. [PMID: 39079170 DOI: 10.1139/cjm-2024-0049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/03/2024]
Abstract
With antimicrobial resistance (AMR) rapidly evolving in pathogens, quick and accurate identification of genetic determinants of phenotypic resistance is essential for improving surveillance, stewardship, and clinical mitigation. Machine learning (ML) models show promise for AMR prediction in diagnostics but require a deep understanding of internal processes to use effectively. Our study utilised AMR gene, pangenomic, and predicted plasmid features from 647 Enterococcus faecium and Enterococcus faecalis genomes across the One Health continuum, along with corresponding resistance phenotypes, to develop interpretive ML classifiers. Vancomycin resistance could be predicted with 99% accuracy with AMR gene features, 98% with pangenome features, and 96% with plasmid clusters. Top pangenome features overlapped with the resistance genes of the vanA operon, which are often laterally transmitted via plasmids. Doxycycline resistance prediction achieved approximately 92% accuracy with pangenome features, with the top feature being elements of Tn916 conjugative transposon, a tet(M) carrier. Erythromycin resistance prediction models achieved about 90% accuracy, but top features were negatively correlated with resistance due to the confounding effect of population structure. This work demonstrates the importance of reviewing ML models' features to discern biological relevance even when achieving high-performance metrics. Our workflow offers the potential to propose hypotheses for experimental testing, enhancing the understanding of AMR mechanisms, which are crucial for combating the AMR crisis.
Collapse
Affiliation(s)
- Jee In Kim
- Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, NS, Canada
- Agriculture and Agri-Food Canada, Lethbridge, AB, Canada
| | - Alexander Manuele
- Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, NS, Canada
| | - Finlay Maguire
- Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, NS, Canada
- Department of Community Health and Epidemiology, Dalhousie University, Faculty of Medicine, Halifax, NS, Canada
| | - Rahat Zaheer
- Agriculture and Agri-Food Canada, Lethbridge, AB, Canada
| | | | - Robert G Beiko
- Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, NS, Canada
| |
Collapse
|
9
|
Bonnici V, Chicco D. Seven quick tips for gene-focused computational pangenomic analysis. BioData Min 2024; 17:28. [PMID: 39227987 PMCID: PMC11370085 DOI: 10.1186/s13040-024-00380-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 08/12/2024] [Indexed: 09/05/2024] Open
Abstract
Pangenomics is a relatively new scientific field which investigates the union of all the genomes of a clade. The word pan means everything in ancient Greek; the term pangenomics originally regarded genomes of bacteria and was later intended to refer to human genomes as well. Modern bioinformatics offers several tools to analyze pangenomics data, paving the way to an emerging field that we can call computational pangenomics. Current computational power available for the bioinformatics community has made computational pangenomic analyses easy to perform, but this higher accessibility to pangenomics analysis also increases the chances to make mistakes and to produce misleading or inflated results, especially by beginners. To handle this problem, we present here a few quick tips for efficient and correct computational pangenomic analyses with a focus on bacterial pangenomics, by describing common mistakes to avoid and experienced best practices to follow in this field. We believe our recommendations can help the readers perform more robust and sound pangenomic analyses and to generate more reliable results.
Collapse
Affiliation(s)
- Vincenzo Bonnici
- Dipartimento di Scienze Matematiche Fisiche e Informatiche, Università di Parma, Parma, Italy.
| | - Davide Chicco
- Dipartimento di Informatica Sistemistica e Comunicazione, Università di Milano-Bicocca, Milan, Italy.
- Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
10
|
Jung JM, Rahman A, Schiffer AM, Weisberg AJ. Beav: a bacterial genome and mobile element annotation pipeline. mSphere 2024; 9:e0020924. [PMID: 39037262 PMCID: PMC11351099 DOI: 10.1128/msphere.00209-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 06/28/2024] [Indexed: 07/23/2024] Open
Abstract
Comprehensive and accurate genome annotation is crucial for inferring the predicted functions of an organism. Numerous tools exist to annotate genes, gene clusters, mobile genetic elements, and other diverse features. However, these tools and pipelines can be difficult to install and run, be specialized for a particular element or feature, or lack annotations for larger elements that provide important genomic context. Integrating results across analyses is also important for understanding gene function. To address these challenges, we present the Beav annotation pipeline. Beav is a command-line tool that automates the annotation of bacterial genome sequences, mobile genetic elements, molecular systems and gene clusters, key regulatory features, and other elements. Beav uses existing tools in addition to custom models, scripts, and databases to annotate diverse elements, systems, and sequence features. Custom databases for plant-associated microbes are incorporated to improve annotation of key virulence and symbiosis genes in agriculturally important pathogens and mutualists. Beav includes an optional Agrobacterium-specific pipeline that identifies and classifies oncogenic plasmids and annotates plasmid-specific features. Following the completion of all analyses, annotations are consolidated to produce a single comprehensive output. Finally, Beav generates publication-quality genome and plasmid maps. Beav is on Bioconda and is available for download at https://github.com/weisberglab/beav. IMPORTANCE Annotation of genome features, such as the presence of genes and their predicted function, or larger loci encoding secretion systems or biosynthetic gene clusters, is necessary for understanding the functions encoded by an organism. Genomes can also host diverse mobile genetic elements, such as integrative and conjugative elements and/or phages, that are often not annotated by existing pipelines. These elements can horizontally mobilize genes encoding for virulence, antimicrobial resistance, or other adaptive functions and alter the phenotype of an organism. We developed a software pipeline, called Beav, that combines new and existing tools for the comprehensive annotation of these and other major features. Existing pipelines often misannotate loci important for virulence or mutualism in plant-associated bacteria. Beav includes custom databases and optional workflows for the improved annotation of plant-associated bacteria. Beav is designed to be easy to install and run, making comprehensive genome annotation broadly available to the research community.
Collapse
Affiliation(s)
- Jewell M. Jung
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon, USA
| | - Arafat Rahman
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon, USA
| | - Andrea M. Schiffer
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon, USA
| | - Alexandra J. Weisberg
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon, USA
| |
Collapse
|
11
|
Jiang J, Czuchry D, Ru Y, Peng H, Shen J, Wang T, Zhao W, Chen W, Sui SF, Li Y, Li N. Activity-based metaproteomics driven discovery and enzymological characterization of potential α-galactosidases in the mouse gut microbiome. Commun Chem 2024; 7:184. [PMID: 39152233 PMCID: PMC11329505 DOI: 10.1038/s42004-024-01273-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Accepted: 08/08/2024] [Indexed: 08/19/2024] Open
Abstract
The gut microbiota offers an extensive resource of enzymes, but many remain uncharacterized. To distinguish the activities of similar annotated proteins and mine the potentially applicable ones in the microbiome, we applied an effective Activity-Based Metaproteomics (ABMP) strategy using a specific activity-based probe (ABP) to screen the entire gut microbiome for directly discovering active enzymes and their potential applications, not for exploring host-microbiome interactions. By using an activity-based cyclophellitol aziridine probe specific to α-galactosidases (AGAL), we successfully identified and characterized several gut microbiota enzymes possessing AGAL activities. Cryo-electron microscopy analysis of a newly characterized enzyme (AGLA5) revealed the covalent binding conformations between the AGAL5 active site and the cyclophellitol aziridine ABP, which could provide insights into the enzyme's catalytic mechanism. The four newly characterized AGALs have diverse potential activities, including raffinose family oligosaccharides (RFOs) hydrolysis and enzymatic blood group transformation. Collectively, we present a ABMP platform that facilitates gut microbiota AGALs discovery, biochemical activity annotations and potential industrial or biopharmaceutical applications.
Collapse
Affiliation(s)
- Jianbing Jiang
- Institute for Inheritance-Based Innovation of Chinese Medicine, School of Pharmacy, Shenzhen University Medical School, Shenzhen University, Shenzhen, 518055, China
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Diana Czuchry
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Yanxia Ru
- School of Life Sciences, Southern University of Science and Technology, Shenzhen, 518055, Guangdong, China
| | - Huipai Peng
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Junfeng Shen
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Teng Wang
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular Imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China
| | - Wenjuan Zhao
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Weihua Chen
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular Imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China
| | - Sen-Fang Sui
- School of Life Sciences, Southern University of Science and Technology, Shenzhen, 518055, Guangdong, China
- State Key Laboratory of Membrane Biology, Beijing Advanced Innovation Center for Structural Biology, Beijing Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, 100084, China
- Cryo-EM Center, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Yaowang Li
- School of Life Sciences, Southern University of Science and Technology, Shenzhen, 518055, Guangdong, China.
| | - Nan Li
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
- Shenzhen Key Laboratory of Genome Manipulation and Biosynthesis, Shenzhen, China.
| |
Collapse
|
12
|
Bizzotto E, Fraulini S, Zampieri G, Orellana E, Treu L, Campanaro S. MICROPHERRET: MICRObial PHEnotypic tRait ClassifieR using Machine lEarning Techniques. ENVIRONMENTAL MICROBIOME 2024; 19:58. [PMID: 39113074 PMCID: PMC11308548 DOI: 10.1186/s40793-024-00600-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 07/24/2024] [Indexed: 08/10/2024]
Abstract
BACKGROUND In recent years, there has been a rapid increase in the number of microbial genomes reconstructed through shotgun sequencing, and obtained by newly developed approaches including metagenomic binning and single-cell sequencing. However, our ability to functionally characterize these genomes by experimental assays is orders of magnitude less efficient. Consequently, there is a pressing need for the development of swift and automated strategies for the functional classification of microbial genomes. RESULTS The present work leverages a suite of supervised machine learning algorithms to establish a range of 86 metabolic and other ecological functions, such as methanotrophy and plastic degradation, starting from widely obtainable microbial genome annotations. Tests performed on independent datasets demonstrated robust performance across complete, fragmented, and incomplete genomes above a 70% completeness level for most of the considered functions. Application of the algorithms to the Biogas Microbiome database yielded predictions broadly consistent with current biological knowledge and correctly detecting functionally-related nuances of archaeal genomes. Finally, a case study focused on acetoclastic methanogenesis demonstrated how the developed machine learning models can be refined or expanded with models describing novel functions of interest. CONCLUSIONS The resulting tool, MICROPHERRET, incorporates a total of 86 models, one for each tested functional class, and can be applied to high-quality microbial genomes as well as to low-quality genomes derived from metagenomics and single-cell sequencing. MICROPHERRET can thus aid in understanding the functional role of newly generated genomes within their micro-ecological context.
Collapse
Affiliation(s)
- Edoardo Bizzotto
- Department of Biology, University of Padova, Padova, 35131, Italy
| | - Sofia Fraulini
- Department of Biology, University of Padova, Padova, 35131, Italy
| | - Guido Zampieri
- Department of Biology, University of Padova, Padova, 35131, Italy.
| | - Esteban Orellana
- Department of Biology, University of Padova, Padova, 35131, Italy
| | - Laura Treu
- Department of Biology, University of Padova, Padova, 35131, Italy
| | | |
Collapse
|
13
|
Huang YY, Price MN, Hung A, Gal-Oz O, Tripathi S, Smith CW, Ho D, Carion H, Deutschbauer AM, Arkin AP. Barcoded overexpression screens in gut Bacteroidales identify genes with roles in carbon utilization and stress resistance. Nat Commun 2024; 15:6618. [PMID: 39103350 DOI: 10.1038/s41467-024-50124-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Accepted: 06/28/2024] [Indexed: 08/07/2024] Open
Abstract
A mechanistic understanding of host-microbe interactions in the gut microbiome is hindered by poorly annotated bacterial genomes. While functional genomics can generate large gene-to-phenotype datasets to accelerate functional discovery, their applications to study gut anaerobes have been limited. For instance, most gain-of-function screens of gut-derived genes have been performed in Escherichia coli and assayed in a small number of conditions. To address these challenges, we develop Barcoded Overexpression BActerial shotgun library sequencing (Boba-seq). We demonstrate the power of this approach by assaying genes from diverse gut Bacteroidales overexpressed in Bacteroides thetaiotaomicron. From hundreds of experiments, we identify new functions and phenotypes for 29 genes important for carbohydrate metabolism or tolerance to antibiotics or bile salts. Highlights include the discovery of a D-glucosamine kinase, a raffinose transporter, and several routes that increase tolerance to ceftriaxone and bile salts through lipid biosynthesis. This approach can be readily applied to develop screens in other strains and additional phenotypic assays.
Collapse
Affiliation(s)
- Yolanda Y Huang
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Department of Microbiology and Immunology, University at Buffalo, State University of New York, Buffalo, NY, USA.
| | - Morgan N Price
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Allison Hung
- Department of Molecular and Cell Biology, University of California-Berkeley, Berkeley, CA, USA
| | - Omree Gal-Oz
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Surya Tripathi
- Department of Plant and Microbial Biology, University of California-Berkeley, Berkeley, CA, USA
| | - Christopher W Smith
- Department of Microbiology and Immunology, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - Davian Ho
- Department of Bioengineering, University of California-Berkeley, Berkeley, CA, USA
| | - Héloïse Carion
- Department of Bioengineering, University of California-Berkeley, Berkeley, CA, USA
| | - Adam M Deutschbauer
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Plant and Microbial Biology, University of California-Berkeley, Berkeley, CA, USA
| | - Adam P Arkin
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Department of Bioengineering, University of California-Berkeley, Berkeley, CA, USA.
| |
Collapse
|
14
|
Qiu Z, Zhu Y, Zhang Q, Qiao X, Mu R, Xu Z, Yan Y, Wang F, Zhang T, Zhuang WQ, Yu K. Unravelling biosynthesis and biodegradation potentials of microbial dark matters in hypersaline lakes. ENVIRONMENTAL SCIENCE AND ECOTECHNOLOGY 2024; 20:100359. [PMID: 39221074 PMCID: PMC11361885 DOI: 10.1016/j.ese.2023.100359] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 11/26/2023] [Accepted: 11/26/2023] [Indexed: 09/04/2024]
Abstract
Biosynthesis and biodegradation of microorganisms critically underpin the development of biotechnology, new drugs and therapies, and environmental remediation. However, most uncultured microbial species along with their metabolic capacities in extreme environments, remain obscured. Here we unravel the metabolic potential of microbial dark matters (MDMs) in four deep-inland hypersaline lakes in Xinjiang, China. Utilizing metagenomic binning, we uncovered a rich diversity of 3030 metagenome-assembled genomes (MAGs) across 82 phyla, revealing a substantial portion, 2363 MAGs, as previously unclassified at the genus level. These unknown MAGs displayed unique distribution patterns across different lakes, indicating a strong correlation with varied physicochemical conditions. Our analysis revealed an extensive array of 9635 biosynthesis gene clusters (BGCs), with a remarkable 9403 being novel, suggesting untapped biotechnological potential. Notably, some MAGs from potentially new phyla exhibited a high density of these BGCs. Beyond biosynthesis, our study also identified novel biodegradation pathways, including dehalogenation, anaerobic ammonium oxidation (Anammox), and degradation of polycyclic aromatic hydrocarbons (PAHs) and plastics, in previously unknown microbial clades. These findings significantly enrich our understanding of biosynthesis and biodegradation processes and open new avenues for biotechnological innovation, emphasizing the untapped potential of microbial diversity in hypersaline environments.
Collapse
Affiliation(s)
- Zhiguang Qiu
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, 518055, China
| | - Yuanyuan Zhu
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
| | - Qing Zhang
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
| | - Xuejiao Qiao
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
| | - Rong Mu
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
| | - Zheng Xu
- Southern University of Sciences and Technology Yantian Hospital, Shenzhen, 518081, China
- Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Yan Yan
- State Key Laboratory of Isotope Geochemistry, CAS Center for Excellence in Deep Earth Science, Guangzhou Institute of Geochemistry, Chinese Academy of Sciences, Guangzhou, 510640, China
| | - Fan Wang
- School of Atmospheric Sciences, Sun Yat-sen University, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, 519082, China
| | - Tong Zhang
- Department of Civil Engineering, University of Hong Kong, 999077, Hong Kong, China
| | - Wei-Qin Zhuang
- Department of Civil and Environmental Engineering, Faculty of Engineering, University of Auckland, New Zealand
| | - Ke Yu
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, 518055, China
| |
Collapse
|
15
|
Hsieh YE, Tandon K, Verbruggen H, Nikoloski Z. Comparative analysis of metabolic models of microbial communities reconstructed from automated tools and consensus approaches. NPJ Syst Biol Appl 2024; 10:54. [PMID: 38783065 PMCID: PMC11116368 DOI: 10.1038/s41540-024-00384-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Accepted: 05/13/2024] [Indexed: 05/25/2024] Open
Abstract
Genome-scale metabolic models (GEMs) of microbial communities offer valuable insights into the functional capabilities of their members and facilitate the exploration of microbial interactions. These models are generated using different automated reconstruction tools, each relying on different biochemical databases that may affect the conclusions drawn from the in silico analysis. One way to address this problem is to employ a consensus reconstruction method that combines the outcomes of different reconstruction tools. Here, we conducted a comparative analysis of community models reconstructed from three automated tools, i.e. CarveMe, gapseq, and KBase, alongside a consensus approach, utilizing metagenomics data from two marine bacterial communities. Our analysis revealed that these reconstruction approaches, while based on the same genomes, resulted in GEMs with varying numbers of genes and reactions as well as metabolic functionalities, attributed to the different databases employed. Further, our results indicated that the set of exchanged metabolites was more influenced by the reconstruction approach rather than the specific bacterial community investigated. This observation suggests a potential bias in predicting metabolite interactions using community GEMs. We also showed that consensus models encompassed a larger number of reactions and metabolites while concurrently reducing the presence of dead-end metabolites. Therefore, the usage of consensus models allows making full and unbiased use from aggregating genes from the different reconstructions in assessing the functional potential of microbial communities.
Collapse
Affiliation(s)
- Yunli Eric Hsieh
- Bioinformatics Department, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
- Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany
- School of BioSciences, The University of Melbourne, Parkville, VIC, Australia
| | - Kshitij Tandon
- School of BioSciences, The University of Melbourne, Parkville, VIC, Australia
| | - Heroen Verbruggen
- School of BioSciences, The University of Melbourne, Parkville, VIC, Australia
| | - Zoran Nikoloski
- Bioinformatics Department, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany.
- Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany.
| |
Collapse
|
16
|
Sun Z, Ning Z, Figeys D. The Landscape and Perspectives of the Human Gut Metaproteomics. Mol Cell Proteomics 2024; 23:100763. [PMID: 38608842 PMCID: PMC11098955 DOI: 10.1016/j.mcpro.2024.100763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 02/26/2024] [Accepted: 04/09/2024] [Indexed: 04/14/2024] Open
Abstract
The human gut microbiome is closely associated with human health and diseases. Metaproteomics has emerged as a valuable tool for studying the functionality of the gut microbiome by analyzing the entire proteins present in microbial communities. Recent advancements in liquid chromatography and tandem mass spectrometry (LC-MS/MS) techniques have expanded the detection range of metaproteomics. However, the overall coverage of the proteome in metaproteomics is still limited. While metagenomics studies have revealed substantial microbial diversity and functional potential of the human gut microbiome, few studies have summarized and studied the human gut microbiome landscape revealed with metaproteomics. In this article, we present the current landscape of human gut metaproteomics studies by re-analyzing the identification results from 15 published studies. We quantified the limited proteome coverage in metaproteomics and revealed a high proportion of annotation coverage of metaproteomics-identified proteins. We conducted a preliminary comparison between the metaproteomics view and the metagenomics view of the human gut microbiome, identifying key areas of consistency and divergence. Based on the current landscape of human gut metaproteomics, we discuss the feasibility of using metaproteomics to study functionally unknown proteins and propose a whole workflow peptide-centric analysis. Additionally, we suggest enhancing metaproteomics analysis by refining taxonomic classification and calculating confidence scores, as well as developing tools for analyzing the interaction between taxonomy and function.
Collapse
Affiliation(s)
- Zhongzhi Sun
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada; Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - Zhibin Ning
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - Daniel Figeys
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada; Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada.
| |
Collapse
|
17
|
Wei X, Tan H, Lobb B, Zhen W, Wu Z, Parks DH, Neufeld JD, Moreno-Hagelsieb G, Doxey AC. AnnoView enables large-scale analysis, comparison, and visualization of microbial gene neighborhoods. Brief Bioinform 2024; 25:bbae229. [PMID: 38747283 PMCID: PMC11094555 DOI: 10.1093/bib/bbae229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 04/02/2024] [Accepted: 04/26/2024] [Indexed: 05/19/2024] Open
Abstract
The analysis and comparison of gene neighborhoods is a powerful approach for exploring microbial genome structure, function, and evolution. Although numerous tools exist for genome visualization and comparison, genome exploration across large genomic databases or user-generated datasets remains a challenge. Here, we introduce AnnoView, a web server designed for interactive exploration of gene neighborhoods across the bacterial and archaeal tree of life. Our server offers users the ability to identify, compare, and visualize gene neighborhoods of interest from 30 238 bacterial genomes and 1672 archaeal genomes, through integration with the comprehensive Genome Taxonomy Database and AnnoTree databases. Identified gene neighborhoods can be visualized using pre-computed functional annotations from different sources such as KEGG, Pfam and TIGRFAM, or clustered based on similarity. Alternatively, users can upload and explore their own custom genomic datasets in GBK, GFF or CSV format, or use AnnoView as a genome browser for relatively small genomes (e.g. viruses and plasmids). Ultimately, we anticipate that AnnoView will catalyze biological discovery by enabling user-friendly search, comparison, and visualization of genomic data. AnnoView is available at http://annoview.uwaterloo.ca.
Collapse
Affiliation(s)
- Xin Wei
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - Huagang Tan
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - Briallen Lobb
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - William Zhen
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - Zijing Wu
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - Donovan H Parks
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Brisbane, Australia
| | - Josh D Neufeld
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - Gabriel Moreno-Hagelsieb
- Department of Biology, Wilfrid Laurier University, 75 University Avenue West, Waterloo, ON, Canada
| | - Andrew C Doxey
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| |
Collapse
|
18
|
Sengupta P, Muthamilselvi Sivabalan SK, Singh NK, Raman K, Venkateswaran K. Genomic, functional, and metabolic enhancements in multidrug-resistant Enterobacter bugandensis facilitating its persistence and succession in the International Space Station. MICROBIOME 2024; 12:62. [PMID: 38521963 PMCID: PMC10960378 DOI: 10.1186/s40168-024-01777-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 02/08/2024] [Indexed: 03/25/2024]
Abstract
BACKGROUND The International Space Station (ISS) stands as a testament to human achievement in space exploration. Despite its highly controlled environment, characterised by microgravity, increased CO2 levels, and elevated solar radiation, microorganisms occupy a unique niche. These microbial inhabitants play a significant role in influencing the health and well-being of astronauts on board. One microorganism of particular interest in our study is Enterobacter bugandensis, primarily found in clinical specimens including the human gastrointestinal tract, and also reported to possess pathogenic traits, leading to a plethora of infections. RESULTS Distinct from their Earth counterparts, ISS E. bugandensis strains have exhibited resistance mechanisms that categorise them within the ESKAPE pathogen group, a collection of pathogens recognised for their formidable resistance to antimicrobial treatments. During the 2-year Microbial Tracking 1 mission, 13 strains of multidrug-resistant E. bugandensis were isolated from various locations within the ISS. We have carried out a comprehensive study to understand the genomic intricacies of ISS-derived E. bugandensis in comparison to terrestrial strains, with a keen focus on those associated with clinical infections. We unravel the evolutionary trajectories of pivotal genes, especially those contributing to functional adaptations and potential antimicrobial resistance. A hypothesis central to our study was that the singular nature of the stresses of the space environment, distinct from any on Earth, could be driving these genomic adaptations. Extending our investigation, we meticulously mapped the prevalence and distribution of E. bugandensis across the ISS over time. This temporal analysis provided insights into the persistence, succession, and potential patterns of colonisation of E. bugandensis in space. Furthermore, by leveraging advanced analytical techniques, including metabolic modelling, we delved into the coexisting microbial communities alongside E. bugandensis in the ISS across multiple missions and spatial locations. This exploration revealed intricate microbial interactions, offering a window into the microbial ecosystem dynamics within the ISS. CONCLUSIONS Our comprehensive analysis illuminated not only the ways these interactions sculpt microbial diversity but also the factors that might contribute to the potential dominance and succession of E. bugandensis within the ISS environment. The implications of these findings are twofold. Firstly, they shed light on microbial behaviour, adaptation, and evolution in extreme, isolated environments. Secondly, they underscore the need for robust preventive measures, ensuring the health and safety of astronauts by mitigating risks associated with potential pathogenic threats. Video Abstract.
Collapse
Affiliation(s)
- Pratyay Sengupta
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, Tamil Nadu, India
- Center for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology Madras, Chennai, 600036, Tamil Nadu, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), Indian Institute of Technology Madras, Chennai, 600036, Tamil Nadu, India
| | | | - Nitin Kumar Singh
- NASA Jet Propulsion Laboratory, California Institute of Technology, M/S 89-2, 4800 Oak Grove Dr, Pasadena, 91109, CA, USA
| | - Karthik Raman
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, Tamil Nadu, India.
- Center for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology Madras, Chennai, 600036, Tamil Nadu, India.
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), Indian Institute of Technology Madras, Chennai, 600036, Tamil Nadu, India.
- Wadhwani School of Data Science and AI, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India.
| | - Kasthuri Venkateswaran
- NASA Jet Propulsion Laboratory, California Institute of Technology, M/S 89-2, 4800 Oak Grove Dr, Pasadena, 91109, CA, USA.
| |
Collapse
|
19
|
Tavis S, Hettich RL. Multi-Omics integration can be used to rescue metabolic information for some of the dark region of the Pseudomonas putida proteome. BMC Genomics 2024; 25:267. [PMID: 38468234 PMCID: PMC10926591 DOI: 10.1186/s12864-024-10082-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 02/02/2024] [Indexed: 03/13/2024] Open
Abstract
In every omics experiment, genes or their products are identified for which even state of the art tools are unable to assign a function. In the biotechnology chassis organism Pseudomonas putida, these proteins of unknown function make up 14% of the proteome. This missing information can bias analyses since these proteins can carry out functions which impact the engineering of organisms. As a consequence of predicting protein function across all organisms, function prediction tools generally fail to use all of the types of data available for any specific organism, including protein and transcript expression information. Additionally, the release of Alphafold predictions for all Uniprot proteins provides a novel opportunity for leveraging structural information. We constructed a bespoke machine learning model to predict the function of recalcitrant proteins of unknown function in Pseudomonas putida based on these sources of data, which annotated 1079 terms to 213 proteins. Among the predicted functions supplied by the model, we found evidence for a significant overrepresentation of nitrogen metabolism and macromolecule processing proteins. These findings were corroborated by manual analyses of selected proteins which identified, among others, a functionally unannotated operon that likely encodes a branch of the shikimate pathway.
Collapse
Affiliation(s)
- Steven Tavis
- Genome Science and Technology Graduate Program, University of Tennessee Knoxville, Knoxville, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Robert L Hettich
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA.
| |
Collapse
|
20
|
Koutsandreas T, Felden B, Chevet E, Chatziioannou A. Protein homeostasis imprinting across evolution. NAR Genom Bioinform 2024; 6:lqae014. [PMID: 38486886 PMCID: PMC10939379 DOI: 10.1093/nargab/lqae014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 10/07/2023] [Accepted: 01/24/2024] [Indexed: 03/17/2024] Open
Abstract
Protein homeostasis (a.k.a. proteostasis) is associated with the primary functions of life, and therefore with evolution. However, it is unclear how cellular proteostasis machines have evolved to adjust protein biogenesis needs to environmental constraints. Herein, we describe a novel computational approach, based on semantic network analysis, to evaluate proteostasis plasticity during evolution. We show that the molecular components of the proteostasis network (PN) are reliable metrics to deconvolute the life forms into Archaea, Bacteria and Eukarya and to assess the evolution rates among species. Semantic graphs were used as new criteria to evaluate PN complexity in 93 Eukarya, 250 Bacteria and 62 Archaea, thus representing a novel strategy for taxonomic classification, which provided information about species divergence. Kingdom-specific PN components were identified, suggesting that PN complexity may correlate with evolution. We found that the gains that occurred throughout PN evolution revealed a dichotomy within both the PN conserved modules and within kingdom-specific modules. Additionally, many of these components contribute to the evolutionary imprinting of other conserved mechanisms. Finally, the current study suggests a new way to exploit the genomic annotation of biomedical ontologies, deriving new knowledge from the semantic comparison of different biological systems.
Collapse
Affiliation(s)
- Thodoris Koutsandreas
- Center of Systems Biology, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
- e-NIOS Applications PC, Kallithea-Athens, Greece
| | - Brice Felden
- University of Rennes, INSERM U1230, Rennes, France
| | - Eric Chevet
- INSERM U1242, University of Rennes, Rennes, France
- Centre de Lutte Contre le Cancer Eugène Marquis, Rennes, France
| | - Aristotelis Chatziioannou
- Center of Systems Biology, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
- e-NIOS Applications PC, Kallithea-Athens, Greece
| |
Collapse
|
21
|
Rodríguez Del Río Á, Giner-Lamia J, Cantalapiedra CP, Botas J, Deng Z, Hernández-Plaza A, Munar-Palmer M, Santamaría-Hernando S, Rodríguez-Herva JJ, Ruscheweyh HJ, Paoli L, Schmidt TSB, Sunagawa S, Bork P, López-Solanilla E, Coelho LP, Huerta-Cepas J. Functional and evolutionary significance of unknown genes from uncultivated taxa. Nature 2024; 626:377-384. [PMID: 38109938 PMCID: PMC10849945 DOI: 10.1038/s41586-023-06955-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 12/08/2023] [Indexed: 12/20/2023]
Abstract
Many of the Earth's microbes remain uncultured and understudied, limiting our understanding of the functional and evolutionary aspects of their genetic material, which remain largely overlooked in most metagenomic studies1. Here we analysed 149,842 environmental genomes from multiple habitats2-6 and compiled a curated catalogue of 404,085 functionally and evolutionarily significant novel (FESNov) gene families exclusive to uncultivated prokaryotic taxa. All FESNov families span multiple species, exhibit strong signals of purifying selection and qualify as new orthologous groups, thus nearly tripling the number of bacterial and archaeal gene families described to date. The FESNov catalogue is enriched in clade-specific traits, including 1,034 novel families that can distinguish entire uncultivated phyla, classes and orders, probably representing synapomorphies that facilitated their evolutionary divergence. Using genomic context analysis and structural alignments we predicted functional associations for 32.4% of FESNov families, including 4,349 high-confidence associations with important biological processes. These predictions provide a valuable hypothesis-driven framework that we used for experimental validatation of a new gene family involved in cell motility and a novel set of antimicrobial peptides. We also demonstrate that the relative abundance profiles of novel families can discriminate between environments and clinical conditions, leading to the discovery of potentially new biomarkers associated with colorectal cancer. We expect this work to enhance future metagenomics studies and expand our knowledge of the genetic repertory of uncultivated organisms.
Collapse
Affiliation(s)
- Álvaro Rodríguez Del Río
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Joaquín Giner-Lamia
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
- Departamento de Biotecnología-Biología Vegetal, Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid (UPM), Madrid, Spain
- Departamento de Bioquímica Vegetal y Biología Molecular, Facultad de Biología, Instituto de Bioquímica Vegetal y Fotosíntesis (IBVF), Universidad de Sevilla-CSIC, Seville, Spain
| | - Carlos P Cantalapiedra
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Jorge Botas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Ziqi Deng
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Ana Hernández-Plaza
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Martí Munar-Palmer
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Saray Santamaría-Hernando
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - José J Rodríguez-Herva
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
- Departamento de Biotecnología-Biología Vegetal, Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid (UPM), Madrid, Spain
| | - Hans-Joachim Ruscheweyh
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zürich, Zürich, Switzerland
| | - Lucas Paoli
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zürich, Zürich, Switzerland
| | - Thomas S B Schmidt
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Shinichi Sunagawa
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zürich, Zürich, Switzerland
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, Würzburg, Germany
| | - Emilia López-Solanilla
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
- Departamento de Biotecnología-Biología Vegetal, Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid (UPM), Madrid, Spain
| | - Luis Pedro Coelho
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Shanghai, China
- Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology, Translational Research Institute, Woolloongabba, Queensland, Australia
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain.
| |
Collapse
|
22
|
Noirungsee N, Changkhong S, Phinyo K, Suwannajak C, Tanakul N, Inwongwan S. Genome-scale metabolic modelling of extremophiles and its applications in astrobiological environments. ENVIRONMENTAL MICROBIOLOGY REPORTS 2024; 16:e13231. [PMID: 38192220 PMCID: PMC10866088 DOI: 10.1111/1758-2229.13231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 12/19/2023] [Indexed: 01/10/2024]
Abstract
Metabolic modelling approaches have become the powerful tools in modern biology. These mathematical models are widely used to predict metabolic phenotypes of the organisms or communities of interest, and to identify metabolic targets in metabolic engineering. Apart from a broad range of industrial applications, the possibility of using metabolic modelling in the contexts of astrobiology are poorly explored. In this mini-review, we consolidated the concepts and related applications of applying metabolic modelling in studying organisms in space-related environments, specifically the extremophilic microbes. We recapitulated the current state of the art in metabolic modelling approaches and their advantages in the astrobiological context. Our review encompassed the applications of metabolic modelling in the theoretical investigation of the origin of life within prebiotic environments, as well as the compilation of existing uses of genome-scale metabolic models of extremophiles. Furthermore, we emphasize the current challenges associated with applying this technique in extreme environments, and conclude this review by discussing the potential implementation of metabolic models to explore theoretically optimal metabolic networks under various space conditions. Through this mini-review, our aim is to highlight the potential of metabolic modelling in advancing the study of astrobiology.
Collapse
Affiliation(s)
- Nuttapol Noirungsee
- Department of Biology, Faculty of ScienceChiang Mai UniversityChiang MaiThailand
- Research Center of Microbial Diversity and Sustainable Utilizations, Faculty of ScienceChiang Mai UniversityChiang MaiThailand
| | - Sakunthip Changkhong
- Department of Biology, Faculty of ScienceChiang Mai UniversityChiang MaiThailand
- Department of Thoracic SurgeryUniversity Hospital ZurichZurichSwitzerland
| | - Kittiya Phinyo
- Department of Biology, Faculty of ScienceChiang Mai UniversityChiang MaiThailand
- Research group on Earth—Space Ecology (ESE), Faculty of ScienceChiang Mai UniversityChiang MaiThailand
- Office of Research AdministrationChiang Mai UniversityChiang MaiThailand
| | | | - Nahathai Tanakul
- National Astronomical Research Institute of ThailandChiang MaiThailand
| | - Sahutchai Inwongwan
- Department of Biology, Faculty of ScienceChiang Mai UniversityChiang MaiThailand
- Research Center of Microbial Diversity and Sustainable Utilizations, Faculty of ScienceChiang Mai UniversityChiang MaiThailand
| |
Collapse
|
23
|
Verhoeven MD, Nielsen PH, Dueholm MKD. Amplicon-guided isolation and cultivation of previously uncultured microbial species from activated sludge. Appl Environ Microbiol 2023; 89:e0115123. [PMID: 38051071 PMCID: PMC10734543 DOI: 10.1128/aem.01151-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 10/23/2023] [Indexed: 12/07/2023] Open
Abstract
IMPORTANCE Biological wastewater treatment relies on complex microbial communities that assimilate nutrients and break down pollutants in the wastewater. Knowledge about the physiology and metabolism of bacteria in wastewater treatment plants (WWTPs) may therefore be used to improve the efficacy and economy of wastewater treatment. Our current knowledge is largely based on 16S rRNA gene amplicon profiling, fluorescence in situ hybridization studies, and predictions based on metagenome-assembled genomes. Bacterial isolates are often required to validate genome-based predictions as they allow researchers to analyze a specific species without interference from other bacteria and with simple bulk measurements. Unfortunately, there are currently very few pure cultures representing the microbes commonly found in WWTPs. To address this, we introduce an isolation strategy that takes advantage of state-of-the-art microbial profiling techniques to uncover suitable growth conditions for key WWTP microbes. We furthermore demonstrate that this information can be used to isolate key organisms representing global WWTPs.
Collapse
Affiliation(s)
- Maarten D. Verhoeven
- Department of Chemistry and Bioscience, Center for Microbial Communities, Aalborg University, Aalborg, Denmark
| | - Per H. Nielsen
- Department of Chemistry and Bioscience, Center for Microbial Communities, Aalborg University, Aalborg, Denmark
| | - Morten K. D. Dueholm
- Department of Chemistry and Bioscience, Center for Microbial Communities, Aalborg University, Aalborg, Denmark
| |
Collapse
|
24
|
Piton G, Allison SD, Bahram M, Hildebrand F, Martiny JBH, Treseder KK, Martiny AC. Life history strategies of soil bacterial communities across global terrestrial biomes. Nat Microbiol 2023; 8:2093-2102. [PMID: 37798477 DOI: 10.1038/s41564-023-01465-0] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 08/08/2023] [Indexed: 10/07/2023]
Abstract
The life history strategies of soil microbes determine their metabolic potential and their response to environmental changes. Yet these strategies remain poorly understood. Here we use shotgun metagenomes from terrestrial biomes to characterize overarching covariations of the genomic traits that capture dominant life history strategies in bacterial communities. The emerging patterns show a triangle of life history strategies shaped by two trait dimensions, supporting previous theoretical and isolate-based studies. The first dimension ranges from streamlined genomes with simple metabolisms to larger genomes and expanded metabolic capacities. As metabolic capacities expand, bacterial communities increasingly differentiate along a second dimension that reflects a trade-off between increasing capacities for environmental responsiveness or for nutrient recycling. Random forest analyses show that soil pH, C:N ratio and precipitation patterns together drive the dominant life history strategy of soil bacterial communities and their biogeographic distribution. Our findings provide a trait-based framework to compare life history strategies of soil bacteria.
Collapse
Affiliation(s)
- Gabin Piton
- Department of Earth System Science, University of California, Irvine, Irvine, CA, USA.
- Eco&Sols, University Montpellier, CIRAD, INRAE, Institut Agro, IRD, Montpellier, France.
| | - Steven D Allison
- Department of Earth System Science, University of California, Irvine, Irvine, CA, USA
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| | - Mohammad Bahram
- Department of Ecology, Swedish University of Agricultural Sciences, Uppsala, Sweden
- Institute of Ecology and Earth Sciences, University of Tartu, Tartu, Estonia
| | - Falk Hildebrand
- Gut Microbes and Health, Quadram Institute Bioscience, Norwich Research Park, Norwich, Norfolk, UK
- Digital Biology, Earlham Institute, Norwich Research Park, Norwich, Norfolk, UK
| | - Jennifer B H Martiny
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| | - Kathleen K Treseder
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| | - Adam C Martiny
- Department of Earth System Science, University of California, Irvine, Irvine, CA, USA
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| |
Collapse
|
25
|
Awori RM, Hendre P, Amugune NO. The genome of a steinernematid-associated Pseudomonas piscis bacterium encodes the biosynthesis of insect toxins. Access Microbiol 2023; 5:000659.v3. [PMID: 37970093 PMCID: PMC10634486 DOI: 10.1099/acmi.0.000659.v3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 09/15/2023] [Indexed: 11/17/2023] Open
Abstract
Several species of soil-dwelling Steinernema nematodes are used in the biocontrol of crop pests, due to their natural capacity to kill diverse lepidopteran species. Although this insect-killing trait is known to be augmented by the nematodes' Xenorhabdus endosymbionts, the role of other steinernematid-associated bacterial genera in the nematode lifecycle remains unclear. This genomic study aimed to determine the potential of Pseudomonas piscis to contribute to the entomopathogenicity of its Steinernema host. Insect larvae were infected with three separate Steinernema cultures. From each of the three treatments, the prevalent bacteria in the haemocoel of cadavers, four days post-infection, were isolated. These three bacterial isolates were morphologically characterised. DNA was extracted from each of the three bacterial isolates and used for long-read genome sequencing and assembly. Assemblies were used to delineate species and identify genes that encode insect toxins, antimicrobials, and confer antibiotic resistance. We assembled three complete genomes. Through digital DNA-DNA hybridisation analyses, we ascertained that the haemocoels of insect cadavers previously infected with Steinernema sp. Kalro, Steinernema sp. 75, and Steinernema sp. 97 were dominated by Xenorhabdus griffiniae Kalro, Pseudomonas piscis 75, and X. griffiniae 97, respectively. X. griffiniae Kalro and X. griffiniae 97 formed a subspecies with other X. griffiniae symbionts of steinernematids from Kenya. P. piscis 75 phylogenetically clustered with pseudomonads that are characterised by high insecticidal activity. The P. piscis 75 genome encoded the production pathway of insect toxins such as orfamides and rhizoxins, antifungals such as pyrrolnitrin and pyoluteorin, and the broad-spectrum antimicrobial 2,4-diacetylphloroglucinol. The P. piscis 75 genome encoded resistance to over ten classes of antibiotics, including cationic lipopeptides. Steinernematid-associated P. piscis bacteria hence have the biosynthetic potential to contribute to nematode entomopathogenicity.
Collapse
Affiliation(s)
- Ryan Musumba Awori
- Elakistos Biosciences, P. O. Box 19301-00100, Nairobi, Kenya
- International Centre for Research on Agroforestry, P. O. Box 30677-00100, Nairobi, Kenya
| | - Prasad Hendre
- International Centre for Research on Agroforestry, P. O. Box 30677-00100, Nairobi, Kenya
| | - Nelson O. Amugune
- Department of Biology, University of Nairobi, P. O. Box 30197-00100, Nairobi, Kenya
| |
Collapse
|
26
|
Xue B, Rhee SY. Status of genome function annotation in model organisms and crops. PLANT DIRECT 2023; 7:e499. [PMID: 37426891 PMCID: PMC10326244 DOI: 10.1002/pld3.499] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 04/21/2023] [Accepted: 05/08/2023] [Indexed: 07/11/2023]
Abstract
Since the entry into genome-enabled biology several decades ago, much progress has been made in determining, describing, and disseminating the functions of genes and their products. Yet, this information is still difficult to access for many scientists and for most genomes. To provide easy access and a graphical summary of the status of genome function annotation for model organisms and bioenergy and food crop species, we created a web application (https://genomeannotation.rheelab.org) to visualize, search, and download genome annotation data for 28 species. The summary graphics and data tables will be updated semi-annually, and snapshots will be archived to provide a historical record of the progress of genome function annotation efforts. Clear and simple visualization of up-to-date genome function annotation status, including the extent of what is unknown, will help address the grand challenge of elucidating the functions of all genes in organisms.
Collapse
Affiliation(s)
- Bo Xue
- Department of Plant BiologyCarnegie Institution for ScienceStanfordCaliforniaUSA
- Present address:
Plant Resilience InstituteMichigan State UniversityEast LansingMI 4882
| | - Seung Y. Rhee
- Department of Plant BiologyCarnegie Institution for ScienceStanfordCaliforniaUSA
- Present address:
Plant Resilience InstituteMichigan State UniversityEast LansingMI 4882
| |
Collapse
|
27
|
Shan X, Goyal A, Gregor R, Cordero OX. Annotation-free discovery of functional groups in microbial communities. Nat Ecol Evol 2023; 7:716-724. [PMID: 36997739 DOI: 10.1038/s41559-023-02021-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 02/16/2023] [Indexed: 04/01/2023]
Abstract
Recent studies have shown that microbial communities are composed of groups of functionally cohesive taxa whose abundance is more stable and better-associated with metabolic fluxes than that of any individual taxon. However, identifying these functional groups in a manner that is independent of error-prone functional gene annotations remains a major open problem. Here we tackle this structure-function problem by developing a novel unsupervised approach that coarse-grains taxa into functional groups, solely on the basis of the patterns of statistical variation in species abundances and functional read-outs. We demonstrate the power of this approach on three distinct datasets. On data of replicate microcosms with heterotrophic soil bacteria, our unsupervised algorithm recovered experimentally validated functional groups that divide metabolic labour and remain stable despite large variation in species composition. When leveraged against the ocean microbiome data, our approach discovered a functional group that combines aerobic and anaerobic ammonia oxidizers whose summed abundance tracks closely with nitrate concentrations in the water column. Finally, we show that our framework can enable the detection of species groups that are probably responsible for the production or consumption of metabolites abundant in animal gut microbiomes, serving as a hypothesis-generating tool for mechanistic studies. Overall, this work advances our understanding of structure-function relationships in complex microbiomes and provides a powerful approach to discover functional groups in an objective and systematic manner.
Collapse
Affiliation(s)
- Xiaoyu Shan
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Akshit Goyal
- Physics of Living Systems, Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Rachel Gregor
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Otto X Cordero
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
28
|
A Genome-Scale Metabolic Model of Marine Heterotroph Vibrio splendidus Strain 1A01. mSystems 2023; 8:e0037722. [PMID: 36853050 PMCID: PMC10134806 DOI: 10.1128/msystems.00377-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
While Vibrio splendidus is best known as an opportunistic pathogen in oysters, Vibrio splendidus strain 1A01 was first identified as an early colonizer of synthetic chitin particles incubated in seawater. To gain a better understanding of its metabolism, a genome-scale metabolic model (GSMM) of V. splendidus 1A01 was reconstructed. GSMMs enable us to simulate all metabolic reactions in a bacterial cell using flux balance analysis. A draft model was built using an automated pipeline from BioCyc. Manual curation was then performed based on experimental data, in part by gap-filling metabolic pathways and tailoring the model's biomass reaction to V. splendidus 1A01. The challenges of building a metabolic model for a marine microorganism like V. splendidus 1A01 are described. IMPORTANCE A genome-scale metabolic model of V. splendidus 1A01 was reconstructed in this work. We offer solutions to the technical problems associated with model reconstruction for a marine bacterial strain like V. splendidus 1A01, which arise largely from the high salt concentration found in both seawater and culture media that simulate seawater.
Collapse
|
29
|
Ding G, Mugume Y, Dueñas ME, Lee YJ, Liu M, Nettleton DS, Zhao X, Li L, Bassham DC, Nikolau BJ. Biological insights from multi-omics analysis strategies: Complex pleotropic effects associated with autophagy. FRONTIERS IN PLANT SCIENCE 2023; 14:1093358. [PMID: 36875559 PMCID: PMC9978356 DOI: 10.3389/fpls.2023.1093358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 01/30/2023] [Indexed: 06/18/2023]
Abstract
Research strategies that combine molecular data from multiple levels of genome expression (i.e., multi-omics data), often referred to as a systems biology strategy, has been advocated as a route to discovering gene functions. In this study we conducted an evaluation of this strategy by combining lipidomics, metabolite mass-spectral imaging and transcriptomics data from leaves and roots in response to mutations in two AuTophaGy-related (ATG) genes of Arabidopsis. Autophagy is an essential cellular process that degrades and recycles macromolecules and organelles, and this process is blocked in the atg7 and atg9 mutants that were the focus of this study. Specifically, we quantified abundances of ~100 lipids and imaged the cellular locations of ~15 lipid molecular species and the relative abundance of ~26,000 transcripts from leaf and root tissues of WT, atg7 and atg9 mutant plants, grown either in normal (nitrogen-replete) and autophagy-inducing conditions (nitrogen-deficient). The multi-omics data enabled detailed molecular depiction of the effect of each mutation, and a comprehensive physiological model to explain the consequence of these genetic and environmental changes in autophagy is greatly facilitated by the a priori knowledge of the exact biochemical function of the ATG7 and ATG9 proteins.
Collapse
Affiliation(s)
- Geng Ding
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA, United States
| | - Yosia Mugume
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA, United States
| | | | - Young Jin Lee
- Department of Chemistry, Iowa State University, Ames, IA, United States
| | - Meiling Liu
- Department of Statistics, Iowa State University, Ames, IA, United States
| | | | - Xuefeng Zhao
- Research Information Technology, College of Liberal Arts & Sciences, Iowa State University, Ames, IA, United States
| | - Ling Li
- Department of Biological Sciences, Mississippi State University, Mississippi State, MS, United States
| | - Diane C. Bassham
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA, United States
| | - Basil J. Nikolau
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA, United States
| |
Collapse
|
30
|
Abstract
The intestinal lining is protected by a mucous barrier composed predominantly of complex carbohydrates. Gut microbes employ diverse glycoside hydrolases (GHs) to liberate mucosal sugars as a nutrient source to facilitate host colonization. Intensive catabolism of mucosal glycans, however, may contribute to barrier erosion, pathogen encroachment, and inflammation. Sialic acid is an acidic sugar featured at terminal positions of host glycans. Characterized sialidases from the microbiome belong to the GH33 family, according to CAZy (Carbohydrate-Active enZYmes Database). In 2018 a functional metagenomics screen using thermal spring DNA uncovered the founding member of the GH156 sialidase family, the presence of which has yet to be reported in the context of the human microbiome. A subset of GH156 sequences from the CAZy database containing key sialidase residues was used to build a hidden Markov model. HMMsearch against public databases revealed ~10× more putative GH156 sialidases than currently cataloged by CAZy. Represented phyla include Bacteroidota, Verrucomicrobiota, and Firmicutes_A from human microbiomes, all of which play notable roles in carbohydrate fermentation. Analyses of metagenomic data sets revealed that GH156s are frequently encoded in metagenomes, with a greater variety and abundance of GH156 genes observed in traditional hunter-gatherer or agriculturalist societies than in industrialized societies, particularly relative to individuals with inflammatory bowel disease (IBD). Nineteen GH156s were recombinantly expressed and assayed for sialidase activity. The five GH156 sialidases identified here share limited sequence identity to each other or the founding GH156 family member and are representative of a large subset of the family. IMPORTANCE Sialic acids occupy terminal positions of human glycans where they act as receptors for microbes, toxins, and immune signaling molecules. Microbial enzymes that remove sialic acids, sialidases, are abundant in the human microbiome where they may contribute to shaping the microbiota community structure or contribute to pathology. Furthermore, sialidases have proven to hold therapeutic potential for cancer therapy. Here, we examined the sequence space of a sialidase family of enzymes, GH156, previously unknown in the human gut environment. Our analyses suggest that human populations with disparate dietary practices harbor distinct varieties and abundances of GH156-encoding genes. Furthermore, we demonstrate the sialidase activity of 5 gut-derived GH156s. These results expand the diversity of sialidases that may contribute to host glycan degradation, and these sequences may have biotechnological or clinical utility.
Collapse
|
31
|
Exploring Bacterial Attributes That Underpin Symbiont Life in the Monogastric Gut. Appl Environ Microbiol 2022; 88:e0112822. [PMID: 36036591 PMCID: PMC9499014 DOI: 10.1128/aem.01128-22] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The large bowel of monogastric animals, such as that of humans, is home to a microbial community (microbiota) composed of a diversity of mostly bacterial species. Interrelationships between the microbiota as an entity and the host are complex and lifelong and are characteristic of a symbiosis. The relationships may be disrupted in association with disease, resulting in dysbiosis. Modifications to the microbiota to correct dysbiosis require knowledge of the fundamental mechanisms by which symbionts inhabit the gut. This review aims to summarize aspects of niche fitness of bacterial species that inhabit the monogastric gut, especially of humans, and to indicate the research path by which progress can be made in exploring bacterial attributes that underpin symbiont life in the gut.
Collapse
|
32
|
Impact of Clarified Apple Juices with Different Processing Methods on Gut Microbiota and Metabolomics of Rats. Nutrients 2022; 14:nu14173488. [PMID: 36079746 PMCID: PMC9460580 DOI: 10.3390/nu14173488] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 08/09/2022] [Accepted: 08/12/2022] [Indexed: 11/16/2022] Open
Abstract
The consumption of processed foods has increased compared to that of fresh foods in recent years, especially due to the coronavirus disease 2019 pandemic. Here, we evaluated the health effects of clarified apple juices (CAJs, devoid of pectin and additives) processed to different degrees, including not-from-concentrate (NFC) and from-concentrate (FC) CAJs. A 56-day experiment including a juice-switch after 28 days was designed. An integrated analysis of 16S rRNA sequencing and untargeted metabolomics of cecal content were performed. In addition, differences in the CAJs tested with respect to nutritional indices and composition of small-molecule compounds were analyzed. The NFC CAJ, which showed a higher phenolic content resulting from the lower processing degree, could improve microbiota diversity and influence its structure. It also reduced bile acid and bilirubin contents, as well as inhibited the microbial metabolism of tryptophan in the gut. However, we found that these effects diminished with time by performing experiment extension and undertaking juice-switching. Our study provides evidence regarding the health effects of processed foods that can potentially be applied to public health policy decision making. We believe that NFC juices with a lower processing degree could potentially be healthier than FC juice.
Collapse
|
33
|
de Crécy-lagard V, Amorin de Hegedus R, Arighi C, Babor J, Bateman A, Blaby I, Blaby-Haas C, Bridge AJ, Burley SK, Cleveland S, Colwell LJ, Conesa A, Dallago C, Danchin A, de Waard A, Deutschbauer A, Dias R, Ding Y, Fang G, Friedberg I, Gerlt J, Goldford J, Gorelik M, Gyori BM, Henry C, Hutinet G, Jaroch M, Karp PD, Kondratova L, Lu Z, Marchler-Bauer A, Martin MJ, McWhite C, Moghe GD, Monaghan P, Morgat A, Mungall CJ, Natale DA, Nelson WC, O’Donoghue S, Orengo C, O’Toole KH, Radivojac P, Reed C, Roberts RJ, Rodionov D, Rodionova IA, Rudolf JD, Saleh L, Sheynkman G, Thibaud-Nissen F, Thomas PD, Uetz P, Vallenet D, Carter EW, Weigele PR, Wood V, Wood-Charlson EM, Xu J. A roadmap for the functional annotation of protein families: a community perspective. Database (Oxford) 2022; 2022:baac062. [PMID: 35961013 PMCID: PMC9374478 DOI: 10.1093/database/baac062] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 06/28/2022] [Accepted: 08/03/2022] [Indexed: 12/23/2022]
Abstract
Over the last 25 years, biology has entered the genomic era and is becoming a science of 'big data'. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3-4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.
Collapse
Affiliation(s)
- Valérie de Crécy-lagard
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | | | - Cecilia Arighi
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19713, USA
| | - Jill Babor
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Ian Blaby
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Crysten Blaby-Haas
- Biology Department, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Alan J Bridge
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva 4 CH-1211, Switzerland
| | - Stephen K Burley
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Stacey Cleveland
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Lucy J Colwell
- Departmenf of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
| | - Ana Conesa
- Spanish National Research Council, Institute for Integrative Systems Biology, Paterna, Valencia 46980, Spain
| | - Christian Dallago
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, i12, Boltzmannstr. 3, Garching/Munich 85748, Germany
| | - Antoine Danchin
- School of Biomedical Sciences, Li KaShing Faculty of Medicine, The University of Hong Kong, 21 Sassoon Road, Pokfulam, SAR Hong Kong 999077, China
| | - Anita de Waard
- Research Collaboration Unit, Elsevier, Jericho, VT 05465, USA
| | - Adam Deutschbauer
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Raquel Dias
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Yousong Ding
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, USA
| | - Gang Fang
- NYU-Shanghai, Shanghai 200120, China
| | - Iddo Friedberg
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA 50011, USA
| | - John Gerlt
- Institute for Genomic Biology and Departments of Biochemistry and Chemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Joshua Goldford
- Physics of Living Systems, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Mark Gorelik
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Benjamin M Gyori
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - Christopher Henry
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Geoffrey Hutinet
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Marshall Jaroch
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Peter D Karp
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025, USA
| | | | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Aron Marchler-Bauer
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Maria-Jesus Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Claire McWhite
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA
| | - Gaurav D Moghe
- Plant Biology Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Paul Monaghan
- Department of Agricultural Education and Communication, University of Florida, Gainesville, FL 32611, USA
| | - Anne Morgat
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva 4 CH-1211, Switzerland
| | - Christopher J Mungall
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Darren A Natale
- Georgetown University Medical Center, Washington, DC 20007, USA
| | - William C Nelson
- Biological Sciences Division, Pacific Northwest National Laboratories, Richland, WA 99354, USA
| | - Seán O’Donoghue
- School of Biotechnology and Biomolecular Sciences, University of NSW, Sydney, NSW 2052, Australia
| | - Christine Orengo
- Department of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | | | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| | - Colbie Reed
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | | | - Dmitri Rodionov
- Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA 92037, USA
| | - Irina A Rodionova
- Department of Bioengineering, Division of Engineering, University of California at San Diego, La Jolla, CA 92093-0412, USA
| | - Jeffrey D Rudolf
- Department of Chemistry, University of Florida, Gainesville, FL 32611, USA
| | - Lana Saleh
- New England Biolabs, Ipswich, MA 01938, USA
| | - Gloria Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | - Francoise Thibaud-Nissen
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA 90033, USA
| | - Peter Uetz
- Center for Biological Data Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - David Vallenet
- LABGeM, Génomique Métabolique, CEA, Genoscope, Institut François Jacob, Université d’Évry, Université Paris-Saclay, CNRS, Evry 91057, France
| | - Erica Watson Carter
- Department of Plant Pathology, University of Florida Citrus Research and Education Center, 700 Experiment Station Rd., Lake Alfred, FL 33850, USA
| | | | - Valerie Wood
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - Elisha M Wood-Charlson
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jin Xu
- Department of Plant Pathology, University of Florida Citrus Research and Education Center, 700 Experiment Station Rd., Lake Alfred, FL 33850, USA
| |
Collapse
|
34
|
Hoarfrost A, Aptekmann A, Farfañuk G, Bromberg Y. Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter. Nat Commun 2022; 13:2606. [PMID: 35545619 PMCID: PMC9095714 DOI: 10.1038/s41467-022-30070-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Accepted: 03/30/2022] [Indexed: 12/22/2022] Open
Abstract
The majority of microbial genomes have yet to be cultured, and most proteins identified in microbial genomes or environmental sequences cannot be functionally annotated. As a result, current computational approaches to describe microbial systems rely on incomplete reference databases that cannot adequately capture the functional diversity of the microbial tree of life, limiting our ability to model high-level features of biological sequences. Here we present LookingGlass, a deep learning model encoding contextually-aware, functionally and evolutionarily relevant representations of short DNA reads, that distinguishes reads of disparate function, homology, and environmental origin. We demonstrate the ability of LookingGlass to be fine-tuned via transfer learning to perform a range of diverse tasks: to identify novel oxidoreductases, to predict enzyme optimal temperature, and to recognize the reading frames of DNA sequence fragments. LookingGlass enables functionally relevant representations of otherwise unknown and unannotated sequences, shedding light on the microbial dark matter that dominates life on Earth. Computational methods to analyse microbial systems rely on reference databases which do not capture their full functional diversity. Here the authors develop a deep learning model and apply it using transfer learning, creating biologically useful models for multiple different tasks.
Collapse
Affiliation(s)
- A Hoarfrost
- Department of Marine and Coastal Sciences, Rutgers University, 71 Dudley Road, New Brunswick, NJ, 08873, USA. .,NASA Ames Research Center, Moffett Field, CA, 94035, USA.
| | - A Aptekmann
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ, 08901, USA
| | - G Farfañuk
- Department of Biological Chemistry, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Y Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ, 08901, USA.
| |
Collapse
|
35
|
Yang MR, Wu YW. Enhancing predictions of antimicrobial resistance of pathogens by expanding the potential resistance gene repertoire using a pan-genome-based feature selection approach. BMC Bioinformatics 2022; 23:131. [PMID: 35428201 PMCID: PMC9011928 DOI: 10.1186/s12859-022-04666-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 04/04/2022] [Indexed: 11/10/2022] Open
Abstract
Background Predicting which pathogens might exhibit antimicrobial resistance (AMR) based on genomics data is one of the promising ways to swiftly and precisely identify AMR pathogens. Currently, the most widely used genomics approach is through identifying known AMR genes from genomic information in order to predict whether a pathogen might be resistant to certain antibiotic drugs. The list of known AMR genes, however, is still far from comprehensive and may result in inaccurate AMR pathogen predictions. We thus felt the need to expand the AMR gene set and proposed a pan-genome-based feature selection method to identify potential gene sets for AMR prediction purposes. Results By building pan-genome datasets and extracting gene presence/absence patterns from four bacterial species, each with more than 2000 strains, we showed that machine learning models built from pan-genome data can be very promising for predicting AMR pathogens. The gene set selected by the eXtreme Gradient Boosting (XGBoost) feature selection approach further improved prediction outcomes, and an incremental approach selecting subsets of XGBoost-selected features brought the machine learning model performance to the next level. Investigating selected gene sets revealed that on average about 50% of genes had no known function and very few of them were known AMR genes, indicating the potential of the selected gene sets to expand resistance gene repertoires. Conclusions We demonstrated that a pan-genome-based feature selection approach is suitable for building machine learning models for predicting AMR pathogens. The extracted gene sets may provide future clues to expand our knowledge of known AMR genes and provide novel hypotheses for inferring bacterial AMR mechanisms. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04666-2.
Collapse
|
36
|
Schwengers O, Jelonek L, Dieckmann MA, Beyvers S, Blom J, Goesmann A. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb Genom 2021; 7:000685. [PMID: 34739369 PMCID: PMC8743544 DOI: 10.1099/mgen.0.000685] [Citation(s) in RCA: 323] [Impact Index Per Article: 80.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 09/08/2021] [Indexed: 12/21/2022] Open
Abstract
Command-line annotation software tools have continuously gained popularity compared to centralized online services due to the worldwide increase of sequenced bacterial genomes. However, results of existing command-line software pipelines heavily depend on taxon-specific databases or sufficiently well annotated reference genomes. Here, we introduce Bakta, a new command-line software tool for the robust, taxon-independent, thorough and, nonetheless, fast annotation of bacterial genomes. Bakta conducts a comprehensive annotation workflow including the detection of small proteins taking into account replicon metadata. The annotation of coding sequences is accelerated via an alignment-free sequence identification approach that in addition facilitates the precise assignment of public database cross-references. Annotation results are exported in GFF3 and International Nucleotide Sequence Database Collaboration (INSDC)-compliant flat files, as well as comprehensive JSON files, facilitating automated downstream analysis. We compared Bakta to other rapid contemporary command-line annotation software tools in both targeted and taxonomically broad benchmarks including isolates and metagenomic-assembled genomes. We demonstrated that Bakta outperforms other tools in terms of functional annotations, the assignment of functional categories and database cross-references, whilst providing comparable wall-clock runtimes. Bakta is implemented in Python 3 and runs on MacOS and Linux systems. It is freely available under a GPLv3 license at https://github.com/oschwengers/bakta. An accompanying web version is available at https://bakta.computational.bio.
Collapse
Affiliation(s)
- Oliver Schwengers
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, Giessen 35392, Germany
| | - Lukas Jelonek
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, Giessen 35392, Germany
| | - Marius Alfred Dieckmann
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, Giessen 35392, Germany
| | - Sebastian Beyvers
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, Giessen 35392, Germany
| | - Jochen Blom
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, Giessen 35392, Germany
| | - Alexander Goesmann
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, Giessen 35392, Germany
| |
Collapse
|
37
|
Abstract
Cyclic diguanylate (c-di-GMP) signal transduction systems provide bacteria with the ability to sense changing cell status or environmental conditions and then execute suitable physiological and social behaviors in response. In this review, we provide a comprehensive census of the stimuli and receptors that are linked to the modulation of intracellular c-di-GMP. Emerging evidence indicates that c-di-GMP networks sense light, surfaces, energy, redox potential, respiratory electron acceptors, temperature, and structurally diverse biotic and abiotic chemicals. Bioinformatic analysis of sensory domains in diguanylate cyclases and c-di-GMP-specific phosphodiesterases as well as the receptor complexes associated with them reveals that these functions are linked to a diverse repertoire of protein domain families. We describe the principles of stimulus perception learned from studying these modular sensory devices, illustrate how they are assembled in varied combinations with output domains, and summarize a system for classifying these sensor proteins based on their complexity. Biological information processing via c-di-GMP signal transduction not only is fundamental to bacterial survival in dynamic environments but also is being used to engineer gene expression circuitry and synthetic proteins with à la carte biochemical functionalities.
Collapse
|
38
|
Cooley NP, Wright ES. Accurate annotation of protein coding sequences with IDTAXA. NAR Genom Bioinform 2021; 3:lqab080. [PMID: 34541527 PMCID: PMC8445202 DOI: 10.1093/nargab/lqab080] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 07/07/2021] [Accepted: 08/25/2021] [Indexed: 11/12/2022] Open
Abstract
The observed diversity of protein coding sequences continues to increase far more rapidly than knowledge of their functions, making classification algorithms essential for assigning a function to proteins using only their sequence. Most pipelines for annotating proteins rely on searches for homologous sequences in databases of previously annotated proteins using BLAST or HMMER. Here, we develop a new approach for classifying proteins into a taxonomy of functions and demonstrate its utility for genome annotation. Our algorithm, IDTAXA, was more accurate than BLAST or HMMER at assigning sequences to KEGG ortholog groups. Moreover, IDTAXA correctly avoided classifying sequences with novel functions to existing groups, which is a common error mode for classification approaches that rely on E-values as a proxy for confidence. We demonstrate IDTAXA's utility for annotating eukaryotic and prokaryotic genomes by assigning functions to proteins within a multi-level ontology and applied IDTAXA to detect genome contamination in eukaryotic genomes. Finally, we re-annotated 8604 microbial genomes with known antibiotic resistance phenotypes to discover two novel associations between proteins and antibiotic resistance. IDTAXA is available as a web tool (http://DECIPHER.codes/Classification.html) or as part of the open source DECIPHER R package from Bioconductor.
Collapse
Affiliation(s)
- Nicholas P Cooley
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
| | - Erik S Wright
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
- Center for Evolutionary Biology and Medicine, Pittsburgh, PA 15219, USA
| |
Collapse
|
39
|
Kinetic, metabolic, and statistical analytics: addressing metabolic transport limitations among organelles and microbial communities. Curr Opin Biotechnol 2021; 71:91-97. [PMID: 34293631 DOI: 10.1016/j.copbio.2021.06.024] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 05/24/2021] [Accepted: 06/28/2021] [Indexed: 11/23/2022]
Abstract
Microbial organisms engage in a variety of metabolic interactions. A crucial part of these interactions is the exchange of molecules between different organelles, cells, and the environment. The main forces mediating this metabolic exchange are transporters. This transport can be difficult to measure experimentally because several transport mechanisms remain opaque. However, theoretical calculations about the inputs and outputs of cells via metabolic exchanges have enabled the successful inference of the workings of intra-organismal and inter-organismal systems. Kinetic, metabolic, and statistical modeling approaches in combination with omics data are enhancing our knowledge and understanding about metabolic exchange and mass resource allocation. This model-driven analytics approach can guide effective experimental design and yield new insights into biological function and control.
Collapse
|
40
|
Westoby M, Gillings MR, Madin JS, Nielsen DA, Paulsen IT, Tetu SG. Trait dimensions in bacteria and archaea compared to vascular plants. Ecol Lett 2021; 24:1487-1504. [PMID: 33896087 DOI: 10.1111/ele.13742] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Revised: 02/25/2021] [Accepted: 03/04/2021] [Indexed: 01/04/2023]
Abstract
Bacteria and archaea have very different ecology compared to plants. One similarity, though, is that much discussion of their ecological strategies has invoked concepts such as oligotrophy or stress tolerance. For plants, so-called 'trait ecology'-strategy description reframed along measurable trait dimensions-has made global syntheses possible. Among widely measured trait dimensions for bacteria and archaea three main axes are evident. Maximum growth rate in association with rRNA operon copy number expresses a rate-yield trade-off that is analogous to the acquisitive-conservative spectrum in plants, though underpinned by different trade-offs. Genome size in association with signal transduction expresses versatility. Cell size has influence on diffusive uptake and on relative wall costs. These trait dimensions, and potentially others, offer promise for interpreting ecology. At the same time, there are very substantial differences from plant trait ecology. Traits and their underpinning trade-offs are different. Also, bacteria and archaea use a variety of different substrates. Bacterial strategies can be viewed both through the facet of substrate-use pathways, and also through the facet of quantitative traits such as maximum growth rate. Preliminary evidence shows the quantitative traits vary widely within substrate-use pathways. This indicates they convey information complementary to substrate use.
Collapse
Affiliation(s)
- Mark Westoby
- Department of Biological Sciences, Macquarie University, Sydney, NSW, Australia
| | - Michael R Gillings
- Department of Biological Sciences, Macquarie University, Sydney, NSW, Australia
| | - Joshua S Madin
- Hawaii Institute of Marine Biology, University of Hawaii, Kaneohe, HI, USA
| | - Daniel A Nielsen
- Department of Biological Sciences, Macquarie University, Sydney, NSW, Australia
| | - Ian T Paulsen
- Dept of Molecular Sciences, Macquarie University, Sydney, NSW, Australia
| | - Sasha G Tetu
- Dept of Molecular Sciences, Macquarie University, Sydney, NSW, Australia
| |
Collapse
|
41
|
Klassen L, Xing X, Tingley JP, Low KE, King ML, Reintjes G, Abbott DW. Approaches to Investigate Selective Dietary Polysaccharide Utilization by Human Gut Microbiota at a Functional Level. Front Microbiol 2021; 12:632684. [PMID: 33679661 PMCID: PMC7933471 DOI: 10.3389/fmicb.2021.632684] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 02/01/2021] [Indexed: 12/18/2022] Open
Abstract
The human diet is temporally and spatially dynamic, and influenced by culture, regional food systems, socioeconomics, and consumer preference. Such factors result in enormous structural diversity of ingested glycans that are refractory to digestion by human enzymes. To convert these glycans into metabolizable nutrients and energy, humans rely upon the catalytic potential encoded within the gut microbiome, a rich collective of microorganisms residing in the gastrointestinal tract. The development of high-throughput sequencing methods has enabled microbial communities to be studied with more coverage and depth, and as a result, cataloging the taxonomic structure of the gut microbiome has become routine. Efforts to unravel the microbial processes governing glycan digestion by the gut microbiome, however, are still in their infancy and will benefit by retooling our approaches to study glycan structure at high resolution and adopting next-generation functional methods. Also, new bioinformatic tools specialized for annotating carbohydrate-active enzymes and predicting their functions with high accuracy will be required for deciphering the catalytic potential of sequence datasets. Furthermore, physiological approaches to enable genotype-phenotype assignments within the gut microbiome, such as fluorescent polysaccharides, has enabled rapid identification of carbohydrate interactions at the single cell level. In this review, we summarize the current state-of-knowledge of these methods and discuss how their continued development will advance our understanding of gut microbiome function.
Collapse
Affiliation(s)
- Leeann Klassen
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB, Canada
| | - Xiaohui Xing
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB, Canada
| | - Jeffrey P. Tingley
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB, Canada
- Department of Chemistry and Biochemistry, University of Lethbridge, Lethbridge, AB, Canada
| | - Kristin E. Low
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB, Canada
| | - Marissa L. King
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB, Canada
- Department of Chemistry and Biochemistry, University of Lethbridge, Lethbridge, AB, Canada
| | - Greta Reintjes
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB, Canada
- Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - D. Wade Abbott
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB, Canada
- Department of Chemistry and Biochemistry, University of Lethbridge, Lethbridge, AB, Canada
| |
Collapse
|
42
|
Bernstein DB, Sulheim S, Almaas E, Segrè D. Addressing uncertainty in genome-scale metabolic model reconstruction and analysis. Genome Biol 2021; 22:64. [PMID: 33602294 PMCID: PMC7890832 DOI: 10.1186/s13059-021-02289-z] [Citation(s) in RCA: 74] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 02/04/2021] [Indexed: 02/07/2023] Open
Abstract
The reconstruction and analysis of genome-scale metabolic models constitutes a powerful systems biology approach, with applications ranging from basic understanding of genotype-phenotype mapping to solving biomedical and environmental problems. However, the biological insight obtained from these models is limited by multiple heterogeneous sources of uncertainty, which are often difficult to quantify. Here we review the major sources of uncertainty and survey existing approaches developed for representing and addressing them. A unified formal characterization of these uncertainties through probabilistic approaches and ensemble modeling will facilitate convergence towards consistent reconstruction pipelines, improved data integration algorithms, and more accurate assessment of predictive capacity.
Collapse
Affiliation(s)
- David B Bernstein
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA
| | - Snorre Sulheim
- Bioinformatics Program, Boston University, Boston, MA, USA
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- Department of Biotechnology and Nanomedicine, SINTEF Industry, Trondheim, Norway
| | - Eivind Almaas
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- K.G. Jebsen Center for Genetic Epidemiology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Daniel Segrè
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA.
- Bioinformatics Program, Boston University, Boston, MA, USA.
- Department of Biology and Department of Physics, Boston University, Boston, MA, USA.
| |
Collapse
|
43
|
Tremblay BJM, Lobb B, Doxey AC. PhyloCorrelate: inferring bacterial gene-gene functional associations through large-scale phylogenetic profiling. Bioinformatics 2021; 37:17-22. [PMID: 33416870 DOI: 10.1093/bioinformatics/btaa1105] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Revised: 12/26/2020] [Accepted: 12/29/2020] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Statistical detection of co-occurring genes across genomes, known as "phylogenetic profiling", is a powerful bioinformatic technique for inferring gene-gene functional associations. However, this can be a challenging task given the size and complexity of phylogenomic databases, difficulty in accounting for phylogenetic structure, inconsistencies in genome annotation, and substantial computational requirements. RESULTS We introduce PhyloCorrelate-a computational framework for gene co-occurrence analysis across large phylogenomic datasets. PhyloCorrelate implements a variety of co-occurrence metrics including standard correlation metrics and model-based metrics that account for phylogenetic history. By combining multiple metrics, we developed an optimized score that exhibits a superior ability to link genes with overlapping GO terms and KEGG pathways, enabling gene function prediction. Using genomic and functional annotation data from the Genome Taxonomy Database and AnnoTree, we performed all-by-all comparisons of gene occurrence profiles across the bacterial tree of life, totaling 154,217,052 comparisons for 28,315 genes across 27,372 bacterial genomes. All predictions are available in an online database, which instantaneously returns the top correlated genes for any PFAM, TIGRFAM, or KEGG query. In total, PhyloCorrelate detected 29,762 high confidence associations between bacterial gene/protein pairs, and generated functional predictions for 834 DUFs and proteins of unknown function. AVAILABILITY PhyloCorrelate is available as a web-server at phylocorrelate.uwaterloo.ca as well as an R package for analysis of custom datasets. We anticipate that PhyloCorrelate will be broadly useful as a tool for predicting function and interactions for gene families. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Briallen Lobb
- Department of Biology, 200 University Ave. West, Waterloo, ON, N2L 3G1, Canada
| | - Andrew C Doxey
- Department of Biology, 200 University Ave. West, Waterloo, ON, N2L 3G1, Canada
| |
Collapse
|
44
|
Gaultney RA, Vincent AT, Lorioux C, Coppée JY, Sismeiro O, Varet H, Legendre R, Cockram CA, Veyrier F, Picardeau M. 4-Methylcytosine DNA modification is critical for global epigenetic regulation and virulence in the human pathogen Leptospira interrogans. Nucleic Acids Res 2020; 48:12102-12115. [PMID: 33301041 PMCID: PMC7708080 DOI: 10.1093/nar/gkaa966] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 10/01/2020] [Accepted: 10/13/2020] [Indexed: 12/25/2022] Open
Abstract
In bacteria, DNA methylation can be facilitated by 'orphan' DNA methyltransferases lacking cognate restriction endonucleases, but whether and how these enzymes control key cellular processes are poorly understood. The effects of a specific modification, 4-methylcytosine (4mC), are even less clear, as this epigenetic marker is unique to bacteria and archaea, whereas the bulk of epigenetic research is currently performed on eukaryotes. Here, we characterize a 4mC methyltransferase from the understudied pathogen Leptospira spp. Inactivating this enzyme resulted in complete abrogation of CTAG motif methylation, leading to genome-wide dysregulation of gene expression. Mutants exhibited growth defects, decreased adhesion to host cells, higher susceptibility to LPS-targeting antibiotics, and, importantly, were no longer virulent in an acute infection model. Further investigation resulted in the discovery of at least one gene, that of an ECF sigma factor, whose transcription was altered in the methylase mutant and, subsequently, by mutation of the CTAG motifs in the promoter of the gene. The genes that comprise the regulon of this sigma factor were, accordingly, dysregulated in the methylase mutant and in a strain overexpressing the sigma factor. Our results highlight the importance of 4mC in Leptospira physiology, and suggest the same of other understudied species.
Collapse
Affiliation(s)
| | - Antony T Vincent
- Bacterial Symbionts Evolution, INRS-Centre Armand-Frappier, Laval, Quebec, Canada
| | - Céline Lorioux
- Unité Biologie des Spirochètes, Institut Pasteur, Paris, France
| | - Jean-Yves Coppée
- Transcriptome and Epigenome Platform, Biomics, Center for Technological Resources and Research (C2RT), Institut Pasteur, Paris, France
| | - Odile Sismeiro
- Transcriptome and Epigenome Platform, Biomics, Center for Technological Resources and Research (C2RT), Institut Pasteur, Paris, France
| | - Hugo Varet
- Transcriptome and Epigenome Platform, Biomics, Center for Technological Resources and Research (C2RT), Institut Pasteur, Paris, France
- Bioinformatics and Biostatistics Hub, Department of Computational Biology, USR 3756 CNRS, Institut Pasteur, Paris, France
| | - Rachel Legendre
- Transcriptome and Epigenome Platform, Biomics, Center for Technological Resources and Research (C2RT), Institut Pasteur, Paris, France
- Bioinformatics and Biostatistics Hub, Department of Computational Biology, USR 3756 CNRS, Institut Pasteur, Paris, France
| | | | - Frédéric J Veyrier
- Bacterial Symbionts Evolution, INRS-Centre Armand-Frappier, Laval, Quebec, Canada
| | | |
Collapse
|
45
|
Grasso S, van Rij T, van Dijl JM. GP4: an integrated Gram-Positive Protein Prediction Pipeline for subcellular localization mimicking bacterial sorting. Brief Bioinform 2020; 22:5998864. [PMID: 33227814 PMCID: PMC8294519 DOI: 10.1093/bib/bbaa302] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 10/08/2020] [Accepted: 10/09/2020] [Indexed: 11/17/2022] Open
Abstract
Subcellular localization is a critical aspect of protein function and the potential application of proteins either as drugs or drug targets, or in industrial and domestic applications. However, the experimental determination of protein localization is time consuming and expensive. Therefore, various localization predictors have been developed for particular groups of species. Intriguingly, despite their major representation amongst biotechnological cell factories and pathogens, a meta-predictor based on sorting signals and specific for Gram-positive bacteria was still lacking. Here we present GP4, a protein subcellular localization meta-predictor mainly for Firmicutes, but also Actinobacteria, based on the combination of multiple tools, each specific for different sorting signals and compartments. Novelty elements include improved cell-wall protein prediction, including differentiation of the type of interaction, prediction of non-canonical secretion pathway target proteins, separate prediction of lipoproteins and better user experience in terms of parsability and interpretability of the results. GP4 aims at mimicking protein sorting as it would happen in a bacterial cell. As GP4 is not homology based, it has a broad applicability and does not depend on annotated databases with homologous proteins. Non-canonical usage may include little studied or novel species, synthetic and engineered organisms, and even re-use of the prediction data to develop custom prediction algorithms. Our benchmark analysis highlights the improved performance of GP4 compared to other widely used subcellular protein localization predictors. A webserver running GP4 is available at http://gp4.hpc.rug.nl/
Collapse
Affiliation(s)
| | | | - Jan Maarten van Dijl
- University of Groningen and the University Medical Center Groningen, the Netherlands
| |
Collapse
|