1
|
Li C, Stražar M, Mohamed AMT, Pacheco JA, Walker RL, Lebar T, Zhao S, Lockart J, Dame A, Thurimella K, Jeanfavre S, Brown EM, Ang QY, Berdy B, Sergio D, Invernizzi R, Tinoco A, Pishchany G, Vasan RS, Balskus E, Huttenhower C, Vlamakis H, Clish C, Shaw SY, Plichta DR, Xavier RJ. Gut microbiome and metabolome profiling in Framingham heart study reveals cholesterol-metabolizing bacteria. Cell 2024; 187:1834-1852.e19. [PMID: 38569543 DOI: 10.1016/j.cell.2024.03.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 01/23/2024] [Accepted: 03/11/2024] [Indexed: 04/05/2024]
Abstract
Accumulating evidence suggests that cardiovascular disease (CVD) is associated with an altered gut microbiome. Our understanding of the underlying mechanisms has been hindered by lack of matched multi-omic data with diagnostic biomarkers. To comprehensively profile gut microbiome contributions to CVD, we generated stool metagenomics and metabolomics from 1,429 Framingham Heart Study participants. We identified blood lipids and cardiovascular health measurements associated with microbiome and metabolome composition. Integrated analysis revealed microbial pathways implicated in CVD, including flavonoid, γ-butyrobetaine, and cholesterol metabolism. Species from the Oscillibacter genus were associated with decreased fecal and plasma cholesterol levels. Using functional prediction and in vitro characterization of multiple representative human gut Oscillibacter isolates, we uncovered conserved cholesterol-metabolizing capabilities, including glycosylation and dehydrogenation. These findings suggest that cholesterol metabolism is a broad property of phylogenetically diverse Oscillibacter spp., with potential benefits for lipid homeostasis and cardiovascular health.
Collapse
Affiliation(s)
- Chenhao Li
- Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | | | - Ahmed M T Mohamed
- Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | | | | | - Tina Lebar
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Shijie Zhao
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Julia Lockart
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Andrea Dame
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | - Eric M Brown
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Qi Yan Ang
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Dallis Sergio
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Rachele Invernizzi
- Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Antonio Tinoco
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
| | | | - Ramachandran S Vasan
- Boston University and NHLBI's Framingham Heart Study, Framingham, MA, USA; Sections of Preventive Medicine and Epidemiology and Cardiology, Department of Medicine, Boston University School of Medicine, Boston, MA, USA; University of Texas School of Public Health, San Antonio, TX, USA
| | - Emily Balskus
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA; Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA
| | - Curtis Huttenhower
- Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Hera Vlamakis
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Clary Clish
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Stanley Y Shaw
- Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | | | - Ramnik J Xavier
- Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
2
|
Thurimella K, Mohamed AMT, Graham DB, Owens RM, La Rosa SL, Plichta DR, Bacallado S, Xavier RJ. Protein Language Models Uncover Carbohydrate-Active Enzyme Function in Metagenomics. bioRxiv 2023:2023.10.23.563620. [PMID: 37961379 PMCID: PMC10634757 DOI: 10.1101/2023.10.23.563620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
In metagenomics, the pool of uncharacterized microbial enzymes presents a challenge for functional annotation. Among these, carbohydrate-active enzymes (CAZymes) stand out due to their pivotal roles in various biological processes related to host health and nutrition. Here, we present CAZyLingua, the first tool that harnesses protein language model embeddings to build a deep learning framework that facilitates the annotation of CAZymes in metagenomic datasets. Our benchmarking results showed on average a higher F1 score (reflecting an average of precision and recall) on the annotated genomes of Bacteroides thetaiotaomicron, Eggerthella lenta and Ruminococcus gnavus compared to the traditional sequence homology-based method in dbCAN2. We applied our tool to a paired mother/infant longitudinal dataset and revealed unannotated CAZymes linked to microbial development during infancy. When applied to metagenomic datasets derived from patients affected by fibrosis-prone diseases such as Crohn's disease and IgG4-related disease, CAZyLingua uncovered CAZymes associated with disease and healthy states. In each of these metagenomic catalogs, CAZyLingua discovered new annotations that were previously overlooked by traditional sequence homology tools. Overall, the deep learning model CAZyLingua can be applied in combination with existing tools to unravel intricate CAZyme evolutionary profiles and patterns, contributing to a more comprehensive understanding of microbial metabolic dynamics.
Collapse
Affiliation(s)
- Kumar Thurimella
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UK
- School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Ahmed M. T. Mohamed
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Daniel B. Graham
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Róisín M. Owens
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UK
| | - Sabina Leanti La Rosa
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
| | - Damian R. Plichta
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Sergio Bacallado
- Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge, UK
| | - Ramnik J. Xavier
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
3
|
Shaffer M, Thurimella K, Sterrett JD, Lozupone CA. SCNIC: Sparse correlation network investigation for compositional data. Mol Ecol Resour 2023; 23:312-325. [PMID: 36001047 PMCID: PMC9744196 DOI: 10.1111/1755-0998.13704] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 08/17/2022] [Accepted: 08/18/2022] [Indexed: 12/14/2022]
Abstract
Microbiome studies are often limited by a lack of statistical power due to small sample sizes and a large number of features. This problem is exacerbated in correlative studies of multi-omic datasets. Statistical power can be increased by finding and summarizing modules of correlated observations, which is one dimensionality reduction method. Additionally, modules provide biological insight as correlated groups of microbes can have relationships among themselves. To address these challenges, we developed SCNIC: Sparse Cooccurrence Network Investigation for compositional data. SCNIC is open-source software that can generate correlation networks and detect and summarize modules of highly correlated features. Modules can be formed using either the Louvain Modularity Maximization (LMM) algorithm or a Shared Minimum Distance algorithm (SMD) that we newly describe here and relate to LMM using simulated data. We applied SCNIC to two published datasets and we achieved increased statistical power and identified microbes that not only differed across groups, but also correlated strongly with each other, suggesting shared environmental drivers or cooperative relationships among them. SCNIC provides an easy way to generate correlation networks, identify modules of correlated features and summarize them for downstream statistical analysis. Although SCNIC was designed considering properties of microbiome data, such as compositionality and sparsity, it can be applied to a variety of data types including metabolomics data and used to integrate multiple data types. SCNIC allows for the identification of functional microbial relationships at scale while increasing statistical power through feature reduction.
Collapse
Affiliation(s)
- Michael Shaffer
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Kumar Thurimella
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA,Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UK
| | - John D. Sterrett
- Department of Integrative Physiology, University of Colorado, Boulder, Colorado, USA
| | - Catherine A. Lozupone
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| |
Collapse
|
4
|
Shaffer M, Thurimella K, Quinn K, Doenges K, Zhang X, Bokatzian S, Reisdorph N, Lozupone CA. AMON: annotation of metabolite origins via networks to integrate microbiome and metabolome data. BMC Bioinformatics 2019; 20:614. [PMID: 31779604 PMCID: PMC6883642 DOI: 10.1186/s12859-019-3176-8] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Accepted: 10/28/2019] [Indexed: 12/26/2022] Open
Abstract
Background Untargeted metabolomics of host-associated samples has yielded insights into mechanisms by which microbes modulate health. However, data interpretation is challenged by the complexity of origins of the small molecules measured, which can come from the host, microbes that live within the host, or from other exposures such as diet or the environment. Results We address this challenge through development of AMON: Annotation of Metabolite Origins via Networks. AMON is an open-source bioinformatics application that can be used to annotate which compounds in the metabolome could have been produced by bacteria present or the host, to evaluate pathway enrichment of host verses microbial metabolites, and to visualize which compounds may have been produced by host versus microbial enzymes in KEGG pathway maps. Conclusions AMON empowers researchers to predict origins of metabolites via genomic information and to visualize potential host:microbe interplay. Additionally, the evaluation of enrichment of pathway metabolites of host versus microbial origin gives insight into the metabolic functionality that a microbial community adds to a host:microbe system. Through integrated analysis of microbiome and metabolome data, mechanistic relationships between microbial communities and host phenotypes can be better understood.
Collapse
Affiliation(s)
- M Shaffer
- Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - K Thurimella
- Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - K Quinn
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, 80045CO, Aurora, USA
| | - K Doenges
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, 80045CO, Aurora, USA
| | - X Zhang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, 80045CO, Aurora, USA.,Present address: BioElectron Technology Corporation, Mountain View, CA, 94043, USA
| | - S Bokatzian
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, 80045CO, Aurora, USA
| | - N Reisdorph
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, 80045CO, Aurora, USA
| | - C A Lozupone
- Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
| |
Collapse
|