1
|
Moyer DC, Reimertz J, Segrè D, Fuxman Bass JI. MACAW: a method for semi-automatic detection of errors in genome-scale metabolic models. Genome Biol 2025; 26:79. [PMID: 40156030 PMCID: PMC11954327 DOI: 10.1186/s13059-025-03533-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Accepted: 03/07/2025] [Indexed: 04/01/2025] Open
Abstract
Genome-scale metabolic models (GSMMs) are used to predict metabolic fluxes, with applications ranging from identifying novel drug targets to engineering microbial metabolism. Erroneous or missing reactions, scattered throughout densely interconnected networks, are a limiting factor in these applications. We present Metabolic Accuracy Check and Analysis Workflow (MACAW), a suite of algorithms that helps to identify and visualize errors at the level of connected pathways, rather than individual reactions. We show how MACAW highlights inaccuracies of varying severity in manually curated and automatically generated GSMMs for humans, yeast, and bacteria and helps to identify systematic issues to be addressed in future model construction efforts.
Collapse
Affiliation(s)
- Devlin C Moyer
- Bioinformatics Program, Boston University, Boston, MA, 02215, USA
- Department of Biology, Boston University, Boston, MA, 02215, USA
| | - Justin Reimertz
- Bioinformatics Program, Boston University, Boston, MA, 02215, USA
| | - Daniel Segrè
- Bioinformatics Program, Boston University, Boston, MA, 02215, USA.
- Department of Biology, Boston University, Boston, MA, 02215, USA.
- Biological Design Center, Boston University, Boston, MA, 02215, USA.
- Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA.
- Department of Physics, Boston University, Boston, MA, 02215, USA.
- Bioinformatics Program, Faculty of Computing and Data Science, Boston, MA, 02215, USA.
| | - Juan I Fuxman Bass
- Bioinformatics Program, Boston University, Boston, MA, 02215, USA.
- Department of Biology, Boston University, Boston, MA, 02215, USA.
- Biological Design Center, Boston University, Boston, MA, 02215, USA.
| |
Collapse
|
2
|
Moyer DC, Reimertz J, Segrè D, Fuxman Bass JI. Semi-Automatic Detection of Errors in Genome-Scale Metabolic Models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.24.600481. [PMID: 38979177 PMCID: PMC11230171 DOI: 10.1101/2024.06.24.600481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Genome-Scale Metabolic Models (GSMMs) are used for numerous tasks requiring computational estimates of metabolic fluxes, from predicting novel drug targets to engineering microbes to produce valuable compounds. A key limiting step in most applications of GSMMs is ensuring their representation of the target organism's metabolism is complete and accurate. Identifying and visualizing errors in GSMMs is complicated by the fact that they contain thousands of densely interconnected reactions. Furthermore, many errors in GSMMs only become apparent when considering pathways of connected reactions collectively, as opposed to examining reactions individually. Results We present Metabolic Accuracy Check and Analysis Workflow (MACAW), a collection of algorithms for detecting errors in GSMMs. The relative frequencies of errors we detect in manually curated GSMMs appear to reflect the different approaches used to curate them. Changing the method used to automatically create a GSMM from a particular organism's genome can have a larger impact on the kinds of errors in the resulting GSMM than using the same method with a different organism's genome. Our algorithms are particularly capable of identifying errors that are only apparent at the pathway level, including loops, and nontrivial cases of dead ends. Conclusions MACAW is capable of identifying inaccuracies of varying severity in a wide range of GSMMs. Correcting these errors can measurably improve the predictive capacity of a GSMM. The relative prevalence of each type of error we identify in a large collection of GSMMs could help shape future efforts for further automation of error correction and GSMM creation.
Collapse
|
3
|
Bernstein DB, Sulheim S, Almaas E, Segrè D. Addressing uncertainty in genome-scale metabolic model reconstruction and analysis. Genome Biol 2021; 22:64. [PMID: 33602294 PMCID: PMC7890832 DOI: 10.1186/s13059-021-02289-z] [Citation(s) in RCA: 74] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 02/04/2021] [Indexed: 02/07/2023] Open
Abstract
The reconstruction and analysis of genome-scale metabolic models constitutes a powerful systems biology approach, with applications ranging from basic understanding of genotype-phenotype mapping to solving biomedical and environmental problems. However, the biological insight obtained from these models is limited by multiple heterogeneous sources of uncertainty, which are often difficult to quantify. Here we review the major sources of uncertainty and survey existing approaches developed for representing and addressing them. A unified formal characterization of these uncertainties through probabilistic approaches and ensemble modeling will facilitate convergence towards consistent reconstruction pipelines, improved data integration algorithms, and more accurate assessment of predictive capacity.
Collapse
Affiliation(s)
- David B Bernstein
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA
| | - Snorre Sulheim
- Bioinformatics Program, Boston University, Boston, MA, USA
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- Department of Biotechnology and Nanomedicine, SINTEF Industry, Trondheim, Norway
| | - Eivind Almaas
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- K.G. Jebsen Center for Genetic Epidemiology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Daniel Segrè
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA.
- Bioinformatics Program, Boston University, Boston, MA, USA.
- Department of Biology and Department of Physics, Boston University, Boston, MA, USA.
| |
Collapse
|
4
|
Hadadi N, Pandey V, Chiappino-Pepe A, Morales M, Gallart-Ayala H, Mehl F, Ivanisevic J, Sentchilo V, Meer JRVD. Mechanistic insights into bacterial metabolic reprogramming from omics-integrated genome-scale models. NPJ Syst Biol Appl 2020; 6:1. [PMID: 32001719 PMCID: PMC6946695 DOI: 10.1038/s41540-019-0121-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 11/28/2019] [Indexed: 11/18/2022] Open
Abstract
Understanding the adaptive responses of individual bacterial strains is crucial for microbiome engineering approaches that introduce new functionalities into complex microbiomes, such as xenobiotic compound metabolism for soil bioremediation. Adaptation requires metabolic reprogramming of the cell, which can be captured by multi-omics, but this data remains formidably challenging to interpret and predict. Here we present a new approach that combines genome-scale metabolic modeling with transcriptomics and exometabolomics, both of which are common tools for studying dynamic population behavior. As a realistic demonstration, we developed a genome-scale model of Pseudomonas veronii 1YdBTEX2, a candidate bioaugmentation agent for accelerated metabolism of mono-aromatic compounds in soil microbiomes, while simultaneously collecting experimental data of P. veronii metabolism during growth phase transitions. Predictions of the P. veronii growth rates and specific metabolic processes from the integrated model closely matched experimental observations. We conclude that integrative and network-based analysis can help build predictive models that accurately capture bacterial adaptation responses. Further development and testing of such models may considerably improve the successful establishment of bacterial inoculants in more complex systems.
Collapse
Affiliation(s)
- Noushin Hadadi
- Department of Fundamental Microbiology, University of Lausanne, 1015, Lausanne, Switzerland.
| | - Vikash Pandey
- Laboratory of Computational Systems Biotechnology, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
| | - Anush Chiappino-Pepe
- Laboratory of Computational Systems Biotechnology, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
| | - Marian Morales
- Department of Fundamental Microbiology, University of Lausanne, 1015, Lausanne, Switzerland
| | | | - Florence Mehl
- Metabolomics Platform, University of Lausanne, 1015, Lausanne, Switzerland
| | | | - Vladimir Sentchilo
- Department of Fundamental Microbiology, University of Lausanne, 1015, Lausanne, Switzerland
| | - Jan R van der Meer
- Department of Fundamental Microbiology, University of Lausanne, 1015, Lausanne, Switzerland
| |
Collapse
|
5
|
Griesemer M, Kimbrel JA, Zhou CE, Navid A, D'haeseleer P. Combining multiple functional annotation tools increases coverage of metabolic annotation. BMC Genomics 2018; 19:948. [PMID: 30567498 PMCID: PMC6299973 DOI: 10.1186/s12864-018-5221-9] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Accepted: 11/05/2018] [Indexed: 12/15/2022] Open
Abstract
Background Genome-scale metabolic modeling is a cornerstone of systems biology analysis of microbial organisms and communities, yet these genome-scale modeling efforts are invariably based on incomplete functional annotations. Annotated genomes typically contain 30–50% of genes without functional annotation, severely limiting our knowledge of the “parts lists” that the organisms have at their disposal. These incomplete annotations may be sufficient to derive a model of a core set of well-studied metabolic pathways that support growth in pure culture. However, pathways important for growth on unusual metabolites exchanged in complex microbial communities are often less understood, resulting in missing functional annotations in newly sequenced genomes. Results Here, we present results on a comprehensive reannotation of 27 bacterial reference genomes, focusing on enzymes with EC numbers annotated by KEGG, RAST, EFICAz, and the BRENDA enzyme database, and on membrane transport annotations by TransportDB, KEGG and RAST. Our analysis shows that annotation using multiple tools can result in a drastically larger metabolic network reconstruction, adding on average 40% more EC numbers, 3–8 times more substrate-specific transporters, and 37% more metabolic genes. These results are even more pronounced for bacterial species that are phylogenetically distant from well-studied model organisms such as E. coli. Conclusions Metabolic annotations are often incomplete and inconsistent. Combining multiple functional annotation tools can greatly improve genome coverage and metabolic network size, especially for non-model organisms and non-core pathways. Electronic supplementary material The online version of this article (10.1186/s12864-018-5221-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Marc Griesemer
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Jeffrey A Kimbrel
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Carol E Zhou
- Global Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Ali Navid
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Patrik D'haeseleer
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA. .,Global Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA.
| |
Collapse
|
6
|
Klobucar K, Brown ED. Use of genetic and chemical synthetic lethality as probes of complexity in bacterial cell systems. FEMS Microbiol Rev 2018; 42:4563584. [PMID: 29069427 DOI: 10.1093/femsre/fux054] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 10/23/2017] [Indexed: 12/22/2022] Open
Abstract
Different conditions and genomic contexts are known to have an impact on gene essentiality and interactions. Synthetic lethal interactions occur when a combination of perturbations, either genetic or chemical, result in a more profound fitness defect than expected based on the effect of each perturbation alone. Synthetic lethality in bacterial systems has long been studied; however, during the past decade, the emerging fields of genomics and chemical genomics have led to an increase in the scale and throughput of these studies. Here, we review the concepts of genomics and chemical genomics in the context of synthetic lethality and their revolutionary roles in uncovering novel biology such as the characterization of genes of unknown function and in antibacterial drug discovery. We provide an overview of the methodologies, examples and challenges of both genetic and chemical synthetic lethal screening platforms. Finally, we discuss how to apply genetic and chemical synthetic lethal approaches to rationalize the synergies of drugs, screen for new and improved antibacterial therapies and predict drug mechanism of action.
Collapse
Affiliation(s)
- Kristina Klobucar
- Department of Biochemistry and Biomedical Sciences, Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, 1280 Main St West, Hamilton, ON L8N 3Z5, Canada
| | - Eric D Brown
- Department of Biochemistry and Biomedical Sciences, Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, 1280 Main St West, Hamilton, ON L8N 3Z5, Canada
| |
Collapse
|
7
|
Latendresse M, Karp PD. Evaluation of reaction gap-filling accuracy by randomization. BMC Bioinformatics 2018; 19:53. [PMID: 29444634 PMCID: PMC5813426 DOI: 10.1186/s12859-018-2050-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Accepted: 01/31/2018] [Indexed: 12/18/2022] Open
Abstract
Background Completion of genome-scale flux-balance models using computational reaction gap-filling is a widely used approach, but its accuracy is not well known. Results We report on computational experiments of reaction gap filling in which we generated degraded versions of the EcoCyc-20.0-GEM model by randomly removing flux-carrying reactions from a growing model. We gap-filled the degraded models and compared the resulting gap-filled models with the original model. Gap-filling was performed by the Pathway Tools MetaFlux software using its General Development Mode (GenDev) and its Fast Development Mode (FastDev). We explored 12 GenDev variants including two linear solvers (SCIP and CPLEX) for solving the Mixed Integer Linear Programming (MILP) problems for gap filling; three different sets of linear constraints were applied; and two MILP methods were implemented. We compared these 13 variants according to accuracy, speed, and amount of information returned to the user. Conclusions We observed large variation among the performance of the 13 gap-filling variants. Although no variant was best in all dimensions, we found one variant that was fast, accurate, and returned more information to the user. Some gap-filling variants were inaccurate, producing solutions that were non-minimum or invalid (did not enable model growth). The best GenDev variant showed a best average precision of 87% and a best average recall of 61%. FastDev showed an average precision of 71% and an average recall of 59%. Thus, using the most accurate variant, approximately 13% of the gap-filled reactions were incorrect (were not the reactions removed from the model), and 39% of gap-filled reactions were not found, suggesting that curation is still an important aspect of metabolic-model development.
Collapse
Affiliation(s)
- Mario Latendresse
- SRI International/Artificial Intelligence Center, 333 Ravenswood Ave, Menlo Park, 94025, USA.
| | - Peter D Karp
- SRI International/Artificial Intelligence Center, 333 Ravenswood Ave, Menlo Park, 94025, USA
| |
Collapse
|