1
|
Moyer DC, Reimertz J, Segrè D, Fuxman Bass JI. MACAW: a method for semi-automatic detection of errors in genome-scale metabolic models. Genome Biol 2025; 26:79. [PMID: 40156030 PMCID: PMC11954327 DOI: 10.1186/s13059-025-03533-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Accepted: 03/07/2025] [Indexed: 04/01/2025] Open
Abstract
Genome-scale metabolic models (GSMMs) are used to predict metabolic fluxes, with applications ranging from identifying novel drug targets to engineering microbial metabolism. Erroneous or missing reactions, scattered throughout densely interconnected networks, are a limiting factor in these applications. We present Metabolic Accuracy Check and Analysis Workflow (MACAW), a suite of algorithms that helps to identify and visualize errors at the level of connected pathways, rather than individual reactions. We show how MACAW highlights inaccuracies of varying severity in manually curated and automatically generated GSMMs for humans, yeast, and bacteria and helps to identify systematic issues to be addressed in future model construction efforts.
Collapse
Affiliation(s)
- Devlin C Moyer
- Bioinformatics Program, Boston University, Boston, MA, 02215, USA
- Department of Biology, Boston University, Boston, MA, 02215, USA
| | - Justin Reimertz
- Bioinformatics Program, Boston University, Boston, MA, 02215, USA
| | - Daniel Segrè
- Bioinformatics Program, Boston University, Boston, MA, 02215, USA.
- Department of Biology, Boston University, Boston, MA, 02215, USA.
- Biological Design Center, Boston University, Boston, MA, 02215, USA.
- Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA.
- Department of Physics, Boston University, Boston, MA, 02215, USA.
- Bioinformatics Program, Faculty of Computing and Data Science, Boston, MA, 02215, USA.
| | - Juan I Fuxman Bass
- Bioinformatics Program, Boston University, Boston, MA, 02215, USA.
- Department of Biology, Boston University, Boston, MA, 02215, USA.
- Biological Design Center, Boston University, Boston, MA, 02215, USA.
| |
Collapse
|
2
|
Moyer DC, Reimertz J, Segrè D, Fuxman Bass JI. Semi-Automatic Detection of Errors in Genome-Scale Metabolic Models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.24.600481. [PMID: 38979177 PMCID: PMC11230171 DOI: 10.1101/2024.06.24.600481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Genome-Scale Metabolic Models (GSMMs) are used for numerous tasks requiring computational estimates of metabolic fluxes, from predicting novel drug targets to engineering microbes to produce valuable compounds. A key limiting step in most applications of GSMMs is ensuring their representation of the target organism's metabolism is complete and accurate. Identifying and visualizing errors in GSMMs is complicated by the fact that they contain thousands of densely interconnected reactions. Furthermore, many errors in GSMMs only become apparent when considering pathways of connected reactions collectively, as opposed to examining reactions individually. Results We present Metabolic Accuracy Check and Analysis Workflow (MACAW), a collection of algorithms for detecting errors in GSMMs. The relative frequencies of errors we detect in manually curated GSMMs appear to reflect the different approaches used to curate them. Changing the method used to automatically create a GSMM from a particular organism's genome can have a larger impact on the kinds of errors in the resulting GSMM than using the same method with a different organism's genome. Our algorithms are particularly capable of identifying errors that are only apparent at the pathway level, including loops, and nontrivial cases of dead ends. Conclusions MACAW is capable of identifying inaccuracies of varying severity in a wide range of GSMMs. Correcting these errors can measurably improve the predictive capacity of a GSMM. The relative prevalence of each type of error we identify in a large collection of GSMMs could help shape future efforts for further automation of error correction and GSMM creation.
Collapse
|
3
|
Bernstein DB, Sulheim S, Almaas E, Segrè D. Addressing uncertainty in genome-scale metabolic model reconstruction and analysis. Genome Biol 2021; 22:64. [PMID: 33602294 PMCID: PMC7890832 DOI: 10.1186/s13059-021-02289-z] [Citation(s) in RCA: 78] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 02/04/2021] [Indexed: 02/07/2023] Open
Abstract
The reconstruction and analysis of genome-scale metabolic models constitutes a powerful systems biology approach, with applications ranging from basic understanding of genotype-phenotype mapping to solving biomedical and environmental problems. However, the biological insight obtained from these models is limited by multiple heterogeneous sources of uncertainty, which are often difficult to quantify. Here we review the major sources of uncertainty and survey existing approaches developed for representing and addressing them. A unified formal characterization of these uncertainties through probabilistic approaches and ensemble modeling will facilitate convergence towards consistent reconstruction pipelines, improved data integration algorithms, and more accurate assessment of predictive capacity.
Collapse
Affiliation(s)
- David B Bernstein
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA
| | - Snorre Sulheim
- Bioinformatics Program, Boston University, Boston, MA, USA
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- Department of Biotechnology and Nanomedicine, SINTEF Industry, Trondheim, Norway
| | - Eivind Almaas
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- K.G. Jebsen Center for Genetic Epidemiology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Daniel Segrè
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA.
- Bioinformatics Program, Boston University, Boston, MA, USA.
- Department of Biology and Department of Physics, Boston University, Boston, MA, USA.
| |
Collapse
|