1
|
Kundu S. A mathematically rigorous algorithm to define, compute and assess relevance of the probable dissociation constants in characterizing a biochemical network. Sci Rep 2024; 14:3507. [PMID: 38347039 PMCID: PMC10861591 DOI: 10.1038/s41598-024-53231-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 01/30/2024] [Indexed: 02/15/2024] Open
Abstract
Metabolism results from enzymatic- and non-enzymatic interactions of several molecules, is easily parameterized with the dissociation constant and occurs via biochemical networks. The dissociation constant is an empirically determined parameter and cannot be used directly to investigate in silico models of biochemical networks. Here, we develop and present an algorithm to define, compute and assess the relevance of the probable dissociation constant for every reaction of a biochemical network. The reactants and reactions of this network are modelled by a stoichiometry number matrix. The algorithm computes the null space and then serially generates subspaces by combinatorially summing the spanning vectors that are non-trivial and unique. This is done until the terms of each row either monotonically diverge or form an alternating sequence whose terms can be partitioned into subsets with almost the same number of oppositely signed terms. For a selected null space-generated subspace the algorithm utilizes several statistical and mathematical descriptors to select and bin terms from each row into distinct outcome-specific subsets. The terms of each subset are summed, mapped to the real-valued open interval [Formula: see text] and used to populate a reaction-specific outcome vector. The p1-norm for this vector is then the probable dissociation constant for this reaction. These steps are continued until every reaction of a modelled network is unambiguously annotated. The assertions presented are complemented by computational studies of a biochemical network for aerobic glycolysis. The fundamental premise of this work is that every row of a null space-generated subspace is a valid reaction and can therefore, be modelled as a reaction-specific sequence vector with a dimension that corresponds to the cardinality of the subspace after excluding all trivial- and redundant-vectors. A major finding of this study is that the row-wise sum or the sum of the terms contained in each reaction-specific sequence vector is mapped unambiguously to a positive real number. This means that the probable dissociation constants, for all reactions, can be directly computed from the stoichiometry number matrix and are suitable indicators of outcome for every reaction of the modelled biochemical network. Additionally, we find that the unambiguous annotation for a biochemical network will require a minimum number of iterations and will determine computational complexity.
Collapse
Affiliation(s)
- Siddhartha Kundu
- Department of Biochemistry, All India Institute of Medical Sciences, Ansari Nagar, New Delhi, 110029, India.
| |
Collapse
|
2
|
Saa PA, Zapararte S, Drovandi CC, Nielsen LK. LooplessFluxSampler: an efficient toolbox for sampling the loopless flux solution space of metabolic models. BMC Bioinformatics 2024; 25:3. [PMID: 38166586 PMCID: PMC10763395 DOI: 10.1186/s12859-023-05616-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 12/13/2023] [Indexed: 01/04/2024] Open
Abstract
BACKGROUND Uniform random sampling of mass-balanced flux solutions offers an unbiased appraisal of the capabilities of metabolic networks. Unfortunately, it is impossible to avoid thermodynamically infeasible loops in flux samples when using convex samplers on large metabolic models. Current strategies for randomly sampling the non-convex loopless flux space display limited efficiency and lack theoretical guarantees. RESULTS Here, we present LooplessFluxSampler, an efficient algorithm for exploring the loopless mass-balanced flux solution space of metabolic models, based on an Adaptive Directions Sampling on a Box (ADSB) algorithm. ADSB is rooted in the general Adaptive Direction Sampling (ADS) framework, specifically the Parallel ADS, for which theoretical convergence and irreducibility results are available for sampling from arbitrary distributions. By sampling directions that adapt to the target distribution, ADSB traverses more efficiently the sample space achieving faster mixing than other methods. Importantly, the presented algorithm is guaranteed to target the uniform distribution over convex regions, and it provably converges on the latter distribution over more general (non-convex) regions provided the sample can have full support. CONCLUSIONS LooplessFluxSampler enables scalable statistical inference of the loopless mass-balanced solution space of large metabolic models. Grounded in a theoretically sound framework, this toolbox provides not only efficient but also reliable results for exploring the properties of the almost surely non-convex loopless flux space. Finally, LooplessFluxSampler includes a Markov Chain diagnostics suite for assessing the quality of the final sample and the performance of the algorithm.
Collapse
Affiliation(s)
- Pedro A Saa
- Department of Chemical and Bioprocess Engineering, School of Engineering, Pontifical Catholic University of Chile, Av. Vicuña Mackenna 4860, 7820436, Santiago, Chile
- Institute for Mathematical and Computational Engineering, Pontifical Catholic University of Chile, Av. Vicuña Mackenna 4860, 7820436, Santiago, Chile
| | - Sebastian Zapararte
- Department of Chemical and Bioprocess Engineering, School of Engineering, Pontifical Catholic University of Chile, Av. Vicuña Mackenna 4860, 7820436, Santiago, Chile
| | - Christopher C Drovandi
- School of Mathematical Sciences and Centre for Data Science, Queensland University of Technology, 2 George Street, Brisbane, Australia
| | - Lars K Nielsen
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Building 75, Cnr College Rd and Cooper Rd, Brisbane, Australia.
- The Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building, Kemitorvet 220, 2800, Kongens Lyngby, Copenhagen, Denmark.
| |
Collapse
|
3
|
Kundu S. ReDirection: an R-package to compute the probable dissociation constant for every reaction of a user-defined biochemical network. Front Mol Biosci 2023; 10:1206502. [PMID: 37942290 PMCID: PMC10628733 DOI: 10.3389/fmolb.2023.1206502] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 09/14/2023] [Indexed: 11/10/2023] Open
Abstract
Biochemical networks integrate enzyme-mediated substrate conversions with non-enzymatic complex formation and disassembly to accomplish complex biochemical and physiological functions. The choice of parameters and constraints used in most of these studies is numerically motivated and network-specific. Although sound in theory, the outcomes that result depart significantly from the intracellular milieu and are less likely to retain relevance in a clinical setting. There is a need for a computational tool which is biochemically relevant, mathematically rigorous, and unbiased, and can ascribe functionality to and generate potentially testable hypotheses for a user-defined biochemical network. Here, we present "ReDirection," an R-package which computes the probable dissociation constant for every reaction of a biochemical network directly from a null space-generated subspace of the stoichiometry number matrix of the modeled network. "ReDirection" delineates this subspace by excluding all trivial and redundant or duplicate occurrences of non-trivial vectors, combinatorially summing the vectors that remain and verifying that the upper or lower bounds of the sequence of terms formed by each row of this subspace belong to the open real-valued intervals - ∞ , - 1 or 1 , ∞ or whether the number of terms that are differently signed are almost equal. "ReDirection" iterates these steps until these bounds are consistent and unambiguous for all reactions of the modeled biochemical network. Thereafter, "ReDirection" filters the terms from each row of this subspace, bins them to outcome-specific subsets, sums and maps this to an outcome-specific reaction vector, and computes the p1-norm, which is the probable dissociation constant for a reaction. "ReDirection" works on first principles, does not discriminate between enzymatic and non-enzymatic reactions, offers a biochemically relevant and mathematically rigorous environment to explore user-defined biochemical networks under baseline and perturbed conditions, and can be used to address empirically intractable biochemical problems. The utility and relevance of "ReDirection" are highlighted by numerical studies on stoichiometric number models of biochemical networks of galactose metabolism and heme and cholesterol biosynthesis. "ReDirection" is freely available and accessible from the comprehensive R archive network (CRAN) with the URL (https://cran.r-project.org/package=ReDirection).
Collapse
Affiliation(s)
- Siddhartha Kundu
- Department of Biochemistry, All India Institute of Medical Sciences, New Delhi, India
| |
Collapse
|
4
|
Li G, Liu L, Du W, Cao H. Local flux coordination and global gene expression regulation in metabolic modeling. Nat Commun 2023; 14:5700. [PMID: 37709734 PMCID: PMC10502109 DOI: 10.1038/s41467-023-41392-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 09/04/2023] [Indexed: 09/16/2023] Open
Abstract
Genome-scale metabolic networks (GSMs) are fundamental systems biology representations of a cell's entire set of stoichiometrically balanced reactions. However, such static GSMs do not incorporate the functional organization of metabolic genes and their dynamic regulation (e.g., operons and regulons). Specifically, there are numerous topologically coupled local reactions through which fluxes are coordinated; the global growth state often dynamically regulates many gene expression of metabolic reactions via global transcription factor regulators. Here, we develop a GSM reconstruction method, Decrem, by integrating locally coupled reactions and global transcriptional regulation of metabolism by cell state. Decrem produces predictions of flux and growth rates, which are highly correlated with those experimentally measured in both wild-type and mutants of three model microorganisms Escherichia coli, Saccharomyces cerevisiae, and Bacillus subtilis under various conditions. More importantly, Decrem can also explain the observed growth rates by capturing the experimentally measured flux changes between wild-types and mutants. Overall, by identifying and incorporating locally organized and regulated functional modules into GSMs, Decrem achieves accurate predictions of phenotypes and has broad applications in bioengineering, synthetic biology, and microbial pathology.
Collapse
Affiliation(s)
- Gaoyang Li
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Li Liu
- Division of Natural and Applied Sciences, Duke Kunshan University, Kunshan, 215316, China
| | - Wei Du
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China.
| | - Huansheng Cao
- Division of Natural and Applied Sciences, Duke Kunshan University, Kunshan, 215316, China.
| |
Collapse
|
5
|
The topology of genome-scale metabolic reconstructions unravels independent modules and high network flexibility. PLoS Comput Biol 2022; 18:e1010203. [PMID: 35759507 PMCID: PMC9269948 DOI: 10.1371/journal.pcbi.1010203] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 07/08/2022] [Accepted: 05/14/2022] [Indexed: 11/30/2022] Open
Abstract
The topology of metabolic networks is recognisably modular with modules weakly connected apart from sharing a pool of currency metabolites. Here, we defined modules as sets of reversible reactions isolated from the rest of metabolism by irreversible reactions except for the exchange of currency metabolites. Our approach identifies topologically independent modules under specific conditions associated with different metabolic functions. As case studies, the E.coli iJO1366 and Human Recon 2.2 genome-scale metabolic models were split in 103 and 321 modules respectively, displaying significant correlation patterns in expression data. Finally, we addressed a fundamental question about the metabolic flexibility conferred by reversible reactions: “Of all Directed Topologies (DTs) defined by fixing directions to all reversible reactions, how many are capable of carrying flux through all reactions?”. Enumeration of the DTs for iJO1366 model was performed using an efficient depth-first search algorithm, rejecting infeasible DTs based on mass-imbalanced and loopy flux patterns. We found the direction of 79% of reversible reactions must be defined before all directions in the network can be fixed, granting a high degree of flexibility. Genome-scale metabolic reconstructions represent all biochemical reactions that an organism can accomplish. These reconstructions are complex and often difficult to study in great detail. A way to overcome this limitation is to focus on specific pathways or subsystems. We present a novel method to identify metabolic modules based on the network topology. The method relies on reaction directions and ignores currency metabolites, which artificially connect distant metabolic reactions. In this way, topologically independent modules are built, where inputs and outputs are controlled by irreversible reactions. The method is automatic and unbiased, and, the result is a set of condition specific modules with defined metabolic functions. As a proof-of-concept we generated biologically relevant modules for the E.coli and Human genome-scale metabolic reconstructions supported by transcriptomic data. Finally, we applied the novel approach to study the network flexibility conferred by reversible reactions. In the case of the E. coli model, we found that the direction of 79% of structurally reversible reactions (those not directionally constrained by surrounding irreversible reactions) must be fixed to determine all the reaction directions in the network. Therefore, reversible reactions operate practically independent of each other.
Collapse
|
6
|
Abstract
Networks of reactions inside the cell are constrained by the laws of mass and energy balance. Constrained-based modelling (CBM) is the most used method to describe the mass balance of metabolic network. The main key concepts in CBM are stoichiometric analysis such as elementary flux mode analysis or flux balance analysis. Some of these methods have focused on adding thermodynamics constraints to eliminate non-physical fluxes or inconsistencies in the metabolic system. Here, we review the main different approaches and how they tackle the different class of problems.
Collapse
|
7
|
Ullah E, Yosafshahi M, Hassoun S. Towards scaling elementary flux mode computation. Brief Bioinform 2019; 21:1875-1885. [PMID: 31745550 DOI: 10.1093/bib/bbz094] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Revised: 07/04/2019] [Accepted: 07/05/2019] [Indexed: 01/05/2023] Open
Abstract
While elementary flux mode (EFM) analysis is now recognized as a cornerstone computational technique for cellular pathway analysis and engineering, EFM application to genome-scale models remains computationally prohibitive. This article provides a review of aspects of EFM computation that elucidates bottlenecks in scaling EFM computation. First, algorithms for computing EFMs are reviewed. Next, the impact of redundant constraints, sensitivity to constraint ordering and network compression are evaluated. Then, the advantages and limitations of recent parallelization and GPU-based efforts are highlighted. The article then reviews alternative pathway analysis approaches that aim to reduce the EFM solution space. Despite advances in EFM computation, our review concludes that continued scaling of EFM computation is necessary to apply EFM to genome-scale models. Further, our review concludes that pathway analysis methods that target specific pathway properties can provide powerful alternatives to EFM analysis.
Collapse
Affiliation(s)
- Ehsan Ullah
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
| | - Mona Yosafshahi
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
| | - Soha Hassoun
- Department of Computer Science, Tufts University, Medford MA 02155, USA
| |
Collapse
|
8
|
Fernandez-de-Cossio-Diaz J, Mulet R. Maximum entropy and population heterogeneity in continuous cell cultures. PLoS Comput Biol 2019; 15:e1006823. [PMID: 30811392 PMCID: PMC6411232 DOI: 10.1371/journal.pcbi.1006823] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 03/11/2019] [Accepted: 01/28/2019] [Indexed: 12/20/2022] Open
Abstract
Continuous cultures of mammalian cells are complex systems displaying hallmark phenomena of nonlinear dynamics, such as multi-stability, hysteresis, as well as sharp transitions between different metabolic states. In this context mathematical models may suggest control strategies to steer the system towards desired states. Although even clonal populations are known to exhibit cell-to-cell variability, most of the currently studied models assume that the population is homogeneous. To overcome this limitation, we use the maximum entropy principle to model the phenotypic distribution of cells in a chemostat as a function of the dilution rate. We consider the coupling between cell metabolism and extracellular variables describing the state of the bioreactor and take into account the impact of toxic byproduct accumulation on cell viability. We present a formal solution for the stationary state of the chemostat and show how to apply it in two examples. First, a simplified model of cell metabolism where the exact solution is tractable, and then a genome-scale metabolic network of the Chinese hamster ovary (CHO) cell line. Along the way we discuss several consequences of heterogeneity, such as: qualitative changes in the dynamical landscape of the system, increasing concentrations of byproducts that vanish in the homogeneous case, and larger population sizes.
Collapse
Affiliation(s)
- Jorge Fernandez-de-Cossio-Diaz
- Group of Complex Systems and Statistical Physics, Department of Theoretical Physics, University of Havana, Physics Faculty, Cuba
- Systems Biology Department, Center of Molecular Immunology, Havana, Cuba
| | - Roberto Mulet
- Group of Complex Systems and Statistical Physics, Department of Theoretical Physics, University of Havana, Physics Faculty, Cuba
- Group of Statistical Inference and Computational Biology, Italian Institute for Genomic Medicine, Italy
| |
Collapse
|