1
|
Hasibi R, Michoel T, Oyarzún DA. Integration of graph neural networks and genome-scale metabolic models for predicting gene essentiality. NPJ Syst Biol Appl 2024; 10:24. [PMID: 38448436 PMCID: PMC10917767 DOI: 10.1038/s41540-024-00348-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 02/08/2024] [Indexed: 03/08/2024] Open
Abstract
Genome-scale metabolic models are powerful tools for understanding cellular physiology. Flux balance analysis (FBA), in particular, is an optimization-based approach widely employed for predicting metabolic phenotypes. In model microbes such as Escherichia coli, FBA has been successful at predicting essential genes, i.e. those genes that impair survival when deleted. A central assumption in this approach is that both wild type and deletion strains optimize the same fitness objective. Although the optimality assumption may hold for the wild type metabolic network, deletion strains are not subject to the same evolutionary pressures and knock-out mutants may steer their metabolism to meet other objectives for survival. Here, we present FlowGAT, a hybrid FBA-machine learning strategy for predicting essentiality directly from wild type metabolic phenotypes. The approach is based on graph-structured representation of metabolic fluxes predicted by FBA, where nodes correspond to enzymatic reactions and edges quantify the propagation of metabolite mass flow between a reaction and its neighbours. We integrate this information into a graph neural network that can be trained on knock-out fitness assay data. Comparisons across different model architectures reveal that FlowGAT predictions for E. coli are close to those of FBA for several growth conditions. This suggests that essentiality of enzymatic genes can be predicted by exploiting the inherent network structure of metabolism. Our approach demonstrates the benefits of combining the mechanistic insights afforded by genome-scale models with the ability of deep learning to infer patterns from complex datasets.
Collapse
Affiliation(s)
- Ramin Hasibi
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Tom Michoel
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Diego A Oyarzún
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK.
- School of Informatics, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
2
|
Freischem LJ, Oyarzún DA. A Machine Learning Approach for Predicting Essentiality of Metabolic Genes. Methods Mol Biol 2024; 2760:345-369. [PMID: 38468098 DOI: 10.1007/978-1-0716-3658-9_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
The identification of essential genes is a key challenge in systems and synthetic biology, particularly for engineering metabolic pathways that convert feedstocks into valuable products. Assessment of gene essentiality at a genome scale requires large and costly growth assays of knockout strains. Here we describe a strategy to predict the essentiality of metabolic genes using binary classification algorithms. The approach combines elements from genome-scale metabolic models, directed graphs, and machine learning into a predictive model that can be trained on small knockout data. We demonstrate the efficacy of this approach using the most complete metabolic model of Escherichia coli and various machine learning algorithms for binary classification.
Collapse
Affiliation(s)
| | - Diego A Oyarzún
- School of Informatics, University of Edinburgh, Edinburgh, UK.
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
3
|
Interdisciplinary Overview of Lipopeptide and Protein-Containing Biosurfactants. Genes (Basel) 2022; 14:genes14010076. [PMID: 36672817 PMCID: PMC9859011 DOI: 10.3390/genes14010076] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 12/05/2022] [Accepted: 12/20/2022] [Indexed: 12/28/2022] Open
Abstract
Biosurfactants are amphipathic molecules capable of lowering interfacial and superficial tensions. Produced by living organisms, these compounds act the same as chemical surfactants but with a series of improvements, the most notable being biodegradability. Biosurfactants have a wide diversity of categories. Within these, lipopeptides are some of the more abundant and widely known. Protein-containing biosurfactants are much less studied and could be an interesting and valuable alternative. The harsh temperature, pH, and salinity conditions that target organisms can sustain need to be understood for better implementation. Here, we will explore biotechnological applications via lipopeptide and protein-containing biosurfactants. Also, we discuss their natural role and the organisms that produce them, taking a glimpse into the possibilities of research via meta-omics and machine learning.
Collapse
|
4
|
Panditrao G, Bhowmick R, Meena C, Sarkar RR. Emerging landscape of molecular interaction networks: Opportunities, challenges and prospects. J Biosci 2022. [PMID: 36210749 PMCID: PMC9018971 DOI: 10.1007/s12038-022-00253-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Network biology finds application in interpreting molecular interaction networks and providing insightful inferences using graph theoretical analysis of biological systems. The integration of computational bio-modelling approaches with different hybrid network-based techniques provides additional information about the behaviour of complex systems. With increasing advances in high-throughput technologies in biological research, attempts have been made to incorporate this information into network structures, which has led to a continuous update of network biology approaches over time. The newly minted centrality measures accommodate the details of omics data and regulatory network structure information. The unification of graph network properties with classical mathematical and computational modelling approaches and technologically advanced approaches like machine-learning- and artificial intelligence-based algorithms leverages the potential application of these techniques. These computational advances prove beneficial and serve various applications such as essential gene prediction, identification of drug–disease interaction and gene prioritization. Hence, in this review, we have provided a comprehensive overview of the emerging landscape of molecular interaction networks using graph theoretical approaches. With the aim to provide information on the wide range of applications of network biology approaches in understanding the interaction and regulation of genes, proteins, enzymes and metabolites at different molecular levels, we have reviewed the methods that utilize network topological properties, emerging hybrid network-based approaches and applications that integrate machine learning techniques to analyse molecular interaction networks. Further, we have discussed the applications of these approaches in biomedical research with a note on future prospects.
Collapse
Affiliation(s)
- Gauri Panditrao
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Rupa Bhowmick
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| | - Chandrakala Meena
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| |
Collapse
|
5
|
Coggan JS, Keller D, Markram H, Schürmann F, Magistretti PJ. Representing Stimulus Information in an Energy Metabolism Pathway. J Theor Biol 2022; 540:111090. [PMID: 35271865 DOI: 10.1016/j.jtbi.2022.111090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 02/21/2022] [Accepted: 03/01/2022] [Indexed: 10/18/2022]
Abstract
We explored a computational model of astrocytic energy metabolism and demonstrated the theoretical plausibility that this type of pathway might be capable of coding information about stimuli in addition to its known functions in cellular energy and carbon budgets. Simulation results indicate that glycogenolytic glycolysis triggered by activation of adrenergic receptors can capture the intensity and duration features of a neuromodulator waveform and can respond in a dose-dependent manner, including non-linear state changes that are analogous to action potentials. We show how this metabolic pathway can translate information about external stimuli to production profiles of energy-carrying molecules such as lactate with a precision beyond simple signal transduction or non-linear amplification. The results suggest the operation of a metabolic state-machine from the spatially discontiguous yet interdependent metabolite elements. Such metabolic pathways might be well-positioned to code an additional level of salient information about a cell's environmental demands to impact its function. Our hypothesis has implications for the computational power and energy efficiency of the brain.
Collapse
Affiliation(s)
- Jay S Coggan
- Blue Brain Project, École Polytechnique Fédérale de Lausanne (EPFL), Geneva, CH-1202, Switzerland.
| | - Daniel Keller
- Blue Brain Project, École Polytechnique Fédérale de Lausanne (EPFL), Geneva, CH-1202, Switzerland
| | - Henry Markram
- Blue Brain Project, École Polytechnique Fédérale de Lausanne (EPFL), Geneva, CH-1202, Switzerland
| | - Felix Schürmann
- Blue Brain Project, École Polytechnique Fédérale de Lausanne (EPFL), Geneva, CH-1202, Switzerland
| | - Pierre J Magistretti
- Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| |
Collapse
|
6
|
Verma BK, Mannan AA, Zhang F, Oyarzún DA. Trade-Offs in Biosensor Optimization for Dynamic Pathway Engineering. ACS Synth Biol 2022; 11:228-240. [PMID: 34968029 DOI: 10.1021/acssynbio.1c00391] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Recent progress in synthetic biology allows the construction of dynamic control circuits for metabolic engineering. This technology promises to overcome many challenges encountered in traditional pathway engineering, thanks to its ability to self-regulate gene expression in response to bioreactor perturbations. The central components in these control circuits are metabolite biosensors that read out pathway signals and actuate enzyme expression. However, the construction of metabolite biosensors is a major bottleneck for strain design, and a key challenge is to understand the relation between biosensor dose-response curves and pathway performance. Here we employ multiobjective optimization to quantify performance trade-offs that arise in the design of metabolite biosensors. Our approach reveals strategies for tuning dose-response curves along an optimal trade-off between production flux and the cost of an increased expression burden on the host. We explore properties of control architectures built in the literature and identify their advantages and caveats in terms of performance and robustness to growth conditions and leaky promoters. We demonstrate the optimality of a control circuit for glucaric acid production in Escherichia coli, which has been shown to increase the titer by 2.5-fold as compared to static designs. Our results lay the groundwork for the automated design of control circuits for pathway engineering, with applications in the food, energy, and pharmaceutical sectors.
Collapse
Affiliation(s)
- Babita K. Verma
- School of Biological Sciences, The University of Edinburgh, Edinburgh EH9 3BF, U.K
| | - Ahmad A. Mannan
- Warwick Integrative Synthetic Biology Centre, School of Engineering, University of Warwick, Coventry CV4 7AL, U.K
| | - Fuzhong Zhang
- Department of Energy, Environmental & Chemical Engineering, Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Diego A. Oyarzún
- School of Biological Sciences, The University of Edinburgh, Edinburgh EH9 3BF, U.K
- School of Informatics, The University of Edinburgh, Edinburgh EH8 9AB, U.K
- The Alan Turing Institute, London, NW1 2DB, U.K
| |
Collapse
|
7
|
Vijayakumar S, Angione C. Protocol for hybrid flux balance, statistical, and machine learning analysis of multi-omic data from the cyanobacterium Synechococcus sp. PCC 7002. STAR Protoc 2021; 2:100837. [PMID: 34632416 PMCID: PMC8488602 DOI: 10.1016/j.xpro.2021.100837] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Combining a computational framework for flux balance analysis with machine learning improves the accuracy of predicting metabolic activity across conditions, while enabling mechanistic interpretation. This protocol presents a guide to condition-specific metabolic modeling that integrates regularized flux balance analysis with machine learning approaches to extract key features from transcriptomic and fluxomic data. We demonstrate the protocol as applied to Synechococcus sp. PCC 7002; we also outline how it can be adapted to any species or community with available multi-omic data. For complete details on the use and execution of this protocol, please refer to Vijayakumar et al. (2020).
Collapse
Affiliation(s)
- Supreeta Vijayakumar
- School of Computing, Engineering & Digital Technologies, Teesside University, Middlesbrough, North Yorkshire TS1 3BX, UK
| | - Claudio Angione
- School of Computing, Engineering & Digital Technologies, Teesside University, Middlesbrough, North Yorkshire TS1 3BX, UK
- Centre for Digital Innovation, Teesside University, Middlesbrough TS1 3BX, UK
- Healthcare Innovation Centre, Teesside University, Middlesbrough TS1 3BX, UK
| |
Collapse
|
8
|
Toward modeling metabolic state from single-cell transcriptomics. Mol Metab 2021; 57:101396. [PMID: 34785394 PMCID: PMC8829761 DOI: 10.1016/j.molmet.2021.101396] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 10/21/2021] [Accepted: 11/09/2021] [Indexed: 12/31/2022] Open
Abstract
Background Single-cell metabolic studies bring new insights into cellular function, which can often not be captured on other omics layers. Metabolic information has wide applicability, such as for the study of cellular heterogeneity or for the understanding of drug mechanisms and biomarker development. However, metabolic measurements on single-cell level are limited by insufficient scalability and sensitivity, as well as resource intensiveness, and are currently not possible in parallel with measuring transcript state, commonly used to identify cell types. Nevertheless, because omics layers are strongly intertwined, it is possible to make metabolic predictions based on measured data of more easily measurable omics layers together with prior metabolic network knowledge. Scope of Review We summarize the current state of single-cell metabolic measurement and modeling approaches, motivating the use of computational techniques. We review three main classes of computational methods used for prediction of single-cell metabolism: pathway-level analysis, constraint-based modeling, and kinetic modeling. We describe the unique challenges arising when transitioning from bulk to single-cell modeling. Finally, we propose potential model extensions and computational methods that could be leveraged to achieve these goals. Major Conclusions Single-cell metabolic modeling is a rising field that provides a new perspective for understanding cellular functions. The presented modeling approaches vary in terms of input requirements and assumptions, scalability, modeled metabolic layers, and newly gained insights. We believe that the use of prior metabolic knowledge will lead to more robust predictions and will pave the way for mechanistic and interpretable machine-learning models. Single-cell RNA sequencing and prior metabolic knowledge enable metabolic predictions. When compared to bulk, single-cell modeling is linked to unique insights and challenges. Computational modelling approaches differ in applicability and newly provided insights. The use of prior metabolic knowledge paves the way for mechanistic machine-learning.
Collapse
|
9
|
Ibrahim M, Raajaraam L, Raman K. Modelling microbial communities: Harnessing consortia for biotechnological applications. Comput Struct Biotechnol J 2021; 19:3892-3907. [PMID: 34584635 PMCID: PMC8441623 DOI: 10.1016/j.csbj.2021.06.048] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 06/29/2021] [Accepted: 06/29/2021] [Indexed: 02/06/2023] Open
Abstract
Microbes propagate and thrive in complex communities, and there are many benefits to studying and engineering microbial communities instead of single strains. Microbial communities are being increasingly leveraged in biotechnological applications, as they present significant advantages such as the division of labour and improved substrate utilisation. Nevertheless, they also present some interesting challenges to surmount for the design of efficient biotechnological processes. In this review, we discuss key principles of microbial interactions, followed by a deep dive into genome-scale metabolic models, focussing on a vast repertoire of constraint-based modelling methods that enable us to characterise and understand the metabolic capabilities of microbial communities. Complementary approaches to model microbial communities, such as those based on graph theory, are also briefly discussed. Taken together, these methods provide rich insights into the interactions between microbes and how they influence microbial community productivity. We finally overview approaches that allow us to generate and test numerous synthetic community compositions, followed by tools and methodologies that can predict effective genetic interventions to further improve the productivity of communities. With impending advancements in high-throughput omics of microbial communities, the stage is set for the rapid expansion of microbial community engineering, with a significant impact on biotechnological processes.
Collapse
Affiliation(s)
- Maziya Ibrahim
- Bhupat and Jyoti Mehta School of Biosciences, Department of Biotechnology, Indian Institute of Technology (IIT) Madras, Chennai 600 036, India
- Centre for Integrative Biology and Systems Medicine (IBSE), IIT Madras, Chennai 600 036, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai 600 036, India
| | - Lavanya Raajaraam
- Bhupat and Jyoti Mehta School of Biosciences, Department of Biotechnology, Indian Institute of Technology (IIT) Madras, Chennai 600 036, India
- Centre for Integrative Biology and Systems Medicine (IBSE), IIT Madras, Chennai 600 036, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai 600 036, India
| | - Karthik Raman
- Bhupat and Jyoti Mehta School of Biosciences, Department of Biotechnology, Indian Institute of Technology (IIT) Madras, Chennai 600 036, India
- Centre for Integrative Biology and Systems Medicine (IBSE), IIT Madras, Chennai 600 036, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai 600 036, India
| |
Collapse
|