1
|
Hao T, Song Z, Zhang M, Zhang L, Yang J, Li J, Sun J. Reconstruction of Metabolic-Protein Interaction Integrated Network of Eriocheir sinensis and Analysis of Ecdysone Synthesis. Genes (Basel) 2024; 15:410. [PMID: 38674345 PMCID: PMC11049885 DOI: 10.3390/genes15040410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 03/24/2024] [Accepted: 03/25/2024] [Indexed: 04/28/2024] Open
Abstract
Integrated networks have become a new interest in genome-scale network research due to their ability to comprehensively reflect and analyze the molecular processes in cells. Currently, none of the integrated networks have been reported for higher organisms. Eriocheir sinensis is a typical aquatic animal that grows through ecdysis. Ecdysone has been identified to be a crucial regulator of ecdysis, but the influence factors and regulatory mechanisms of ecdysone synthesis in E. sinensis are still unclear. In this work, the genome-scale metabolic network and protein-protein interaction network of E. sinensis were integrated to reconstruct a metabolic-protein interaction integrated network (MPIN). The MPIN was used to analyze the influence factors of ecdysone synthesis through flux variation analysis. In total, 236 integrated reactions (IRs) were found to influence the ecdysone synthesis of which 16 IRs had a significant impact. These IRs constitute three ecdysone synthesis routes. It is found that there might be alternative pathways to obtain cholesterol for ecdysone synthesis in E. sinensis instead of absorbing it directly from the feeds. The MPIN reconstructed in this work is the first integrated network for higher organisms. The analysis based on the MPIN supplies important information for the mechanism analysis of ecdysone synthesis in E. sinensis.
Collapse
Affiliation(s)
- Tong Hao
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin 300387, China; (T.H.); (Z.S.); (M.Z.); (L.Z.); (J.Y.)
| | - Zhentao Song
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin 300387, China; (T.H.); (Z.S.); (M.Z.); (L.Z.); (J.Y.)
| | - Mingzhi Zhang
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin 300387, China; (T.H.); (Z.S.); (M.Z.); (L.Z.); (J.Y.)
| | - Lingrui Zhang
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin 300387, China; (T.H.); (Z.S.); (M.Z.); (L.Z.); (J.Y.)
| | - Jiarui Yang
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin 300387, China; (T.H.); (Z.S.); (M.Z.); (L.Z.); (J.Y.)
| | - Jingjing Li
- Tianjin Fisheries Research Institute, Tianjin 300211, China;
| | - Jinsheng Sun
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin 300387, China; (T.H.); (Z.S.); (M.Z.); (L.Z.); (J.Y.)
| |
Collapse
|
2
|
Han Y, Li W, Filko A, Li J, Zhang F. Genome-wide promoter responses to CRISPR perturbations of regulators reveal regulatory networks in Escherichia coli. Nat Commun 2023; 14:5757. [PMID: 37717013 PMCID: PMC10505187 DOI: 10.1038/s41467-023-41572-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 09/08/2023] [Indexed: 09/18/2023] Open
Abstract
Elucidating genome-scale regulatory networks requires a comprehensive collection of gene expression profiles, yet measuring gene expression responses for every transcription factor (TF)-gene pair in living prokaryotic cells remains challenging. Here, we develop pooled promoter responses to TF perturbation sequencing (PPTP-seq) via CRISPR interference to address this challenge. Using PPTP-seq, we systematically measure the activity of 1372 Escherichia coli promoters under single knockdown of 183 TF genes, illustrating more than 200,000 possible TF-gene responses in one experiment. We perform PPTP-seq for E. coli growing in three different media. The PPTP-seq data reveal robust steady-state promoter activities under most single TF knockdown conditions. PPTP-seq also enables identifications of, to the best of our knowledge, previously unknown TF autoregulatory responses and complex transcriptional control on one-carbon metabolism. We further find context-dependent promoter regulation by multiple TFs whose relative binding strengths determined promoter activities. Additionally, PPTP-seq reveals different promoter responses in different growth media, suggesting condition-specific gene regulation. Overall, PPTP-seq provides a powerful method to examine genome-wide transcriptional regulatory networks and can be potentially expanded to reveal gene expression responses to other genetic elements.
Collapse
Affiliation(s)
- Yichao Han
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Wanji Li
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Alden Filko
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Jingyao Li
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Fuzhong Zhang
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA.
- Division of Biological and Biomedical Sciences, Washington University in St. Louis, Saint Louis, Missouri, USA.
- Institute of Materials Science and Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA.
| |
Collapse
|
3
|
Favate JS, Skalenko KS, Chiles E, Su X, Yadavalli SS, Shah P. Linking genotypic and phenotypic changes in the E. coli Long-Term Evolution Experiment using metabolomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.15.528756. [PMID: 36874203 PMCID: PMC9985142 DOI: 10.1101/2023.02.15.528756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Changes in an organism's environment, genome, or gene expression patterns can lead to changes in its metabolism. The metabolic phenotype can be under selection and contributes to adaptation. However, the networked and convoluted nature of an organism's metabolism makes relating mutations, metabolic changes, and effects on fitness challenging. To overcome this challenge, we use the Long-Term Evolution Experiment (LTEE) with E. coli as a model to understand how mutations can eventually affect metabolism and perhaps fitness. We used mass-spectrometry to broadly survey the metabolomes of the ancestral strains and all 12 evolved lines. We combined this metabolic data with mutation and expression data to suggest how mutations that alter specific reaction pathways, such as the biosynthesis of nicotinamide adenine dinucleotide, might increase fitness in the system. Our work provides a better understanding of how mutations might affect fitness through the metabolic changes in the LTEE and thus provides a major step in developing a complete genotype-phenotype map for this experimental system.
Collapse
Affiliation(s)
- John S. Favate
- Department of Genetics, Rutgers University, Piscataway, New Jersey, USA
- Human Genetics Institute of New Jersey, Piscataway, New Jersey, USA
| | - Kyle S. Skalenko
- Department of Genetics, Rutgers University, Piscataway, New Jersey, USA
- Waksman Institute, Rutgers University, Piscataway, New Jersey, USA
| | - Eric Chiles
- Cancer Institute of New Jersey, New Brunswick, New Jersey, USA
| | - Xiaoyang Su
- Cancer Institute of New Jersey, New Brunswick, New Jersey, USA
| | - Srujana S. Yadavalli
- Department of Genetics, Rutgers University, Piscataway, New Jersey, USA
- Waksman Institute, Rutgers University, Piscataway, New Jersey, USA
| | - Premal Shah
- Department of Genetics, Rutgers University, Piscataway, New Jersey, USA
- Human Genetics Institute of New Jersey, Piscataway, New Jersey, USA
| |
Collapse
|
4
|
Using genome-wide expression compendia to study microorganisms. Comput Struct Biotechnol J 2022; 20:4315-4324. [PMID: 36016717 PMCID: PMC9396250 DOI: 10.1016/j.csbj.2022.08.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 08/07/2022] [Accepted: 08/07/2022] [Indexed: 11/30/2022] Open
Abstract
A gene expression compendium is a heterogeneous collection of gene expression experiments assembled from data collected for diverse purposes. The widely varied experimental conditions and genetic backgrounds across samples creates a tremendous opportunity for gaining a systems level understanding of the transcriptional responses that influence phenotypes. Variety in experimental design is particularly important for studying microbes, where the transcriptional responses integrate many signals and demonstrate plasticity across strains including response to what nutrients are available and what microbes are present. Advances in high-throughput measurement technology have made it feasible to construct compendia for many microbes. In this review we discuss how these compendia are constructed and analyzed to reveal transcriptional patterns.
Collapse
|
5
|
Ahn-Horst TA, Mille LS, Sun G, Morrison JH, Covert MW. An expanded whole-cell model of E. coli links cellular physiology with mechanisms of growth rate control. NPJ Syst Biol Appl 2022; 8:30. [PMID: 35986058 PMCID: PMC9391491 DOI: 10.1038/s41540-022-00242-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 07/28/2022] [Indexed: 11/09/2022] Open
Abstract
Growth and environmental responses are essential for living organisms to survive and adapt to constantly changing environments. In order to simulate new conditions and capture dynamic responses to environmental shifts in a developing whole-cell model of E. coli, we incorporated additional regulation, including dynamics of the global regulator guanosine tetraphosphate (ppGpp), along with dynamics of amino acid biosynthesis and translation. With the model, we show that under perturbed ppGpp conditions, small molecule feedback inhibition pathways, in addition to regulation of expression, play a role in ppGpp regulation of growth. We also found that simulations with dysregulated amino acid synthesis pathways provide average amino acid concentration predictions that are comparable to experimental results but on the single-cell level, concentrations unexpectedly show regular fluctuations. Additionally, during both an upshift and downshift in nutrient availability, the simulated cell responds similarly with a transient increase in the mRNA:rRNA ratio. This additional simulation functionality should support a variety of new applications and expansions of the E. coli Whole-Cell Modeling Project.
Collapse
Affiliation(s)
- Travis A Ahn-Horst
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
| | | | - Gwanggyu Sun
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
| | - Jerry H Morrison
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
| | - Markus W Covert
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
6
|
Ye C, Wei X, Shi T, Sun X, Xu N, Gao C, Zou W. Genome-scale metabolic network models: from first-generation to next-generation. Appl Microbiol Biotechnol 2022; 106:4907-4920. [PMID: 35829788 DOI: 10.1007/s00253-022-12066-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 06/24/2022] [Accepted: 07/02/2022] [Indexed: 11/26/2022]
Abstract
Over the last two decades, thousands of genome-scale metabolic network models (GSMMs) have been constructed. These GSMMs have been widely applied in various fields, ranging from network interaction analysis, to cell phenotype prediction. However, due to the lack of constraints, the prediction accuracy of first-generation GSMMs was limited. To overcome these limitations, the next-generation GSMMs were developed by integrating omics data, adding constrain condition, integrating different biological models, and constructing whole-cell models. Here, we review recent advances of GSMMs from the first generation to the next generation. Then, we discuss the major application of GSMMs in industrial biotechnology, such as predicting phenotypes and guiding metabolic engineering. In addition, human health applications, including understanding biological mechanisms, discovering biomarkers and drug targets, are also summarized. Finally, we address the challenges and propose new trend of GSMMs. KEY POINTS: •This mini-review updates the literature on almost all published GSMMs since 1999. •Detailed insights into the development of the first- and next-generation GSMMs. •The application of GSMMs is summarized, and the prospects of integrating machine learning are emphasized.
Collapse
Affiliation(s)
- Chao Ye
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, Nanjing, 210023, China.
| | - Xinyu Wei
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, Nanjing, 210023, China
| | - Tianqiong Shi
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, Nanjing, 210023, China
| | - Xiaoman Sun
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, Nanjing, 210023, China
| | - Nan Xu
- College of Bioscience and Biotechnology, Yangzhou University, Yangzhou, 225009, China
| | - Cong Gao
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, 214122, China
| | - Wei Zou
- College of Bioengineering, Sichuan University of Science & Engineering, Yibin, 644005, China.
| |
Collapse
|
7
|
Erdem C, Mutsuddy A, Bensman EM, Dodd WB, Saint-Antoine MM, Bouhaddou M, Blake RC, Gross SM, Heiser LM, Feltus FA, Birtwistle MR. A scalable, open-source implementation of a large-scale mechanistic model for single cell proliferation and death signaling. Nat Commun 2022; 13:3555. [PMID: 35729113 PMCID: PMC9213456 DOI: 10.1038/s41467-022-31138-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 06/07/2022] [Indexed: 02/01/2023] Open
Abstract
Mechanistic models of how single cells respond to different perturbations can help integrate disparate big data sets or predict response to varied drug combinations. However, the construction and simulation of such models have proved challenging. Here, we developed a python-based model creation and simulation pipeline that converts a few structured text files into an SBML standard and is high-performance- and cloud-computing ready. We applied this pipeline to our large-scale, mechanistic pan-cancer signaling model (named SPARCED) and demonstrate it by adding an IFNγ pathway submodel. We then investigated whether a putative crosstalk mechanism could be consistent with experimental observations from the LINCS MCF10A Data Cube that IFNγ acts as an anti-proliferative factor. The analyses suggested this observation can be explained by IFNγ-induced SOCS1 sequestering activated EGF receptors. This work forms a foundational recipe for increased mechanistic model-based data integration on a single-cell level, an important building block for clinically-predictive mechanistic models.
Collapse
Affiliation(s)
- Cemal Erdem
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA.
| | - Arnab Mutsuddy
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA
| | - Ethan M Bensman
- Computer Science, School of Computing, Clemson University, Clemson, SC, USA
| | - William B Dodd
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA
| | - Michael M Saint-Antoine
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA
| | - Mehdi Bouhaddou
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA
| | - Robert C Blake
- Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - Sean M Gross
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Laura M Heiser
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - F Alex Feltus
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
- Biomedical Data Science and Informatics Program, Clemson University, Clemson, SC, USA
- Center for Human Genetics, Clemson University, Clemson, SC, USA
| | - Marc R Birtwistle
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA.
- Department of Bioengineering, Clemson University, Clemson, SC, USA.
| |
Collapse
|
8
|
Bi X, Liu Y, Li J, Du G, Lv X, Liu L. Construction of Multiscale Genome-Scale Metabolic Models: Frameworks and Challenges. Biomolecules 2022; 12:biom12050721. [PMID: 35625648 PMCID: PMC9139095 DOI: 10.3390/biom12050721] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 05/15/2022] [Accepted: 05/16/2022] [Indexed: 12/04/2022] Open
Abstract
Genome-scale metabolic models (GEMs) are effective tools for metabolic engineering and have been widely used to guide cell metabolic regulation. However, the single gene–protein-reaction data type in GEMs limits the understanding of biological complexity. As a result, multiscale models that add constraints or integrate omics data based on GEMs have been developed to more accurately predict phenotype from genotype. This review summarized the recent advances in the development of multiscale GEMs, including multiconstraint, multiomic, and whole-cell models, and outlined machine learning applications in GEM construction. This review focused on the frameworks, toolkits, and algorithms for constructing multiscale GEMs. The challenges and perspectives of multiscale GEM development are also discussed.
Collapse
Affiliation(s)
- Xinyu Bi
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; (X.B.); (Y.L.); (J.L.); (G.D.); (X.L.)
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Yanfeng Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; (X.B.); (Y.L.); (J.L.); (G.D.); (X.L.)
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Jianghua Li
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; (X.B.); (Y.L.); (J.L.); (G.D.); (X.L.)
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Guocheng Du
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; (X.B.); (Y.L.); (J.L.); (G.D.); (X.L.)
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Xueqin Lv
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; (X.B.); (Y.L.); (J.L.); (G.D.); (X.L.)
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Long Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; (X.B.); (Y.L.); (J.L.); (G.D.); (X.L.)
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
- Correspondence: ; Tel.: +86-0510-8591-8312; Fax: +86-0510-8591-8309
| |
Collapse
|
9
|
Lv X, Hueso-Gil A, Bi X, Wu Y, Liu Y, Liu L, Ledesma-Amaro R. New synthetic biology tools for metabolic control. Curr Opin Biotechnol 2022; 76:102724. [PMID: 35489308 DOI: 10.1016/j.copbio.2022.102724] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 02/28/2022] [Accepted: 03/20/2022] [Indexed: 11/29/2022]
Abstract
In industrial bioprocesses, microbial metabolism dictates the product yields, and therefore, our capacity to control it has an enormous potential to help us move towards a bio-based economy. The rapid development of multiomics data has accelerated our systematic understanding of complex metabolic regulatory mechanisms, which allow us to develop tools to manipulate them. In the last few years, machine learning-based metabolic modeling, Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) derived synthetic biology tools, and synthetic genetic circuits have been widely used to control the metabolism of microorganisms, manipulate gene expression, and build synthetic pathways for bioproduction. This review describes the latest developments for metabolic control, and focuses on the trends and challenges of metabolic engineering strategies.
Collapse
Affiliation(s)
- Xueqin Lv
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Angeles Hueso-Gil
- Department of Bioengineering and Imperial College Centre for Synthetic Biology, Imperial College London, London SW72AZ, UK
| | - Xinyu Bi
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Yaokang Wu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Yanfeng Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Long Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi 214122, China.
| | - Rodrigo Ledesma-Amaro
- Department of Bioengineering and Imperial College Centre for Synthetic Biology, Imperial College London, London SW72AZ, UK.
| |
Collapse
|
10
|
Niu P, Soto MJ, Yoon BJ, Dougherty ER, Alexander FJ, Blaby I, Qian X. Protocol for condition-dependent metabolite yield prediction using the TRIMER pipeline. STAR Protoc 2022; 3:101184. [PMID: 35243375 PMCID: PMC8866898 DOI: 10.1016/j.xpro.2022.101184] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
This protocol explains the pipeline for condition-dependent metabolite yield prediction using Transcription Regulation Integrated with MEtabolic Regulation (TRIMER). TRIMER targets metabolic engineering applications via a hybrid model integrating transcription factor (TF)-gene regulatory network (TRN) with a Bayesian network (BN) inferred from transcriptomic expression data to effectively regulate metabolic reactions. For E. coli and yeast, TRIMER achieves reliable knockout phenotype and flux predictions from the deletion of one or more TFs at the genome scale. For complete details on the use and execution of this protocol, please refer to Niu et al. (2021). TRIMER is a package for transcription-regulated metabolic predictions Global dependency modeling by Bayesian network enables condition-dependent prediction We present the step-by-step TRIMER implementation for metabolic engineering We demonstrate the analyses for E. coli and yeast mutants
Collapse
Affiliation(s)
- Puhua Niu
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
| | - Maria J. Soto
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Byung-Jun Yoon
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Edward R. Dougherty
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
| | - Francis J. Alexander
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Ian Blaby
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Corresponding author
| | - Xiaoning Qian
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973, USA
- Corresponding author
| |
Collapse
|
11
|
Niu P, Soto MJ, Yoon BJ, Dougherty ER, Alexander FJ, Blaby I, Qian X. TRIMER: Transcription Regulation Integrated with Metabolic Regulation. iScience 2021; 24:103218. [PMID: 34761179 PMCID: PMC8567008 DOI: 10.1016/j.isci.2021.103218] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 08/22/2021] [Accepted: 09/29/2021] [Indexed: 01/01/2023] Open
Abstract
There has been extensive research in predictive modeling of genome-scale metabolic reaction networks. Living systems involve complex stochastic processes arising from interactions among different biomolecules. For more accurate and robust prediction of target metabolic behavior under different conditions, not only metabolic reactions but also the genetic regulatory relationships involving transcription factors (TFs) affecting these metabolic reactions should be modeled. We have developed a modeling and simulation pipeline enabling the analysis of Transcription Regulation Integrated with Metabolic Regulation: TRIMER. TRIMER utilizes a Bayesian network (BN) inferred from transcriptomes to model the transcription factor regulatory network. TRIMER then infers the probabilities of the gene states relevant to the metabolism of interest, and predicts the metabolic fluxes and their changes that result from the deletion of one or more transcription factors at the genome scale. We demonstrate TRIMER’s applicability to both simulated and experimental data and provide performance comparison with other existing approaches. TRIMER models transcription-regulated metabolism using Bayesian network modeling; TRIMER integrates prior knowledge (regulatory interaction) with data (expression); TRIMER enables metabolic behavior prediction for general knockout strategies; TRIMER includes a simulator as an evaluation platform for similar hybrid models; TRIMER reliably predicts metabolite yields for both simulated and experimental data.
Collapse
Affiliation(s)
- Puhua Niu
- Texas A&M University, Department of Electrical and Computer Engineering, College Station, TX, 77843, USA
| | - Maria J. Soto
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Byung-Jun Yoon
- Texas A&M University, Department of Electrical and Computer Engineering, College Station, TX, 77843, USA
- Brookhaven National Laboratory, Computational Science Initiative, Upton, NY, 11973, USA
| | - Edward R. Dougherty
- Texas A&M University, Department of Electrical and Computer Engineering, College Station, TX, 77843, USA
| | - Francis J. Alexander
- Brookhaven National Laboratory, Computational Science Initiative, Upton, NY, 11973, USA
| | - Ian Blaby
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
- Corresponding author
| | - Xiaoning Qian
- Texas A&M University, Department of Electrical and Computer Engineering, College Station, TX, 77843, USA
- Brookhaven National Laboratory, Computational Science Initiative, Upton, NY, 11973, USA
- Corresponding author
| |
Collapse
|
12
|
O'Leary JK, Sleator RD, Lucey B. Cryptosporidium spp. diagnosis and research in the 21 st century. Food Waterborne Parasitol 2021; 24:e00131. [PMID: 34471706 PMCID: PMC8390533 DOI: 10.1016/j.fawpar.2021.e00131] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 08/06/2021] [Accepted: 08/17/2021] [Indexed: 01/01/2023] Open
Abstract
The protozoan parasite Cryptosporidium has emerged as a leading cause of diarrhoeal illness worldwide, posing a significant threat to young children and immunocompromised patients. While endemic in the vast majority of developing countries, Cryptosporidium also has the potential to cause waterborne epidemics and large scale outbreaks in both developing and developed nations. Anthroponontic and zoonotic transmission routes are well defined, with the ingestion of faecally contaminated food and water supplies a common source of infection. Microscopy, the current diagnostic mainstay, is considered by many to be suboptimal. This has prompted a shift towards alternative diagnostic techniques in the advent of the molecular era. Molecular methods, particularly PCR, are gaining traction in a diagnostic capacity over microscopy in the diagnosis of cryptosporidiosis, given the laborious and often tedious nature of the latter. Until now, developments in the field of Cryptosporidium detection and research have been somewhat hampered by the intractable nature of this parasite. However, recent advances in the field have taken the tentative first steps towards bringing Cryptosporidium research into the 21st century. Herein, we provide a review of these advances.
Collapse
Affiliation(s)
- Jennifer K. O'Leary
- Department of Biological Sciences, Munster Technological University, Bishopstown Campus, Cork, Ireland
| | - Roy D. Sleator
- Department of Biological Sciences, Munster Technological University, Bishopstown Campus, Cork, Ireland
| | - Brigid Lucey
- Department of Biological Sciences, Munster Technological University, Bishopstown Campus, Cork, Ireland
| |
Collapse
|
13
|
Sahu A, Blätke MA, Szymański JJ, Töpfer N. Advances in flux balance analysis by integrating machine learning and mechanism-based models. Comput Struct Biotechnol J 2021; 19:4626-4640. [PMID: 34471504 PMCID: PMC8382995 DOI: 10.1016/j.csbj.2021.08.004] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 08/03/2021] [Accepted: 08/03/2021] [Indexed: 02/08/2023] Open
Abstract
The availability of multi-omics data sets and genome-scale metabolic models for various organisms provide a platform for modeling and analyzing genotype-to-phenotype relationships. Flux balance analysis is the main tool for predicting flux distributions in genome-scale metabolic models and various data-integrative approaches enable modeling context-specific network behavior. Due to its linear nature, this optimization framework is readily scalable to multi-tissue or -organ and even multi-organism models. However, both data and model size can hamper a straightforward biological interpretation of the estimated fluxes. Moreover, flux balance analysis simulates metabolism at steady-state and thus, in its most basic form, does not consider kinetics or regulatory events. The integration of flux balance analysis with complementary data analysis and modeling techniques offers the potential to overcome these challenges. In particular machine learning approaches have emerged as the tool of choice for data reduction and selection of most important variables in big data sets. Kinetic models and formal languages can be used to simulate dynamic behavior. This review article provides an overview of integrative studies that combine flux balance analysis with machine learning approaches, kinetic models, such as physiology-based pharmacokinetic models, and formal graphical modeling languages, such as Petri nets. We discuss the mathematical aspects and biological applications of these integrated approaches and outline challenges and future perspectives.
Collapse
Affiliation(s)
- Ankur Sahu
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466 Gatersleben, Germany
| | - Mary-Ann Blätke
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466 Gatersleben, Germany
| | - Jędrzej Jakub Szymański
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466 Gatersleben, Germany
| | - Nadine Töpfer
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466 Gatersleben, Germany
| |
Collapse
|
14
|
Matthews ML, Marshall-Colón A. Multiscale plant modeling: from genome to phenome and beyond. Emerg Top Life Sci 2021; 5:231-237. [PMID: 33543231 PMCID: PMC8166335 DOI: 10.1042/etls20200276] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 01/18/2021] [Accepted: 01/20/2021] [Indexed: 01/08/2023]
Abstract
Plants are complex organisms that adapt to changes in their environment using an array of regulatory mechanisms that span across multiple levels of biological organization. Due to this complexity, it is difficult to predict emergent properties using conventional approaches that focus on single levels of biology such as the genome, transcriptome, or metabolome. Mathematical models of biological systems have emerged as useful tools for exploring pathways and identifying gaps in our current knowledge of biological processes. Identification of emergent properties, however, requires their vertical integration across biological scales through multiscale modeling. Multiscale models that capture and predict these emergent properties will allow us to predict how plants will respond to a changing climate and explore strategies for plant engineering. In this review, we (1) summarize the recent developments in plant multiscale modeling; (2) examine multiscale models of microbial systems that offer insight to potential future directions for the modeling of plant systems; (3) discuss computational tools and resources for developing multiscale models; and (4) examine future directions of the field.
Collapse
Affiliation(s)
- Megan L Matthews
- Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Institute for Sustainability, Energy, and Environment, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Amy Marshall-Colón
- Institute for Sustainability, Energy, and Environment, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| |
Collapse
|
15
|
Mapping the Transcriptional and Fitness Landscapes of a Pathogenic E. coli Strain: The Effects of Organic Acid Stress under Aerobic and Anaerobic Conditions. Genes (Basel) 2020; 12:genes12010053. [PMID: 33396416 PMCID: PMC7824302 DOI: 10.3390/genes12010053] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 12/22/2020] [Accepted: 12/29/2020] [Indexed: 12/31/2022] Open
Abstract
Several methods are available to probe cellular responses to external stresses at the whole genome level. RNAseq can be used to measure changes in expression of all genes following exposure to stress, but gives no information about the contribution of these genes to an organism’s ability to survive the stress. The relative contribution of each non-essential gene in the genome to the fitness of the organism under stress can be obtained using methods that use sequencing to estimate the frequencies of members of a dense transposon library grown under different conditions, for example by transposon-directed insertion sequencing (TraDIS). These two methods thus probe different aspects of the underlying biology of the organism. We were interested to determine the extent to which the data from these two methods converge on related genes and pathways. To do this, we looked at a combination of biologically meaningful stresses. The human gut contains different organic short-chain fatty acids (SCFAs) produced by fermentation of carbon compounds, and Escherichia coli is exposed to these in its passage through the gut. Their effect is likely to depend on both the ambient pH and the level of oxygen present. We, therefore, generated RNAseq and TraDIS data on a uropathogenic E. coli strain grown at either pH 7 or pH 5.5 in the presence or absence of three SCFAs (acetic, propionic and butyric), either aerobically or anaerobically. Our analysis identifies both known and novel pathways as being likely to be important under these conditions. There is no simple correlation between gene expression and fitness, but we found a significant overlap in KEGG pathways that are predicted to be enriched following analysis of the data from the two methods, and the majority of these showed a fitness signature that would be predicted from the gene expression data, assuming expression to be adaptive. Genes which are not in the E. coli core genome were found to be particularly likely to show a positive correlation between level of expression and contribution to fitness.
Collapse
|
16
|
Fang X, Lloyd CJ, Palsson BO. Reconstructing organisms in silico: genome-scale models and their emerging applications. Nat Rev Microbiol 2020; 18:731-743. [PMID: 32958892 PMCID: PMC7981288 DOI: 10.1038/s41579-020-00440-4] [Citation(s) in RCA: 108] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/17/2020] [Indexed: 02/06/2023]
Abstract
Escherichia coli is considered to be the best-known microorganism given the large number of published studies detailing its genes, its genome and the biochemical functions of its molecular components. This vast literature has been systematically assembled into a reconstruction of the biochemical reaction networks that underlie E. coli's functions, a process which is now being applied to an increasing number of microorganisms. Genome-scale reconstructed networks are organized and systematized knowledge bases that have multiple uses, including conversion into computational models that interpret and predict phenotypic states and the consequences of environmental and genetic perturbations. These genome-scale models (GEMs) now enable us to develop pan-genome analyses that provide mechanistic insights, detail the selection pressures on proteome allocation and address stress phenotypes. In this Review, we first discuss the overall development of GEMs and their applications. Next, we review the evolution of the most complete GEM that has been developed to date: the E. coli GEM. Finally, we explore three emerging areas in genome-scale modelling of microbial phenotypes: collections of strain-specific models, metabolic and macromolecular expression models, and simulation of stress responses.
Collapse
Affiliation(s)
- Xin Fang
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Colton J Lloyd
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA.
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark.
| |
Collapse
|
17
|
Macklin DN, Ahn-Horst TA, Choi H, Ruggero NA, Carrera J, Mason JC, Sun G, Agmon E, DeFelice MM, Maayan I, Lane K, Spangler RK, Gillies TE, Paull ML, Akhter S, Bray SR, Weaver DS, Keseler IM, Karp PD, Morrison JH, Covert MW. Simultaneous cross-evaluation of heterogeneous E. coli datasets via mechanistic simulation. Science 2020; 369:eaav3751. [PMID: 32703847 PMCID: PMC7990026 DOI: 10.1126/science.aav3751] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Revised: 10/28/2019] [Accepted: 05/26/2020] [Indexed: 12/24/2022]
Abstract
The extensive heterogeneity of biological data poses challenges to analysis and interpretation. Construction of a large-scale mechanistic model of Escherichia coli enabled us to integrate and cross-evaluate a massive, heterogeneous dataset based on measurements reported by various groups over decades. We identified inconsistencies with functional consequences across the data, including that the total output of the ribosomes and RNA polymerases described by data are not sufficient for a cell to reproduce measured doubling times, that measured metabolic parameters are neither fully compatible with each other nor with overall growth, and that essential proteins are absent during the cell cycle-and the cell is robust to this absence. Finally, considering these data as a whole leads to successful predictions of new experimental outcomes, in this case protein half-lives.
Collapse
Affiliation(s)
- Derek N Macklin
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Travis A Ahn-Horst
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Heejo Choi
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Nicholas A Ruggero
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA
| | - Javier Carrera
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - John C Mason
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Gwanggyu Sun
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Eran Agmon
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Mialy M DeFelice
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Inbal Maayan
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Keara Lane
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Ryan K Spangler
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Taryn E Gillies
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Morgan L Paull
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Sajia Akhter
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Samuel R Bray
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | | | | | - Jerry H Morrison
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| | - Markus W Covert
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA.
- Allen Discovery Center at Stanford University, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
18
|
Ye C, Luo Q, Guo L, Gao C, Xu N, Zhang L, Liu L, Chen X. Improving lysine production through construction of an Escherichia coli enzyme-constrained model. Biotechnol Bioeng 2020; 117:3533-3544. [PMID: 32648933 DOI: 10.1002/bit.27485] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Revised: 05/28/2020] [Accepted: 07/09/2020] [Indexed: 12/28/2022]
Abstract
Microbial cell factories are widely used for the production of high-value chemicals. However, maximizing production titers is made difficult by the complicated regulatory mechanisms of these cell platforms. Here, kcat values were incorporated to construct an Escherichia coli enzyme-constrained model. The resulting ec_iML1515 model showed that the protein demand and protein synthesis rate were the key factors affecting lysine production. By optimizing the expression of the 20 top-demanded proteins, lysine titers reached 95.7 ± 0.7 g/L, with a 0.45 g/g glucose yield. Moreover, adjusting NH4 + and dissolved oxygen levels to regulate the synthesis rate of energy metabolism-related proteins caused lysine titers and glucose yields to increase to 193.6 ± 1.8 g/L and 0.74 g/g, respectively. The ec_iML1515 model provides insight into how enzymes required for the biosynthesis of certain products are distributed between and within metabolic pathways. This information can be used to accurately predict and rationally design lysine production.
Collapse
Affiliation(s)
- Chao Ye
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, China.,School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, Nanjing, China.,Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China
| | - Qiuling Luo
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, China.,Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China
| | - Liang Guo
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, China.,Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China
| | - Cong Gao
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, China.,Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China
| | - Nan Xu
- College of Bioscience and Biotechnology, Yangzhou University, Yangzhou, China
| | - Li Zhang
- School of Marine and Bioengineering, Yancheng Institute of Technology, Yancheng, China
| | - Liming Liu
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, China.,Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China
| | - Xiulai Chen
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, China.,Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China
| |
Collapse
|
19
|
Ye C, Xu N, Gao C, Liu G, Xu J, Zhang W, Chen X, Nielsen J, Liu L. Comprehensive understanding of Saccharomyces cerevisiae phenotypes with whole-cell model WM_S288C. Biotechnol Bioeng 2020; 117:1562-1574. [PMID: 32022245 DOI: 10.1002/bit.27298] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 01/28/2020] [Accepted: 02/03/2020] [Indexed: 02/01/2023]
Abstract
Biological network construction for Saccharomyces cerevisiae is a widely used approach for simulating phenotypes and designing cell factories. However, due to a complicated regulatory mechanism governing the translation of genotype to phenotype, precise prediction of phenotypes remains challenging. Here, we present WM_S288C, a computational whole-cell model that includes 15 cellular states and 26 cellular processes and which enables integrated analyses of physiological functions of Saccharomyces cerevisiae. Using WM_S288C to predict phenotypes of S. cerevisiae, the functions of 1140 essential genes were characterized and linked to phenotypes at five levels. During the cell cycle, the dynamic allocation of intracellular molecules could be tracked in real-time to simulate cell activities. Additionally, one-third of non-essential genes were identified to affect cell growth via regulating nucleotide concentrations. These results demonstrated the value of WM_S288C as a tool for understanding and investigating the phenotypes of S. cerevisiae.
Collapse
Affiliation(s)
- Chao Ye
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, Jiangsu, China.,Key Laboratory of Industrial Biotechnology, Jiangnan University, Ministry of Education, Wuxi, Jiangsu, China.,National Engineering Laboratory for Cereal Fermentation Technology, Jiangnan University, Wuxi, Jiangsu, China
| | - Nan Xu
- College of Bioscience and Biotechnology, Yangzhou University, Yangzhou, Jiangsu, China
| | - Cong Gao
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, Jiangsu, China.,Key Laboratory of Industrial Biotechnology, Jiangnan University, Ministry of Education, Wuxi, Jiangsu, China.,National Engineering Laboratory for Cereal Fermentation Technology, Jiangnan University, Wuxi, Jiangsu, China
| | - Gaoqiang Liu
- Hunan Provincial Key Laboratory for Forestry Biotechnology, Central South University of Forestry and Technology, Changsha, Hunan, China
| | - Jianzhong Xu
- Key Laboratory of Industrial Biotechnology, Jiangnan University, Ministry of Education, Wuxi, Jiangsu, China
| | - Weiguo Zhang
- Key Laboratory of Industrial Biotechnology, Jiangnan University, Ministry of Education, Wuxi, Jiangsu, China
| | - Xiulai Chen
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, Jiangsu, China.,Key Laboratory of Industrial Biotechnology, Jiangnan University, Ministry of Education, Wuxi, Jiangsu, China.,National Engineering Laboratory for Cereal Fermentation Technology, Jiangnan University, Wuxi, Jiangsu, China
| | - Jens Nielsen
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Liming Liu
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, Jiangsu, China.,Key Laboratory of Industrial Biotechnology, Jiangnan University, Ministry of Education, Wuxi, Jiangsu, China.,National Engineering Laboratory for Cereal Fermentation Technology, Jiangnan University, Wuxi, Jiangsu, China
| |
Collapse
|
20
|
Mao Z, Ma H. iMTBGO: An Algorithm for Integrating Metabolic Networks with Transcriptomes Based on Gene Ontology Analysis. Curr Genomics 2020; 20:252-259. [PMID: 32030085 PMCID: PMC6983954 DOI: 10.2174/1389202920666190626155130] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 05/14/2019] [Accepted: 06/12/2019] [Indexed: 11/22/2022] Open
Abstract
Background: Constraint-based metabolic network models have been widely used in pheno-typic prediction and metabolic engineering design. In recent years, researchers have attempted to im-prove prediction accuracy by integrating regulatory information and multiple types of “omics” data into this constraint-based model. The transcriptome is the most commonly used data type in integration, and a large number of FBA (flux balance analysis)-based integrated algorithms have been developed. Methods and Results: We mapped the Kcat values to the tree structure of GO terms and found that the Kcat values under the same GO term have a higher similarity. Based on this observation, we developed a new method, called iMTBGO, to predict metabolic flux distributions by constraining reaction bounda-ries based on gene expression ratios normalized by marker genes under the same GO term. We applied this method to previously published data and compared the prediction results with other metabolic flux analysis methods which also utilize gene expression data. The prediction errors of iMTBGO for both growth rates and fluxes in the central metabolic pathways were smaller than those of earlier published methods. Conclusion: Considering the fact that reaction rates are not only determined by genes/expression levels, but also by the specific activities of enzymes, the iMTBGO method allows us to make more precise pre-dictions of metabolic fluxes by using expression values normalized based on GO.
Collapse
Affiliation(s)
- Zhitao Mao
- 1A Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin300308, China; 2University of Chinese Academy of Sciences, Beijing100049, China
| | - Hongwu Ma
- 1A Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin300308, China; 2University of Chinese Academy of Sciences, Beijing100049, China
| |
Collapse
|
21
|
A Protocol for the Construction and Curation of Genome-Scale Integrated Metabolic and Regulatory Network Models. Methods Mol Biol 2019. [PMID: 30788794 DOI: 10.1007/978-1-4939-9142-6_14] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Genome-scale metabolic network models have been widely used over the last decade and have been shown to successfully predict the metabolic behavior of many organisms. Yet the complexity of metabolic regulation often limits the accuracy of these models. Integrative modeling approaches have recently been developed that combine metabolic and regulatory networks, thereby expanding the capabilities and accuracy of genome-scale modeling. This chapter provides a guide to reconstruct and curate such integrated network models. Specifically, this protocol describes the PROM (Probabilistic Regulation of Metabolism) and GEMINI (Gene Expression and Metabolism Integrated for Network Inference) approaches. PROM is an automated method for the construction of integrated metabolic and transcriptional regulatory network models, while the GEMINI approach curates the integrated network models using transcriptomics and phenomics data. GEMINI represents the first attempt at applying well-established curation tools that exist for metabolic networks to be applied for curating regulatory networks. The integrated network models generated by these approaches enable the mechanistic integration of diverse biological data and can identify novel strategies to engineer cellular metabolism.
Collapse
|
22
|
Rai N, Huynh L, Kim M, Tagkopoulos I. Population collapse and adaptive rescue during long‐term chemostat fermentation. Biotechnol Bioeng 2019; 116:693-703. [DOI: 10.1002/bit.26898] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2018] [Revised: 11/02/2018] [Accepted: 12/06/2018] [Indexed: 11/09/2022]
Affiliation(s)
- Navneet Rai
- UC Davis Genome Center, University of California Davis California
- Department of Computer Science University of California Davis California
| | - Linh Huynh
- UC Davis Genome Center, University of California Davis California
- Department of Computer Science University of California Davis California
| | - Minseung Kim
- UC Davis Genome Center, University of California Davis California
- Department of Computer Science University of California Davis California
| | - Ilias Tagkopoulos
- UC Davis Genome Center, University of California Davis California
- Department of Computer Science University of California Davis California
| |
Collapse
|
23
|
Huynh-Thu VA, Geurts P. Unsupervised Gene Network Inference with Decision Trees and Random Forests. Methods Mol Biol 2019; 1883:195-215. [PMID: 30547401 DOI: 10.1007/978-1-4939-8882-2_8] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
In this chapter, we introduce the reader to a popular family of machine learning algorithms, called decision trees. We then review several approaches based on decision trees that have been developed for the inference of gene regulatory networks (GRNs). Decision trees have indeed several nice properties that make them well-suited for tackling this problem: they are able to detect multivariate interacting effects between variables, are non-parametric, have good scalability, and have very few parameters. In particular, we describe in detail the GENIE3 algorithm, a state-of-the-art method for GRN inference.
Collapse
Affiliation(s)
- Vân Anh Huynh-Thu
- Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium.
| | - Pierre Geurts
- Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium
| |
Collapse
|
24
|
Abstract
Growth rate is one of the most important and most complex phenotypic characteristics of unicellular microorganisms, which determines the genetic mutations that dominate at the population level, and ultimately whether the population will survive. Translating changes at the genetic level to their growth-rate consequences remains a subject of intense interest, since such a mapping could rationally direct experiments to optimize antibiotic efficacy or bioreactor productivity. In this work, we directly map transcriptional profiles to growth rates by gathering published gene-expression data from Escherichia coli and Saccharomyces cerevisiae with corresponding growth-rate measurements. Using a machine-learning technique called k-nearest-neighbors regression, we build a model which predicts growth rate from gene expression. By exploiting the correlated nature of gene expression and sparsifying the model, we capture 81% of the variance in growth rate of the E. coli dataset, while reducing the number of features from >4,000 to 9. In S. cerevisiae, we account for 89% of the variance in growth rate, while reducing from >5,500 dimensions to 18. Such a model provides a basis for selecting successful strategies from among the combinatorial number of experimental possibilities when attempting to optimize complex phenotypic traits like growth rate.
Collapse
|
25
|
Eetemadi A, Tagkopoulos I. Genetic Neural Networks: an artificial neural network architecture for capturing gene expression relationships. Bioinformatics 2018; 35:2226-2234. [DOI: 10.1093/bioinformatics/bty945] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 10/27/2018] [Accepted: 11/16/2018] [Indexed: 01/16/2023] Open
Abstract
Abstract
Motivation
Gene expression prediction is one of the grand challenges in computational biology. The availability of transcriptomics data combined with recent advances in artificial neural networks provide an unprecedented opportunity to create predictive models of gene expression with far reaching applications.
Results
We present the Genetic Neural Network (GNN), an artificial neural network for predicting genome-wide gene expression given gene knockouts and master regulator perturbations. In its core, the GNN maps existing gene regulatory information in its architecture and it uses cell nodes that have been specifically designed to capture the dependencies and non-linear dynamics that exist in gene networks. These two key features make the GNN architecture capable to capture complex relationships without the need of large training datasets. As a result, GNNs were 40% more accurate on average than competing architectures (MLP, RNN, BiRNN) when compared on hundreds of curated and inferred transcription modules. Our results argue that GNNs can become the architecture of choice when building predictors of gene expression from exponentially growing corpus of genome-wide transcriptomics data.
Availability and implementation
https://github.com/IBPA/GNN
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ameen Eetemadi
- Department of Computer Science, University of California, Davis, CA, USA
- Genome Center, University of California, Davis, CA, USA
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, CA, USA
- Genome Center, University of California, Davis, CA, USA
| |
Collapse
|
26
|
Caglar MU, Hockenberry AJ, Wilke CO. Predicting bacterial growth conditions from mRNA and protein abundances. PLoS One 2018; 13:e0206634. [PMID: 30388153 PMCID: PMC6214550 DOI: 10.1371/journal.pone.0206634] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2018] [Accepted: 10/16/2018] [Indexed: 01/30/2023] Open
Abstract
Cells respond to changing nutrient availability and external stresses by altering the expression of individual genes. Condition-specific gene expression patterns may thus provide a promising and low-cost route to quantifying the presence of various small molecules, toxins, or species-interactions in natural environments. However, whether gene expression signatures alone can predict individual environmental growth conditions remains an open question. Here, we used machine learning to predict 16 closely-related growth conditions using 155 datasets of E. coli transcript and protein abundances. We show that models are able to discriminate between different environmental features with a relatively high degree of accuracy. We observed a small but significant increase in model accuracy by combining transcriptome and proteome-level data, and we show that measurements from stationary phase cells typically provide less useful information for discriminating between conditions as compared to exponentially growing populations. Nevertheless, with sufficient training data, gene expression measurements from a single species are capable of distinguishing between environmental conditions that are separated by a single environmental variable.
Collapse
Affiliation(s)
- M. Umut Caglar
- Department of Integrative Biology, The University of Texas at Austin, Austin, Texas, United States of America
| | - Adam J. Hockenberry
- Department of Integrative Biology, The University of Texas at Austin, Austin, Texas, United States of America
| | - Claus O. Wilke
- Department of Integrative Biology, The University of Texas at Austin, Austin, Texas, United States of America
- * E-mail:
| |
Collapse
|
27
|
Harrison OB, Schoen C, Retchless AC, Wang X, Jolley KA, Bray JE, Maiden MCJ. Neisseria genomics: current status and future perspectives. Pathog Dis 2018; 75:3861976. [PMID: 28591853 PMCID: PMC5827584 DOI: 10.1093/femspd/ftx060] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 06/05/2017] [Indexed: 12/17/2022] Open
Abstract
High-throughput whole genome sequencing has unlocked a multitude of possibilities enabling members of the Neisseria genus to be examined with unprecedented detail, including the human pathogens Neisseria meningitidis and Neisseria gonorrhoeae. To maximise the potential benefit of this for public health, it is becoming increasingly important to ensure that this plethora of data are adequately stored, disseminated and made readily accessible. Investigations facilitating cross-species comparisons as well as the analysis of global datasets will allow differences among and within species and across geographic locations and different times to be identified, improving our understanding of the distinct phenotypes observed. Recent advances in high-throughput platforms that measure the transcriptome, proteome and/or epigenome are also becoming increasingly employed to explore the complexities of Neisseria biology. An integrated approach to the analysis of these is essential to fully understand the impact these may have in the Neisseria genus. This article reviews the current status of some of the tools available for next generation sequence analysis at the dawn of the ‘post-genomic’ era.
Collapse
Affiliation(s)
| | - Christoph Schoen
- Institute for Hygiene and Microbiology, University of Würzburg, Würzburg 97080, Germany
| | - Adam C Retchless
- Centers for Disease Control and Prevention, Atlanta, GA 30333, USA
| | - Xin Wang
- Centers for Disease Control and Prevention, Atlanta, GA 30333, USA
| | - Keith A Jolley
- Department of Zoology, University of Oxford, Oxford OX1 3SY, UK
| | - James E Bray
- Department of Zoology, University of Oxford, Oxford OX1 3SY, UK
| | | |
Collapse
|
28
|
Wytock TP, Fiebig A, Willett JW, Herrou J, Fergin A, Motter AE, Crosson S. Experimental evolution of diverse Escherichia coli metabolic mutants identifies genetic loci for convergent adaptation of growth rate. PLoS Genet 2018; 14:e1007284. [PMID: 29584733 PMCID: PMC5892946 DOI: 10.1371/journal.pgen.1007284] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Revised: 04/10/2018] [Accepted: 03/02/2018] [Indexed: 01/08/2023] Open
Abstract
Cell growth is determined by substrate availability and the cell’s metabolic capacity to assimilate substrates into building blocks. Metabolic genes that determine growth rate may interact synergistically or antagonistically, and can accelerate or slow growth, depending on genetic background and environmental conditions. We evolved a diverse set of Escherichia coli single-gene deletion mutants with a spectrum of growth rates and identified mutations that generally increase growth rate. Despite the metabolic differences between parent strains, mutations that enhanced growth largely mapped to core transcription machinery, including the β and β’ subunits of RNA polymerase (RNAP) and the transcription elongation factor, NusA. The structural segments of RNAP that determine enhanced growth have been previously implicated in antibiotic resistance and in the control of transcription elongation and pausing. We further developed a computational framework to characterize how the transcriptional changes that occur upon acquisition of these mutations affect growth rate across strains. Our experimental and computational results provide evidence for cases in which RNAP mutations shift the competitive balance between active transcription and gene silencing. This study demonstrates that mutations in specific regions of RNAP are a convergent adaptive solution that can enhance the growth rate of cells from distinct metabolic states. The loss of a metabolic function caused by gene deletion can be compensated, in certain cases, by the concurrent mutation of a second gene. Whether such gene pairs share a local chemical or regulatory relationship or interact via a non-local mechanism has implications for the co-evolution of genetic changes, development of alternatives to gene therapy, and the design of combination antimicrobial therapies that select against resistance. Yet, we lack a comprehensive knowledge of adaptive responses to metabolic mutations, and our understanding of the mechanisms underlying genetic rescue remains limited. We present results of a laboratory evolution approach that has the potential to address both challenges, showing that mutations in specific regions of RNA polymerase enhance growth rates of distinct mutant strains of Escherichia coli with a spectrum of growth defects. Several of these adaptive mutations are deleterious when engineered directly into the original wild-type strain under alternative cultivation conditions, and thus have epistatic rescue properties when paired with the corresponding primary metabolic gene deletions. Our combination of adaptive evolution, directed genetic engineering, and mathematical analysis of transcription and growth rate distinguishes between rescue interactions that are specific or non-specific to a particular deletion. Our study further supports a model for RNA polymerase as a locus of convergent adaptive evolution from different sub-optimal metabolic starting points.
Collapse
Affiliation(s)
- Thomas P. Wytock
- Department of Physics and Astronomy, Northwestern University, Evanston, Illinois, United States of America
| | - Aretha Fiebig
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Jonathan W. Willett
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Julien Herrou
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Aleksandra Fergin
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Adilson E. Motter
- Department of Physics and Astronomy, Northwestern University, Evanston, Illinois, United States of America
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, Illinois, United States of America
- * E-mail: (AEM); (SC)
| | - Sean Crosson
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Microbiology, University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (AEM); (SC)
| |
Collapse
|
29
|
Xu N, Ye C, Liu L. Genome-scale biological models for industrial microbial systems. Appl Microbiol Biotechnol 2018; 102:3439-3451. [PMID: 29497793 DOI: 10.1007/s00253-018-8803-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Revised: 01/19/2018] [Accepted: 01/21/2018] [Indexed: 01/08/2023]
Abstract
The primary aims and challenges associated with microbial fermentation include achieving faster cell growth, higher productivity, and more robust production processes. Genome-scale biological models, predicting the formation of an interaction among genetic materials, enzymes, and metabolites, constitute a systematic and comprehensive platform to analyze and optimize the microbial growth and production of biological products. Genome-scale biological models can help optimize microbial growth-associated traits by simulating biomass formation, predicting growth rates, and identifying the requirements for cell growth. With regard to microbial product biosynthesis, genome-scale biological models can be used to design product biosynthetic pathways, accelerate production efficiency, and reduce metabolic side effects, leading to improved production performance. The present review discusses the development of microbial genome-scale biological models since their emergence and emphasizes their pertinent application in improving industrial microbial fermentation of biological products.
Collapse
Affiliation(s)
- Nan Xu
- State Key Laboratory of Food Science and Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu, 214122, China.,College of Bioscience and Biotechnology, Yangzhou University, Yangzhou, Jiangsu, 225009, China.,The Laboratory of Food Microbial-Manufacturing Engineering, Jiangnan University, Wuxi, 214122, China
| | - Chao Ye
- State Key Laboratory of Food Science and Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu, 214122, China.,Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu, 214122, China.,The Laboratory of Food Microbial-Manufacturing Engineering, Jiangnan University, Wuxi, 214122, China
| | - Liming Liu
- State Key Laboratory of Food Science and Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu, 214122, China. .,Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu, 214122, China. .,The Laboratory of Food Microbial-Manufacturing Engineering, Jiangnan University, Wuxi, 214122, China.
| |
Collapse
|
30
|
Abstract
The genome-scale cellular network has become a necessary tool in the systematic analysis of microbes. In a cell, there are several layers (i.e., types) of the molecular networks, for example, genome-scale metabolic network (GMN), transcriptional regulatory network (TRN), and signal transduction network (STN). It has been realized that the limitation and inaccuracy of the prediction exist just using only a single-layer network. Therefore, the integrated network constructed based on the networks of the three types attracts more interests. The function of a biological process in living cells is usually performed by the interaction of biological components. Therefore, it is necessary to integrate and analyze all the related components at the systems level for the comprehensively and correctly realizing the physiological function in living organisms. In this review, we discussed three representative genome-scale cellular networks: GMN, TRN, and STN, representing different levels (i.e., metabolism, gene regulation, and cellular signaling) of a cell’s activities. Furthermore, we discussed the integration of the networks of the three types. With more understanding on the complexity of microbial cells, the development of integrated network has become an inevitable trend in analyzing genome-scale cellular networks of microorganisms.
Collapse
Affiliation(s)
- Tong Hao
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Dan Wu
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Lingxuan Zhao
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Qian Wang
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Edwin Wang
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China.,Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Jinsheng Sun
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China.,Tianjin Bohai Fisheries Research Institute, Tianjin, China
| |
Collapse
|
31
|
Vijayakumar S, Conway M, Lió P, Angione C. Optimization of Multi-Omic Genome-Scale Models: Methodologies, Hands-on Tutorial, and Perspectives. Methods Mol Biol 2018; 1716:389-408. [PMID: 29222764 DOI: 10.1007/978-1-4939-7528-0_18] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Genome-scale metabolic models are valuable tools for assessing the metabolic potential of living organisms. Being downstream of gene expression, metabolism is increasingly being used as an indicator of the phenotypic outcome for drugs and therapies. We here present a review of the principal methods used for constraint-based modelling in systems biology, and explore how the integration of multi-omic data can be used to improve phenotypic predictions of genome-scale metabolic models. We believe that the large-scale comparison of the metabolic response of an organism to different environmental conditions will be an important challenge for genome-scale models. Therefore, within the context of multi-omic methods, we describe a tutorial for multi-objective optimization using the metabolic and transcriptomics adaptation estimator (METRADE), implemented in MATLAB. METRADE uses microarray and codon usage data to model bacterial metabolic response to environmental conditions (e.g., antibiotics, temperatures, heat shock). Finally, we discuss key considerations for the integration of multi-omic networks into metabolic models, towards automatically extracting knowledge from such models.
Collapse
Affiliation(s)
- Supreeta Vijayakumar
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, Tees Valley TS1 3BX, UK
| | - Max Conway
- Computer Laboratory, University of Cambridge, Cambridge, UK
| | - Pietro Lió
- Computer Laboratory, University of Cambridge, Cambridge, UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, Tees Valley TS1 3BX, UK.
| |
Collapse
|
32
|
Szigeti B, Roth YD, Sekar JAP, Goldberg AP, Pochiraju SC, Karr JR. A blueprint for human whole-cell modeling. ACTA ACUST UNITED AC 2017; 7:8-15. [PMID: 29806041 DOI: 10.1016/j.coisb.2017.10.005] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Whole-cell dynamical models of human cells are a central goal of systems biology. Such models could help researchers understand cell biology and help physicians treat disease. Despite significant challenges, we believe that human whole-cell models are rapidly becoming feasible. To develop a plan for achieving human whole-cell models, we analyzed the existing models of individual cellular pathways, surveyed the biomodeling community, and reflected on our experience developing whole-cell models of bacteria. Based on these analyses, we propose a plan for a project, termed the Human Whole-Cell Modeling Project, to achieve human whole-cell models. The foundations of the plan include technology development, standards development, and interdisciplinary collaboration.
Collapse
Affiliation(s)
- Balázs Szigeti
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - Yosef D Roth
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - John A P Sekar
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - Arthur P Goldberg
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - Saahith C Pochiraju
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| | - Jonathan R Karr
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029, USA
| |
Collapse
|
33
|
Global transcriptional regulatory network for Escherichia coli robustly connects gene expression to transcription factor activities. Proc Natl Acad Sci U S A 2017; 114:10286-10291. [PMID: 28874552 DOI: 10.1073/pnas.1702581114] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Transcriptional regulatory networks (TRNs) have been studied intensely for >25 y. Yet, even for the Escherichia coli TRN-probably the best characterized TRN-several questions remain. Here, we address three questions: (i) How complete is our knowledge of the E. coli TRN; (ii) how well can we predict gene expression using this TRN; and (iii) how robust is our understanding of the TRN? First, we reconstructed a high-confidence TRN (hiTRN) consisting of 147 transcription factors (TFs) regulating 1,538 transcription units (TUs) encoding 1,764 genes. The 3,797 high-confidence regulatory interactions were collected from published, validated chromatin immunoprecipitation (ChIP) data and RegulonDB. For 21 different TF knockouts, up to 63% of the differentially expressed genes in the hiTRN were traced to the knocked-out TF through regulatory cascades. Second, we trained supervised machine learning algorithms to predict the expression of 1,364 TUs given TF activities using 441 samples. The algorithms accurately predicted condition-specific expression for 86% (1,174 of 1,364) of the TUs, while 193 TUs (14%) were predicted better than random TRNs. Third, we identified 10 regulatory modules whose definitions were robust against changes to the TRN or expression compendium. Using surrogate variable analysis, we also identified three unmodeled factors that systematically influenced gene expression. Our computational workflow comprehensively characterizes the predictive capabilities and systems-level functions of an organism's TRN from disparate data types.
Collapse
|
34
|
Babtie AC, Stumpf MPH. How to deal with parameters for whole-cell modelling. J R Soc Interface 2017; 14:20170237. [PMID: 28768879 PMCID: PMC5582120 DOI: 10.1098/rsif.2017.0237] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Accepted: 06/22/2017] [Indexed: 11/12/2022] Open
Abstract
Dynamical systems describing whole cells are on the verge of becoming a reality. But as models of reality, they are only useful if we have realistic parameters for the molecular reaction rates and cell physiological processes. There is currently no suitable framework to reliably estimate hundreds, let alone thousands, of reaction rate parameters. Here, we map out the relative weaknesses and promises of different approaches aimed at redressing this issue. While suitable procedures for estimation or inference of the whole (vast) set of parameters will, in all likelihood, remain elusive, some hope can be drawn from the fact that much of the cellular behaviour may be explained in terms of smaller sets of parameters. Identifying such parameter sets and assessing their behaviour is now becoming possible even for very large systems of equations, and we expect such methods to become central tools in the development and analysis of whole-cell models.
Collapse
Affiliation(s)
- Ann C Babtie
- Department of Life Sciences, Imperial College London, London, UK
| | | |
Collapse
|
35
|
van der Ark KCH, van Heck RGA, Martins Dos Santos VAP, Belzer C, de Vos WM. More than just a gut feeling: constraint-based genome-scale metabolic models for predicting functions of human intestinal microbes. MICROBIOME 2017; 5:78. [PMID: 28705224 PMCID: PMC5512848 DOI: 10.1186/s40168-017-0299-x] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Accepted: 07/05/2017] [Indexed: 05/14/2023]
Abstract
The human gut is colonized with a myriad of microbes, with substantial interpersonal variation. This complex ecosystem is an integral part of the gastrointestinal tract and plays a major role in the maintenance of homeostasis. Its dysfunction has been correlated to a wide array of diseases, but the understanding of causal mechanisms is hampered by the limited amount of cultured microbes, poor understanding of phenotypes, and the limited knowledge about interspecies interactions. Genome-scale metabolic models (GEMs) have been used in many different fields, ranging from metabolic engineering to the prediction of interspecies interactions. We provide showcase examples for the application of GEMs for gut microbes and focus on (i) the prediction of minimal, synthetic, or defined media; (ii) the prediction of possible functions and phenotypes; and (iii) the prediction of interspecies interactions. All three applications are key in understanding the role of individual species in the gut ecosystem as well as the role of the microbiota as a whole. Using GEMs in the described fashions has led to designs of minimal growth media, an increased understanding of microbial phenotypes and their influence on the host immune system, and dietary interventions to improve human health. Ultimately, an increased understanding of the gut ecosystem will enable targeted interventions in gut microbial composition to restore homeostasis and appropriate host-microbe crosstalk.
Collapse
Affiliation(s)
- Kees C H van der Ark
- Laboratory of Microbiology, Wageningen University, Stippeneng 4, 6708 WE, Wageningen, The Netherlands
| | - Ruben G A van Heck
- Laboratory of Systems and Synthetic Biology, Wageningen University, Stippeneng 4, 6708 WE, Wageningen, The Netherlands
| | - Vitor A P Martins Dos Santos
- Laboratory of Systems and Synthetic Biology, Wageningen University, Stippeneng 4, 6708 WE, Wageningen, The Netherlands
- LifeGlimmer GmbH, Markelstrasse 38, 12163, Berlin, Germany
| | - Clara Belzer
- Laboratory of Microbiology, Wageningen University, Stippeneng 4, 6708 WE, Wageningen, The Netherlands
| | - Willem M de Vos
- Laboratory of Microbiology, Wageningen University, Stippeneng 4, 6708 WE, Wageningen, The Netherlands.
- RPU Immunobiology, Department of Bacteriology and Immunology, University of Helsinki, Haartmanikatu 4, 002940, Helsinki, Finland.
| |
Collapse
|
36
|
Chen X, Gao C, Guo L, Hu G, Luo Q, Liu J, Nielsen J, Chen J, Liu L. DCEO Biotechnology: Tools To Design, Construct, Evaluate, and Optimize the Metabolic Pathway for Biosynthesis of Chemicals. Chem Rev 2017; 118:4-72. [DOI: 10.1021/acs.chemrev.6b00804] [Citation(s) in RCA: 109] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Xiulai Chen
- State
Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
- Key
Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Cong Gao
- State
Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
- Key
Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Liang Guo
- State
Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
- Key
Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Guipeng Hu
- State
Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
- Key
Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Qiuling Luo
- State
Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
- Key
Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Jia Liu
- State
Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
- Key
Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Jens Nielsen
- Department
of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg SE-412 96, Sweden
- Novo
Nordisk Foundation Center for Biosustainability, Technical University of Denmark, DK2800 Lyngby, Denmark
| | - Jian Chen
- State
Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
- Key
Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Liming Liu
- State
Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
- Department
of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg SE-412 96, Sweden
- Key
Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
| |
Collapse
|
37
|
Xu N, Ye C, Chen X, Liu J, Liu L. Genome-scale metabolic modelling common cofactors metabolism in microorganisms. J Biotechnol 2017; 251:1-13. [PMID: 28385592 DOI: 10.1016/j.jbiotec.2017.04.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2017] [Revised: 04/02/2017] [Accepted: 04/03/2017] [Indexed: 12/20/2022]
Abstract
The common cofactors ATP/ADP, NAD(P)(H), and acetyl-CoA/CoA are indispensable participants in biochemical reactions in industrial microbes. To systematically explore the effects of these cofactors on cell growth and metabolic phenotypes, the first genome-scale cofactor metabolic model, icmNX6434, including 6434 genes, 1782 metabolites, and 6877 reactions, was constructed from 14 genome-scale metabolic models of 14 industrial strains. The origin, consumption, and interactions of these common cofactors in microbial cells were elucidated by the icmNX6434 model, and they played important roles in cell growth. The essential cofactor modules contained 2480 genes and 2948 reactions; therefore, improving cofactor biosynthesis, directing these cofactors into essential metabolic pathways, as well as avoiding cofactor utilization during byproduct biosynthesis and futile cycles, are three ways to increase cell growth. The effects of these common cofactors on the distribution and rate of the carbon flux in four universal modes, as well as an optimized metabolic flux, could be obtained by manipulating cofactor availability and balance. Significant changes in the ATP, NAD(H), NADP(H), or acetyl-CoA concentrations triggered relevant metabolic responses to acidic, oxidative, heat, and osmotic stress. Globally, the model icmNX6434 provides a comprehensive platform to elucidate the physiological effects of these cofactors on cell growth, metabolic flux, and industrial robustness. Moreover, the results of this study are a further example of using a consensus genome-scale metabolic model to increase our understanding of key biological processes.
Collapse
Affiliation(s)
- Nan Xu
- State Key Laboratory of Food Science and Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; College of Bioscience and Biotechnology, Yangzhou University, Yangzhou, Jiangsu 225009, China; The Laboratory of Food Microbial-Manufacturing Engineering, Jiangnan University, Wuxi 214122, China
| | - Chao Ye
- State Key Laboratory of Food Science and Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; The Laboratory of Food Microbial-Manufacturing Engineering, Jiangnan University, Wuxi 214122, China
| | - Xiulai Chen
- State Key Laboratory of Food Science and Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; The Laboratory of Food Microbial-Manufacturing Engineering, Jiangnan University, Wuxi 214122, China
| | - Jia Liu
- State Key Laboratory of Food Science and Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; The Laboratory of Food Microbial-Manufacturing Engineering, Jiangnan University, Wuxi 214122, China
| | - Liming Liu
- State Key Laboratory of Food Science and Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; The Laboratory of Food Microbial-Manufacturing Engineering, Jiangnan University, Wuxi 214122, China.
| |
Collapse
|
38
|
Multi-omic data integration enables discovery of hidden biological regularities. Nat Commun 2016; 7:13091. [PMID: 27782110 PMCID: PMC5095171 DOI: 10.1038/ncomms13091] [Citation(s) in RCA: 104] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2016] [Accepted: 08/31/2016] [Indexed: 01/01/2023] Open
Abstract
Rapid growth in size and complexity of biological data sets has led to the ‘Big Data to Knowledge' challenge. We develop advanced data integration methods for multi-level analysis of genomic, transcriptomic, ribosomal profiling, proteomic and fluxomic data. First, we show that pairwise integration of primary omics data reveals regularities that tie cellular processes together in Escherichia coli: the number of protein molecules made per mRNA transcript and the number of ribosomes required per translated protein molecule. Second, we show that genome-scale models, based on genomic and bibliomic data, enable quantitative synchronization of disparate data types. Integrating omics data with models enabled the discovery of two novel regularities: condition invariant in vivo turnover rates of enzymes and the correlation of protein structural motifs and translational pausing. These regularities can be formally represented in a computable format allowing for coherent interpretation and prediction of fitness and selection that underlies cellular physiology. Translating omics data sets into biological insight is one of the great challenges of our time. Here, the authors make headway by synchronising pairs of omics data types via invariants across conditions and by integrating datasets into a genome-scale model of E. coli metabolism and gene expression.
Collapse
|
39
|
Kim M, Rai N, Zorraquino V, Tagkopoulos I. Multi-omics integration accurately predicts cellular state in unexplored conditions for Escherichia coli. Nat Commun 2016; 7:13090. [PMID: 27713404 PMCID: PMC5059772 DOI: 10.1038/ncomms13090] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Accepted: 09/01/2016] [Indexed: 12/20/2022] Open
Abstract
A significant obstacle in training predictive cell models is the lack of integrated data sources. We develop semi-supervised normalization pipelines and perform experimental characterization (growth, transcriptional, proteome) to create Ecomics, a consistent, quality-controlled multi-omics compendium for Escherichia coli with cohesive meta-data information. We then use this resource to train a multi-scale model that integrates four omics layers to predict genome-wide concentrations and growth dynamics. The genetic and environmental ontology reconstructed from the omics data is substantially different and complementary to the genetic and chemical ontologies. The integration of different layers confers an incremental increase in the prediction performance, as does the information about the known gene regulatory and protein-protein interactions. The predictive performance of the model ranges from 0.54 to 0.87 for the various omics layers, which far exceeds various baselines. This work provides an integrative framework of omics-driven predictive modelling that is broadly applicable to guide biological discovery. Multi-omics data integration is a great challenge. Here, the authors compile a database of E. coli proteomics, transcriptomics, metabolomics and fluxomics data to train models of recurrent neural network and constrained regression, enabling prediction of bacterial responses to perturbations.
Collapse
Affiliation(s)
- Minseung Kim
- Department of Computer Science, University of California, Davis, California 95616, USA.,Genome Center, University of California, Davis, California 95616, USA
| | - Navneet Rai
- Genome Center, University of California, Davis, California 95616, USA
| | | | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, California 95616, USA.,Genome Center, University of California, Davis, California 95616, USA
| |
Collapse
|
40
|
Machado D, Herrgård MJ, Rocha I. Stoichiometric Representation of Gene-Protein-Reaction Associations Leverages Constraint-Based Analysis from Reaction to Gene-Level Phenotype Prediction. PLoS Comput Biol 2016; 12:e1005140. [PMID: 27711110 PMCID: PMC5053500 DOI: 10.1371/journal.pcbi.1005140] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 09/13/2016] [Indexed: 12/05/2022] Open
Abstract
Genome-scale metabolic reconstructions are currently available for hundreds of organisms. Constraint-based modeling enables the analysis of the phenotypic landscape of these organisms, predicting the response to genetic and environmental perturbations. However, since constraint-based models can only describe the metabolic phenotype at the reaction level, understanding the mechanistic link between genotype and phenotype is still hampered by the complexity of gene-protein-reaction associations. We implement a model transformation that enables constraint-based methods to be applied at the gene level by explicitly accounting for the individual fluxes of enzymes (and subunits) encoded by each gene. We show how this can be applied to different kinds of constraint-based analysis: flux distribution prediction, gene essentiality analysis, random flux sampling, elementary mode analysis, transcriptomics data integration, and rational strain design. In each case we demonstrate how this approach can lead to improved phenotype predictions and a deeper understanding of the genotype-to-phenotype link. In particular, we show that a large fraction of reaction-based designs obtained by current strain design methods are not actually feasible, and show how our approach allows using the same methods to obtain feasible gene-based designs. We also show, by extensive comparison with experimental 13C-flux data, how simple reformulations of different simulation methods with gene-wise objective functions result in improved prediction accuracy. The model transformation proposed in this work enables existing constraint-based methods to be used at the gene level without modification. This automatically leverages phenotype analysis from reaction to gene level, improving the biological insight that can be obtained from genome-scale models.
Collapse
Affiliation(s)
- Daniel Machado
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| | - Markus J. Herrgård
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Horsølm, Denmark
| | - Isabel Rocha
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| |
Collapse
|
41
|
Cuevas DA, Edirisinghe J, Henry CS, Overbeek R, O’Connell TG, Edwards RA. From DNA to FBA: How to Build Your Own Genome-Scale Metabolic Model. Front Microbiol 2016; 7:907. [PMID: 27379044 PMCID: PMC4911401 DOI: 10.3389/fmicb.2016.00907] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Accepted: 05/27/2016] [Indexed: 11/19/2022] Open
Abstract
Microbiological studies are increasingly relying on in silico methods to perform exploration and rapid analysis of genomic data, and functional genomics studies are supplemented by the new perspectives that genome-scale metabolic models offer. A mathematical model consisting of a microbe's entire metabolic map can be rapidly determined from whole-genome sequencing and annotating the genomic material encoded in its DNA. Flux-balance analysis (FBA), a linear programming technique that uses metabolic models to predict the phenotypic responses imposed by environmental elements and factors, is the leading method to simulate and manipulate cellular growth in silico. However, the process of creating an accurate model to use in FBA consists of a series of steps involving a multitude of connections between bioinformatics databases, enzyme resources, and metabolic pathways. We present the methodology and procedure to obtain a metabolic model using PyFBA, an extensible Python-based open-source software package aimed to provide a platform where functional annotations are used to build metabolic models (http://linsalrob.github.io/PyFBA). Backed by the Model SEED biochemistry database, PyFBA contains methods to reconstruct a microbe's metabolic map, run FBA upon different media conditions, and gap-fill its metabolism. The extensibility of PyFBA facilitates novel techniques in creating accurate genome-scale metabolic models.
Collapse
Affiliation(s)
- Daniel A. Cuevas
- Computational Science Research Center, San Diego State University, San DiegoCA, USA
| | - Janaka Edirisinghe
- Mathematics and Computer Science Division, Argonne National Laboratory, ArgonneIL, USA
| | - Chris S. Henry
- Mathematics and Computer Science Division, Argonne National Laboratory, ArgonneIL, USA
| | - Ross Overbeek
- Fellowship for Interpretation of Genomes, Burr RidgeIL, USA
| | - Taylor G. O’Connell
- Biological and Medical Informatics Research Center, San Diego State University, San DiegoCA, USA
| | - Robert A. Edwards
- Computational Science Research Center, San Diego State University, San DiegoCA, USA
- Biological and Medical Informatics Research Center, San Diego State University, San DiegoCA, USA
- Department of Computer Science, San Diego State University, San DiegoCA, USA
- Department of Biology, San Diego State University, San DiegoCA, USA
| |
Collapse
|
42
|
Tummler K, Kühn C, Klipp E. Dynamic metabolic models in context: biomass backtracking. Integr Biol (Camb) 2016; 7:940-51. [PMID: 26189715 DOI: 10.1039/c5ib00050e] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Mathematical modeling has proven to be a powerful tool to understand and predict functional and regulatory properties of metabolic processes. High accuracy dynamic modeling of individual pathways is thereby opposed by simplified but genome scale constraint based approaches. A method that links these two powerful techniques would greatly enhance predictive power but is so far lacking. We present biomass backtracking, a workflow that integrates the cellular context in existing dynamic metabolic models via stoichiometrically exact drain reactions based on a genome scale metabolic model. With comprehensive examples, for different species and environmental contexts, we show the importance and scope of applications and highlight the improvement compared to common boundary formulations in existing metabolic models. Our method allows for the contextualization of dynamic metabolic models based on all available information. We anticipate this to greatly increase their accuracy and predictive power for basic research and also for drug development and industrial applications.
Collapse
Affiliation(s)
- Katja Tummler
- Theoretische Biophysik, Humboldt-Universität zu Berlin, Germany.
| | | | | |
Collapse
|
43
|
Yu MK, Kramer M, Dutkowski J, Srivas R, Licon K, Kreisberg J, Ng CT, Krogan N, Sharan R, Ideker T. Translation of Genotype to Phenotype by a Hierarchy of Cell Subsystems. Cell Syst 2016; 2:77-88. [PMID: 26949740 PMCID: PMC4772745 DOI: 10.1016/j.cels.2016.02.003] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Accurately translating genotype to phenotype requires accounting for the functional impact of genetic variation at many biological scales. Here we present a strategy for genotype-phenotype reasoning based on existing knowledge of cellular subsystems. These subsystems and their hierarchical organization are defined by the Gene Ontology or a complementary ontology inferred directly from previously published datasets. Guided by the ontology's hierarchical structure, we organize genotype data into an "ontotype," that is, a hierarchy of perturbations representing the effects of genetic variation at multiple cellular scales. The ontotype is then interpreted using logical rules generated by machine learning to predict phenotype. This approach substantially outperforms previous, non-hierarchical methods for translating yeast genotype to cell growth phenotype, and it accurately predicts the growth outcomes of two new screens of 2,503 double gene knockouts impacting DNA repair or nuclear lumen. Ontotypes also generalize to larger knockout combinations, setting the stage for interpreting the complex genetics of disease.
Collapse
Affiliation(s)
- Michael Ku Yu
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla CA 92093, USA
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| | - Michael Kramer
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
- Biomedical Sciences Program, University of California San Diego, La Jolla CA 92093, USA
| | - Janusz Dutkowski
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
- Data4Cure, La Jolla, CA 92037, USA
| | - Rohith Srivas
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
- Department of Bioengineering, University of California San Diego, La Jolla CA 92093, USA
| | - Katherine Licon
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| | - Jason Kreisberg
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| | | | - Nevan Krogan
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco 94143, USA
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv 69978, Israel
| | - Trey Ideker
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| |
Collapse
|
44
|
Baumstark R, Hänzelmann S, Tsuru S, Schaerli Y, Francesconi M, Mancuso FM, Castelo R, Isalan M. The propagation of perturbations in rewired bacterial gene networks. Nat Commun 2015; 6:10105. [PMID: 26670742 DOI: 10.1038/ncomms10105] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 11/04/2015] [Indexed: 11/09/2022] Open
Abstract
What happens to gene expression when you add new links to a gene regulatory network? To answer this question, we profile 85 network rewirings in E. coli. Here we report that concerted patterns of differential expression propagate from reconnected hub genes. The rewirings link promoter regions to different transcription factor and σ-factor genes, resulting in perturbations that span four orders of magnitude, changing up to ∼ 70% of the transcriptome. Importantly, factor connectivity and promoter activity both associate with perturbation size. Perturbations from related rewirings have more similar transcription profiles and a statistical analysis reveals ∼ 20 underlying states of the system, associating particular gene groups with rewiring constructs. We examine two large clusters (ribosomal and flagellar genes) in detail. These represent alternative global outcomes from different rewirings because of antagonism between these major cell states. This data set of systematically related perturbations enables reverse engineering and discovery of underlying network interactions.
Collapse
Affiliation(s)
- Rebecca Baumstark
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr Aiguader 88, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Dr Aiguader 88, 08003 Barcelona, Spain
| | - Sonja Hänzelmann
- Research Program on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Dr Aiguader 88, 08003 Barcelona, Spain.,Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Dr Aiguader 88, 08003 Barcelona, Spain
| | - Saburo Tsuru
- Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Yolanda Schaerli
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr Aiguader 88, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Dr Aiguader 88, 08003 Barcelona, Spain
| | - Mirko Francesconi
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr Aiguader 88, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Dr Aiguader 88, 08003 Barcelona, Spain
| | - Francesco M Mancuso
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr Aiguader 88, 08003 Barcelona, Spain.,Genomics Cancer Group, Vall d 'Hebron Institute of Oncology (VHIO), Carrer Natzaret 15-17, 08035 Barcelona, Spain
| | - Robert Castelo
- Research Program on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Dr Aiguader 88, 08003 Barcelona, Spain.,Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Dr Aiguader 88, 08003 Barcelona, Spain
| | - Mark Isalan
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr Aiguader 88, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Dr Aiguader 88, 08003 Barcelona, Spain.,Department of Life Sciences, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
45
|
King ZA, Lloyd CJ, Feist AM, Palsson BO. Next-generation genome-scale models for metabolic engineering. Curr Opin Biotechnol 2015; 35:23-9. [DOI: 10.1016/j.copbio.2014.12.016] [Citation(s) in RCA: 130] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2014] [Revised: 12/06/2014] [Accepted: 12/17/2014] [Indexed: 11/26/2022]
|
46
|
Arrieta-Ortiz ML, Hafemeister C, Bate AR, Chu T, Greenfield A, Shuster B, Barry SN, Gallitto M, Liu B, Kacmarczyk T, Santoriello F, Chen J, Rodrigues CDA, Sato T, Rudner DZ, Driks A, Bonneau R, Eichenberger P. An experimentally supported model of the Bacillus subtilis global transcriptional regulatory network. Mol Syst Biol 2015; 11:839. [PMID: 26577401 PMCID: PMC4670728 DOI: 10.15252/msb.20156236] [Citation(s) in RCA: 134] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Organisms from all domains of life use gene regulation networks to control cell growth, identity, function, and responses to environmental challenges. Although accurate global regulatory models would provide critical evolutionary and functional insights, they remain incomplete, even for the best studied organisms. Efforts to build comprehensive networks are confounded by challenges including network scale, degree of connectivity, complexity of organism–environment interactions, and difficulty of estimating the activity of regulatory factors. Taking advantage of the large number of known regulatory interactions in Bacillus subtilis and two transcriptomics datasets (including one with 38 separate experiments collected specifically for this study), we use a new combination of network component analysis and model selection to simultaneously estimate transcription factor activities and learn a substantially expanded transcriptional regulatory network for this bacterium. In total, we predict 2,258 novel regulatory interactions and recall 74% of the previously known interactions. We obtained experimental support for 391 (out of 635 evaluated) novel regulatory edges (62% accuracy), thus significantly increasing our understanding of various cell processes, such as spore formation.
Collapse
Affiliation(s)
- Mario L Arrieta-Ortiz
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Christoph Hafemeister
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Ashley Rose Bate
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Timothy Chu
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Alex Greenfield
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Bentley Shuster
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Samantha N Barry
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Matthew Gallitto
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Brian Liu
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Thadeous Kacmarczyk
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Francis Santoriello
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Jie Chen
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | | | - Tsutomu Sato
- Department of Frontier Bioscience, Hosei University, Koganei, Tokyo, Japan
| | - David Z Rudner
- Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA, USA
| | - Adam Driks
- Department of Microbiology and Immunology, Stritch School of Medicine, Loyola University Chicago, Maywood, IL, USA
| | - Richard Bonneau
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA Courant Institute of Mathematical Science, Computer Science Department, New York, NY, USA Simons Foundation, Simons Center for Data Analysis, New York, NY, USA
| | - Patrick Eichenberger
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| |
Collapse
|
47
|
Carrera J, Covert MW. Why Build Whole-Cell Models? Trends Cell Biol 2015; 25:719-722. [PMID: 26471224 DOI: 10.1016/j.tcb.2015.09.004] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Revised: 09/11/2015] [Accepted: 09/14/2015] [Indexed: 10/22/2022]
Abstract
Our ability to build computational models that account for all known gene functions in a cell has increased dramatically. But why build whole-cell models, and how can they best be used? In this forum, we enumerate several areas in which whole-cell modeling can significantly impact research and technology.
Collapse
Affiliation(s)
- Javier Carrera
- Department of Bioengineering, Stanford University, 443 Via Ortega, Stanford, CA 94305-4125, USA
| | - Markus W Covert
- Department of Bioengineering, Stanford University, 443 Via Ortega, Stanford, CA 94305-4125, USA.
| |
Collapse
|
48
|
Improving prediction fidelity of cellular metabolism with kinetic descriptions. Curr Opin Biotechnol 2015; 36:57-64. [PMID: 26318076 DOI: 10.1016/j.copbio.2015.08.011] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Revised: 08/06/2015] [Accepted: 08/09/2015] [Indexed: 12/13/2022]
Abstract
Several modeling frameworks for describing and redirecting cellular metabolism have been developed keeping pace with the rapid development in high-throughput data generation and advances in metabolic engineering techniques. The incorporation of kinetic information within stoichiometry-only modeling techniques offers potential advantages for improved phenotype prediction and consequently more precise computational strain design. In addition to substrate-level kinetic regulatory information, the integration of a number of additional layers of regulation at the transcription, translation, and post-translation levels is sought after by many research groups. However, the practical integration of these complex biological processes into a unified framework amenable to design remains an ongoing challenge.
Collapse
|
49
|
Angione C, Pratanwanich N, Lió P. A Hybrid of Metabolic Flux Analysis and Bayesian Factor Modeling for Multiomic Temporal Pathway Activation. ACS Synth Biol 2015; 4:880-9. [PMID: 25856685 DOI: 10.1021/sb5003407] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
The growing availability of multiomic data provides a highly comprehensive view of cellular processes at the levels of mRNA, proteins, metabolites, and reaction fluxes. However, due to probabilistic interactions between components depending on the environment and on the time course, casual, sometimes rare interactions may cause important effects in the cellular physiology. To date, interactions at the pathway level cannot be measured directly, and methodologies to predict pathway cross-correlations from reaction fluxes are still missing. Here, we develop a multiomic approach of flux-balance analysis combined with Bayesian factor modeling with the aim of detecting pathway cross-correlations and predicting metabolic pathway activation profiles. Starting from gene expression profiles measured in various environmental conditions, we associate a flux rate profile with each condition. We then infer pathway cross-correlations and identify the degrees of pathway activation with respect to the conditions and time course using Bayesian factor modeling. We test our framework on the most recent metabolic reconstruction of Escherichia coli in both static and dynamic environments, thus predicting the functionality of particular groups of reactions and how it varies over time. In a dynamic environment, our method can be readily used to characterize the temporal progression of pathway activation in response to given stimuli.
Collapse
Affiliation(s)
- Claudio Angione
- Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, United Kingdom
| | | | - Pietro Lió
- Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, United Kingdom
| |
Collapse
|
50
|
Huynh L, Tagkopoulos I. Fast and Accurate Circuit Design Automation through Hierarchical Model Switching. ACS Synth Biol 2015; 4:890-7. [PMID: 25916918 DOI: 10.1021/sb500339k] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In computer-aided biological design, the trifecta of characterized part libraries, accurate models and optimal design parameters is crucial for producing reliable designs. As the number of parts and model complexity increase, however, it becomes exponentially more difficult for any optimization method to search the solution space, hence creating a trade-off that hampers efficient design. To address this issue, we present a hierarchical computer-aided design architecture that uses a two-step approach for biological design. First, a simple model of low computational complexity is used to predict circuit behavior and assess candidate circuit branches through branch-and-bound methods. Then, a complex, nonlinear circuit model is used for a fine-grained search of the reduced solution space, thus achieving more accurate results. Evaluation with a benchmark of 11 circuits and a library of 102 experimental designs with known characterization parameters demonstrates a speed-up of 3 orders of magnitude when compared to other design methods that provide optimality guarantees.
Collapse
Affiliation(s)
- Linh Huynh
- Department of Computer Science & UC Davis Genome Center, University of California Davis, Davis, California 95616, United States
| | - Ilias Tagkopoulos
- Department of Computer Science & UC Davis Genome Center, University of California Davis, Davis, California 95616, United States
| |
Collapse
|