1
|
Jin H, Moseley HNB. md_harmonize: A Python Package for Atom-Level Harmonization of Public Metabolic Databases. Metabolites 2023; 13:1199. [PMID: 38132881 PMCID: PMC10744849 DOI: 10.3390/metabo13121199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 12/13/2023] [Accepted: 12/15/2023] [Indexed: 12/23/2023] Open
Abstract
A major challenge to integrating public metabolic resources is the use of different nomenclatures by individual databases. This paper presents md_harmonize, an open-source Python package for harmonizing compounds and metabolic reactions across various metabolic databases. The md_harmonize package utilizes a neighborhood-specific graph coloring method for generating a unique identifier for each compound via atom identifiers based on a compound's chemical structure. The resulting harmonized compounds and reactions can be used for various downstream analyses, including the construction of atom-resolved metabolic networks and models for metabolic flux analysis. Parts of the md_harmonize package have been optimized using a variety of computational techniques to allow certain NP-complete problems handled by the software to be tractable for these specific use-cases. The software is available on GitHub and through the Python Package Index, with end-user documentation hosted on GitHub Pages.
Collapse
Affiliation(s)
- Huan Jin
- Department of Toxicology and Cancer Biology, University of Kentucky, Lexington, KY 40536, USA;
| | - Hunter N. B. Moseley
- Department of Toxicology and Cancer Biology, University of Kentucky, Lexington, KY 40536, USA;
- Department of Molecular & Cellular Biochemistry, University of Kentucky, Lexington, KY 40536, USA
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA
- Superfund Research Center, University of Kentucky, Lexington, KY 40506, USA
- Institute for Biomedical Informatics, University of Kentucky, Lexington, KY 40536, USA
| |
Collapse
|
2
|
Jin H, Moseley HNB. Hierarchical Harmonization of Atom-Resolved Metabolic Reactions across Metabolic Databases. Metabolites 2021; 11:metabo11070431. [PMID: 34209357 PMCID: PMC8307411 DOI: 10.3390/metabo11070431] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 06/26/2021] [Accepted: 06/28/2021] [Indexed: 11/16/2022] Open
Abstract
Metabolic models have been proven to be useful tools in system biology and have been successfully applied to various research fields in a wide range of organisms. A relatively complete metabolic network is a prerequisite for deriving reliable metabolic models. The first step in constructing metabolic network is to harmonize compounds and reactions across different metabolic databases. However, effectively integrating data from various sources still remains a big challenge. Incomplete and inconsistent atomistic details in compound representations across databases is a very important limiting factor. Here, we optimized a subgraph isomorphism detection algorithm to validate generic compound pairs. Moreover, we defined a set of harmonization relationship types between compounds to deal with inconsistent chemical details while successfully capturing atom-level characteristics, enabling a more complete enabling compound harmonization across metabolic databases. In total, 15,704 compound pairs across KEGG (Kyoto Encyclopedia of Genes and Genomes) and MetaCyc databases were detected. Furthermore, utilizing the classification of compound pairs and EC (Enzyme Commission) numbers of reactions, we established hierarchical relationships between metabolic reactions, enabling the harmonization of 3856 reaction pairs. In addition, we created and used atom-specific identifiers to evaluate the consistency of atom mappings within and between harmonized reactions, detecting some consistency issues between the reaction and compound descriptions in these metabolic databases.
Collapse
Affiliation(s)
- Huan Jin
- Department of Toxicology and Cancer Biology, University of Kentucky, Lexington, KY 40536, USA;
| | - Hunter N. B. Moseley
- Department of Molecular & Cellular Biochemistry, University of Kentucky, Lexington, KY 40536, USA
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA
- Superfund Research Center, University of Kentucky, Lexington, KY 40506, USA
- Institute for Biomedical Informatics, University of Kentucky, Lexington, KY 40536, USA
- Correspondence: ; Tel.: +1-859-218-2964
| |
Collapse
|
3
|
Gholizadeh M, Fayazi J, Asgari Y, Zali H, Kaderali L. Reconstruction and Analysis of Cattle Metabolic Networks in Normal and Acidosis Rumen Tissue. Animals (Basel) 2020; 10:ani10030469. [PMID: 32168900 PMCID: PMC7142512 DOI: 10.3390/ani10030469] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2020] [Revised: 02/21/2020] [Accepted: 02/27/2020] [Indexed: 12/29/2022] Open
Abstract
Simple Summary Economics of feedlot beef production dictate that beef cattle must gain weight at their maximum potential rate; this involves getting them quickly onto a full feed of high fermentable diet which can induce the ruminal acidosis disease. The molecular host mechanisms that occur as a response to the acidosis, are mostly unknown. For answering this question, the rumen epithelial transcriptome in acidosis and control fattening steers were obtained. By RNA sequencing we found the different expression profiles of genes in normal and acidosis induced steers. Then we constructed two metabolic networks for normal and acidosis tissue based on gene expression profile. Our results suggest that rapid shifts to diets rich in fermentable carbohydrates cause an increased concentration of ruminal volatile fatty acids (VFA) and toxins and significant changes in transcriptome profiles and metabolites of rumen epithelial tissue, with negative effects on economic consequences of poor performance and animal health. Abstract The objective of this study was to develop a system-level understanding of acidosis biology. Therefore, the genes expression differences between the normal and acidosis rumen epithelial tissues were first examined using the RNA-seq data in order to understand the molecular mechanisms involved in the disease and then their corresponding metabolic networks constructed. A number of 1074 genes, 978 isoforms, 1049 transcription start sites (TSS), 998 coding DNA sequence (CDS) and 2 promoters were identified being differentially expressed in the rumen tissue between the normal and acidosis samples (p < 0.05). The functional analysis of 627 up-regulated genes revealed their involvement in ion transmembrane transport, filament organization, regulation of cell adhesion, regulation of the actin cytoskeleton, ATP binding, glucose transmembrane transporter activity, carbohydrate binding, growth factor binding and cAMP metabolic process. Additionally, 111 differentially expressed enzymes were identified between the rumen epithelial tissue of the normal and acidosis steers with 46 up-regulated and 65 down-regulated ones in the acidosis group. The pathways and reactions analyses associated with the up-regulated enzymes indicate that most of these enzymes are involved in the fatty acid metabolism, biosynthesis of amino acids, pyruvate and carbon metabolism while most of the down-regulated ones are involved in purine and pyrimidine, vitamin B6 and antibiotics metabolisms. The degree distribution of both metabolic networks follows a power-law one, hence displaying a scale-free property. The top 15 hub metabolites were determined in the acidosis metabolic network with most of them involved in the fatty acid oxidation, VFA biosynthesis, amino acid biogenesis and glutathione metabolism which plays an important role in the stress condition. The limitations of this study were low number of animals and using only epithelial tissue (ventral sac) for RNA-seq.
Collapse
Affiliation(s)
- Maryam Gholizadeh
- Department of Animal Science, Faculty of Animal Science and Food Technology, Agricultural Sciences and Natural Resources University of Khuzestan, Mollasani, Ahvaz 6341773637, Iran;
| | - Jamal Fayazi
- Department of Animal Science, Faculty of Animal Science and Food Technology, Agricultural Sciences and Natural Resources University of Khuzestan, Mollasani, Ahvaz 6341773637, Iran;
- Correspondence: ; Tel.: +98-91-6612-4162
| | - Yazdan Asgari
- Department of Medical Biotechnology, School of Advanced Technologies in Medicine, Tehran University of Medical Sciences, Tehran 1416753955, Iran;
| | - Hakimeh Zali
- School of Advanced Technologies in Medicine, Shahid Beheshti University of Medical Sciences, Tehran 1416753955, Iran;
| | - Lars Kaderali
- Institute of Bioinformatics, University Medicine Greifswald, Felix-Hausdorff-Str. 8, 17475 Greifswald, Germany;
| |
Collapse
|
4
|
Mancini A, Eyassu F, Conway M, Occhipinti A, Liò P, Angione C, Pucciarelli S. CiliateGEM: an open-project and a tool for predictions of ciliate metabolic variations and experimental condition design. BMC Bioinformatics 2018; 19:442. [PMID: 30497359 PMCID: PMC6266953 DOI: 10.1186/s12859-018-2422-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The study of cell metabolism is becoming central in several fields such as biotechnology, evolution/adaptation and human disease investigations. Here we present CiliateGEM, the first metabolic network reconstruction draft of the freshwater ciliate Tetrahymena thermophila. We also provide the tools and resources to simulate different growth conditions and to predict metabolic variations. CiliateGEM can be extended to other ciliates in order to set up a meta-model, i.e. a metabolic network reconstruction valid for all ciliates. Ciliates are complex unicellular eukaryotes of presumably monophyletic origin, with a phylogenetic position that is equal from plants and animals. These cells represent a new concept of unicellular system with a high degree of species, population biodiversity and cell complexity. Ciliates perform in a single cell all the functions of a pluricellular organism, including locomotion, feeding, digestion, and sexual processes. RESULTS After generating the model, we performed an in-silico simulation with the presence and absence of glucose. The lack of this nutrient caused a 32.1% reduction rate in biomass synthesis. Despite the glucose starvation, the growth did not stop due to the use of alternative carbon sources such as amino acids. CONCLUSIONS The future models obtained from CiliateGEM may represent a new approach to describe the metabolism of ciliates. This tool will be a useful resource for the ciliate research community in order to extend these species as model organisms in different research fields. An improved understanding of ciliate metabolism could be relevant to elucidate the basis of biological phenomena like genotype-phenotype relationships, population genetics, and cilia-related disease mechanisms.
Collapse
Affiliation(s)
- Alessio Mancini
- School of Biosciences and Veterinary Medicine, University of Camerino, Camerino, Italy
- Computer Laboratory, University of Cambridge, Cambridge, UK
| | - Filmon Eyassu
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, UK
| | - Maxwell Conway
- Computer Laboratory, University of Cambridge, Cambridge, UK
| | | | - Pietro Liò
- Computer Laboratory, University of Cambridge, Cambridge, UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, UK
| | - Sandra Pucciarelli
- School of Biosciences and Veterinary Medicine, University of Camerino, Camerino, Italy
| |
Collapse
|
5
|
Shaw R, Cheung CYM. A Dynamic Multi-Tissue Flux Balance Model Captures Carbon and Nitrogen Metabolism and Optimal Resource Partitioning During Arabidopsis Growth. FRONTIERS IN PLANT SCIENCE 2018; 9:884. [PMID: 29997643 PMCID: PMC6028781 DOI: 10.3389/fpls.2018.00884] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Accepted: 06/06/2018] [Indexed: 05/19/2023]
Abstract
Plant metabolism is highly adapted in response to its surrounding for acquiring limiting resources. In this study, a dynamic flux balance modeling framework with a multi-tissue (leaf and root) diel genome-scale metabolic model of Arabidopsis thaliana was developed and applied to investigate the reprogramming of plant metabolism through multiple growth stages under different nutrient availability. The framework allowed the modeling of optimal partitioning of resources and biomass in leaf and root over diel phases. A qualitative flux map of carbon and nitrogen metabolism was identified which was consistent across growth phases under both nitrogen rich and limiting conditions. Results from the model simulations suggested distinct metabolic roles in nitrogen metabolism played by enzymes with different cofactor specificities. Moreover, the dynamic model was used to predict the effect of physiological or environmental perturbation on the growth of Arabidopsis leaves and roots.
Collapse
|
6
|
Chatterjee A, Huma B, Shaw R, Kundu S. Reconstruction of Oryza sativa indica Genome Scale Metabolic Model and Its Responses to Varying RuBisCO Activity, Light Intensity, and Enzymatic Cost Conditions. FRONTIERS IN PLANT SCIENCE 2017; 8:2060. [PMID: 29250098 PMCID: PMC5715477 DOI: 10.3389/fpls.2017.02060] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2017] [Accepted: 11/17/2017] [Indexed: 05/12/2023]
Abstract
To combat decrease in rice productivity under different stresses, an understanding of rice metabolism is needed. Though there are different genome scale metabolic models (GSMs) of Oryza sativa japonica, no GSM with gene-protein-reaction association exist for Oryza sativa indica. Here, we report a GSM, OSI1136 of O.s. indica, which includes 3602 genes and 1136 metabolic reactions and transporters distributed across the cytosol, mitochondrion, peroxisome, and chloroplast compartments. Flux balance analysis of the model showed that for varying RuBisCO activity (Vc/Vo) (i) the activity of the chloroplastic malate valve increases to transport reducing equivalents out of the chloroplast under increased photorespiratory conditions and (ii) glyceraldehyde-3-phosphate dehydrogenase and phosphoglycerate kinase can act as source of cytosolic ATP under decreased photorespiration. Under increasing light conditions we observed metabolic flexibility, involving photorespiration, chloroplastic triose phosphate and the dicarboxylate transporters of the chloroplast and mitochondrion for redox and ATP exchanges across the intracellular compartments. Simulations under different enzymatic cost conditions revealed (i) participation of peroxisomal glutathione-ascorbate cycle in photorespiratory H2O2 metabolism (ii) different modes of the chloroplastic triose phosphate transporters and malate valve, and (iii) two possible modes of chloroplastic Glu-Gln transporter which were related with the activity of chloroplastic and cytosolic isoforms of glutamine synthetase. Altogether, our results provide new insights into plant metabolism.
Collapse
Affiliation(s)
| | | | | | - Sudip Kundu
- Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta, Kolkata, India
| |
Collapse
|
7
|
Ahmad A, Hartman HB, Krishnakumar S, Fell DA, Poolman MG, Srivastava S. A Genome Scale Model of Geobacillus thermoglucosidasius (C56-YS93) reveals its biotechnological potential on rice straw hydrolysate. J Biotechnol 2017; 251:30-37. [DOI: 10.1016/j.jbiotec.2017.03.031] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Revised: 03/27/2017] [Accepted: 03/27/2017] [Indexed: 01/29/2023]
|
8
|
Flux balance analysis of genome-scale metabolic model of rice (Oryza sativa): aiming to increase biomass. J Biosci 2016; 40:819-28. [PMID: 26564982 DOI: 10.1007/s12038-015-9563-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Due to socio-economic reasons, it is essential to design efficient stress-tolerant, more nutritious, high yielding rice varieties. A systematic understanding of the rice cellular metabolism is essential for this purpose. Here, we analyse a genome-scale metabolic model of rice leaf using Flux Balance Analysis to investigate whether it has potential metabolic flexibility to increase the biosynthesis of any of the biomass components. We initially simulate the metabolic responses under an objective to maximize the biomass components. Using the estimated maximum value of biomass synthesis as a constraint, we further simulate the metabolic responses optimizing the cellular economy. Depending on the physiological conditions of a cell, the transport capacities of intracellular transporters (ICTs) can vary. To mimic this physiological state, we randomly vary the ICTs' transport capacities and investigate their effects. The results show that the rice leaf has the potential to increase glycine and starch in a wide range depending on the ICTs' transport capacities. The predicted biosynthesis pathways vary slightly at the two different optimization conditions. With the constraint of biomass composition, the cell also has the metabolic plasticity to fix a wide range of carbon-nitrogen ratio.
Collapse
|
9
|
Kim W, Park H, Seo S. Global Metabolic Reconstruction and Metabolic Gene Evolution in the Cattle Genome. PLoS One 2016; 11:e0150974. [PMID: 26992093 PMCID: PMC4798299 DOI: 10.1371/journal.pone.0150974] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2015] [Accepted: 02/22/2016] [Indexed: 11/23/2022] Open
Abstract
The sequence of cattle genome provided a valuable opportunity to systematically link genetic and metabolic traits of cattle. The objectives of this study were 1) to reconstruct genome-scale cattle-specific metabolic pathways based on the most recent and updated cattle genome build and 2) to identify duplicated metabolic genes in the cattle genome for better understanding of metabolic adaptations in cattle. A bioinformatic pipeline of an organism for amalgamating genomic annotations from multiple sources was updated. Using this, an amalgamated cattle genome database based on UMD_3.1, was created. The amalgamated cattle genome database is composed of a total of 33,292 genes: 19,123 consensus genes between NCBI and Ensembl databases, 8,410 and 5,493 genes only found in NCBI or Ensembl, respectively, and 266 genes from NCBI scaffolds. A metabolic reconstruction of the cattle genome and cattle pathway genome database (PGDB) was also developed using Pathway Tools, followed by an intensive manual curation. The manual curation filled or revised 68 pathway holes, deleted 36 metabolic pathways, and added 23 metabolic pathways. Consequently, the curated cattle PGDB contains 304 metabolic pathways, 2,460 reactions including 2,371 enzymatic reactions, and 4,012 enzymes. Furthermore, this study identified eight duplicated genes in 12 metabolic pathways in the cattle genome compared to human and mouse. Some of these duplicated genes are related with specific hormone biosynthesis and detoxifications. The updated genome-scale metabolic reconstruction is a useful tool for understanding biology and metabolic characteristics in cattle. There has been significant improvements in the quality of cattle genome annotations and the MetaCyc database. The duplicated metabolic genes in the cattle genome compared to human and mouse implies evolutionary changes in the cattle genome and provides a useful information for further research on understanding metabolic adaptations of cattle.
Collapse
Affiliation(s)
- Woonsu Kim
- Department of Animal Biosystem Sciences, Chungnam National University, Daejeon, Republic of Korea
| | - Hyesun Park
- Department of Animal Biosystem Sciences, Chungnam National University, Daejeon, Republic of Korea
| | - Seongwon Seo
- Department of Animal Biosystem Sciences, Chungnam National University, Daejeon, Republic of Korea
- * E-mail:
| |
Collapse
|
10
|
Salehzadeh-Yazdi A, Asgari Y, Saboury AA, Masoudi-Nejad A. Computational analysis of reciprocal association of metabolism and epigenetics in the budding yeast: a genome-scale metabolic model (GSMM) approach. PLoS One 2014; 9:e111686. [PMID: 25365344 PMCID: PMC4218804 DOI: 10.1371/journal.pone.0111686] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2014] [Accepted: 10/07/2014] [Indexed: 12/13/2022] Open
Abstract
Metaboloepigenetics is a newly coined term in biological sciences that investigates the crosstalk between epigenetic modifications and metabolism. The reciprocal relation between biochemical transformations and gene expression regulation has been experimentally demonstrated in cancers and metabolic syndromes. In this study, we explored the metabolism-histone modifications crosstalk by topological analysis and constraint-based modeling approaches in the budding yeast. We constructed nine models through the integration of gene expression data of four mutated histone tails into a genome-scale metabolic model of yeast. Accordingly, we defined the centrality indices of the lowly expressed enzymes in the undirected enzyme-centric network of yeast by CytoHubba plug-in in Cytoscape. To determine the global effects of histone modifications on the yeast metabolism, the growth rate and the range of possible flux values of reactions, we used constraint-based modeling approach. Centrality analysis shows that the lowly expressed enzymes could affect and control the yeast metabolic network. Besides, constraint-based modeling results are in a good agreement with the experimental findings, confirming that the mutations in histone tails lead to non-lethal alterations in the yeast, but have diverse effects on the growth rate and reveal the functional redundancy.
Collapse
Affiliation(s)
- Ali Salehzadeh-Yazdi
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Yazdan Asgari
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Ali Akbar Saboury
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
- * E-mail:
| |
Collapse
|
11
|
Najafi A, Bidkhori G, Bozorgmehr JH, Koch I, Masoudi-Nejad A. Genome scale modeling in systems biology: algorithms and resources. Curr Genomics 2014; 15:130-59. [PMID: 24822031 PMCID: PMC4009841 DOI: 10.2174/1389202915666140319002221] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2013] [Revised: 02/16/2014] [Accepted: 03/17/2014] [Indexed: 12/18/2022] Open
Abstract
In recent years, in silico studies and trial simulations have complemented experimental procedures. A model is a description of a system, and a system is any collection of interrelated objects; an object, moreover, is some elemental unit upon which observations can be made but whose internal structure either does not exist or is ignored. Therefore, any network analysis approach is critical for successful quantitative modeling of biological systems. This review highlights some of most popular and important modeling algorithms, tools, and emerging standards for representing, simulating and analyzing cellular networks in five sections. Also, we try to show these concepts by means of simple example and proper images and graphs. Overall, systems biology aims for a holistic description and understanding of biological processes by an integration of analytical experimental approaches along with synthetic computational models. In fact, biological networks have been developed as a platform for integrating information from high to low-throughput experiments for the analysis of biological systems. We provide an overview of all processes used in modeling and simulating biological networks in such a way that they can become easily understandable for researchers with both biological and mathematical backgrounds. Consequently, given the complexity of generated experimental data and cellular networks, it is no surprise that researchers have turned to computer simulation and the development of more theory-based approaches to augment and assist in the development of a fully quantitative understanding of cellular dynamics.
Collapse
Affiliation(s)
- Ali Najafi
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Iran
| | - Gholamreza Bidkhori
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Iran
| | - Joseph H. Bozorgmehr
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Iran
| | - Ina Koch
- Molecular Bioinformatics, Johann Wolfgang Goethe-University Frankfurt am Main, Germany
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Iran
| |
Collapse
|
12
|
Liberal R, Pinney JW. Simple topological properties predict functional misannotations in a metabolic network. Bioinformatics 2013; 29:i154-61. [PMID: 23812979 PMCID: PMC3694667 DOI: 10.1093/bioinformatics/btt236] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Motivation: Misannotation in sequence databases is an important obstacle for automated tools for gene function annotation, which rely extensively on comparison with sequences with known function. To improve current annotations and prevent future propagation of errors, sequence-independent tools are, therefore, needed to assist in the identification of misannotated gene products. In the case of enzymatic functions, each functional assignment implies the existence of a reaction within the organism’s metabolic network; a first approximation to a genome-scale metabolic model can be obtained directly from an automated genome annotation. Any obvious problems in the network, such as dead end or disconnected reactions, can, therefore, be strong indications of misannotation. Results: We demonstrate that a machine-learning approach using only network topological features can successfully predict the validity of enzyme annotations. The predictions are tested at three different levels. A random forest using topological features of the metabolic network and trained on curated sets of correct and incorrect enzyme assignments was found to have an accuracy of up to 86% in 5-fold cross-validation experiments. Further cross-validation against unseen enzyme superfamilies indicates that this classifier can successfully extrapolate beyond the classes of enzyme present in the training data. The random forest model was applied to several automated genome annotations, achieving an accuracy of in most cases when validated against recent genome-scale metabolic models. We also observe that when applied to draft metabolic networks for multiple species, a clear negative correlation is observed between predicted annotation quality and phylogenetic distance to the major model organism for biochemistry (Escherichia coli for prokaryotes and Homo sapiens for eukaryotes). Contact:j.pinney@imperial.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rodrigo Liberal
- Department of Life Sciences and Centre for Integrative Systems Biology and Bioinformatics, Imperial College London, London SW7 2AZ, UK
| | | |
Collapse
|
13
|
Poolman MG, Kundu S, Shaw R, Fell DA. Responses to light intensity in a genome-scale model of rice metabolism. PLANT PHYSIOLOGY 2013; 162:1060-72. [PMID: 23640755 PMCID: PMC3668040 DOI: 10.1104/pp.113.216762] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2013] [Accepted: 04/30/2013] [Indexed: 05/08/2023]
Abstract
We describe the construction and analysis of a genome-scale metabolic model representing a developing leaf cell of rice (Oryza sativa) primarily derived from the annotations in the RiceCyc database. We used flux balance analysis to determine that the model represents a network capable of producing biomass precursors (amino acids, nucleotides, lipid, starch, cellulose, and lignin) in experimentally reported proportions, using carbon dioxide as the sole carbon source. We then repeated the analysis over a range of photon flux values to examine responses in the solutions. The resulting flux distributions show that (1) redox shuttles between the chloroplast, cytosol, and mitochondrion may play a significant role at low light levels, (2) photorespiration can act to dissipate excess energy at high light levels, and (3) the role of mitochondrial metabolism is likely to vary considerably according to the balance between energy demand and availability. It is notable that these organelle interactions, consistent with many experimental observations, arise solely as a result of the need for mass and energy balancing without any explicit assumptions concerning kinetic or other regulatory mechanisms.
Collapse
Affiliation(s)
- Mark G Poolman
- Department of Biology and Medical Science, Oxford Brookes University, Headington, Oxford OX3 OBP, United Kingdom.
| | | | | | | |
Collapse
|
14
|
Evaluating Sphingolipid Biochemistry in the Consensus Reconstruction of Yeast Metabolism. Ind Biotechnol (New Rochelle N Y) 2012. [DOI: 10.1089/ind.2012.0002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
15
|
Alcántara R, Axelsen KB, Morgat A, Belda E, Coudert E, Bridge A, Cao H, de Matos P, Ennis M, Turner S, Owen G, Bougueleret L, Xenarios I, Steinbeck C. Rhea--a manually curated resource of biochemical reactions. Nucleic Acids Res 2011; 40:D754-60. [PMID: 22135291 PMCID: PMC3245052 DOI: 10.1093/nar/gkr1126] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Rhea (http://www.ebi.ac.uk/rhea) is a comprehensive resource of expert-curated biochemical reactions. Rhea provides a non-redundant set of chemical transformations for use in a broad spectrum of applications, including metabolic network reconstruction and pathway inference. Rhea includes enzyme-catalyzed reactions (covering the IUBMB Enzyme Nomenclature list), transport reactions and spontaneously occurring reactions. Rhea reactions are described using chemical species from the Chemical Entities of Biological Interest ontology (ChEBI) and are stoichiometrically balanced for mass and charge. They are extensively manually curated with links to source literature and other public resources on metabolism including enzyme and pathway databases. This cross-referencing facilitates the mapping and reconciliation of common reactions and compounds between distinct resources, which is a common first step in the reconstruction of genome scale metabolic networks and models.
Collapse
Affiliation(s)
- Rafael Alcántara
- Chemoinformatics and Metabolism Team, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Abstract
Based on our experience in kinetic modeling of coupled multiple metabolic pathways, we propose a generic rate equation for the dynamical modeling of metabolic kinetics. It is symmetric for forward and backward reactions. Its Michaelis-Menten-King-Altman form makes the kinetic parameters (or functions) easy to relate to experimental values in the database and to use in computation. In addition, such a uniform form is ready to arbitrary number of substrates and products with different stiochiometry. We explicitly show how to obtain such rate equations rigorously for three well-known binding mechanisms. Hence, the proposed rate equation is formally exact under the quasi-steady state condition. Various features of this generic rate equation are discussed. In particular, for irreversible reactions, the product inhibition which directly arises from enzymatic reaction is eliminated in a natural way. We also discuss how to include the effects of modifiers and cooperativity.
Collapse
Affiliation(s)
- L. W. LEE
- Department of Mechanical Engineering, University of Washington, Seattle, WA 98195, USA
| | - L. YIN
- School of Physics, Peking University, 100871 Beijing, P. R. China
| | - X. M. ZHU
- GeneMath, 5525 27th Avenue N.E., Seattle, WA 98105, USA
| | - P. AO
- Department of Mechanical Engineering, University of Washington, Seattle, WA 98195, USA
- Department of Physics, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
17
|
|
18
|
Kaleta C, de Figueiredo LF, Heiland I, Klamt S, Schuster S. Special issue: integration of OMICs datasets into metabolic pathway analysis. Biosystems 2011; 105:107-8. [PMID: 21619911 DOI: 10.1016/j.biosystems.2011.05.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- Christoph Kaleta
- Department of Bioinformatics, School of Biology and Pharmaceutics, Friedrich Schiller University Jena, Germany.
| | | | | | | | | |
Collapse
|
19
|
Zhou W, Nakhleh L. The strength of chemical linkage as a criterion for pruning metabolic graphs. Bioinformatics 2011; 27:1957-63. [PMID: 21551141 DOI: 10.1093/bioinformatics/btr271] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
MOTIVATION A metabolic graph represents the connectivity patterns of a metabolic system, and provides a powerful framework within which the organization of metabolic reactions can be analyzed and elucidated. A common practice is to prune (i.e. remove nodes and edges) the metabolic graph prior to any analysis in order to eliminate confounding signals from the representation. Currently, this pruning process is carried out in an ad hoc fashion, resulting in discrepancies and ambiguities across studies. RESULTS We propose a biochemically informative criterion, the strength of chemical linkage (SCL), for a systematic pruning of metabolic graphs. By analyzing the metabolic graph of Escherichia coli, we show that thresholding SCL is powerful in selecting the conventional pathways' connectivity out of the raw network connectivity when the network is restricted to the reactions collected from these pathways. Further, we argue that the root of ambiguity in pruning metabolic graphs is in the continuity of the amount of chemical content that can be conserved in reaction transformation patterns. Finally, we demonstrate how biochemical pathways can be inferred efficiently if the search procedure is guided by SCL.
Collapse
Affiliation(s)
- Wanding Zhou
- Department of Bioengineering, Rice University, Houston, TX, USA.
| | | |
Collapse
|
20
|
Teusink B, Westerhoff HV, Bruggeman FJ. Comparative systems biology: from bacteria to man. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2011; 2:518-532. [PMID: 20836045 DOI: 10.1002/wsbm.74] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Comparative analyses, as carried out by comparative genomics and bioinformatics, have proven extremely powerful to obtain insight into the identity of specific genes that underlie differences and similarities across species. The central concept developed in this chapter is that important aspects of the functional differences between organisms derive not only from the differences in genetic components (which underlies comparative genomics) but also from dynamic, molecular (physical) interactions. Approaches that aim at identifying such network-based rather than component-based homologies between species we shall call Comparative Systems Biology. It will be illustrated by a number of examples from metabolic networks from prokaryotes, via yeast, to man. The potential for species comparisons, at the genome-scale using classical approaches and at the more detailed level of dynamic molecular networks will be illustrated. In our opinion, comparative systems biology, as a marriage between bioinformatics and systems biology, will offer new insights into the nature of organisms for the benefit of medicine, biotechnology, and drug design. As dynamic modeling is becoming more mainstream in cell biology, the potential of comparative systems biology will become more evident.
Collapse
Affiliation(s)
- Bas Teusink
- Systems BioInformatics, Center for Integrative Bioinformatics VU (IBIVU), VU University Amsterdam, The Netherlands.,Netherlands Institute Systems Biology (NISB), The Netherlands.,Kluyver Center for Genomics of Industrial Fermentation, The Netherlands
| | - Hans V Westerhoff
- Netherlands Institute Systems Biology (NISB), The Netherlands.,Molecular Cell Physiology, VU University Amsterdam, The Netherlands.,Manchester Centre for Integrative Systems Biology, University of Manchester, UK
| | - Frank J Bruggeman
- Systems BioInformatics, Center for Integrative Bioinformatics VU (IBIVU), VU University Amsterdam, The Netherlands.,Regulatory Networks Group, NISB, The Netherlands.,Life Sciences, Centre for Mathematics and Computer Science (CWI) Amsterdam, The Netherlands
| |
Collapse
|
21
|
Gevorgyan A, Bushell ME, Avignone-Rossa C, Kierzek AM. SurreyFBA: a command line tool and graphics user interface for constraint-based modeling of genome-scale metabolic reaction networks. Bioinformatics 2010; 27:433-4. [PMID: 21148545 DOI: 10.1093/bioinformatics/btq679] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
UNLABELLED Constraint-based modeling of genome-scale metabolic networks has been successfully used in numerous applications such as prediction of gene essentiality and metabolic engineering. We present SurreyFBA, which provides constraint-based simulations and network map visualization in a free, stand-alone software. In addition to basic simulation protocols, the tool also implements the analysis of minimal substrate and product sets, which is useful for metabolic engineering and prediction of nutritional requirements in complex in vivo environments, but not available in other commonly used programs. The SurreyFBA is based on a command line interface to the GLPK solver distributed as binary and source code for the three major operating systems. The command line tool, implemented in C++, is easily executed within scripting languages used in the bioinformatics community and provides efficient implementation of tasks requiring iterative calls to the linear programming solver. SurreyFBA includes JyMet, a graphics user interface allowing spreadsheet-based model presentation, visualization of numerical results on metabolic networks represented in the Petri net convention, as well as in charts and plots. AVAILABILITY SurreyFBA is distributed under GNU GPL license and available from http://sysbio3.fhms.surrey.ac.uk/SurreyFBA.zip.
Collapse
Affiliation(s)
- Albert Gevorgyan
- Faculty of Health and Medical Sciences, University of Surrey, Guildford, Surrey GU2 7XH, UK
| | | | | | | |
Collapse
|
22
|
Abstract
Reconstructing a model of the metabolic network of an organism from its annotated genome sequence would seem, at first sight, to be one of the most straightforward tasks in functional genomics, even if the various data sources required were never designed with this application in mind. The number of genome-scale metabolic models is, however, lagging far behind the number of sequenced genomes and is likely to continue to do so unless the model-building process can be accelerated. Two aspects that could usefully be improved are the ability to find the sources of error in a nascent model rapidly, and the generation of tenable hypotheses concerning solutions that would improve a model. We will illustrate these issues with approaches we have developed in the course of building metabolic models of Streptococcus agalactiae and Arabidopsis thaliana.
Collapse
|
23
|
Radrich K, Tsuruoka Y, Dobson P, Gevorgyan A, Swainston N, Baart G, Schwartz JM. Integration of metabolic databases for the reconstruction of genome-scale metabolic networks. BMC SYSTEMS BIOLOGY 2010; 4:114. [PMID: 20712863 PMCID: PMC2930596 DOI: 10.1186/1752-0509-4-114] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/27/2010] [Accepted: 08/16/2010] [Indexed: 01/13/2023]
Abstract
BACKGROUND Genome-scale metabolic reconstructions have been recognised as a valuable tool for a variety of applications ranging from metabolic engineering to evolutionary studies. However, the reconstruction of such networks remains an arduous process requiring a high level of human intervention. This process is further complicated by occurrences of missing or conflicting information and the absence of common annotation standards between different data sources. RESULTS In this article, we report a semi-automated methodology aimed at streamlining the process of metabolic network reconstruction by enabling the integration of different genome-wide databases of metabolic reactions. We present results obtained by applying this methodology to the metabolic network of the plant Arabidopsis thaliana. A systematic comparison of compounds and reactions between two genome-wide databases allowed us to obtain a high-quality core consensus reconstruction, which was validated for stoichiometric consistency. A lower level of consensus led to a larger reconstruction, which has a lower quality standard but provides a baseline for further manual curation. CONCLUSION This semi-automated methodology may be applied to other organisms and help to streamline the process of genome-scale network reconstruction in order to accelerate the transfer of such models to applications.
Collapse
Affiliation(s)
- Karin Radrich
- Faculty of Life Sciences, University of Manchester, Manchester M13 9PT, UK
| | | | | | | | | | | | | |
Collapse
|
24
|
Ruppin E, Papin JA, de Figueiredo LF, Schuster S. Metabolic reconstruction, constraint-based analysis and game theory to probe genome-scale metabolic networks. Curr Opin Biotechnol 2010; 21:502-10. [DOI: 10.1016/j.copbio.2010.07.002] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2010] [Accepted: 07/05/2010] [Indexed: 11/27/2022]
|
25
|
Smith GR, Shanley DP. Modelling the response of FOXO transcription factors to multiple post-translational modifications made by ageing-related signalling pathways. PLoS One 2010; 5:e11092. [PMID: 20567500 PMCID: PMC2886341 DOI: 10.1371/journal.pone.0011092] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2010] [Accepted: 05/01/2010] [Indexed: 01/10/2023] Open
Abstract
FOXO transcription factors are an important, conserved family of regulators of cellular processes including metabolism, cell-cycle progression, apoptosis and stress resistance. They are required for the efficacy of several of the genetic interventions that modulate lifespan. FOXO activity is regulated by multiple post-translational modifications (PTMs) that affect its subcellular localization, half-life, DNA binding and transcriptional activity. Here, we show how a mathematical modelling approach can be used to simulate the effects, singly and in combination, of these PTMs. Our model is implemented using the Systems Biology Markup Language (SBML), generated by an ancillary program and simulated in a stochastic framework. The use of the ancillary program to generate the SBML is necessary because the possibility that many regulatory PTMs may be added, each independently of the others, means that a large number of chemically distinct forms of the FOXO molecule must be taken into account, and the program is used to generate them. Although the model does not yet include detailed representations of events upstream and downstream of FOXO, we show how it can qualitatively, and in some cases quantitatively, reproduce the known effects of certain treatments that induce various single and multiple PTMs, and allows for a complex spatiotemporal interplay of effects due to the activation of multiple PTM-inducing treatments. Thus, it provides an important framework to integrate current knowledge about the behaviour of FOXO. The approach should be generally applicable to other proteins experiencing multiple regulations.
Collapse
Affiliation(s)
- Graham R. Smith
- Henry Wellcome Laboratory for Biogerontology, Institute for Ageing and Health, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Daryl P. Shanley
- Henry Wellcome Laboratory for Biogerontology, Institute for Ageing and Health, Newcastle University, Newcastle upon Tyne, United Kingdom
- * E-mail:
| |
Collapse
|
26
|
Schellenberger J, Park JO, Conrad TM, Palsson BØ. BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions. BMC Bioinformatics 2010; 11:213. [PMID: 20426874 PMCID: PMC2874806 DOI: 10.1186/1471-2105-11-213] [Citation(s) in RCA: 356] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2009] [Accepted: 04/29/2010] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Genome-scale metabolic reconstructions under the Constraint Based Reconstruction and Analysis (COBRA) framework are valuable tools for analyzing the metabolic capabilities of organisms and interpreting experimental data. As the number of such reconstructions and analysis methods increases, there is a greater need for data uniformity and ease of distribution and use. DESCRIPTION We describe BiGG, a knowledgebase of Biochemically, Genetically and Genomically structured genome-scale metabolic network reconstructions. BiGG integrates several published genome-scale metabolic networks into one resource with standard nomenclature which allows components to be compared across different organisms. BiGG can be used to browse model content, visualize metabolic pathway maps, and export SBML files of the models for further analysis by external software packages. Users may follow links from BiGG to several external databases to obtain additional information on genes, proteins, reactions, metabolites and citations of interest. CONCLUSIONS BiGG addresses a need in the systems biology community to have access to high quality curated metabolic models and reconstructions. It is freely available for academic use at http://bigg.ucsd.edu.
Collapse
Affiliation(s)
- Jan Schellenberger
- Bioinformatics Program, University of California San Diego, La Jolla, California, 92093-0419, USA
| | | | | | | |
Collapse
|
27
|
Bourguignon PY, Samal A, Képès F, Jost J, Martin OC. Challenges in experimental data integration within genome-scale metabolic models. Algorithms Mol Biol 2010; 5:20. [PMID: 20412574 PMCID: PMC2865480 DOI: 10.1186/1748-7188-5-20] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2010] [Accepted: 04/22/2010] [Indexed: 11/10/2022] Open
Abstract
A report of the meeting "Challenges in experimental data integration within genome-scale metabolic models", Institut Henri Poincaré, Paris, October 10-11 2009, organized by the CNRS-MPG joint program in Systems Biology.
Collapse
|
28
|
Poolman MG, Miguet L, Sweetlove LJ, Fell DA. A genome-scale metabolic model of Arabidopsis and some of its properties. PLANT PHYSIOLOGY 2009; 151:1570-81. [PMID: 19755544 PMCID: PMC2773075 DOI: 10.1104/pp.109.141267] [Citation(s) in RCA: 138] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2009] [Accepted: 09/11/2009] [Indexed: 05/17/2023]
Abstract
We describe the construction and analysis of a genome-scale metabolic model of Arabidopsis (Arabidopsis thaliana) primarily derived from the annotations in the Aracyc database. We used techniques based on linear programming to demonstrate the following: (1) that the model is capable of producing biomass components (amino acids, nucleotides, lipid, starch, and cellulose) in the proportions observed experimentally in a heterotrophic suspension culture; (2) that approximately only 15% of the available reactions are needed for this purpose and that the size of this network is comparable to estimates of minimal network size for other organisms; (3) that reactions may be grouped according to the changes in flux resulting from a hypothetical stimulus (in this case demand for ATP) and that this allows the identification of potential metabolic modules; and (4) that total ATP demand for growth and maintenance can be inferred and that this is consistent with previous estimates in prokaryotes and yeast.
Collapse
Affiliation(s)
- Mark G Poolman
- School of Life Science, Oxford Brookes University, Headington, Oxford OX3 OBP, United Kingdom.
| | | | | | | |
Collapse
|
29
|
Figueiredo LFD, Schuster S, Kaleta C, Fell DA. Response to comment on 'Can sugars be produced from fatty acids? A test case for pathway analysis tools'. Bioinformatics 2009; 25:3330-1. [DOI: 10.1093/bioinformatics/btp591] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
30
|
Christian N, May P, Kempa S, Handorf T, Ebenhöh O. An integrative approach towards completing genome-scale metabolic networks. MOLECULAR BIOSYSTEMS 2009; 5:1889-903. [PMID: 19763335 DOI: 10.1039/b915913b] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Genome-scale metabolic networks which have been automatically derived through sequence comparison techniques are necessarily incomplete. We propose a strategy that incorporates genomic sequence data and metabolite profiles into modeling approaches to arrive at improved gene annotations and more complete genome-scale metabolic networks. The core of our strategy is an algorithm that computes minimal sets of reactions by which a draft network has to be extended in order to be consistent with experimental observations. A particular strength of our approach is that alternative possibilities are suggested and thus experimentally testable hypotheses are produced. We carefully evaluate our strategy on the well-studied metabolic network of Escherichia coli, demonstrating how the predictions can be improved by incorporating sequence data. Subsequently, we apply our method to the recently sequenced green alga Chlamydomonas reinhardtii. We suggest specific genes in the genome of Chlamydomonas which are the strongest candidates for coding the responsible enzymes.
Collapse
Affiliation(s)
- Nils Christian
- Max-Planck-Institute for Molecular Plant Physiology, Potsdam-Golm, Germany
| | | | | | | | | |
Collapse
|
31
|
Seo S, Lewin HA. Reconstruction of metabolic pathways for the cattle genome. BMC SYSTEMS BIOLOGY 2009; 3:33. [PMID: 19284618 PMCID: PMC2669051 DOI: 10.1186/1752-0509-3-33] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/29/2007] [Accepted: 03/12/2009] [Indexed: 01/21/2023]
Abstract
Background Metabolic reconstruction of microbial, plant and animal genomes is a necessary step toward understanding the evolutionary origins of metabolism and species-specific adaptive traits. The aims of this study were to reconstruct conserved metabolic pathways in the cattle genome and to identify metabolic pathways with missing genes and proteins. The MetaCyc database and PathwayTools software suite were chosen for this work because they are widely used and easy to implement. Results An amalgamated cattle genome database was created using the NCBI and Ensembl cattle genome databases (based on build 3.1) as data sources. PathwayTools was used to create a cattle-specific pathway genome database, which was followed by comprehensive manual curation for the reconstruction of metabolic pathways. The curated database, CattleCyc 1.0, consists of 217 metabolic pathways. A total of 64 mammalian-specific metabolic pathways were modified from the reference pathways in MetaCyc, and two pathways previously identified but missing from MetaCyc were added. Comparative analysis of metabolic pathways revealed the absence of mammalian genes for 22 metabolic enzymes whose activity was reported in the literature. We also identified six human metabolic protein-coding genes for which the cattle ortholog is missing from the sequence assembly. Conclusion CattleCyc is a powerful tool for understanding the biology of ruminants and other cetartiodactyl species. In addition, the approach used to develop CattleCyc provides a framework for the metabolic reconstruction of other newly sequenced mammalian genomes. It is clear that metabolic pathway analysis strongly reflects the quality of the underlying genome annotations. Thus, having well-annotated genomes from many mammalian species hosted in BioCyc will facilitate the comparative analysis of metabolic pathways among different species and a systems approach to comparative physiology.
Collapse
Affiliation(s)
- Seongwon Seo
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, IL 61801, USA.
| | | |
Collapse
|
32
|
Faust K, Croes D, van Helden J. Metabolic pathfinding using RPAIR annotation. J Mol Biol 2009; 388:390-414. [PMID: 19281817 DOI: 10.1016/j.jmb.2009.03.006] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2008] [Revised: 02/25/2009] [Accepted: 03/03/2009] [Indexed: 12/21/2022]
Abstract
Metabolic databases contain information about thousands of small molecules and reactions, which can be represented as networks. In the context of metabolic reconstruction, pathways can be inferred by searching optimal paths in such networks. A recurrent problem is the presence of pool metabolites (e.g., water, energy carriers, and cofactors), which are connected to hundreds of reactions, thus establishing irrelevant shortcuts between nodes of the network. One solution to this problem relies on weighted networks to penalize highly connected compounds. A more refined solution takes the chemical structure of reactants into account in order to differentiate between side and main compounds of a reaction. Thanks to an intensive annotation effort at KEGG, decompositions of reactions into reactant pairs (RPAIR) categorized by their role (main, trans, cofac, ligase, and leave) are now available. The goal of this article is to evaluate the impact of RPAIR data on pathfinding in metabolic networks. To this end, we measure the impact of different parameters concerning the construction of the metabolic network: mapping of reactions and reactant pairs onto a graph, use of selected categories of reactant pairs, weighting schemes for compounds and reactions, removal of highly connected metabolites, and reaction directionality. In total, we tested 104 combinations of parameters and identified their optimal values for pathfinding on the basis of 55 reference pathways from three organisms. The best-performing metabolic network combines the biochemical knowledge encoded by KEGG RPAIR with a weighting scheme penalizing highly connected compounds. With this network, we could recover reference pathways from Escherichia coli with an average accuracy of 93% (32 pathways), from Saccharomyces cerevisiae with an average accuracy of 66% (11 pathways), and from humans with an average accuracy of 70% (12 pathways). Our pathfinding approach is available as part of the Network Analysis Tools.
Collapse
Affiliation(s)
- Karoline Faust
- Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe), Université Libre de Bruxelles, Campus Plaine, CP 263, Bld du Triomphe, B-1050 Bruxelles, Belgium.
| | | | | |
Collapse
|
33
|
Kotera M, McDonald AG, Boyce S, Tipton KF. Eliciting possible reaction equations and metabolic pathways involving orphan metabolites. J Chem Inf Model 2009; 48:2335-49. [PMID: 19053521 DOI: 10.1021/ci800213g] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
The development of metabolomics has resulted in the discovery of an increasing number of orphan metabolites, which are defined as compounds that are known to be present in living organisms but whose synthetic/degradation pathways are unknown. In this paper, we describe a procedure for identifying possible products and/or precursors of such orphan metabolites and for suggesting complete reaction equations and the corresponding EC (Enzyme Commission) number simultaneously. Chemical structure comparison is performed for a pair of compounds consisting of a reported substrate and its corresponding product and also for pairs of randomly selected compounds. Possible combinations of compounds registered in the KEGG database were used for generating putative enzyme reaction equations, which resulted in 77% of the reported equations being generated, as most of the remainder represent classes of compounds, rather than specific compounds, or contain Markush structures. The quality was checked using chemical structure comparison and the random-tree method, which gave 98% accuracy in suggesting EC subsubclasses for reported equations in cross-validation tests. The equations generated in this study can be seen using the Web-based program GREP (Generator of Reaction Equations & Pathways; http://bisscat.org/GREP/ ). The usefulness of our method for constructing possible metabolic pathways was demonstrated by mapping the generated equations for several groups of compounds, such as the betalain alkaloids. The possible development of our method so that alternative substrates for reported enzymes can be found and for annotating enzyme functions in genomic research is also discussed.
Collapse
Affiliation(s)
- Masaaki Kotera
- School of Biochemistry and Immunology, Trinity College, Dublin 2, Ireland.
| | | | | | | |
Collapse
|
34
|
de Figueiredo LF, Schuster S, Kaleta C, Fell DA. Can sugars be produced from fatty acids? A test case for pathway analysis tools. Bioinformatics 2009; 25:152-8. [PMID: 19117076 DOI: 10.1093/bioinformatics/btn621] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION In recent years, several methods have been proposed for determining metabolic pathways in an automated way based on network topology. The aim of this work is to analyse these methods by tackling a concrete example relevant in biochemistry. It concerns the question whether even-chain fatty acids, being the most important constituents of lipids, can be converted into sugars at steady state. It was proved five decades ago that this conversion using the Krebs cycle is impossible unless the enzymes of the glyoxylate shunt (or alternative bypasses) are present in the system. Using this example, we can compare the various methods in pathway analysis. RESULTS Elementary modes analysis (EMA) of a set of enzymes corresponding to the Krebs cycle, glycolysis and gluconeogenesis supports the scientific evidence showing that there is no pathway capable of converting acetyl-CoA to glucose at steady state. This conversion is possible after the addition of isocitrate lyase and malate synthase (forming the glyoxylate shunt) to the system. Dealing with the same example, we compare EMA with two tools based on graph theory available online, PathFinding and Pathway Hunter Tool. These automated network generating tools do not succeed in predicting the conversions known from experiment. They sometimes generate unbalanced paths and reveal problems identifying side metabolites that are not responsible for the carbon net flux. This shows that, for metabolic pathway analysis, it is important to consider the topology (including bimolecular reactions) and stoichiometry of metabolic systems, as is done in EMA.
Collapse
Affiliation(s)
- Luis F de Figueiredo
- Department of Bioinformatics, Friedrich-Schiller-Universität Jena, Ernst-Abbe-Platz 2, 07743 Jena, Germany.
| | | | | | | |
Collapse
|
35
|
A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology. Nat Biotechnol 2008; 26:1155-60. [PMID: 18846089 DOI: 10.1038/nbt1492] [Citation(s) in RCA: 405] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Genomic data allow the large-scale manual or semi-automated assembly of metabolic network reconstructions, which provide highly curated organism-specific knowledge bases. Although several genome-scale network reconstructions describe Saccharomyces cerevisiae metabolism, they differ in scope and content, and use different terminologies to describe the same chemical entities. This makes comparisons between them difficult and underscores the desirability of a consolidated metabolic network that collects and formalizes the 'community knowledge' of yeast metabolism. We describe how we have produced a consensus metabolic network reconstruction for S. cerevisiae. In drafting it, we placed special emphasis on referencing molecules to persistent databases or using database-independent forms, such as SMILES or InChI strings, as this permits their chemical structure to be represented unambiguously and in a manner that permits automated reasoning. The reconstruction is readily available via a publicly accessible database and in the Systems Biology Markup Language (http://www.comp-sys-bio.org/yeastnet). It can be maintained as a resource that serves as a common denominator for studying the systems biology of yeast. Similar strategies should benefit communities studying genome-scale metabolic networks of other organisms.
Collapse
|
36
|
Nishikawa T, Gulbahce N, Motter AE. Spontaneous reaction silencing in metabolic optimization. PLoS Comput Biol 2008; 4:e1000236. [PMID: 19057639 PMCID: PMC2582435 DOI: 10.1371/journal.pcbi.1000236] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2008] [Accepted: 10/20/2008] [Indexed: 11/18/2022] Open
Abstract
Metabolic reactions of single-cell organisms are routinely observed to become dispensable or even incapable of carrying activity under certain circumstances. Yet, the mechanisms as well as the range of conditions and phenotypes associated with this behavior remain very poorly understood. Here we predict computationally and analytically that any organism evolving to maximize growth rate, ATP production, or any other linear function of metabolic fluxes tends to significantly reduce the number of active metabolic reactions compared to typical nonoptimal states. The reduced number appears to be constant across the microbial species studied and just slightly larger than the minimum number required for the organism to grow at all. We show that this massive spontaneous reaction silencing is triggered by the irreversibility of a large fraction of the metabolic reactions and propagates through the network as a cascade of inactivity. Our results help explain existing experimental data on intracellular flux measurements and the usage of latent pathways, shedding new light on microbial evolution, robustness, and versatility for the execution of specific biochemical tasks. In particular, the identification of optimal reaction activity provides rigorous ground for an intriguing knockout-based method recently proposed for the synthetic recovery of metabolic function. Cellular growth and other integrated metabolic functions are manifestations of the coordinated interconversion of a large number of chemical compounds. But what is the relation between such whole-cell behaviors and the activity pattern of the individual biochemical reactions? In this study, we have used flux balance-based methods and reconstructed networks of Helicobacter pylori, Staphylococcus aureus, Escherichia coli, and Saccharomyces cerevisiae to show that a cell seeking to optimize a metabolic objective, such as growth, has a tendency to spontaneously inactivate a significant number of its metabolic reactions, while all such reactions are recruited for use in typical suboptimal states. The mechanisms governing this behavior not only provide insights into why numerous genes can often be disabled without affecting optimal growth but also lay a foundation for the recently proposed synthetic rescue of metabolic function in which the performance of suboptimally operating cells can be enhanced by disabling specific metabolic reactions. Our findings also offer explanation for another experimentally observed behavior, in which some inactive reactions are temporarily activated following a genetic or environmental perturbation. The latter is of utmost importance given that nonoptimal and transient metabolic behaviors are arguably common in natural environments.
Collapse
Affiliation(s)
- Takashi Nishikawa
- Division of Mathematics and Computer Science, Clarkson University, Potsdam, New York, United States of America
- Department of Physics and Astronomy and Northwestern Institute on Complex Systems, Northwestern University, Evanston, Illinois, United States of America
| | - Natali Gulbahce
- Department of Physics and Center for Complex Network Research, Northeastern University, Boston, Massachusetts, United States of America
- Center for Cancer Systems Biology, Dana Farber Cancer Institute, Boston, Massachusetts, United States of America
| | - Adilson E. Motter
- Department of Physics and Astronomy and Northwestern Institute on Complex Systems, Northwestern University, Evanston, Illinois, United States of America
- * E-mail:
| |
Collapse
|
37
|
Metabolic networks are NP-hard to reconstruct. J Theor Biol 2008; 254:807-16. [DOI: 10.1016/j.jtbi.2008.07.015] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2007] [Revised: 07/14/2008] [Accepted: 07/14/2008] [Indexed: 11/22/2022]
|
38
|
de Figueiredo LF, Schuster S, Kaleta C, Fell DA. Can sugars be produced from fatty acids? A test case for pathway analysis tools. Bioinformatics 2008; 24:2615-21. [DOI: 10.1093/bioinformatics/btn500] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
39
|
Gevorgyan A, Poolman MG, Fell DA. Detection of stoichiometric inconsistencies in biomolecular models. Bioinformatics 2008; 24:2245-51. [DOI: 10.1093/bioinformatics/btn425] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
40
|
Kotera M, McDonald AG, Boyce S, Tipton KF. Functional group and substructure searching as a tool in metabolomics. PLoS One 2008; 3:e1537. [PMID: 18253485 PMCID: PMC2212108 DOI: 10.1371/journal.pone.0001537] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2007] [Accepted: 01/06/2008] [Indexed: 01/31/2023] Open
Abstract
Background A direct link between the names and structures of compounds and the functional groups contained within them is important, not only because biochemists frequently rely on literature that uses a free-text format to describe functional groups, but also because metabolic models depend upon the connections between enzymes and substrates being known and appropriately stored in databases. Methodology We have developed a database named “Biochemical Substructure Search Catalogue” (BiSSCat), which contains 489 functional groups, >200,000 compounds and >1,000,000 different computationally constructed substructures, to allow identification of chemical compounds of biological interest. Conclusions This database and its associated web-based search program (http://bisscat.org/) can be used to find compounds containing selected combinations of substructures and functional groups. It can be used to determine possible additional substrates for known enzymes and for putative enzymes found in genome projects. Its applications to enzyme inhibitor design are also discussed.
Collapse
Affiliation(s)
- Masaaki Kotera
- School of Biochemistry and Immunology, Trinity College, Dublin, Ireland.
| | | | | | | |
Collapse
|
41
|
Abstract
Research into plant metabolism has a long history, and analytical approaches of ever-increasing breadth and sophistication have been brought to bear. We now have access to vast repositories of data concerning enzymology and regulatory features of enzymes, as well as large-scale datasets containing profiling information of transcripts, protein and metabolite levels. Nevertheless, despite this wealth of data, we remain some way off from being able to rationally engineer plant metabolism or even to predict metabolic responses. Within the past 18 months, rapid progress has been made, with several highly informative plant network interrogations being discussed in the literature. In the present review we will appraise the current state of the art regarding plant metabolic network analysis and attempt to outline what the necessary steps are in order to further our understanding of network regulation.
Collapse
|
42
|
Modular decomposition of metabolic systems via null-space analysis. J Theor Biol 2007; 249:691-705. [PMID: 17949756 DOI: 10.1016/j.jtbi.2007.08.005] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2007] [Revised: 07/02/2007] [Accepted: 08/03/2007] [Indexed: 11/23/2022]
Abstract
We describe a method by which the reactions in a metabolic system may be grouped hierarchically into sets of modules to form a metabolic reaction tree. In contrast to previous approaches, the method described here takes into account the fact that, in a viable network, reactions must be capable of sustaining a steady-state flux. In order to achieve this decomposition we introduce a new concept--the reaction correlation coefficient, phi, and show that this is a logical extension of the concept of enzyme (or reaction) subsets. In addition to their application to modular decomposition, reaction correlation coefficients have a number of other interesting properties, including a convenient means for identifying disconnected subnetworks in a system and potential applications to metabolic engineering. The method computes reaction correlation coefficients from an orthonormal basis of the null-space of the stoichiometry matrix. We show that reaction correlation coefficients are uniquely defined, even though the basis of the null-space is not. Once a complete set of reaction correlation coefficients is calculated, a metabolic reaction tree can be determined through the application of standard programming techniques. Computation of the reaction correlation coefficients, and the subsequent construction of the metabolic reaction tree is readily achievable for genome-scale models using a commodity desk-top PC.
Collapse
|
43
|
Hoppe A, Hoffmann S, Holzhütter HG. Including metabolite concentrations into flux balance analysis: thermodynamic realizability as a constraint on flux distributions in metabolic networks. BMC SYSTEMS BIOLOGY 2007; 1:23. [PMID: 17543097 PMCID: PMC1903363 DOI: 10.1186/1752-0509-1-23] [Citation(s) in RCA: 118] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/16/2007] [Accepted: 06/01/2007] [Indexed: 01/04/2023]
Abstract
BACKGROUND In recent years, constrained optimization - usually referred to as flux balance analysis (FBA) - has become a widely applied method for the computation of stationary fluxes in large-scale metabolic networks. The striking advantage of FBA as compared to kinetic modeling is that it basically requires only knowledge of the stoichiometry of the network. On the other hand, results of FBA are to a large degree hypothetical because the method relies on plausible but hardly provable optimality principles that are thought to govern metabolic flux distributions. RESULTS To augment the reliability of FBA-based flux calculations we propose an additional side constraint which assures thermodynamic realizability, i.e. that the flux directions are consistent with the corresponding changes of Gibb's free energies. The latter depend on metabolite levels for which plausible ranges can be inferred from experimental data. Computationally, our method results in the solution of a mixed integer linear optimization problem with quadratic scoring function. An optimal flux distribution together with a metabolite profile is determined which assures thermodynamic realizability with minimal deviations of metabolite levels from their expected values. We applied our novel approach to two exemplary metabolic networks of different complexity, the metabolic core network of erythrocytes (30 reactions) and the metabolic network iJR904 of Escherichia coli (931 reactions). Our calculations show that increasing network complexity entails increasing sensitivity of predicted flux distributions to variations of standard Gibb's free energy changes and metabolite concentration ranges. We demonstrate the usefulness of our method for assessing critical concentrations of external metabolites preventing attainment of a metabolic steady state. CONCLUSION Our method incorporates the thermodynamic link between flux directions and metabolite concentrations into a practical computational algorithm. The weakness of conventional FBA to rely on intuitive assumptions about the reversibility of biochemical reactions is overcome. This enables the computation of reliable flux distributions even under extreme conditions of the network (e.g. enzyme inhibition, depletion of substrates or accumulation of end products) where metabolite concentrations may be drastically altered.
Collapse
Affiliation(s)
- Andreas Hoppe
- Charité – University Medicine Berlin, Institute for Biochemistry, Universitätsmedizin Berlin, Monbijoustr. 2, 10117 Berlin, Germany
| | - Sabrina Hoffmann
- Charité – University Medicine Berlin, Institute for Biochemistry, Universitätsmedizin Berlin, Monbijoustr. 2, 10117 Berlin, Germany
| | - Hermann-Georg Holzhütter
- Charité – University Medicine Berlin, Institute for Biochemistry, Universitätsmedizin Berlin, Monbijoustr. 2, 10117 Berlin, Germany
| |
Collapse
|