251
|
Abstract
Pathway databases collect the bioreactions and molecular interactions that define the processes of life. The MetaCyc family of pathway databases consists of thousands of databases that were derived through computational inference of metabolic pathways from the MetaCyc pathway/genome database (PGDB). In some cases, these DBs underwent subsequent manual curation. Curated pathway DBs are now available for most of the major model organisms. Databases in the MetaCyc family are managed using the Pathway Tools software. This chapter presents methods for performing data mining on the MetaCyc family of pathway DBs. We discuss the major data access mechanisms for the family, which include data files in multiple formats; application programming interfaces (APIs) for the Lisp, Java, and Perl languages; and web services. We present an overview of the Pathway Tools schema, an understanding of which is needed to query the DBs. The chapter also presents several interactive data mining tools within Pathway Tools for performing omics data analysis.
Collapse
Affiliation(s)
- Peter D Karp
- Bioinformatics Research Group, SRI International, Menlo Park, CA, USA.
| | | | | |
Collapse
|
252
|
Karr JR, Sanghvi JC, Macklin DN, Arora A, Covert MW. WholeCellKB: model organism databases for comprehensive whole-cell models. Nucleic Acids Res 2013; 41:D787-92. [PMID: 23175606 PMCID: PMC3531061 DOI: 10.1093/nar/gks1108] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Whole-cell models promise to greatly facilitate the analysis of complex biological behaviors. Whole-cell model development requires comprehensive model organism databases. WholeCellKB (http://wholecellkb.stanford.edu) is an open-source web-based software program for constructing model organism databases. WholeCellKB provides an extensive and fully customizable data model that fully describes individual species including the structure and function of each gene, protein, reaction and pathway. We used WholeCellKB to create WholeCellKB-MG, a comprehensive database of the Gram-positive bacterium Mycoplasma genitalium using over 900 sources. WholeCellKB-MG is extensively cross-referenced to existing resources including BioCyc, KEGG and UniProt. WholeCellKB-MG is freely accessible through a web-based user interface as well as through a RESTful web service.
Collapse
Affiliation(s)
- Jonathan R Karr
- Graduate Program in Biophysics, Stanford University, 318 Campus Drive West, Stanford, CA 94305, USA
| | | | | | | | | |
Collapse
|
253
|
Abstract
Spurred by recent innovations in genome sequencing, the reconstruction of genome-scale models has increased in recent years. Genome-scale models are now available for a wide range of organisms, and models have been successfully applied to a number of research topics including metabolic engineering, genome annotation, biofuel production, and interpretation of omics data sets. The challenge is how to manage the large amount of data in genome-scale models and perform comparative analysis to gain new biological insights. In this chapter, important standards for genome-scale modeling are outlined. Furthermore, management strategies as well as existing repository and construction tools are discussed. As the comparison of models is an important aspect during the development and analysis stages, available methods are presented and existing software solutions are reviewed.
Collapse
Affiliation(s)
- Stephan Pabinger
- Division for Bioinformatics, Biocenter, Innsbruck Medical University, Innsbruck, Austria.
| | | |
Collapse
|
254
|
Blais EM, Chavali AK, Papin JA. Linking genome-scale metabolic modeling and genome annotation. Methods Mol Biol 2013; 985:61-83. [PMID: 23417799 DOI: 10.1007/978-1-62703-299-5_4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Genome-scale metabolic network reconstructions, assembled from annotated genomes, serve as a platform for integrating data from heterogeneous sources and generating hypotheses for further experimental validation. Implementing constraint-based modeling techniques such as flux balance analysis (FBA) on network reconstructions allows for interrogating metabolism at a systems level, which aids in identifying and rectifying gaps in knowledge. With genome sequences for various organisms from prokaryotes to eukaryotes becoming increasingly available, a significant bottleneck lies in the structural and functional annotation of these sequences. Using topologically based and biologically inspired metabolic network refinement, we can better characterize enzymatic functions present in an organism and link annotation of these functions to candidate transcripts; both steps can be experimentally validated.
Collapse
Affiliation(s)
- Edik M Blais
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
| | | | | |
Collapse
|
255
|
Klanchui A, Vorapreeda T, Vongsangnak W, Khannapho C, Cheevadhanarak S, Meechai A. Systems biology and metabolic engineering of Arthrospira cell factories. Comput Struct Biotechnol J 2012; 3:e201210015. [PMID: 24688675 PMCID: PMC3962090 DOI: 10.5936/csbj.201210015] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2012] [Revised: 11/19/2012] [Accepted: 11/22/2012] [Indexed: 12/24/2022] Open
Abstract
Arthrospira are attractive candidates to serve as cell factories for production of many valuable compounds useful for food, feed, fuel and pharmaceutical industries. In connection with the development of sustainable bioprocessing, it is a challenge to design and develop efficient Arthrospira cell factories which can certify effective conversion from the raw materials (i.e. CO2 and sun light) into desired products. With the current availability of the genome sequences and metabolic models of Arthrospira, the development of Arthrospira factories can now be accelerated by means of systems biology and the metabolic engineering approach. Here, we review recent research involving the use of Arthrospira cell factories for industrial applications, as well as the exploitation of systems biology and the metabolic engineering approach for studying Arthrospira. The current status of genomics and proteomics through the development of the genome-scale metabolic model of Arthrospira, as well as the use of mathematical modeling to simulate the phenotypes resulting from the different metabolic engineering strategies are discussed. At the end, the perspective and future direction on Arthrospira cell factories for industrial biotechnology are presented.
Collapse
Affiliation(s)
- Amornpan Klanchui
- Microarray Laboratory, National Center for Genetic Engineering and Biotechnology (BIOTEC), KlongLuang, Pathumthani, Thailand
| | - Tayvich Vorapreeda
- Biochemical Engineering and Pilot Plant Research and Development Unit, National Center for Genetic Engineering and Biotechnology at King Mongkut's University of Technology Thonburi, Bangkhuntien, Bangkok, Thailand
| | - Wanwipa Vongsangnak
- Center for Systems Biology, Soochow University, Suzhou, Jiangsu 215006, China
| | - Chiraphan Khannapho
- Biochemical Engineering and Pilot Plant Research and Development Unit, National Center for Genetic Engineering and Biotechnology at King Mongkut's University of Technology Thonburi, Bangkhuntien, Bangkok, Thailand
| | - Supapon Cheevadhanarak
- Biochemical Engineering and Pilot Plant Research and Development Unit, National Center for Genetic Engineering and Biotechnology at King Mongkut's University of Technology Thonburi, Bangkhuntien, Bangkok, Thailand ; Devision of Biotechnology, School of Bioresources and Technology, King Mongkut's University of Technology Thonburi, Bangkok, Thailand
| | - Asawin Meechai
- Department of Chemical Engineering, Faculty of Engineering, King Mongkut's University of Technology Thonburi, Bangkok, Thailand
| |
Collapse
|
256
|
CINPER: an interactive web system for pathway prediction for prokaryotes. PLoS One 2012; 7:e51252. [PMID: 23236458 PMCID: PMC3517448 DOI: 10.1371/journal.pone.0051252] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2012] [Accepted: 10/30/2012] [Indexed: 11/19/2022] Open
Abstract
We present a web-based network-construction system, CINPER (CSBL INteractive Pathway BuildER), to assist a user to build a user-specified gene network for a prokaryotic organism in an intuitive manner. CINPER builds a network model based on different types of information provided by the user and stored in the system. CINPER’s prediction process has four steps: (i) collection of template networks based on (partially) known pathways of related organism(s) from the SEED or BioCyc database and the published literature; (ii) construction of an initial network model based on the template networks using the P-Map program; (iii) expansion of the initial model, based on the association information derived from operons, protein-protein interactions, co-expression modules and phylogenetic profiles; and (iv) computational validation of the predicted models based on gene expression data. To facilitate easy applications, CINPER provides an interactive visualization environment for a user to enter, search and edit relevant data and for the system to display (partial) results and prompt for additional data. Evaluation of CINPER on 17 well-studied pathways in the MetaCyc database shows that the program achieves an average recall rate of 76% and an average precision rate of 90% on the initial models; and a higher average recall rate at 87% and an average precision rate at 28% on the final models. The reduced precision rate in the final models versus the initial models reflects the reality that the final models have large numbers of novel genes that have no experimental evidences and hence are not yet collected in the MetaCyc database. To demonstrate the usefulness of this server, we have predicted an iron homeostasis gene network of Synechocystis sp. PCC6803 using the server. The predicted models along with the server can be accessed at http://csbl.bmb.uga.edu/cinper/.
Collapse
|
257
|
Vallenet D, Belda E, Calteau A, Cruveiller S, Engelen S, Lajus A, Le Fèvre F, Longin C, Mornico D, Roche D, Rouy Z, Salvignol G, Scarpelli C, Thil Smith AA, Weiman M, Médigue C. MicroScope--an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data. Nucleic Acids Res 2012. [PMID: 23193269 PMCID: PMC3531135 DOI: 10.1093/nar/gks1194] [Citation(s) in RCA: 310] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
MicroScope is an integrated platform dedicated to both the methodical updating of microbial genome annotation and to comparative analysis. The resource provides data from completed and ongoing genome projects (automatic and expert annotations), together with data sources from post-genomic experiments (i.e. transcriptomics, mutant collections) allowing users to perfect and improve the understanding of gene functions. MicroScope (http://www.genoscope.cns.fr/agc/microscope) combines tools and graphical interfaces to analyse genomes and to perform the manual curation of gene annotations in a comparative context. Since its first publication in January 2006, the system (previously named MaGe for Magnifying Genomes) has been continuously extended both in terms of data content and analysis tools. The last update of MicroScope was published in 2009 in the Database journal. Today, the resource contains data for >1600 microbial genomes, of which ∼300 are manually curated and maintained by biologists (1200 personal accounts today). Expert annotations are continuously gathered in the MicroScope database (∼50 000 a year), contributing to the improvement of the quality of microbial genomes annotations. Improved data browsing and searching tools have been added, original tools useful in the context of expert annotation have been developed and integrated and the website has been significantly redesigned to be more user-friendly. Furthermore, in the context of the European project Microme (Framework Program 7 Collaborative Project), MicroScope is becoming a resource providing for the curation and analysis of both genomic and metabolic data. An increasing number of projects are related to the study of environmental bacterial (meta)genomes that are able to metabolize a large variety of chemical compounds that may be of high industrial interest.
Collapse
Affiliation(s)
- David Vallenet
- CEA, Institut de Génomique, Genoscope, 2 rue Gaston Crémieux, 91057 Evry, France.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
258
|
Dubois A, Carrere S, Raymond O, Pouvreau B, Cottret L, Roccia A, Onesto JP, Sakr S, Atanassova R, Baudino S, Foucher F, Le Bris M, Gouzy J, Bendahmane M. Transcriptome database resource and gene expression atlas for the rose. BMC Genomics 2012; 13:638. [PMID: 23164410 PMCID: PMC3518227 DOI: 10.1186/1471-2164-13-638] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2012] [Accepted: 11/06/2012] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND For centuries roses have been selected based on a number of traits. Little information exists on the genetic and molecular basis that contributes to these traits, mainly because information on expressed genes for this economically important ornamental plant is scarce. RESULTS Here, we used a combination of Illumina and 454 sequencing technologies to generate information on Rosa sp. transcripts using RNA from various tissues and in response to biotic and abiotic stresses. A total of 80714 transcript clusters were identified and 76611 peptides have been predicted among which 20997 have been clustered into 13900 protein families. BLASTp hits in closely related Rosaceae species revealed that about half of the predicted peptides in the strawberry and peach genomes have orthologs in Rosa dataset. Digital expression was obtained using RNA samples from organs at different development stages and under different stress conditions. qPCR validated the digital expression data for a selection of 23 genes with high or low expression levels. Comparative gene expression analyses between the different tissues and organs allowed the identification of clusters that are highly enriched in given tissues or under particular conditions, demonstrating the usefulness of the digital gene expression analysis. A web interface ROSAseq was created that allows data interrogation by BLAST, subsequent analysis of DNA clusters and access to thorough transcript annotation including best BLAST matches on Fragaria vesca, Prunus persica and Arabidopsis. The rose peptides dataset was used to create the ROSAcyc resource pathway database that allows access to the putative genes and enzymatic pathways. CONCLUSIONS The study provides useful information on Rosa expressed genes, with thorough annotation and an overview of expression patterns for transcripts with good accuracy.
Collapse
Affiliation(s)
- Annick Dubois
- Reproduction et Développement des Plantes UMR INRA-CNRS- Université Lyon 1-ENSL, Ecole Normale Supérieure, 46 allée d'Italie, Lyon Cedex 07 69364, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
259
|
Sharma R, Dill BD, Chourey K, Shah M, VerBerkmoes NC, Hettich RL. Coupling a detergent lysis/cleanup methodology with intact protein fractionation for enhanced proteome characterization. J Proteome Res 2012; 11:6008-18. [PMID: 23126408 DOI: 10.1021/pr300709k] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The expanding use of surfactants for proteome sample preparations has prompted the need to systematically optimize the application and removal of these MS-deleterious agents prior to proteome measurements. Here we compare four detergent cleanup methods (trichloroacetic acid (TCA) precipitation, chloroform/methanol/water (CMW) extraction, a commercial detergent removal spin column method (DRS) and filter-aided sample preparation (FASP)) to provide efficiency benchmarks with respect to protein, peptide, and spectral identifications in each case. Our results show that for protein-limited samples, FASP outperforms the other three cleanup methods, while at high protein amounts, all the methods are comparable. This information was used to investigate and contrast molecular weight-based fractionated with unfractionated lysates from three increasingly complex samples ( Escherichia coli K-12, a five microbial isolate mixture, and a natural microbial community groundwater sample), all of which were prepared with an SDS-FASP approach. The additional fractionation step enhanced the number of protein identifications by 8% to 25% over the unfractionated approach across the three samples.
Collapse
Affiliation(s)
- Ritin Sharma
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville-Tennessee 37996, United States
| | | | | | | | | | | |
Collapse
|
260
|
Wang H, Sivonen K, Rouhiainen L, Fewer DP, Lyra C, Rantala-Ylinen A, Vestola J, Jokela J, Rantasärkkä K, Li Z, Liu B. Genome-derived insights into the biology of the hepatotoxic bloom-forming cyanobacterium Anabaena sp. strain 90. BMC Genomics 2012; 13:613. [PMID: 23148582 PMCID: PMC3542288 DOI: 10.1186/1471-2164-13-613] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Accepted: 11/05/2012] [Indexed: 11/15/2022] Open
Abstract
Background Cyanobacteria can form massive toxic blooms in fresh and brackish bodies of water and are frequently responsible for the poisoning of animals and pose a health risk for humans. Anabaena is a genus of filamentous diazotrophic cyanobacteria commonly implicated as a toxin producer in blooms in aquatic ecosystems throughout the world. The biology of bloom-forming cyanobacteria is poorly understood at the genome level. Results Here, we report the complete sequence and comprehensive annotation of the bloom-forming Anabaena sp. strain 90 genome. It comprises two circular chromosomes and three plasmids with a total size of 5.3 Mb, encoding a total of 4,738 genes. The genome is replete with mobile genetic elements. Detailed manual annotation demonstrated that almost 5% of the gene repertoire consists of pseudogenes. A further 5% of the genome is dedicated to the synthesis of small peptides that are the products of both ribosomal and nonribosomal biosynthetic pathways. Inactivation of the hassallidin (an antifungal cyclic peptide) biosynthetic gene cluster through a deletion event and a natural mutation of the buoyancy-permitting gvpG gas vesicle gene were documented. The genome contains a large number of genes encoding restriction-modification systems. Two novel excision elements were found in the nifH gene that is required for nitrogen fixation. Conclusions Genome analysis demonstrated that this strain invests heavily in the production of bioactive compounds and restriction-modification systems. This well-annotated genome provides a platform for future studies on the ecology and biology of these important bloom-forming cyanobacteria.
Collapse
Affiliation(s)
- Hao Wang
- Department of Food and Environmental Sciences, University of Helsinki, Helsinki, FIN-00014, Finland
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
261
|
Bernhardt J, Michalik S, Wollscheid B, Völker U, Schmidt F. Proteomics approaches for the analysis of enriched microbial subpopulations and visualization of complex functional information. Curr Opin Biotechnol 2012; 24:112-9. [PMID: 23141770 DOI: 10.1016/j.copbio.2012.10.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2012] [Revised: 09/28/2012] [Accepted: 10/09/2012] [Indexed: 02/05/2023]
Abstract
Advances in the separation of microbial subpopulations and in proteomics technologies have paved the way for the global molecular characterization of microbial cells that share common functional characteristics. Quantitative characterization of the dynamics of microbial proteomes enables an unprecedented view of the adaptive responses of microbes to environmental stimuli or during interaction with other species or host cells. However, the intrinsic complexity of such data requires sophisticated visualization methods for the display, mining, interpretation and efficient exploitation of these data resources. In this review, we discuss how new approaches in data visualization such as streamgraphs, network graphs or Voronoi treemaps are being used in the field to provide new insights into the functional complexity of microbial cells, populations and multispecies consortia.
Collapse
Affiliation(s)
- Jörg Bernhardt
- Institute for Microbiology, Ernst-Moritz-Arndt-University Greifswald, Friedrich-Ludwig-Jahn-Strasse 15, D-17487 Greifswald, Germany
| | | | | | | | | |
Collapse
|
262
|
Sharma A, Pan A. Identification of potential drug targets in Yersinia pestis using metabolic pathway analysis: MurE ligase as a case study. Eur J Med Chem 2012; 57:185-95. [DOI: 10.1016/j.ejmech.2012.09.018] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2012] [Revised: 09/07/2012] [Accepted: 09/11/2012] [Indexed: 01/14/2023]
|
263
|
In silico identification of potential drug targets in Clostridium difficile R20291: modeling and virtual screening analysis of a candidate enzyme MurG. Med Chem Res 2012. [DOI: 10.1007/s00044-012-0262-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
264
|
Suthers PF, Maranas CD. Orchestrating hi-fi annotations. Nat Chem Biol 2012; 8:810-1. [DOI: 10.1038/nchembio.1067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
265
|
Zomorrodi AR, Suthers PF, Ranganathan S, Maranas CD. Mathematical optimization applications in metabolic networks. Metab Eng 2012; 14:672-86. [PMID: 23026121 DOI: 10.1016/j.ymben.2012.09.005] [Citation(s) in RCA: 103] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2012] [Revised: 08/31/2012] [Accepted: 09/14/2012] [Indexed: 11/30/2022]
Abstract
Genome-scale metabolic models are increasingly becoming available for a variety of microorganisms. This has spurred the development of a wide array of computational tools, and in particular, mathematical optimization approaches, to assist in fundamental metabolic network analyses and redesign efforts. This review highlights a number of optimization-based frameworks developed towards addressing challenges in the analysis and engineering of metabolic networks. In particular, three major types of studies are covered here including exploring model predictions, correction and improvement of models of metabolism, and redesign of metabolic networks for the targeted overproduction of a desired compound. Overall, the methods reviewed in this paper highlight the diversity of queries, breadth of questions and complexity of redesigns that are amenable to mathematical optimization strategies.
Collapse
Affiliation(s)
- Ali R Zomorrodi
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | | | | | | |
Collapse
|
266
|
Paley SM, Latendresse M, Karp PD. Regulatory network operations in the Pathway Tools software. BMC Bioinformatics 2012; 13:243. [PMID: 22998532 PMCID: PMC3473263 DOI: 10.1186/1471-2105-13-243] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2012] [Accepted: 08/31/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Biologists are elucidating complex collections of genetic regulatory data for multiple organisms. Software is needed for such regulatory network data. RESULTS The Pathway Tools software supports storage and manipulation of regulatory information through a variety of strategies. The Pathway Tools regulation ontology captures transcriptional and translational regulation, substrate-level regulation of enzyme activity, post-translational modifications, and regulatory pathways. Regulatory visualizations include a novel diagram that summarizes all regulatory influences on a gene; a transcription-unit diagram, and an interactive visualization of a full transcriptional regulatory network that can be painted with gene expression data to probe correlations between gene expression and regulatory mechanisms. We introduce a novel type of enrichment analysis that asks whether a gene-expression dataset is over-represented for known regulators. We present algorithms for ranking the degree of regulatory influence of genes, and for computing the net positive and negative regulatory influences on a gene. CONCLUSIONS Pathway Tools provides a comprehensive environment for manipulating molecular regulatory interactions that integrates regulatory data with an organism's genome and metabolic network. Curated collections of regulatory data authored using Pathway Tools are available for Escherichia coli, Bacillus subtilis, and Shewanella oneidensis.
Collapse
Affiliation(s)
- Suzanne M Paley
- Bioinformatics Research Group, SRI International 333 Ravenswood Ave, Menlo Park, CA 94025
| | - Mario Latendresse
- Bioinformatics Research Group, SRI International 333 Ravenswood Ave, Menlo Park, CA 94025
| | - Peter D Karp
- Bioinformatics Research Group, SRI International 333 Ravenswood Ave, Menlo Park, CA 94025
| |
Collapse
|
267
|
Suzuki H, MacDonald J, Syed K, Salamov A, Hori C, Aerts A, Henrissat B, Wiebenga A, VanKuyk PA, Barry K, Lindquist E, LaButti K, Lapidus A, Lucas S, Coutinho P, Gong Y, Samejima M, Mahadevan R, Abou-Zaid M, de Vries RP, Igarashi K, Yadav JS, Grigoriev IV, Master ER. Comparative genomics of the white-rot fungi, Phanerochaete carnosa and P. chrysosporium, to elucidate the genetic basis of the distinct wood types they colonize. BMC Genomics 2012; 13:444. [PMID: 22937793 PMCID: PMC3463431 DOI: 10.1186/1471-2164-13-444] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2012] [Accepted: 08/22/2012] [Indexed: 11/29/2022] Open
Abstract
Background Softwood is the predominant form of land plant biomass in the Northern hemisphere, and is among the most recalcitrant biomass resources to bioprocess technologies. The white rot fungus, Phanerochaete carnosa, has been isolated almost exclusively from softwoods, while most other known white-rot species, including Phanerochaete chrysosporium, were mainly isolated from hardwoods. Accordingly, it is anticipated that P. carnosa encodes a distinct set of enzymes and proteins that promote softwood decomposition. To elucidate the genetic basis of softwood bioconversion by a white-rot fungus, the present study reports the P. carnosa genome sequence and its comparative analysis with the previously reported P. chrysosporium genome. Results P. carnosa encodes a complete set of lignocellulose-active enzymes. Comparative genomic analysis revealed that P. carnosa is enriched with genes encoding manganese peroxidase, and that the most divergent glycoside hydrolase families were predicted to encode hemicellulases and glycoprotein degrading enzymes. Most remarkably, P. carnosa possesses one of the largest P450 contingents (266 P450s) among the sequenced and annotated wood-rotting basidiomycetes, nearly double that of P. chrysosporium. Along with metabolic pathway modeling, comparative growth studies on model compounds and chemical analyses of decomposed wood components showed greater tolerance of P. carnosa to various substrates including coniferous heartwood. Conclusions The P. carnosa genome is enriched with genes that encode P450 monooxygenases that can participate in extractives degradation, and manganese peroxidases involved in lignin degradation. The significant expansion of P450s in P. carnosa, along with differences in carbohydrate- and lignin-degrading enzymes, could be correlated to the utilization of heartwood and sapwood preparations from both coniferous and hardwood species.
Collapse
Affiliation(s)
- Hitoshi Suzuki
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, Toronto, ON, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
268
|
Krempl PM, Mairhofer J, Striedner G, Thallinger GG. A sequence comparison and gene expression data integration add-on for the Pathway Tools software. Bioinformatics 2012; 28:2283-4. [DOI: 10.1093/bioinformatics/bts431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
269
|
Klein CC, Cottret L, Kielbassa J, Charles H, Gautier C, Ribeiro de Vasconcelos AT, Lacroix V, Sagot MF. Exploration of the core metabolism of symbiotic bacteria. BMC Genomics 2012; 13:438. [PMID: 22938206 PMCID: PMC3543179 DOI: 10.1186/1471-2164-13-438] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2012] [Accepted: 08/18/2012] [Indexed: 12/01/2022] Open
Abstract
Background A large number of genome-scale metabolic networks is now available for many organisms, mostly bacteria. Previous works on minimal gene sets, when analysing host-dependent bacteria, found small common sets of metabolic genes. When such analyses are restricted to bacteria with similar lifestyles, larger portions of metabolism are expected to be shared and their composition is worth investigating. Here we report a comparative analysis of the small molecule metabolism of symbiotic bacteria, exploring common and variable portions as well as the contribution of different lifestyle groups to the reduction of a common set of metabolic capabilities. Results We found no reaction shared by all the bacteria analysed. Disregarding those with the smallest genomes, we still do not find a reaction core, however we did find a core of biochemical capabilities. While obligate intracellular symbionts have no core of reactions within their group, extracellular and cell-associated symbionts do have a small core composed of disconnected fragments. In agreement with previous findings in Escherichia coli, their cores are enriched in biosynthetic processes whereas the variable metabolisms have similar ratios of biosynthetic and degradation reactions. Conversely, the variable metabolism of obligate intracellular symbionts is enriched in anabolism. Conclusion Even when removing the symbionts with the most reduced genomes, there is no core of reactions common to the analysed symbiotic bacteria. The main reason is the very high specialisation of obligate intracellular symbionts, however, host-dependence alone is not an explanation for such absence. The composition of the metabolism of cell-associated and extracellular bacteria shows that while they have similar needs in terms of the building blocks of their cells, they have to adapt to very distinct environments. On the other hand, in obligate intracellular bacteria, catabolism has largely disappeared, whereas synthetic routes appear to have been selected for depending on the nature of the symbiosis. As more genomes are added, we expect, based on our simulations, that the core of cell-associated and extracellular bacteria continues to diminish, converging to approximately 60 reactions.
Collapse
|
270
|
Abstract
Constraint-based models of metabolism have been used in a variety of studies on drug discovery, metabolic engineering, evolution, and multi-species interactions. These genome-scale models can be generated for any sequenced organism since their main parameters (i.e., reaction stoichiometry) are highly conserved. Their relatively low parameter requirement makes these models easy to develop; however, these models often result in a solution space with multiple possible flux distributions, making it difficult to determine the precise flux state in the cell. Recent research efforts in this modeling field have investigated how additional experimental data, including gene expression, protein expression, metabolite concentrations, and kinetic parameters, can be used to reduce the solution space. This mini-review provides a summary of the data-driven computational approaches that are available for reducing the solution space and thereby improve predictions of intracellular fluxes by constraint-based models.
Collapse
|
271
|
Genomic and transcriptomic studies of an RDX (hexahydro-1,3,5-trinitro-1,3,5-triazine)-degrading actinobacterium. Appl Environ Microbiol 2012; 78:7798-800. [PMID: 22923396 DOI: 10.1128/aem.02120-12] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Whole-genome sequencing, transcriptomic analyses, and metabolic reconstruction were used to investigate Gordonia sp. strain KTR9's ability to catabolize a range of compounds, including explosives and steroids. Aspects of this mycolic acid-containing actinobacterium's catabolic potential were experimentally verified and compared with those of rhodococci and mycobacteria.
Collapse
|
272
|
Abstract
Microbial metabolomics constitutes an integrated component of systems biology. By studying the complete set of metabolites within a microorganism and monitoring the global outcome of interactions between its development processes and the environment, metabolomics can potentially provide a more accurate snap shot of the actual physiological state of the cell. Recent advancement of technologies and post-genomic developments enable the study and analysis of metabolome. This unique contribution resulted in many scientific disciplines incorporating metabolomics as one of their “omics” platforms. This review focuses on metabolomics in microorganisms and utilizes selected topics to illustrate its impact on the understanding of systems microbiology.
Collapse
Affiliation(s)
- Jane Tang
- Center for National Security and Intelligence, Noblis, Falls Church, Virginia, USA
| |
Collapse
|
273
|
Abstract
The decoding of the Tritryp reference genomes nearly 7 years ago provided a first peek into the biology of pathogenic trypanosomatids and a blueprint that has paved the way for genome-wide studies. Although 60-70% of the predicted protein coding genes in Trypanosoma brucei, Trypanosoma cruzi and Leishmania major remain unannotated, the functional genomics landscape is rapidly changing. Facilitated by the advent of next-generation sequencing technologies, improved structural and functional annotation and genes and their products are emerging. Information is also growing for the interactions between cellular components as transcriptomes, regulatory networks and metabolomes are characterized, ushering in a new era of systems biology. Simultaneously, the launch of comparative sequencing of multiple strains of kinetoplastids will finally lead to the investigation of a vast, yet to be explored, evolutionary and pathogenomic space.
Collapse
Affiliation(s)
- J Choi
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | | |
Collapse
|
274
|
Fabris M, Matthijs M, Rombauts S, Vyverman W, Goossens A, Baart GJE. The metabolic blueprint of Phaeodactylum tricornutum reveals a eukaryotic Entner-Doudoroff glycolytic pathway. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2012; 70:1004-14. [PMID: 22332784 DOI: 10.1111/j.1365-313x.2012.04941.x] [Citation(s) in RCA: 86] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Diatoms are one of the most successful groups of unicellular eukaryotic algae. Successive endosymbiotic events contributed to their flexible metabolism, making them competitive in variable aquatic habitats. Although the recently sequenced genomes of the model diatoms Phaeodactylum tricornutum and Thalassiosira pseudonana have provided the first insights into their metabolic organization, the current knowledge on diatom biochemistry remains fragmentary. By means of a genome-wide approach, we developed DiatomCyc, a detailed pathway/genome database of P. tricornutum. DiatomCyc contains 286 pathways with 1719 metabolic reactions and 1613 assigned enzymes, spanning both the central and parts of the secondary metabolism of P. tricornutum. Central metabolic pathways, such as those of carbohydrates, amino acids and fatty acids, were covered. Furthermore, our understanding of the carbohydrate model in P. tricornutum was extended. In particular we highlight the discovery of a functional Entner-Doudoroff pathway, an ancient alternative for the glycolytic Embden-Meyerhof-Parnas pathway, and a putative phosphoketolase pathway, both uncommon in eukaryotes. DiatomCyc is accessible online (http://www.diatomcyc.org), and offers a range of software tools for the visualization and analysis of metabolic networks and 'omics' data. We anticipate that DiatomCyc will be key to gaining further understanding of diatom metabolism and, ultimately, will feed metabolic engineering strategies for the industrial valorization of diatoms.
Collapse
Affiliation(s)
- Michele Fabris
- Department of Plant Systems Biology, VIB, B-9052 Gent, Belgium
| | | | | | | | | | | |
Collapse
|
275
|
The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes. PLoS Comput Biol 2012; 8:e1002540. [PMID: 22693442 PMCID: PMC3364942 DOI: 10.1371/journal.pcbi.1002540] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Accepted: 04/01/2012] [Indexed: 12/17/2022] Open
Abstract
Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes), a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short). The first step locates “genomic metabolons”, i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12. The discovery of the various metabolic functions catalyzed by enzymes encoded by the genes from the exponentially increasing number of sequenced genomes is one of the main focuses of bioinformatics tools today. However, most of these tools rely on already identified enzyme-coding gene or protein sequence information to predict known enzymatic activities in new genomes. Therefore, they cannot be used to reveal metabolic activities without any corresponding sequenced genes, dubbed “sequence-orphan activities”. In such cases, the best approach is the bioanalysis of target genes by human expert curators, manually integrating so-called “context-based information” (such as gene co-localization on the genome, or the presence of incomplete metabolic pathways) to infer novel functions. Few bioinformatics tools exploit such information and render accessible results in an automated way. Here, we present “CanOE”, a strategy that uses contextual information to propose and rank Candidate genes for Orphan Enzymes in Bacteria and Archaea. Beyond the merit of extending our knowledge and comprehension of prokaryote metabolism, identifying coding genes for sequence-orphan activities opens new opportunities for functional annotation (homology-based transfer made accessible), drug design (new metabolic targets), synthetic biology (new building blocks) and biotechnology applications (new biocatalysts).
Collapse
|
276
|
State of the art in silico tools for the study of signaling pathways in cancer. Int J Mol Sci 2012; 13:6561-6581. [PMID: 22837650 PMCID: PMC3397482 DOI: 10.3390/ijms13066561] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2012] [Revised: 05/03/2012] [Accepted: 05/10/2012] [Indexed: 12/18/2022] Open
Abstract
In the last several years, researchers have exhibited an intense interest in the evolutionarily conserved signaling pathways that have crucial roles during embryonic development. Interestingly, the malfunctioning of these signaling pathways leads to several human diseases, including cancer. The chemical and biophysical events that occur during cellular signaling, as well as the number of interactions within a signaling pathway, make these systems complex to study. In silico resources are tools used to aid the understanding of cellular signaling pathways. Systems approaches have provided a deeper knowledge of diverse biochemical processes, including individual metabolic pathways, signaling networks and genome-scale metabolic networks. In the future, these tools will be enormously valuable, if they continue to be developed in parallel with growing biological knowledge. In this study, an overview of the bioinformatics resources that are currently available for the analysis of biological networks is provided.
Collapse
|
277
|
Copeland WB, Bartley BA, Chandran D, Galdzicki M, Kim KH, Sleight SC, Maranas CD, Sauro HM. Computational tools for metabolic engineering. Metab Eng 2012; 14:270-80. [PMID: 22629572 PMCID: PMC3361690 DOI: 10.1016/j.ymben.2012.03.001] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
A great variety of software applications are now employed in the metabolic engineering field. These applications have been created to support a wide range of experimental and analysis techniques. Computational tools are utilized throughout the metabolic engineering workflow to extract and interpret relevant information from large data sets, to present complex models in a more manageable form, and to propose efficient network design strategies. In this review, we present a number of tools that can assist in modifying and understanding cellular metabolic networks. The review covers seven areas of relevance to metabolic engineers. These include metabolic reconstruction efforts, network visualization, nucleic acid and protein engineering, metabolic flux analysis, pathway prospecting, post-structural network analysis and culture optimization. The list of available tools is extensive and we can only highlight a small, representative portion of the tools from each area.
Collapse
Affiliation(s)
- Wilbert B Copeland
- Department of Bioengineering, University of Washington, Seattle, WA 98195-5061, USA.
| | | | | | | | | | | | | | | |
Collapse
|
278
|
Navid A. Applications of system-level models of metabolism for analysis of bacterial physiology and identification of new drug targets. Brief Funct Genomics 2012; 10:354-64. [PMID: 22199377 DOI: 10.1093/bfgp/elr034] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
For nearly all of the 20th century, biologists gained considerable insights into the fundamental principles of cellular dynamics by examining select modules of biochemical processes. This form of analysis provides detailed information about the workings of the examined pathways. However, any attempt to alter the normal function of bacteria (perhaps for industrial or medicinal goals) requires a detailed global understanding of cellular mechanisms. The reductionist mode of analysis cannot provide the required information for developing the needed perspective on the complex interactions of biochemical pathways. Thankfully, the increasing availability of microbial genomic, transcriptomic, proteomic and other high-throughput data permits system-level analyses of microbiology. During the past two decades, systems biologists have developed constraint-based genome-scale models (GSM) of metabolism for a variety of pathogens. These models are important tools for assessing the metabolic capabilities of various genotypes. Simultaneously, new computational methods have been developed that use these network reconstructions to answer an array of important immunological questions. The objective of this article is to briefly review some of the uses of GSMs for studying bacterial metabolism under different conditions and to discuss how the calculated solutions can be used for rational design of drugs.
Collapse
Affiliation(s)
- Ali Navid
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA 94551, USA.
| |
Collapse
|
279
|
Hamilton JJ, Reed JL. Identification of functional differences in metabolic networks using comparative genomics and constraint-based models. PLoS One 2012; 7:e34670. [PMID: 22666308 PMCID: PMC3359066 DOI: 10.1371/journal.pone.0034670] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2011] [Accepted: 03/08/2012] [Indexed: 11/20/2022] Open
Abstract
Genome-scale network reconstructions are useful tools for understanding cellular metabolism, and comparisons of such reconstructions can provide insight into metabolic differences between organisms. Recent efforts toward comparing genome-scale models have focused primarily on aligning metabolic networks at the reaction level and then looking at differences and similarities in reaction and gene content. However, these reaction comparison approaches are time-consuming and do not identify the effect network differences have on the functional states of the network. We have developed a bilevel mixed-integer programming approach, CONGA, to identify functional differences between metabolic networks by comparing network reconstructions aligned at the gene level. We first identify orthologous genes across two reconstructions and then use CONGA to identify conditions under which differences in gene content give rise to differences in metabolic capabilities. By seeking genes whose deletion in one or both models disproportionately changes flux through a selected reaction (e.g., growth or by-product secretion) in one model over another, we are able to identify structural metabolic network differences enabling unique metabolic capabilities. Using CONGA, we explore functional differences between two metabolic reconstructions of Escherichia coli and identify a set of reactions responsible for chemical production differences between the two models. We also use this approach to aid in the development of a genome-scale model of Synechococcus sp. PCC 7002. Finally, we propose potential antimicrobial targets in Mycobacterium tuberculosis and Staphylococcus aureus based on differences in their metabolic capabilities. Through these examples, we demonstrate that a gene-centric approach to comparing metabolic networks allows for a rapid comparison of metabolic models at a functional level. Using CONGA, we can identify differences in reaction and gene content which give rise to different functional predictions. Because CONGA provides a general framework, it can be applied to find functional differences across models and biological systems beyond those presented here.
Collapse
Affiliation(s)
| | - Jennifer L. Reed
- Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin, United States of America,
| |
Collapse
|
280
|
Song L, Sudhakar P, Wang W, Conrads G, Brock A, Sun J, Wagner-Döbler I, Zeng AP. A genome-wide study of two-component signal transduction systems in eight newly sequenced mutans streptococci strains. BMC Genomics 2012; 13:128. [PMID: 22475007 PMCID: PMC3353171 DOI: 10.1186/1471-2164-13-128] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2011] [Accepted: 04/04/2012] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Mutans streptococci are a group of gram-positive bacteria including the primary cariogenic dental pathogen Streptococcus mutans and closely related species. Two component systems (TCSs) composed of a signal sensing histidine kinase (HK) and a response regulator (RR) play key roles in pathogenicity, but have not been comparatively studied for these oral bacterial pathogens. RESULTS HKs and RRs of 8 newly sequenced mutans streptococci strains, including S. sobrinus DSM20742, S. ratti DSM20564 and six S. mutans strains, were identified and compared to the TCSs of S. mutans UA159 and NN2025, two previously genome sequenced S. mutans strains. Ortholog analysis revealed 18 TCS clusters (HK-RR pairs), 2 orphan HKs and 2 orphan RRs, of which 8 TCS clusters were common to all 10 strains, 6 were absent in one or more strains, and the other 4 were exclusive to individual strains. Further classification of the predicted HKs and RRs revealed interesting aspects of their putative functions. While TCS complements were comparable within the six S. mutans strains, S. sobrinus DSM20742 lacked TCSs possibly involved in acid tolerance and fructan catabolism, and S. ratti DSM20564 possessed 3 unique TCSs but lacked the quorum-sensing related TCS (ComDE). Selected computational predictions were verified by PCR experiments. CONCLUSIONS Differences in the TCS repertoires of mutans streptococci strains, especially those of S. sobrinus and S. ratti in comparison to S. mutans, imply differences in their response mechanisms for survival in the dynamic oral environment. This genomic level study of TCSs should help in understanding the pathogenicity of these mutans streptococci strains.
Collapse
Affiliation(s)
- Lifu Song
- Institute of Bioprocess and Biosystems Engineering, Hamburg University of Technology, Hamburg, Germany
| | | | | | | | | | | | | | | |
Collapse
|
281
|
Nag A, Karpinets TV, Chang CH, Bar-Peled M. Enhancing a Pathway-Genome Database (PGDB) to capture subcellular localization of metabolites and enzymes: the nucleotide-sugar biosynthetic pathways of Populus trichocarpa. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bas013. [PMID: 22465851 PMCID: PMC3316911 DOI: 10.1093/database/bas013] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Understanding how cellular metabolism works and is regulated requires that the underlying biochemical pathways be adequately represented and integrated with large metabolomic data sets to establish a robust network model. Genetically engineering energy crops to be less recalcitrant to saccharification requires detailed knowledge of plant polysaccharide structures and a thorough understanding of the metabolic pathways involved in forming and regulating cell-wall synthesis. Nucleotide-sugars are building blocks for synthesis of cell wall polysaccharides. The biosynthesis of nucleotide-sugars is catalyzed by a multitude of enzymes that reside in different subcellular organelles, and precise representation of these pathways requires accurate capture of this biological compartmentalization. The lack of simple localization cues in genomic sequence data and annotations however leads to missing compartmentalization information for eukaryotes in automatically generated databases, such as the Pathway-Genome Databases (PGDBs) of the SRI Pathway Tools software that drives much biochemical knowledge representation on the internet. In this report, we provide an informal mechanism using the existing Pathway Tools framework to integrate protein and metabolite sub-cellular localization data with the existing representation of the nucleotide-sugar metabolic pathways in a prototype PGDB for Populus trichocarpa. The enhanced pathway representations have been successfully used to map SNP abundance data to individual nucleotide-sugar biosynthetic genes in the PGDB. The manually curated pathway representations are more conducive to the construction of a computational platform that will allow the simulation of natural and engineered nucleotide-sugar precursor fluxes into specific recalcitrant polysaccharide(s). Database URL: The curated Populus PGDB is available in the BESC public portal at http://cricket.ornl.gov/cgi-bin/beocyc_home.cgi and the nucleotide-sugar biosynthetic pathways can be directly accessed at http://cricket.ornl.gov:1555/PTR/new-image?object=SUGAR-NUCLEOTIDES.
Collapse
Affiliation(s)
- Ambarish Nag
- Computational Sciences Center, National Renewable Energy Laboratory, 1617 Cole Boulevard, Golden, CO 80401, USA
| | | | | | | |
Collapse
|
282
|
McGuire AM, Weiner B, Park ST, Wapinski I, Raman S, Dolganov G, Peterson M, Riley R, Zucker J, Abeel T, White J, Sisk P, Stolte C, Koehrsen M, Yamamoto RT, Iacobelli-Martinez M, Kidd MJ, Maer AM, Schoolnik GK, Regev A, Galagan J. Comparative analysis of Mycobacterium and related Actinomycetes yields insight into the evolution of Mycobacterium tuberculosis pathogenesis. BMC Genomics 2012; 13:120. [PMID: 22452820 PMCID: PMC3388012 DOI: 10.1186/1471-2164-13-120] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2011] [Accepted: 03/28/2012] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND The sequence of the pathogen Mycobacterium tuberculosis (Mtb) strain H37Rv has been available for over a decade, but the biology of the pathogen remains poorly understood. Genome sequences from other Mtb strains and closely related bacteria present an opportunity to apply the power of comparative genomics to understand the evolution of Mtb pathogenesis. We conducted a comparative analysis using 31 genomes from the Tuberculosis Database (TBDB.org), including 8 strains of Mtb and M. bovis, 11 additional Mycobacteria, 4 Corynebacteria, 2 Streptomyces, Rhodococcus jostii RHA1, Nocardia farcinia, Acidothermus cellulolyticus, Rhodobacter sphaeroides, Propionibacterium acnes, and Bifidobacterium longum. RESULTS Our results highlight the functional importance of lipid metabolism and its regulation, and reveal variation between the evolutionary profiles of genes implicated in saturated and unsaturated fatty acid metabolism. It also suggests that DNA repair and molybdopterin cofactors are important in pathogenic Mycobacteria. By analyzing sequence conservation and gene expression data, we identify nearly 400 conserved noncoding regions. These include 37 predicted promoter regulatory motifs, of which 14 correspond to previously validated motifs, as well as 50 potential noncoding RNAs, of which we experimentally confirm the expression of four. CONCLUSIONS Our analysis of protein evolution highlights gene families that are associated with the adaptation of environmental Mycobacteria to obligate pathogenesis. These families include fatty acid metabolism, DNA repair, and molybdopterin biosynthesis. Our analysis reinforces recent findings suggesting that small noncoding RNAs are more common in Mycobacteria than previously expected. Our data provide a foundation for understanding the genome and biology of Mtb in a comparative context, and are available online and through TBDB.org.
Collapse
|
283
|
Insights into the completely annotated genome of Lactobacillus buchneri CD034, a strain isolated from stable grass silage. J Biotechnol 2012; 161:153-66. [PMID: 22465289 DOI: 10.1016/j.jbiotec.2012.03.007] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2012] [Revised: 03/05/2012] [Accepted: 03/08/2012] [Indexed: 12/26/2022]
Abstract
Lactobacillus buchneri belongs to the group of heterofermentative lactic acid bacteria and is a common member of the silage microbiome. Here we report the completely annotated genomic sequence of L. buchneri CD034, a strain isolated from stable grass silage. The whole genome of L. buchneri CD034 was sequenced on the Roche Genome Sequencer FLX platform. It was found to consist of four replicons, a circular chromosome, and three plasmids. The circular chromosome was predicted to encode 2319 proteins and contains a genomic island and two prophages which significantly differ in G+C-content from the remaining chromosome. It possesses all genes for enzymes of a complete phosphoketolase pathway, whereas two enzymes necessary for glycolysis are lacking. This confirms the classification of L. buchneri CD034 as an obligate heterofermentative lactic acid bacterium. A set of genes considered to be involved in the lactate degradation pathway and genes putatively involved in the breakdown of plant cell wall polymers were identified. Moreover, several genes encoding putative S-layer proteins and two CRISPR systems, belonging to the subclasses I-E and II-A, are located on the chromosome. The largest plasmid pCD034-3 was predicted to encode 57 genes, including a putative polysaccharide synthesis gene cluster, whereas the functions of the two smaller plasmids, pCD034-1 and pCD034-2, remain cryptic. Phylogenetic analysis based on sequence comparison of the conserved marker gene rpoA reveals that L. buchneri CD034 is more closely related to Lactobacillus hilgardii strains than to Lactobacillus brevis and Lactobacillus plantarum strains. Comparison of the L. buchneri CD034 core genome to other fully sequenced and closely related members of the genus Lactobacillus disclosed a high degree of conservation between L. buchneri CD034 and the recently sequenced L. buchneri strain NRRL B-30929 and a more distant relationship to L. buchneri ATCC 11577 and L. brevis ssp. gravesensis ATCC 27305, which cluster together with L. hilgardii type strain ATCC 8290. L. buchneri CD034 genome information will certainly provide the basis for further postgenome studies with the objective to optimize application of the strain in silage production.
Collapse
|
284
|
Liu X, Gao B, Novik V, Galán JE. Quantitative Proteomics of Intracellular Campylobacter jejuni Reveals Metabolic Reprogramming. PLoS Pathog 2012; 8:e1002562. [PMID: 22412372 PMCID: PMC3297583 DOI: 10.1371/journal.ppat.1002562] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2011] [Accepted: 01/19/2012] [Indexed: 01/11/2023] Open
Abstract
Campylobacter jejuni is the major cause of bacterial food-borne illness in the USA and Europe. An important virulence attribute of this bacterial pathogen is its ability to enter and survive within host cells. Here we show through a quantitative proteomic analysis that upon entry into host cells, C. jejuni undergoes a significant metabolic downshift. Furthermore, our results indicate that intracellular C. jejuni reprograms its respiration, favoring the respiration of fumarate. These results explain the poor ability of C. jejuni obtained from infected cells to grow under standard laboratory conditions and provide the bases for the development of novel anti microbial strategies that would target relevant metabolic pathways. Campylobacter jejuni is one of the most common causes of food-borne illness in the United States and a major cause of diarrheal diseases in developing countries. This pathogen can invade intestinal epithelial cells, which is very important for its ability to cause disease. Once it gains access to epithelial cells, C. jejuni becomes unable to grow under standard growth conditions, although it can grow if pre-incubated under oxygen limiting conditions. This study compares the protein compositions of C. jejuni grown under standard growth conditions and obtained from within epithelial cells. This analysis indicates that, within cells, C. jejuni undergoes a significant metabolic downshift and reprograms its respiration, favoring the respiration of fumarate. These results may provide the bases for the development of novel anti microbial strategies that would target relevant metabolic pathways.
Collapse
Affiliation(s)
| | | | | | - Jorge E. Galán
- Section of Microbial Pathogenesis, Yale University School of Medicine, New Haven, Connecticut, United States of America
- * E-mail:
| |
Collapse
|
285
|
Seaver SMD, Henry CS, Hanson AD. Frontiers in metabolic reconstruction and modeling of plant genomes. JOURNAL OF EXPERIMENTAL BOTANY 2012; 63:2247-58. [PMID: 22238452 DOI: 10.1093/jxb/err371] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
A major goal of post-genomic biology is to reconstruct and model in silico the metabolic networks of entire organisms. Work on bacteria is well advanced, and is now under way for plants and other eukaryotes. Genome-scale modelling in plants is much more challenging than in bacteria. The challenges come from features characteristic of higher organisms (subcellular compartmentation, tissue differentiation) and also from the particular severity in plants of a general problem: genome content whose functions remain undiscovered. This problem results in thousands of genes for which no function is known ('undiscovered genome content') and hundreds of enzymatic and transport functions for which no gene is yet identified. The severity of the undiscovered genome content problem in plants reflects their genome size and complexity. To bring the challenges of plant genome-scale modelling into focus, we first summarize the current status of plant genome-scale models. We then highlight the challenges - and ways to address them - in three areas: identifying genes for missing processes, modelling tissues as opposed to single cells, and finding metabolic functions encoded by undiscovered genome content. We also discuss the emerging view that a significant fraction of undiscovered genome content encodes functions that counter damage to metabolites inflicted by spontaneous chemical reactions or enzymatic mistakes.
Collapse
Affiliation(s)
- Samuel M D Seaver
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
| | | | | |
Collapse
|
286
|
Comparative genomics of enterococci: variation in Enterococcus faecalis, clade structure in E. faecium, and defining characteristics of E. gallinarum and E. casseliflavus. mBio 2012; 3:e00318-11. [PMID: 22354958 PMCID: PMC3374389 DOI: 10.1128/mbio.00318-11] [Citation(s) in RCA: 231] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
The enterococci are Gram-positive lactic acid bacteria that inhabit the gastrointestinal tracts of diverse hosts. However, Enterococcus faecium and E. faecalis have emerged as leading causes of multidrug-resistant hospital-acquired infections. The mechanism by which a well-adapted commensal evolved into a hospital pathogen is poorly understood. In this study, we examined high-quality draft genome data for evidence of key events in the evolution of the leading causes of enterococcal infections, including E. faecalis, E. faecium, E. casseliflavus, and E. gallinarum. We characterized two clades within what is currently classified as E. faecium and identified traits characteristic of each, including variation in operons for cell wall carbohydrate and putative capsule biosynthesis. We examined the extent of recombination between the two E. faecium clades and identified two strains with mosaic genomes. We determined the underlying genetics for the defining characteristics of the motile enterococci E. casseliflavus and E. gallinarum. Further, we identified species-specific traits that could be used to advance the detection of medically relevant enterococci and their identification to the species level. The enterococci, in particular, vancomycin-resistant enterococci, have emerged as leading causes of multidrug-resistant hospital-acquired infections. In this study, we examined genome sequence data to define traits with the potential to influence host-microbe interactions and to identify sequences and biochemical functions that could form the basis for the rapid identification of enterococcal species or lineages of importance in clinical and environmental samples.
Collapse
|
287
|
Tanz SK, Kilian J, Johnsson C, Apel K, Small I, Harter K, Wanke D, Pogson B, Albrecht V. The SCO2 protein disulphide isomerase is required for thylakoid biogenesis and interacts with LHCB1 chlorophyll a/b binding proteins which affects chlorophyll biosynthesis in Arabidopsis seedlings. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2012; 69:743-54. [PMID: 22040291 DOI: 10.1111/j.1365-313x.2011.04833.x] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
The process of chloroplast biogenesis requires a multitude of pathways and processes to establish chloroplast function. In cotyledons of seedlings, chloroplasts develop either directly from proplastids (also named eoplasts) or, if germinated in the dark, via etioplasts, whereas in leaves chloroplasts derive from proplastids in the apical meristem and are then multiplied by division. The snowy cotyledon 2, sco2, mutations specifically disrupt chloroplast biogenesis in cotyledons. SCO2 encodes a chloroplast-localized protein disulphide isomerase, hypothesized to be involved in protein folding. Analysis of co-expressed genes with SCO2 revealed that genes with similar expression patterns encode chloroplast proteins involved in protein translation and in chlorophyll biosynthesis. Indeed, sco2-1 accumulates increased levels of the chlorophyll precursor, protochlorophyllide, in both dark grown cotyledons and leaves. Yeast two-hybrid analyses demonstrated that SCO2 directly interacts with the chlorophyll-binding LHCB1 proteins, being confirmed in planta using bimolecular fluorescence complementation (BIFC). Furthermore, ultrastructural analysis of sco2-1 chloroplasts revealed that formation and movement of transport vesicles from the inner envelope to the thylakoids is perturbed. SCO2 does not interact with the signal recognition particle proteins SRP54 and FtsY, which were shown to be involved in targeting of LHCB1 to the thylakoids. We hypothesize that SCO2 provides an alternative targeting pathway for light-harvesting chlorophyll binding (LHCB) proteins to the thylakoids via transport vesicles predominantly in cotyledons, with the signal recognition particle (SRP) pathway predominant in rosette leaves. Therefore, we propose that SCO2 is involved in the integration of LHCB1 proteins into the thylakoids that feeds back on the regulation of the tetrapyrrole biosynthetic pathway and nuclear gene expression.
Collapse
Affiliation(s)
- Sandra K Tanz
- ARC Centre of Excellence in Plant Energy Biology, University of Western Australia, 35 Stirling Highway, Crawley, Western Australia, Australia
| | | | | | | | | | | | | | | | | |
Collapse
|
288
|
Bains W, Seager S. A combinatorial approach to biochemical space: description and application to the redox distribution of metabolism. ASTROBIOLOGY 2012; 12:271-81. [PMID: 22468888 DOI: 10.1089/ast.2011.0718] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Redox chemistry is central to life on Earth. It is well known that life uses redox chemistry to capture energy from environmental chemical energy gradients. Here, we propose that a second use of redox chemistry, related to building biomass from environmental carbon, is equally important to life. We apply a method based on chemical structure to evaluate the redox range of different groups of terrestrial biochemicals, and find that they are consistently of intermediate redox range. We hypothesize the common intermediate range is related to the chemical space required for the selection of a consistent set of metabolites. We apply a computational method to show that the redox range of the chemical space shows the same restricted redox range as the biochemicals that are selected from that space. By contrast, the carbon from which life is composed is available in the environment only as fully oxidized or reduced species. We therefore argue that redox chemistry is essential to life for assembling biochemicals for biomass building. This biomass-building reason for life to require redox chemistry is in addition (and in contrast) to life's use of redox chemistry to capture energy. Life's use of redox chemistry for biomass capture will generate chemical by-products-that is, biosignature gases-that are not in redox equilibrium with life's environment. These potential biosignature gases may differ from energy-capture redox biosignatures.
Collapse
Affiliation(s)
- William Bains
- Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, USA.
| | | |
Collapse
|
289
|
Strickler SR, Bombarely A, Mueller LA. Designing a transcriptome next-generation sequencing project for a nonmodel plant species. AMERICAN JOURNAL OF BOTANY 2012; 99:257-66. [PMID: 22268224 DOI: 10.3732/ajb.1100292] [Citation(s) in RCA: 133] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
The application of next-generation sequencing (NGS) to transcriptomics, commonly called RNA-seq, allows the nearly complete characterization of transcriptomic events occurring in a specific tissue. It has proven particularly useful in nonmodel species, which often lack the resources available for sequenced organisms. Mainly, RNA-seq does not require a reference genome to gain useful transcriptomic information. In this review, the application of RNA-seq to nonmodel plant species will be addressed. Important experimental considerations from presequencing issues to postsequencing analysis, including sample and platform selection, and useful bioinformatics tools for assembly and data analysis, are covered. Methods of assembling RNA-seq data and analyses commonly performed with RNA-seq data, including single nucleotide polymorphism detection and analysis of differential expression, are explored. In addition, studies that have used RNA-seq to elucidate nonmodel plant transcriptomics are highlighted.
Collapse
Affiliation(s)
- Susan R Strickler
- Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, New York 14853, USA
| | | | | |
Collapse
|
290
|
Latendresse M, Krummenacker M, Trupp M, Karp PD. Construction and completion of flux balance models from pathway databases. ACTA ACUST UNITED AC 2012; 28:388-96. [PMID: 22262672 PMCID: PMC3268246 DOI: 10.1093/bioinformatics/btr681] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Flux balance analysis (FBA) is a well-known technique for genome-scale modeling of metabolic flux. Typically, an FBA formulation requires the accurate specification of four sets: biochemical reactions, biomass metabolites, nutrients and secreted metabolites. The development of FBA models can be time consuming and tedious because of the difficulty in assembling completely accurate descriptions of these sets, and in identifying errors in the composition of these sets. For example, the presence of a single non-producible metabolite in the biomass will make the entire model infeasible. Other difficulties in FBA modeling are that model distributions, and predicted fluxes, can be cryptic and difficult to understand. RESULTS We present a multiple gap-filling method to accelerate the development of FBA models using a new tool, called MetaFlux, based on mixed integer linear programming (MILP). The method suggests corrections to the sets of reactions, biomass metabolites, nutrients and secretions. The method generates FBA models directly from Pathway/Genome Databases. Thus, FBA models developed in this framework are easily queried and visualized using the Pathway Tools software. Predicted fluxes are more easily comprehended by visualizing them on diagrams of individual metabolic pathways or of metabolic maps. MetaFlux can also remove redundant high-flux loops, solve FBA models once they are generated and model the effects of gene knockouts. MetaFlux has been validated through construction of FBA models for Escherichia coli and Homo sapiens. AVAILABILITY Pathway Tools with MetaFlux is freely available to academic users, and for a fee to commercial users. Download from: biocyc.org/download.shtml. CONTACT mario.latendresse@sri.com SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mario Latendresse
- Bioinformatics Research Group/Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA.
| | | | | | | |
Collapse
|
291
|
Syed MH, Karpinets TV, Parang M, Leuze MR, Park BH, Hyatt D, Brown SD, Moulton S, Galloway MD, Uberbacher EC. BESC knowledgebase public portal. Bioinformatics 2012; 28:750-1. [PMID: 22238270 PMCID: PMC3289919 DOI: 10.1093/bioinformatics/bts016] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
UNLABELLED The BioEnergy Science Center (BESC) is undertaking large experimental campaigns to understand the biosynthesis and biodegradation of biomass and to develop biofuel solutions. BESC is generating large volumes of diverse data, including genome sequences, omics data and assay results. The purpose of the BESC Knowledgebase is to serve as a centralized repository for experimentally generated data and to provide an integrated, interactive and user-friendly analysis framework. The Portal makes available tools for visualization, integration and analysis of data either produced by BESC or obtained from external resources. AVAILABILITY http://besckb.ornl.gov.
Collapse
Affiliation(s)
- Mustafa H Syed
- BioEnergy Science Center, Oak Ridge National Laboratory, Oak Ridge, TN, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
292
|
Kumar A, Suthers PF, Maranas CD. MetRxn: a knowledgebase of metabolites and reactions spanning metabolic models and databases. BMC Bioinformatics 2012; 13:6. [PMID: 22233419 PMCID: PMC3277463 DOI: 10.1186/1471-2105-13-6] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2011] [Accepted: 01/10/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Increasingly, metabolite and reaction information is organized in the form of genome-scale metabolic reconstructions that describe the reaction stoichiometry, directionality, and gene to protein to reaction associations. A key bottleneck in the pace of reconstruction of new, high-quality metabolic models is the inability to directly make use of metabolite/reaction information from biological databases or other models due to incompatibilities in content representation (i.e., metabolites with multiple names across databases and models), stoichiometric errors such as elemental or charge imbalances, and incomplete atomistic detail (e.g., use of generic R-group or non-explicit specification of stereo-specificity). DESCRIPTION MetRxn is a knowledgebase that includes standardized metabolite and reaction descriptions by integrating information from BRENDA, KEGG, MetaCyc, Reactome.org and 44 metabolic models into a single unified data set. All metabolite entries have matched synonyms, resolved protonation states, and are linked to unique structures. All reaction entries are elementally and charge balanced. This is accomplished through the use of a workflow of lexicographic, phonetic, and structural comparison algorithms. MetRxn allows for the download of standardized versions of existing genome-scale metabolic models and the use of metabolic information for the rapid reconstruction of new ones. CONCLUSIONS The standardization in description allows for the direct comparison of the metabolite and reaction content between metabolic models and databases and the exhaustive prospecting of pathways for biotechnological production. This ever-growing dataset currently consists of over 76,000 metabolites participating in more than 72,000 reactions (including unresolved entries). MetRxn is hosted on a web-based platform that uses relational database models (MySQL).
Collapse
Affiliation(s)
- Akhil Kumar
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802, USA.
| | | | | |
Collapse
|
293
|
Toya Y, Kono N, Arakawa K, Tomita M. Metabolic flux analysis and visualization. J Proteome Res 2012; 10:3313-23. [PMID: 21815690 DOI: 10.1021/pr2002885] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
One of the ultimate goals of systems biology research is to obtain a comprehensive understanding of the control mechanisms of complex cellular metabolisms. Metabolic Flux Analysis (MFA) is a important method for the quantitative estimation of intracellular metabolic flows through metabolic pathways and the elucidation of cellular physiology. The primary challenge in the use of MFA is that many biological networks are underdetermined systems; it is therefore difficult to narrow down the solution space from the stoichiometric constraints alone. In this tutorial, we present an overview of Flux Balance Analysis (FBA) and (13)C-Metabolic Flux Analysis ((13)C-MFA), both of which are frequently used to solve such underdetermined systems, and we demonstrate FBA and (13)C-MFA using the genome-scale model and the central carbon metabolism model, respectively. Furthermore, because such comprehensive study of intracellular fluxes is inherently complex, we subsequently introduce various pathway mapping and visualization tools to facilitate understanding of these data in the context of the pathways. Specific visualization of MFA results using the BioCyc Omics Viewer and Pathway Projector are shown as illustrative examples.
Collapse
Affiliation(s)
- Yoshihiro Toya
- Institute for Advanced Biosciences, Keio University, Tsuruoka 997-0017, Japan
| | | | | | | |
Collapse
|
294
|
Saunders EC, MacRae JI, Naderer T, Ng M, McConville MJ, Likić VA. LeishCyc: a guide to building a metabolic pathway database and visualization of metabolomic data. Methods Mol Biol 2012; 881:505-529. [PMID: 22639224 DOI: 10.1007/978-1-61779-827-6_17] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
The complexity of the metabolic networks in even the simplest organisms has raised new challenges in organizing metabolic information. To address this, specialized computer frameworks have been developed to capture, manage, and visualize metabolic knowledge. The leading databases of metabolic information are those organized under the umbrella of the BioCyc project, which consists of the reference database MetaCyc, and a number of pathway/genome databases (PGDBs) each focussed on a specific organism. A number of PGDBs have been developed for bacterial, fungal, and protozoan pathogens, greatly facilitating dissection of the metabolic potential of these organisms and the identification of new drug targets. Leishmania are protozoan parasites belonging to the family Trypanosomatidae that cause a broad spectrum of diseases in humans. In this work we use the LeishCyc database, the BioCyc database for Leishmania major, to describe how to build a BioCyc database from genomic sequences and associated annotations. By using metabolomic data generated in our group, we show how such databases can be utilized to elucidate specific changes in parasite metabolism.
Collapse
Affiliation(s)
- Eleanor C Saunders
- Department of Biochemistry and Molecular Biology, University of Melbourne, Parkville, VIC, Australia
| | | | | | | | | | | |
Collapse
|
295
|
Caspi R, Altman T, Dreher K, Fulcher CA, Subhraveti P, Keseler IM, Kothari A, Krummenacker M, Latendresse M, Mueller LA, Ong Q, Paley S, Pujar A, Shearer AG, Travers M, Weerasinghe D, Zhang P, Karp PD. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 2012; 40:D742-53. [PMID: 22102576 PMCID: PMC3245006 DOI: 10.1093/nar/gkr1014] [Citation(s) in RCA: 435] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2011] [Revised: 10/19/2011] [Accepted: 10/21/2011] [Indexed: 11/14/2022] Open
Abstract
The MetaCyc database (http://metacyc.org/) provides a comprehensive and freely accessible resource for metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are experimentally determined, small-molecule metabolic pathways and are curated from the primary scientific literature. MetaCyc contains more than 1800 pathways derived from more than 30,000 publications, and is the largest curated collection of metabolic pathways currently available. Most reactions in MetaCyc pathways are linked to one or more well-characterized enzymes, and both pathways and enzymes are annotated with reviews, evidence codes and literature citations. BioCyc (http://biocyc.org/) is a collection of more than 1700 organism-specific Pathway/Genome Databases (PGDBs). Each BioCyc PGDB contains the full genome and predicted metabolic network of one organism. The network, which is predicted by the Pathway Tools software using MetaCyc as a reference database, consists of metabolites, enzymes, reactions and metabolic pathways. BioCyc PGDBs contain additional features, including predicted operons, transport systems and pathway-hole fillers. The BioCyc website and Pathway Tools software offer many tools for querying and analysis of PGDBs, including Omics Viewers and comparative analysis. New developments include a zoomable web interface for diagrams; flux-balance analysis model generation from PGDBs; web services; and a new tool called Web Groups.
Collapse
Affiliation(s)
- Ron Caspi
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Tomer Altman
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Kate Dreher
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Carol A. Fulcher
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Pallavi Subhraveti
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Ingrid M. Keseler
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Anamika Kothari
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Markus Krummenacker
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Mario Latendresse
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Lukas A. Mueller
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Quang Ong
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Suzanne Paley
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Anuradha Pujar
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Alexander G. Shearer
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Michael Travers
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Deepika Weerasinghe
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Peifen Zhang
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | - Peter D. Karp
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, Department of Plant Biology, Carnegie Institution, 260 Panama Street, Stanford, CA 94305 and Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| |
Collapse
|
296
|
Gerdtzen ZP. Modeling metabolic networks for mammalian cell systems: general considerations, modeling strategies, and available tools. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2012; 127:71-108. [PMID: 21984615 DOI: 10.1007/10_2011_120] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Over the past decades, the availability of large amounts of information regarding cellular processes and reaction rates, along with increasing knowledge about the complex mechanisms involved in these processes, has changed the way we approach the understanding of cellular processes. We can no longer rely only on our intuition for interpreting experimental data and evaluating new hypotheses, as the information to analyze is becoming increasingly complex. The paradigm for the analysis of cellular systems has shifted from a focus on individual processes to comprehensive global mathematical descriptions that consider the interactions of metabolic, genomic, and signaling networks. Analysis and simulations are used to test our knowledge by refuting or validating new hypotheses regarding a complex system, which can result in predictive capabilities that lead to better experimental design. Different types of models can be used for this purpose, depending on the type and amount of information available for the specific system. Stoichiometric models are based on the metabolic structure of the system and allow explorations of steady state distributions in the network. Detailed kinetic models provide a description of the dynamics of the system, they involve a large number of reactions with varied kinetic characteristics and require a large number of parameters. Models based on statistical information provide a description of the system without information regarding structure and interactions of the networks involved. The development of detailed models for mammalian cell metabolism has only recently started to grow more strongly, due to the intrinsic complexities of mammalian systems, and the limited availability of experimental information and adequate modeling tools. In this work we review the strategies, tools, current advances, and recent models of mammalian cells, focusing mainly on metabolism, but discussing the methodology applied to other types of networks as well.
Collapse
Affiliation(s)
- Ziomara P Gerdtzen
- Department of Chemical Engineering and Biotechnology, Millennium Institute for Cell Dynamics and Biotechnology: a Centre for Systems Biology, University of Chile, Beauchef 850, Santiago, Chile,
| |
Collapse
|
297
|
Navid A. Development of constraint-based system-level models of microbial metabolism. Methods Mol Biol 2012; 881:531-549. [PMID: 22639225 DOI: 10.1007/978-1-61779-827-6_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Genome-scale models of metabolism are valuable tools for using genomic information to predict microbial phenotypes. System-level mathematical models of metabolic networks have been developed for a number of microbes and have been used to gain new insights into the biochemical conversions that occur within organisms and permit their survival and proliferation. Utilizing these models, computational biologists can (1) examine network structures, (2) predict metabolic capabilities and resolve unexplained experimental observations, (3) generate and test new hypotheses, (4) assess the nutritional requirements of the organism and approximate its environmental niche, (5) identify missing enzymatic functions in the annotated genome, and (6) engineer desired metabolic capabilities in model organisms. This chapter details the protocol for developing genome-scale models of metabolism in microbes as well as tips for accelerating the model building process.
Collapse
Affiliation(s)
- Ali Navid
- Biosciences and Biotechnology Division, Physics and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA.
| |
Collapse
|
298
|
Latendresse M, Paley S, Karp PD. Browsing metabolic and regulatory networks with BioCyc. Methods Mol Biol 2012; 804:197-216. [PMID: 22144155 DOI: 10.1007/978-1-61779-361-5_11] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The BioCyc database collection at BioCyc.org integrates genome and cellular network information for more than 1,100 organisms. This method chapter describes Web-based tools for browsing metabolic and regulatory networks within BioCyc. These tools allow visualization of complete metabolic and regulatory networks, and allow the user to zoom-in on regions of the network of interest. The user can find objects of interest such as genes and metabolites within the networks, and can selectively examine the connectivity of the network. The EcoCyc database within the BioCyc collection has been extensively curated. The descriptions within EcoCyc of the Escherichia coli metabolic network and regulatory network were derived from thousands of publications. Other BioCyc databases received moderate levels of curation, or no curation at all. Those databases receiving no curation contain metabolic networks that were computationally inferred from the annotated genome sequences of each organism.
Collapse
|
299
|
Abstract
The PathoLogic component of the Pathway Tools software performs prediction of metabolic pathways in sequenced and annotated genomes. This article provides a detailed presentation of the PathoLogic algorithm. The algorithm consists of two phases. The reactome inference phase infers the reactions catalyzed by the organism from the set of enzymes present in the annotated genome. The pathway inference phase infers the metabolic pathways present in the organism from the reactions catalyzed by the organism. Both phases draw on the MetaCyc database of metabolic reactions and pathways. MetaCyc contains two data fields to support pathway inference: the expected taxonomic range of each pathway, and a list of key reactions for pathways. These fields have significantly increased the predictive accuracy of PathoLogic.
Collapse
Affiliation(s)
- Peter D Karp
- Bioinformatics Research Group, SRI International 333 Ravenswood Ave, EK207, Menlo Park, CA 94025
| | | | | |
Collapse
|
300
|
Cattle genomics and its implications for future nutritional strategies for dairy cattle. Animal 2011; 7 Suppl 1:172-83. [PMID: 23031138 DOI: 10.1017/s1751731111002588] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The recently sequenced cattle (Bos taurus) genome unraveled the unique genomic features of the species and provided the molecular basis for applying a systemic approach to systematically link genomic information to metabolic traits. Comparative analysis has identified a variety of evolutionary adaptive features in the cattle genome, such as an expansion of the gene families related to the rumen function, large number of chromosomal rearrangements affecting regulation of genes for lactation, and chromosomal rearrangements that are associated with segmental duplications and copy number variations. Metabolic reconstruction of the cattle genome has revealed that core metabolic pathways are highly conserved among mammals although five metabolic genes are deleted or highly diverged and seven metabolic genes are present in duplicate in the cattle genome compared to their human counter parts. The evolutionary loss and gain of metabolic genes in the cattle genome may reflect metabolic adaptations of cattle. Metabolic reconstruction also provides a platform for better understanding of metabolic regulation in cattle and ruminants. A substantial body of transcriptomics data from dairy and beef cattle under different nutritional management and across different stages of growth and lactation are already available and will aid in linking the genome with metabolism and nutritional physiology of cattle. Application of cattle genomics has great potential for future development of nutritional strategies to improve efficiency and sustainability of beef and milk production. One of the biggest challenges is to integrate genomic and phenotypic data and interpret them in a biological and practical platform. Systems biology, a holistic and systemic approach, will be very useful in overcoming this challenge.
Collapse
|