1
|
Hai Y, Ma J, Yang K, Wen Y. Bayesian linear mixed model with multiple random effects for prediction analysis on high-dimensional multi-omics data. Bioinformatics 2023; 39:btad647. [PMID: 37882747 PMCID: PMC10627352 DOI: 10.1093/bioinformatics/btad647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 09/24/2023] [Accepted: 10/24/2023] [Indexed: 10/27/2023] Open
Abstract
MOTIVATION Accurate disease risk prediction is an essential step in the modern quest for precision medicine. While high-dimensional multi-omics data have provided unprecedented data resources for prediction studies, their high-dimensionality and complex inter/intra-relationships have posed significant analytical challenges. RESULTS We proposed a two-step Bayesian linear mixed model framework (TBLMM) for risk prediction analysis on multi-omics data. TBLMM models the predictive effects from multi-omics data using a hybrid of the sparsity regression and linear mixed model with multiple random effects. It can resemble the shape of the true effect size distributions and accounts for non-linear, including interaction effects, among multi-omics data via kernel fusion. It infers its parameters via a computationally efficient variational Bayes algorithm. Through extensive simulation studies and the prediction analyses on the positron emission tomography imaging outcomes using data obtained from the Alzheimer's Disease Neuroimaging Initiative, we have demonstrated that TBLMM can consistently outperform the existing method in predicting the risk of complex traits. AVAILABILITY AND IMPLEMENTATION The corresponding R package is available on GitHub (https://github.com/YaluWen/TBLMM).
Collapse
Affiliation(s)
- Yang Hai
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province 030000, China
- Department of Statistics, University of Auckland, Auckland 1010, New Zealand
| | - Jixiang Ma
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province 030000, China
| | - Kaixin Yang
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province 030000, China
| | - Yalu Wen
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province 030000, China
- Department of Statistics, University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
2
|
Mori M, Cheng C, Taylor BR, Okano H, Hwa T. Functional decomposition of metabolism allows a system-level quantification of fluxes and protein allocation towards specific metabolic functions. Nat Commun 2023; 14:4161. [PMID: 37443156 PMCID: PMC10345195 DOI: 10.1038/s41467-023-39724-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 06/20/2023] [Indexed: 07/15/2023] Open
Abstract
Quantifying the contribution of individual molecular components to complex cellular processes is a grand challenge in systems biology. Here we establish a general theoretical framework (Functional Decomposition of Metabolism, FDM) to quantify the contribution of every metabolic reaction to metabolic functions, e.g. the synthesis of biomass building blocks. FDM allowed for a detailed quantification of the energy and biosynthesis budget for growing Escherichia coli cells. Surprisingly, the ATP generated during the biosynthesis of building blocks from glucose almost balances the demand from protein synthesis, the largest energy expenditure known for growing cells. This leaves the bulk of the energy generated by fermentation and respiration unaccounted for, thus challenging the common notion that energy is a key growth-limiting resource. Moreover, FDM together with proteomics enables the quantification of enzymes contributing towards each metabolic function, allowing for a first-principle formulation of a coarse-grained model of global protein allocation based on the structure of the metabolic network.
Collapse
Affiliation(s)
- Matteo Mori
- Department of Physics, University of California San Diego, 9500 Gilman Dr. La Jolla, San Diego, CA, 92093, USA.
| | - Chuankai Cheng
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Brian R Taylor
- Department of Physics, University of California San Diego, 9500 Gilman Dr. La Jolla, San Diego, CA, 92093, USA
| | - Hiroyuki Okano
- Department of Physics, University of California San Diego, 9500 Gilman Dr. La Jolla, San Diego, CA, 92093, USA
| | - Terence Hwa
- Department of Physics, University of California San Diego, 9500 Gilman Dr. La Jolla, San Diego, CA, 92093, USA
| |
Collapse
|
3
|
Thomas JP, Modos D, Korcsmaros T, Brooks-Warburton J. Network Biology Approaches to Achieve Precision Medicine in Inflammatory Bowel Disease. Front Genet 2021; 12:760501. [PMID: 34745229 PMCID: PMC8566351 DOI: 10.3389/fgene.2021.760501] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 10/08/2021] [Indexed: 12/22/2022] Open
Abstract
Inflammatory bowel disease (IBD) is a chronic immune-mediated condition arising due to complex interactions between multiple genetic and environmental factors. Despite recent advances, the pathogenesis of the condition is not fully understood and patients still experience suboptimal clinical outcomes. Over the past few years, investigators are increasingly capturing multi-omics data from patient cohorts to better characterise the disease. However, reaching clinically translatable endpoints from these complex multi-omics datasets is an arduous task. Network biology, a branch of systems biology that utilises mathematical graph theory to represent, integrate and analyse biological data through networks, will be key to addressing this challenge. In this narrative review, we provide an overview of various types of network biology approaches that have been utilised in IBD including protein-protein interaction networks, metabolic networks, gene regulatory networks and gene co-expression networks. We also include examples of multi-layered networks that have combined various network types to gain deeper insights into IBD pathogenesis. Finally, we discuss the need to incorporate other data sources including metabolomic, histopathological, and high-quality clinical meta-data. Together with more robust network data integration and analysis frameworks, such efforts have the potential to realise the key goal of precision medicine in IBD.
Collapse
Affiliation(s)
- John P Thomas
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
- Department of Gastroenterology, Norfolk and Norwich University Hospital, Norwich, United Kingdom
| | - Dezso Modos
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Tamas Korcsmaros
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Johanne Brooks-Warburton
- Department of Gastroenterology, Lister Hospital, Stevenage, United Kingdom
- Department of Clinical, Pharmaceutical and Biological Sciences, University of Hertfordshire, Hatfield, United Kingdom
| |
Collapse
|
4
|
Bardozzo F, Lió P, Tagliaferri R. Signal metrics analysis of oscillatory patterns in bacterial multi-omic networks. Bioinformatics 2021; 37:1411-1419. [PMID: 33185666 DOI: 10.1093/bioinformatics/btaa966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 09/25/2020] [Accepted: 11/03/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION One of the branches of Systems Biology is focused on a deep understanding of underlying regulatory networks through the analysis of the biomolecules oscillations and their interplay. Synthetic Biology exploits gene or/and protein regulatory networks towards the design of oscillatory networks for producing useful compounds. Therefore, at different levels of application and for different purposes, the study of biomolecular oscillations can lead to different clues about the mechanisms underlying living cells. It is known that network-level interactions involve more than one type of biomolecule as well as biological processes operating at multiple omic levels. Combining network/pathway-level information with genetic information it is possible to describe well-understood or unknown bacterial mechanisms and organism-specific dynamics. RESULTS Following the methodologies used in signal processing and communication engineering, a methodology is introduced to identify and quantify the extent of multi-omic oscillations. These are due to the process of multi-omic integration and depend on the gene positions on the chromosome. Ad hoc signal metrics are designed to allow further biotechnological explanations and provide important clues about the oscillatory nature of the pathways and their regulatory circuits. Our algorithms designed for the analysis of multi-omic signals are tested and validated on 11 different bacteria for thousands of multi-omic signals perturbed at the network level by different experimental conditions. Information on the order of genes, codon usage, gene expression and protein molecular weight is integrated at three different functional levels. Oscillations show interesting evidence that network-level multi-omic signals present a synchronized response to perturbations and evolutionary relations along taxa. AVAILABILITY AND IMPLEMENTATION The algorithms, the code (in language R), the tool, the pipeline and the whole dataset of multi-omic signal metrics are available at: https://github.com/lodeguns/Multi-omicSignals. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Pietro Lió
- Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | | |
Collapse
|
5
|
An Integrated Approach to Adaptive Control and Supervisory Optimisation of HVAC Control Systems for Demand Response Applications. ENERGIES 2021. [DOI: 10.3390/en14082078] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Heating, ventilating, and air-conditioning (HVAC) systems account for a large percentage of energy consumption in buildings. Implementation of efficient optimisation and control mechanisms has been identified as one crucial way to help reduce and shift HVAC systems’ energy consumption to both save economic costs and foster improved integration with renewables. This has led to the development of various control techniques, some of which have produced promising results. However, very few of these control mechanisms have fully considered important factors such as electricity time of use (TOU) price information, occupant thermal comfort, computational complexity, and nonlinear HVAC dynamics to design a demand response schema. In this paper, a novel two-stage integrated approach for such is proposed and evaluated. A model predictive control (MPC)-based optimiser for supervisory setpoint control is integrated with a digital parameter-adaptive controller for use in a demand response/demand management environment. The optimiser is designed to shift the heating load (and hence electrical load) to off-peak periods by minimising a trade-off between thermal comfort and electricity costs, generating a setpoint trajectory for the inner loop HVAC tracking controller. The tracking controller provides HVAC model information to the outer loop for calibration purposes. By way of calibrated simulations, it was found that significant energy saving and cost reduction could be achieved in comparison to a traditional on/off or variable HVAC control system with a fixed setpoint temperature.
Collapse
|
6
|
Patra P, Das M, Kundu P, Ghosh A. Recent advances in systems and synthetic biology approaches for developing novel cell-factories in non-conventional yeasts. Biotechnol Adv 2021; 47:107695. [PMID: 33465474 DOI: 10.1016/j.biotechadv.2021.107695] [Citation(s) in RCA: 100] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Revised: 12/14/2020] [Accepted: 01/09/2021] [Indexed: 12/14/2022]
Abstract
Microbial bioproduction of chemicals, proteins, and primary metabolites from cheap carbon sources is currently an advancing area in industrial research. The model yeast, Saccharomyces cerevisiae, is a well-established biorefinery host that has been used extensively for commercial manufacturing of bioethanol from myriad carbon sources. However, its Crabtree-positive nature often limits the use of this organism for the biosynthesis of commercial molecules that do not belong in the fermentative pathway. To avoid extensive strain engineering of S. cerevisiae for the production of metabolites other than ethanol, non-conventional yeasts can be selected as hosts based on their natural capacity to produce desired commodity chemicals. Non-conventional yeasts like Kluyveromyces marxianus, K. lactis, Yarrowia lipolytica, Pichia pastoris, Scheffersomyces stipitis, Hansenula polymorpha, and Rhodotorula toruloides have been considered as potential industrial eukaryotic hosts owing to their desirable phenotypes such as thermotolerance, assimilation of a wide range of carbon sources, as well as ability to secrete high titers of protein and lipid. However, the advanced metabolic engineering efforts in these organisms are still lacking due to the limited availability of systems and synthetic biology methods like in silico models, well-characterised genetic parts, and optimized genome engineering tools. This review provides an insight into the recent advances and challenges of systems and synthetic biology as well as metabolic engineering endeavours towards the commercial usage of non-conventional yeasts. Particularly, the approaches in emerging non-conventional yeasts for the production of enzymes, therapeutic proteins, lipids, and metabolites for commercial applications are extensively discussed here. Various attempts to address current limitations in designing novel cell factories have been highlighted that include the advances in the fields of genome-scale metabolic model reconstruction, flux balance analysis, 'omics'-data integration into models, genome-editing toolkit development, and rewiring of cellular metabolisms for desired chemical production. Additionally, the understanding of metabolic networks using 13C-labelling experiments as well as the utilization of metabolomics in deciphering intracellular fluxes and reactions have also been discussed here. Application of cutting-edge nuclease-based genome editing platforms like CRISPR/Cas9, and its optimization towards efficient strain engineering in non-conventional yeasts have also been described. Additionally, the impact of the advances in promising non-conventional yeasts for efficient commercial molecule synthesis has been meticulously reviewed. In the future, a cohesive approach involving systems and synthetic biology will help in widening the horizon of the use of unexplored non-conventional yeast species towards industrial biotechnology.
Collapse
Affiliation(s)
- Pradipta Patra
- School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Manali Das
- School of Bioscience, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Pritam Kundu
- School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Amit Ghosh
- School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India; P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India.
| |
Collapse
|
7
|
A review of methods for the reconstruction and analysis of integrated genome-scale models of metabolism and regulation. Biochem Soc Trans 2020; 48:1889-1903. [DOI: 10.1042/bst20190840] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 07/16/2020] [Accepted: 08/21/2020] [Indexed: 02/07/2023]
Abstract
The current survey aims to describe the main methodologies for extending the reconstruction and analysis of genome-scale metabolic models and phenotype simulation with Flux Balance Analysis mathematical frameworks, via the integration of Transcriptional Regulatory Networks and/or gene expression data. Although the surveyed methods are aimed at improving phenotype simulations obtained from these models, the perspective of reconstructing integrated genome-scale models of metabolism and gene expression for diverse prokaryotes is still an open challenge.
Collapse
|
8
|
Frioux C, Singh D, Korcsmaros T, Hildebrand F. From bag-of-genes to bag-of-genomes: metabolic modelling of communities in the era of metagenome-assembled genomes. Comput Struct Biotechnol J 2020; 18:1722-1734. [PMID: 32670511 PMCID: PMC7347713 DOI: 10.1016/j.csbj.2020.06.028] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2020] [Revised: 06/16/2020] [Accepted: 06/17/2020] [Indexed: 12/12/2022] Open
Abstract
Metagenomic sequencing of complete microbial communities has greatly enhanced our understanding of the taxonomic composition of microbiotas. This has led to breakthrough developments in bioinformatic disciplines such as assembly, gene clustering, metagenomic binning of species genomes and the discovery of an incredible, so far undiscovered, taxonomic diversity. However, functional annotations and estimating metabolic processes from single species - or communities - is still challenging. Earlier approaches relied mostly on inferring the presence of key enzymes for metabolic pathways in the whole metagenome, ignoring the genomic context of such enzymes, resulting in the 'bag-of-genes' approach to estimate functional capacities of microbiotas. Here, we review recent developments in metagenomic bioinformatics, with a special focus on emerging technologies to simulate and estimate metabolic information, that can be derived from metagenomic assembled genomes. Genome-scale metabolic models can be used to model the emergent properties of microbial consortia and whole communities, and the progress in this area is reviewed. While this subfield of metagenomics is still in its infancy, it is becoming evident that there is a dire need for further bioinformatic tools to address the complex combinatorial problems in modelling the metabolism of large communities as a 'bag-of-genomes'.
Collapse
Affiliation(s)
- Clémence Frioux
- Inria, CNRS, INRAE Bordeaux, France
- Gut Microbes and Health, Quadram Institute Bioscience, Norwich, Norfolk, UK
| | - Dipali Singh
- Microbes in the Food Chain, Quadram Institute Bioscience, Norwich, Norfolk, UK
| | - Tamas Korcsmaros
- Gut Microbes and Health, Quadram Institute Bioscience, Norwich, Norfolk, UK
- Digital Biology, Earlham Institute, Norwich, Norfolk, UK
| | - Falk Hildebrand
- Gut Microbes and Health, Quadram Institute Bioscience, Norwich, Norfolk, UK
- Digital Biology, Earlham Institute, Norwich, Norfolk, UK
| |
Collapse
|
9
|
Ma T, Zhang A. Integrate multi-omics data with biological interaction networks using Multi-view Factorization AutoEncoder (MAE). BMC Genomics 2019; 20:944. [PMID: 31856727 PMCID: PMC6923820 DOI: 10.1186/s12864-019-6285-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Comprehensive molecular profiling of various cancers and other diseases has generated vast amounts of multi-omics data. Each type of -omics data corresponds to one feature space, such as gene expression, miRNA expression, DNA methylation, etc. Integrating multi-omics data can link different layers of molecular feature spaces and is crucial to elucidate molecular pathways underlying various diseases. Machine learning approaches to mining multi-omics data hold great promises in uncovering intricate relationships among molecular features. However, due to the "big p, small n" problem (i.e., small sample sizes with high-dimensional features), training a large-scale generalizable deep learning model with multi-omics data alone is very challenging. RESULTS We developed a method called Multi-view Factorization AutoEncoder (MAE) with network constraints that can seamlessly integrate multi-omics data and domain knowledge such as molecular interaction networks. Our method learns feature and patient embeddings simultaneously with deep representation learning. Both feature representations and patient representations are subject to certain constraints specified as regularization terms in the training objective. By incorporating domain knowledge into the training objective, we implicitly introduced a good inductive bias into the machine learning model, which helps improve model generalizability. We performed extensive experiments on the TCGA datasets and demonstrated the power of integrating multi-omics data and biological interaction networks using our proposed method for predicting target clinical variables. CONCLUSIONS To alleviate the overfitting problem in deep learning on multi-omics data with the "big p, small n" problem, it is helpful to incorporate biological domain knowledge into the model as inductive biases. It is very promising to design machine learning models that facilitate the seamless integration of large-scale multi-omics data and biomedical domain knowledge for uncovering intricate relationships among molecular features and clinical features.
Collapse
Affiliation(s)
- Tianle Ma
- Department of Computer Science and Engineering, University at Buffalo, 338 Davis Hall, Buffalo, 14260 NY USA
| | - Aidong Zhang
- Department of Computer Science, University of Virginia, 509 Rice Hall, Charlottesville, 22904 VA USA
| |
Collapse
|
10
|
Torkhovskaya TI, Zakharova TS, Korotkevich EI, Ipatova OM, Markin SS. Human Blood Plasma Lipidome: Opportunities and Prospects of Its Analysis in Medical Chemistry. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2019. [DOI: 10.1134/s106816201905011x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
11
|
Zampieri G, Vijayakumar S, Yaneske E, Angione C. Machine and deep learning meet genome-scale metabolic modeling. PLoS Comput Biol 2019; 15:e1007084. [PMID: 31295267 PMCID: PMC6622478 DOI: 10.1371/journal.pcbi.1007084] [Citation(s) in RCA: 174] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Omic data analysis is steadily growing as a driver of basic and applied molecular biology research. Core to the interpretation of complex and heterogeneous biological phenotypes are computational approaches in the fields of statistics and machine learning. In parallel, constraint-based metabolic modeling has established itself as the main tool to investigate large-scale relationships between genotype, phenotype, and environment. The development and application of these methodological frameworks have occurred independently for the most part, whereas the potential of their integration for biological, biomedical, and biotechnological research is less known. Here, we describe how machine learning and constraint-based modeling can be combined, reviewing recent works at the intersection of both domains and discussing the mathematical and practical aspects involved. We overlap systematic classifications from both frameworks, making them accessible to nonexperts. Finally, we delineate potential future scenarios, propose new joint theoretical frameworks, and suggest concrete points of investigation for this joint subfield. A multiview approach merging experimental and knowledge-driven omic data through machine learning methods can incorporate key mechanistic information in an otherwise biologically-agnostic learning process.
Collapse
Affiliation(s)
- Guido Zampieri
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Supreeta Vijayakumar
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Elisabeth Yaneske
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
- Healthcare Innovation Centre, Teesside University, Middlesbrough, United Kingdom
| |
Collapse
|
12
|
Ulva lactuca, A Source of Troubles and Potential Riches. Mar Drugs 2019; 17:md17060357. [PMID: 31207947 PMCID: PMC6627311 DOI: 10.3390/md17060357] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 06/04/2019] [Accepted: 06/12/2019] [Indexed: 01/15/2023] Open
Abstract
Ulva lactuca is a green macro alga involved in devastating green tides observed worldwide. These green tides or blooms are a consequence of human activities. Ulva blooms occur mainly in shallow waters and the decomposition of this alga can produce dangerous vapors. Ulva lactuca is a species usually resembling lettuce, but genetic analyses demonstrated that other green algae with tubular phenotypes were U. lactuca clades although previously described as different species or even genera. The capacity for U. lactuca to adopt different phenotypes can be due to environment parameters, such as the degree of water salinity or symbiosis with bacteria. No efficient ways have been discovered to control these green tides, but the Mediterranean seas appear to be protected from blooms, which disappear rapidly in springtime. Ulva contains commercially valuable components, such as bioactive compounds, food or biofuel. The biomass due to this alga collected on beaches every year is beginning to be valorized to produce valuable compounds. This review describes different processes and strategies developed to extract these different valuable components.
Collapse
|
13
|
Human Systems Biology and Metabolic Modelling: A Review-From Disease Metabolism to Precision Medicine. BIOMED RESEARCH INTERNATIONAL 2019; 2019:8304260. [PMID: 31281846 PMCID: PMC6590590 DOI: 10.1155/2019/8304260] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 02/07/2019] [Accepted: 05/20/2019] [Indexed: 01/06/2023]
Abstract
In cell and molecular biology, metabolism is the only system that can be fully simulated at genome scale. Metabolic systems biology offers powerful abstraction tools to simulate all known metabolic reactions in a cell, therefore providing a snapshot that is close to its observable phenotype. In this review, we cover the 15 years of human metabolic modelling. We show that, although the past five years have not experienced large improvements in the size of the gene and metabolite sets in human metabolic models, their accuracy is rapidly increasing. We also describe how condition-, tissue-, and patient-specific metabolic models shed light on cell-specific changes occurring in the metabolic network, therefore predicting biomarkers of disease metabolism. We finally discuss current challenges and future promising directions for this research field, including machine/deep learning and precision medicine. In the omics era, profiling patients and biological processes from a multiomic point of view is becoming more common and less expensive. Starting from multiomic data collected from patients and N-of-1 trials where individual patients constitute different case studies, methods for model-building and data integration are being used to generate patient-specific models. Coupled with state-of-the-art machine learning methods, this will allow characterizing each patient's disease phenotype and delivering precision medicine solutions, therefore leading to preventative medicine, reduced treatment, and in silico clinical trials.
Collapse
|
14
|
Vignani R, Liò P, Scali M. How to integrate wet lab and bioinformatics procedures for wine DNA admixture analysis and compositional profiling: Case studies and perspectives. PLoS One 2019; 14:e0211962. [PMID: 30753217 PMCID: PMC6376920 DOI: 10.1371/journal.pone.0211962] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 01/24/2019] [Indexed: 01/03/2023] Open
Abstract
The varietal authentication of wines is fundamental for assessing wine quality, and it is part of its compositional profiling. The availability of historical, cultural and chemical composition information is extremely important for quality evaluation. DNA-based techniques are a powerful tool for proving the varietal composition of a wine. SSR-amplification of genomic residual Vitis vinifera DNA, namely Wine DNA Fingerprinting (WDF) is able to produce strong, analytical evidence concerning the monovarietal nature of a wine, and for blended wines by generating the probability of the presence/absence of a certain variety, all in association with a dedicated bioinformatics elaboration of genotypes associated with possible varietal candidates. Together with WDF we could exploit Bioinformatics techniques, due to the number of grape genomes grown. In this paper, the use of WDF and the development of a bioinformatics tool for allelic data validation, retrieved from the amplification of 7 to 10 SSRs markers in the Vitis vinifera genome, are reported. The wines were chosen based on increasing complexity; from monovarietal, experimental ones, to commercial monovarietals, to blended commercial wines. The results demonstrate that WDF, after calculation of different distance matrices and Neighbor-Joining input data, followed by Principal Component Analysis (PCA) can effectively describe the varietal nature of wines. In the unknown blended wines the WDF profiles were compared to possible varietal candidates (Merlot, Pinot Noir, Cabernet Sauvignon and Zinfandel), and the output graphs show the most probable varieties used in the blend as closeness to the tested wine. This pioneering work should be meant as to favor in perspective the multidisciplinary building-up of on-line databanks and bioinformatics toolkits on wine. The paper concludes with a discussion on an integrated decision support system based on bioinformatics, chemistry and cultural data to assess wine quality.
Collapse
Affiliation(s)
- Rita Vignani
- Department of Life Science, University of Siena, Siena,
Italy
- Serge-genomics, Siena, Italy
| | - Pietro Liò
- Computer Laboratory, University of Cambridge, Cambridge, United
Kingdom
| | - Monica Scali
- Department of Life Science, University of Siena, Siena,
Italy
| |
Collapse
|
15
|
Di Stefano A, Scatà M, Vijayakumar S, Angione C, La Corte A, Liò P. Social dynamics modeling of chrono-nutrition. PLoS Comput Biol 2019; 15:e1006714. [PMID: 30699206 PMCID: PMC6370249 DOI: 10.1371/journal.pcbi.1006714] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Revised: 02/11/2019] [Accepted: 12/14/2018] [Indexed: 12/13/2022] Open
Abstract
Gut microbiota and human relationships are strictly connected to each other. What we eat reflects our body-mind connection and synchronizes with people around us. However, how this impacts on gut microbiota and, conversely, how gut bacteria influence our dietary behaviors has not been explored yet. To quantify the complex dynamics of this interplay between gut and human behaviors we explore the "gut-human behavior axis" and its evolutionary dynamics in a real-world scenario represented by the social multiplex network. We consider a dual type of similarity, homophily and gut similarity, other than psychological and unconscious biases. We analyze the dynamics of social and gut microbial communities, quantifying the impact of human behaviors on diets and gut microbial composition and, backwards, through a control mechanism. Meal timing mechanisms and "chrono-nutrition" play a crucial role in feeding behaviors, along with the quality and quantity of food intake. Considering a population of shift workers, we explore the dynamic interplay between their eating behaviors and gut microbiota, modeling the social dynamics of chrono-nutrition in a multiplex network. Our findings allow us to quantify the relation between human behaviors and gut microbiota through the methodological introduction of gut metabolic modeling and statistical estimators, able to capture their dynamic interplay. Moreover, we find that the timing of gut microbial communities is slower than social interactions and shift-working, and the impact of shift-working on the dynamics of chrono-nutrition is a fluctuation of strategies with a major propensity for defection (e.g. high-fat meals). A deeper understanding of the relation between gut microbiota and the dietary behavioral patterns, by embedding also the related social aspects, allows improving the overall knowledge about metabolic models and their implications for human health, opening the possibility to design promising social therapeutic dietary interventions.
Collapse
Affiliation(s)
- Alessandro Di Stefano
- Dipartimento di Ingegneria Elettrica, Elettronica e Informatica (DIEEI), CNIT (National Inter-University Consortium for Telecommunications) Catania, Italy
| | - Marialisa Scatà
- Dipartimento di Ingegneria Elettrica, Elettronica e Informatica (DIEEI), CNIT (National Inter-University Consortium for Telecommunications) Catania, Italy
| | - Supreeta Vijayakumar
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Aurelio La Corte
- Dipartimento di Ingegneria Elettrica, Elettronica e Informatica (DIEEI), CNIT (National Inter-University Consortium for Telecommunications) Catania, Italy
| | - Pietro Liò
- Computer Laboratory, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
16
|
Abstract
BACKGROUND Ageing can be classified in two different ways, chronological ageing and biological ageing. While chronological age is a measure of the time that has passed since birth, biological (also known as transcriptomic) ageing is defined by how time and the environment affect an individual in comparison to other individuals of the same chronological age. Recent research studies have shown that transcriptomic age is associated with certain genes, and that each of those genes has an effect size. Using these effect sizes we can calculate the transcriptomic age of an individual from their age-associated gene expression levels. The limitation of this approach is that it does not consider how these changes in gene expression affect the metabolism of individuals and hence their observable cellular phenotype. RESULTS We propose a method based on poly-omic constraint-based models and machine learning in order to further the understanding of transcriptomic ageing. We use normalised CD4 T-cell gene expression data from peripheral blood mononuclear cells in 499 healthy individuals to create individual metabolic models. These models are then combined with a transcriptomic age predictor and chronological age to provide new insights into the differences between transcriptomic and chronological ageing. As a result, we propose a novel metabolic age predictor. CONCLUSIONS We show that our poly-omic predictors provide a more detailed analysis of transcriptomic ageing compared to gene-based approaches, and represent a basis for furthering our knowledge of the ageing mechanisms in human cells.
Collapse
Affiliation(s)
- Elisabeth Yaneske
- Department of Computer Science and Information Systems, Teesside University, Borough Road, Middlesbrough, UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Borough Road, Middlesbrough, UK
| |
Collapse
|
17
|
Bardozzo F, Lió P, Tagliaferri R. A study on multi-omic oscillations in Escherichia coli metabolic networks. BMC Bioinformatics 2018; 19:194. [PMID: 30066640 PMCID: PMC6069781 DOI: 10.1186/s12859-018-2175-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Two important challenges in the analysis of molecular biology information are data (multi-omic information) integration and the detection of patterns across large scale molecular networks and sequences. They are are actually coupled beause the integration of omic information may provide better means to detect multi-omic patterns that could reveal multi-scale or emerging properties at the phenotype levels. RESULTS Here we address the problem of integrating various types of molecular information (a large collection of gene expression and sequence data, codon usage and protein abundances) to analyse the E.coli metabolic response to treatments at the whole network level. Our algorithm, MORA (Multi-omic relations adjacency) is able to detect patterns which may represent metabolic network motifs at pathway and supra pathway levels which could hint at some functional role. We provide a description and insights on the algorithm by testing it on a large database of responses to antibiotics. Along with the algorithm MORA, a novel model for the analysis of oscillating multi-omics has been proposed. Interestingly, the resulting analysis suggests that some motifs reveal recurring oscillating or position variation patterns on multi-omics metabolic networks. Our framework, implemented in R, provides effective and friendly means to design intervention scenarios on real data. By analysing how multi-omics data build up multi-scale phenotypes, the software allows to compare and test metabolic models, design new pathways or redesign existing metabolic pathways and validate in silico metabolic models using nearby species. CONCLUSIONS The integration of multi-omic data reveals that E.coli multi-omic metabolic networks contain position dependent and recurring patterns which could provide clues of long range correlations in the bacterial genome.
Collapse
Affiliation(s)
- Francesco Bardozzo
- NeuRoNe Lab, DISA-MIS, University of Salerno, Via Giovanni Paolo II 132, Salerno, 84084 Fisciano, Italy
| | - Pietro Lió
- Computer Laboratory, Department of Computer Science, University of Cambridge, 15 JJ Thomson Ave, Cambridge, CB3 0FD, UK
| | - Roberto Tagliaferri
- NeuRoNe Lab, DISA-MIS, University of Salerno, Via Giovanni Paolo II 132, Salerno, 84084 Fisciano, Italy.
| |
Collapse
|
18
|
Clinical trans-omics: an integration of clinical phenomes with molecular multiomics. Cell Biol Toxicol 2018; 34:163-166. [PMID: 29691682 DOI: 10.1007/s10565-018-9431-3] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Accepted: 04/16/2018] [Indexed: 10/17/2022]
|
19
|
Zeng ISL, Lumley T. Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science). Bioinform Biol Insights 2018; 12:1177932218759292. [PMID: 29497285 PMCID: PMC5824897 DOI: 10.1177/1177932218759292] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2017] [Accepted: 01/24/2018] [Indexed: 12/14/2022] Open
Abstract
Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix.
Collapse
Affiliation(s)
- Irene Sui Lan Zeng
- Department of Statistics, Faculty of Science, The University of Auckland, Auckland, New Zealand
| | - Thomas Lumley
- Department of Statistics, Faculty of Science, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
20
|
Occhipinti A, Eyassu F, Rahman TJ, Rahman PKSM, Angione C. In silico engineering of Pseudomonas metabolism reveals new biomarkers for increased biosurfactant production. PeerJ 2018; 6:e6046. [PMID: 30588397 PMCID: PMC6301282 DOI: 10.7717/peerj.6046] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Accepted: 10/30/2018] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND Rhamnolipids, biosurfactants with a wide range of biomedical applications, are amphiphilic molecules produced on the surfaces of or excreted extracellularly by bacteria including Pseudomonas aeruginosa. However, Pseudomonas putida is a non-pathogenic model organism with greater metabolic versatility and potential for industrial applications. METHODS We investigate in silico the metabolic capabilities of P. putida for rhamnolipids biosynthesis using statistical, metabolic and synthetic engineering approaches after introducing key genes (RhlA and RhlB) from P. aeruginosa into a genome-scale model of P. putida. This pipeline combines machine learning methods with multi-omic modelling, and drives the engineered P. putida model toward an optimal production and export of rhamnolipids out of the membrane. RESULTS We identify a substantial increase in synthesis of rhamnolipids by the engineered model compared to the control model. We apply statistical and machine learning techniques on the metabolic reaction rates to identify distinct features on the structure of the variables and individual components driving the variation of growth and rhamnolipids production. We finally provide a computational framework for integrating multi-omics data and identifying latent pathways and genes for the production of rhamnolipids in P. putida. CONCLUSIONS We anticipate that our results will provide a versatile methodology for integrating multi-omics data for topological and functional analysis of P. putida toward maximization of biosurfactant production.
Collapse
Affiliation(s)
- Annalisa Occhipinti
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, UK
| | - Filmon Eyassu
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, UK
| | - Thahira J. Rahman
- Technology Futures Institute, School of Science, Engineering and Design, Teesside University, Middlesbrough, UK
| | - Pattanathu K. S. M. Rahman
- Technology Futures Institute, School of Science, Engineering and Design, Teesside University, Middlesbrough, UK
- Institute of Biological and Biomedical Sciences, School of Biological Sciences, University of Portsmouth, Portsmouth, UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, UK
| |
Collapse
|
21
|
Fernandes K, Chicco D, Cardoso JS, Fernandes J. Supervised deep learning embeddings for the prediction of cervical cancer diagnosis. PeerJ Comput Sci 2018; 4:e154. [PMID: 33816808 PMCID: PMC7924508 DOI: 10.7717/peerj-cs.154] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2018] [Accepted: 04/26/2018] [Indexed: 05/12/2023]
Abstract
Cervical cancer remains a significant cause of mortality all around the world, even if it can be prevented and cured by removing affected tissues in early stages. Providing universal and efficient access to cervical screening programs is a challenge that requires identifying vulnerable individuals in the population, among other steps. In this work, we present a computationally automated strategy for predicting the outcome of the patient biopsy, given risk patterns from individual medical records. We propose a machine learning technique that allows a joint and fully supervised optimization of dimensionality reduction and classification models. We also build a model able to highlight relevant properties in the low dimensional space, to ease the classification of patients. We instantiated the proposed approach with deep learning architectures, and achieved accurate prediction results (top area under the curve AUC = 0.6875) which outperform previously developed methods, such as denoising autoencoders. Additionally, we explored some clinical findings from the embedding spaces, and we validated them through the medical literature, making them reliable for physicians and biomedical researchers.
Collapse
Affiliation(s)
- Kelwin Fernandes
- Instituto de Engenharia de Sistemas e Computadores Tecnologia e Ciencia (INESC TEC), Porto, Portugal
- Universidade do Porto, Porto, Portugal
| | - Davide Chicco
- Princess Margaret Cancer Centre, Toronto, ON, Canada
| | - Jaime S. Cardoso
- Instituto de Engenharia de Sistemas e Computadores Tecnologia e Ciencia (INESC TEC), Porto, Portugal
- Universidade do Porto, Porto, Portugal
| | | |
Collapse
|
22
|
Vijayakumar S, Conway M, Lió P, Angione C. Optimization of Multi-Omic Genome-Scale Models: Methodologies, Hands-on Tutorial, and Perspectives. Methods Mol Biol 2018; 1716:389-408. [PMID: 29222764 DOI: 10.1007/978-1-4939-7528-0_18] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Genome-scale metabolic models are valuable tools for assessing the metabolic potential of living organisms. Being downstream of gene expression, metabolism is increasingly being used as an indicator of the phenotypic outcome for drugs and therapies. We here present a review of the principal methods used for constraint-based modelling in systems biology, and explore how the integration of multi-omic data can be used to improve phenotypic predictions of genome-scale metabolic models. We believe that the large-scale comparison of the metabolic response of an organism to different environmental conditions will be an important challenge for genome-scale models. Therefore, within the context of multi-omic methods, we describe a tutorial for multi-objective optimization using the metabolic and transcriptomics adaptation estimator (METRADE), implemented in MATLAB. METRADE uses microarray and codon usage data to model bacterial metabolic response to environmental conditions (e.g., antibiotics, temperatures, heat shock). Finally, we discuss key considerations for the integration of multi-omic networks into metabolic models, towards automatically extracting knowledge from such models.
Collapse
Affiliation(s)
- Supreeta Vijayakumar
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, Tees Valley TS1 3BX, UK
| | - Max Conway
- Computer Laboratory, University of Cambridge, Cambridge, UK
| | - Pietro Lió
- Computer Laboratory, University of Cambridge, Cambridge, UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, Tees Valley TS1 3BX, UK.
| |
Collapse
|
23
|
|
24
|
Eyassu F, Angione C. Modelling pyruvate dehydrogenase under hypoxia and its role in cancer metabolism. ROYAL SOCIETY OPEN SCIENCE 2017; 4:170360. [PMID: 29134060 PMCID: PMC5666243 DOI: 10.1098/rsos.170360] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2017] [Accepted: 09/25/2017] [Indexed: 05/18/2023]
Abstract
Metabolism is the only biological system that can be fully modelled at genome scale. As a result, metabolic models have been increasingly used to study the molecular mechanisms of various diseases. Hypoxia, a low-oxygen tension, is a well-known characteristic of many cancer cells. Pyruvate dehydrogenase (PDH) controls the flux of metabolites between glycolysis and the tricarboxylic acid cycle and is a key enzyme in metabolic reprogramming in cancer metabolism. Here, we develop and manually curate a constraint-based metabolic model to investigate the mechanism of pyruvate dehydrogenase under hypoxia. Our results characterize the activity of pyruvate dehydrogenase and its decline during hypoxia. This results in lactate accumulation, consistent with recent hypoxia studies and a well-known feature in cancer metabolism. We apply machine-learning techniques on the flux datasets to identify reactions that drive these variations. We also identify distinct features on the structure of the variables and individual metabolic components in the switch from normoxia to hypoxia. Our results provide a framework for future studies by integrating multi-omics data to predict condition-specific metabolic phenotypes under hypoxia.
Collapse
|
25
|
Tian Z, Guo M, Wang C, Xing L, Wang L, Zhang Y. Constructing an integrated gene similarity network for the identification of disease genes. J Biomed Semantics 2017; 8:32. [PMID: 29297379 PMCID: PMC5763299 DOI: 10.1186/s13326-017-0141-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. RESULTS We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. CONCLUSIONS RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .
Collapse
Affiliation(s)
- Zhen Tian
- School of Computer Science and Engineering, Harbin Institute of Technology, Harbin, 150001 People’s Republic of China
| | - Maozu Guo
- School of Computer Science and Engineering, Harbin Institute of Technology, Harbin, 150001 People’s Republic of China
| | - Chunyu Wang
- School of Computer Science and Engineering, Harbin Institute of Technology, Harbin, 150001 People’s Republic of China
| | - LinLin Xing
- School of Computer Science and Engineering, Harbin Institute of Technology, Harbin, 150001 People’s Republic of China
| | - Lei Wang
- Institute of Health Service and Medical Information Academy of Military Medical Sciences Beijing, Beijing, 100850 China
| | - Yin Zhang
- Institute of Health Service and Medical Information Academy of Military Medical Sciences Beijing, Beijing, 100850 China
| |
Collapse
|
26
|
Angione C. Integrating splice-isoform expression into genome-scale models characterizes breast cancer metabolism. Bioinformatics 2017; 34:494-501. [DOI: 10.1093/bioinformatics/btx562] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Accepted: 09/06/2017] [Indexed: 12/20/2022] Open
Affiliation(s)
- Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, UK
| |
Collapse
|
27
|
Oshota O, Conway M, Fookes M, Schreiber F, Chaudhuri RR, Yu L, Morgan FJE, Clare S, Choudhary J, Thomson NR, Lio P, Maskell DJ, Mastroeni P, Grant AJ. Transcriptome and proteome analysis of Salmonella enterica serovar Typhimurium systemic infection of wild type and immune-deficient mice. PLoS One 2017; 12:e0181365. [PMID: 28796780 PMCID: PMC5552096 DOI: 10.1371/journal.pone.0181365] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Accepted: 06/19/2017] [Indexed: 01/09/2023] Open
Abstract
Salmonella enterica are a threat to public health. Current vaccines are not fully effective. The ability to grow in infected tissues within phagocytes is required for S. enterica virulence in systemic disease. As the infection progresses the bacteria are exposed to a complex host immune response. Consequently, in order to continue growing in the tissues, S. enterica requires the coordinated regulation of fitness genes. Bacterial gene regulation has so far been investigated largely using exposure to artificial environmental conditions or to in vitro cultured cells, and little information is available on how S. enterica adapts in vivo to sustain cell division and survival. We have studied the transcriptome, proteome and metabolic flux of Salmonella, and the transcriptome of the host during infection of wild type C57BL/6 and immune-deficient gp91-/-phox mice. Our analyses advance the understanding of how S. enterica and the host behaves during infection to a more sophisticated level than has previously been reported.
Collapse
Affiliation(s)
- Olusegun Oshota
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Max Conway
- Computer Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge, United Kingdom
| | - Maria Fookes
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Fernanda Schreiber
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Roy R. Chaudhuri
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Lu Yu
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Fiona J. E. Morgan
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Simon Clare
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Jyoti Choudhary
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Nicholas R. Thomson
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
- The London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Pietro Lio
- Computer Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge, United Kingdom
| | - Duncan J. Maskell
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Pietro Mastroeni
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Andrew J. Grant
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
28
|
Ye R, Huang M, Lu H, Qian J, Lin W, Chu J, Zhuang Y, Zhang S. Comprehensive reconstruction and evaluation of Pichia pastoris genome-scale metabolic model that accounts for 1243 ORFs. BIORESOUR BIOPROCESS 2017; 4:22. [PMID: 28546903 PMCID: PMC5423920 DOI: 10.1186/s40643-017-0152-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Revised: 04/17/2017] [Accepted: 05/02/2017] [Indexed: 11/10/2022] Open
Abstract
Background Pichia pastoris is one of the most important cell factories for production of industrial enzymes and heterogenous proteins. The genome-scale metabolic model of high quality is crucial for comprehensive understanding of the P. pastoris metabolism. Methods In this paper, we upgraded P. pastoris genome-scale metabolic model based on the combination of latest genome annotations and literatures. Then the performance of the new model was evaluated using the Cobra Toolbox v2.0. Results Compared with the recently published model iMT1026, the reaction number in the new model iRY1243 was increased from 2035 to 2407 and the metabolite number was increased from 1018 to 1094. Accordingly, the unique ORF number was increased from 1026 to 1243. To improve the metabolic functions of P. pastoris genome-scale metabolic model, the biosynthesis pathways of vitamins and cofactors were carefully added. iRY1243 showed good performances when predicting the growth capability on most of the reported carbon and nitrogen sources, the metabolic flux distribution with glucose as a sole carbon source, the essential and partially essential genes, and the effects of gene deletion or overexpression on cell growth and S-adenosyl-l-methionine production. Conclusion iRY1243 is an upgraded P. pastoris genome-scale metabolic model with significant improvements in the metabolic coverage and prediction ability, and thus it will be a potential platform for further systematic investigation of P. pastoris metabolism. Electronic supplementary material The online version of this article (doi:10.1186/s40643-017-0152-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rui Ye
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, No.130, Meilong Road, Shanghai, 200237 China
| | - Mingzhi Huang
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, No.130, Meilong Road, Shanghai, 200237 China
| | - Hongzhong Lu
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, No.130, Meilong Road, Shanghai, 200237 China
| | - Jiangchao Qian
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, No.130, Meilong Road, Shanghai, 200237 China
| | - Weilu Lin
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, No.130, Meilong Road, Shanghai, 200237 China
| | - Ju Chu
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, No.130, Meilong Road, Shanghai, 200237 China
| | - Yingping Zhuang
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, No.130, Meilong Road, Shanghai, 200237 China
| | - Siliang Zhang
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, No.130, Meilong Road, Shanghai, 200237 China
| |
Collapse
|
29
|
Kashaf SS, Angione C, Lió P. Making life difficult for Clostridium difficile: augmenting the pathogen's metabolic model with transcriptomic and codon usage data for better therapeutic target characterization. BMC SYSTEMS BIOLOGY 2017; 11:25. [PMID: 28209199 PMCID: PMC5314682 DOI: 10.1186/s12918-017-0395-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Accepted: 01/13/2017] [Indexed: 11/10/2022]
Abstract
BACKGROUND Clostridium difficile is a bacterium which can infect various animal species, including humans. Infection with this bacterium is a leading healthcare-associated illness. A better understanding of this organism and the relationship between its genotype and phenotype is essential to the search for an effective treatment. Genome-scale metabolic models contain all known biochemical reactions of a microorganism and can be used to investigate this relationship. RESULTS We present icdf834, an updated metabolic network of C. difficile that builds on iMLTC806cdf and features 1227 reactions, 834 genes, and 807 metabolites. We used this metabolic network to reconstruct the metabolic landscape of this bacterium. The standard metabolic model cannot account for changes in the bacterial metabolism in response to different environmental conditions. To account for this limitation, we also integrated transcriptomic data, which details the gene expression of the bacterium in a wide array of environments. Importantly, to bridge the gap between gene expression levels and protein abundance, we accounted for the synonymous codon usage bias of the bacterium in the model. To our knowledge, this is the first time codon usage has been quantified and integrated into a metabolic model. The metabolic fluxes were defined as a function of protein abundance. To determine potential therapeutic targets using the model, we conducted gene essentiality and metabolic pathway sensitivity analyses and calculated flux control coefficients. We obtained 92.3% accuracy in predicting gene essentiality when compared to experimental data for C. difficile R20291 (ribotype 027) homologs. We validated our context-specific metabolic models using sensitivity and robustness analyses and compared model predictions with literature on C. difficile. The model predicts interesting facets of the bacterium's metabolism, such as changes in the bacterium's growth in response to different environmental conditions. CONCLUSIONS After an extensive validation process, we used icdf834 to obtain state-of-the-art predictions of therapeutic targets for C. difficile. We show how context-specific metabolic models augmented with codon usage information can be a beneficial resource for better understanding C. difficile and for identifying novel therapeutic targets. We remark that our approach can be applied to investigate and treat against other pathogens.
Collapse
Affiliation(s)
- Sara Saheb Kashaf
- Computer Laboratory, University of Cambridge, 15 JJ Thomson Avenue, Cambridge, CB3 0FD UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Borough road, Middlesbrough, TS1 3BA UK
| | - Pietro Lió
- Computer Laboratory, University of Cambridge, 15 JJ Thomson Avenue, Cambridge, CB3 0FD UK
| |
Collapse
|