1
|
Khachatryan L, Xiang Y, Ivanov A, Glaab E, Graham G, Granata I, Giordano M, Maddalena L, Piccirillo M, Manipur I, Baruzzo G, Cappellato M, Avot B, Stan A, Battey J, Lo Sasso G, Boue S, Ivanov NV, Peitsch MC, Hoeng J, Falquet L, Di Camillo B, Guarracino MR, Ulyantsev V, Sierro N, Poussin C. Results and lessons learned from the sbv IMPROVER metagenomics diagnostics for inflammatory bowel disease challenge. Sci Rep 2023; 13:6303. [PMID: 37072468 PMCID: PMC10113391 DOI: 10.1038/s41598-023-33050-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 04/06/2023] [Indexed: 05/03/2023] Open
Abstract
A growing body of evidence links gut microbiota changes with inflammatory bowel disease (IBD), raising the potential benefit of exploiting metagenomics data for non-invasive IBD diagnostics. The sbv IMPROVER metagenomics diagnosis for inflammatory bowel disease challenge investigated computational metagenomics methods for discriminating IBD and nonIBD subjects. Participants in this challenge were given independent training and test metagenomics data from IBD and nonIBD subjects, which could be wither either raw read data (sub-challenge 1, SC1) or processed Taxonomy- and Function-based profiles (sub-challenge 2, SC2). A total of 81 anonymized submissions were received between September 2019 and March 2020. Most participants' predictions performed better than random predictions in classifying IBD versus nonIBD, Ulcerative Colitis (UC) versus nonIBD, and Crohn's Disease (CD) versus nonIBD. However, discrimination between UC and CD remains challenging, with the classification quality similar to the set of random predictions. We analyzed the class prediction accuracy, the metagenomics features by the teams, and computational methods used. These results will be openly shared with the scientific community to help advance IBD research and illustrate the application of a range of computational methodologies for effective metagenomic classification.
Collapse
Affiliation(s)
- Lusine Khachatryan
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland.
| | - Yang Xiang
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
| | - Artem Ivanov
- ITMO University, St. Petersburg, Russian Federation
| | - Enrico Glaab
- University of Luxembourg, Luxembourg, Luxembourg
| | | | | | | | | | | | | | | | | | | | - Adrian Stan
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
| | - James Battey
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
| | - Giuseppe Lo Sasso
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
| | - Stephanie Boue
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
| | - Nikolai V Ivanov
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
| | - Manuel C Peitsch
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
| | - Julia Hoeng
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
| | | | | | | | | | - Nicolas Sierro
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
| | - Carine Poussin
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
| |
Collapse
|
2
|
Altay G, Zapardiel-Gonzalo J, Peters B. RNA-seq preprocessing and sample size considerations for gene network inference. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.02.522518. [PMID: 36711979 PMCID: PMC9881880 DOI: 10.1101/2023.01.02.522518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Background Gene network inference (GNI) methods have the potential to reveal functional relationships between different genes and their products. Most GNI algorithms have been developed for microarray gene expression datasets and their application to RNA-seq data is relatively recent. As the characteristics of RNA-seq data are different from microarray data, it is an unanswered question what preprocessing methods for RNA-seq data should be applied prior to GNI to attain optimal performance, or what the required sample size for RNA-seq data is to obtain reliable GNI estimates. Results We ran 9144 analysis of 7 different RNA-seq datasets to evaluate 300 different preprocessing combinations that include data transformations, normalizations and association estimators. We found that there was no single best performing preprocessing combination but that there were several good ones. The performance varied widely over various datasets, which emphasized the importance of choosing an appropriate preprocessing configuration before GNI. Two preprocessing combinations appeared promising in general: First, Log-2 TPM (transcript per million) with Variance-stabilizing transformation (VST) and Pearson Correlation Coefficient (PCC) association estimator. Second, raw RNA-seq count data with PCC. Along with these two, we also identified 18 other good preprocessing combinations. Any of these algorithms might perform best in different datasets. Therefore, the GNI performances of these approaches should be measured on any new dataset to select the best performing one for it. In terms of the required biological sample size of RNA-seq data, we found that between 30 to 85 samples were required to generate reliable GNI estimates. Conclusions This study provides practical recommendations on default choices for data preprocessing prior to GNI analysis of RNA-seq data to obtain optimal performance results.
Collapse
Affiliation(s)
- Gökmen Altay
- La Jolla Institute for Immunology, 9420 Athena Circle, La Jolla, CA 92037, USA
| | | | - Bjoern Peters
- La Jolla Institute for Immunology, 9420 Athena Circle, La Jolla, CA 92037, USA
| |
Collapse
|
3
|
A strategy to incorporate prior knowledge into correlation network cutoff selection. Nat Commun 2020; 11:5153. [PMID: 33056991 PMCID: PMC7560866 DOI: 10.1038/s41467-020-18675-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 08/27/2020] [Indexed: 12/16/2022] Open
Abstract
Correlation networks are frequently used to statistically extract biological interactions between omics markers. Network edge selection is typically based on the statistical significance of the correlation coefficients. This procedure, however, is not guaranteed to capture biological mechanisms. We here propose an alternative approach for network reconstruction: a cutoff selection algorithm that maximizes the overlap of the inferred network with available prior knowledge. We first evaluate the approach on IgG glycomics data, for which the biochemical pathway is known and well-characterized. Importantly, even in the case of incomplete or incorrect prior knowledge, the optimal network is close to the true optimum. We then demonstrate the generalizability of the approach with applications to untargeted metabolomics and transcriptomics data. For the transcriptomics case, we demonstrate that the optimized network is superior to statistical networks in systematically retrieving interactions that were not included in the biological reference used for optimization.
Collapse
|
4
|
Transcriptomic determinants of the response of ST-111 Pseudomonas aeruginosa AG1 to ciprofloxacin identified by a top-down systems biology approach. Sci Rep 2020; 10:13717. [PMID: 32792590 PMCID: PMC7427096 DOI: 10.1038/s41598-020-70581-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 06/25/2020] [Indexed: 12/13/2022] Open
Abstract
Pseudomonas aeruginosa is an opportunistic pathogen that thrives in diverse environments and causes a variety of human infections. Pseudomonas aeruginosa AG1 (PaeAG1) is a high-risk sequence type 111 (ST-111) strain isolated from a Costa Rican hospital in 2010. PaeAG1 has both blaVIM-2 and blaIMP-18 genes encoding for metallo-β-lactamases, and it is resistant to β-lactams (including carbapenems), aminoglycosides, and fluoroquinolones. Ciprofloxacin (CIP) is an antibiotic commonly used to treat P. aeruginosa infections, and it is known to produce DNA damage, triggering a complex molecular response. In order to evaluate the effects of a sub-inhibitory CIP concentration on PaeAG1, growth curves using increasing CIP concentrations were compared. We then measured gene expression using RNA-Seq at three time points (0, 2.5 and 5 h) after CIP exposure to identify the transcriptomic determinants of the response (i.e. hub genes, gene clusters and enriched pathways). Changes in expression were determined using differential expression analysis and network analysis using a top–down systems biology approach. A hybrid model using database-based and co-expression analysis approaches was implemented to predict gene–gene interactions. We observed a reduction of the growth curve rate as the sub-inhibitory CIP concentrations were increased. In the transcriptomic analysis, we detected that over time CIP treatment resulted in the differential expression of 518 genes, showing a complex impact at the molecular level. The transcriptomic determinants were 14 hub genes, multiple gene clusters at different levels (associated to hub genes or as co-expression modules) and 15 enriched pathways. Down-regulation of genes implicated in several metabolism pathways, virulence elements and ribosomal activity was observed. In contrast, amino acid catabolism, RpoS factor, proteases, and phenazines genes were up-regulated. Remarkably, > 80 resident-phage genes were up-regulated after CIP treatment, which was validated at phenomic level using a phage plaque assay. Thus, reduction of the growth curve rate and increasing phage induction was evidenced as the CIP concentrations were increased. In summary, transcriptomic and network analyses, as well as the growth curves and phage plaque assays provide evidence that PaeAG1 presents a complex, concentration-dependent response to sub-inhibitory CIP exposure, showing pleiotropic effects at the systems level. Manipulation of these determinants, such as phage genes, could be used to gain more insights about the regulation of responses in PaeAG1 as well as the identification of possible therapeutic targets. To our knowledge, this is the first report of the transcriptomic analysis of CIP response in a ST-111 high-risk P. aeruginosa strain, in particular using a top-down systems biology approach.
Collapse
|
5
|
De Bastiani MA, Klamt F. Integrated transcriptomics reveals master regulators of lung adenocarcinoma and novel repositioning of drug candidates. Cancer Med 2019; 8:6717-6729. [PMID: 31503425 PMCID: PMC6825976 DOI: 10.1002/cam4.2493] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Revised: 07/18/2019] [Accepted: 07/31/2019] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Lung adenocarcinoma is the major cause of cancer-related deaths in the world. Given this, the importance of research on its pathophysiology and therapy remains a key health issue. To assist in this endeavor, recent oncology studies are adopting Systems Biology approaches and bioinformatics to analyze and understand omics data, bringing new insights about this disease and its treatment. METHODS We used reverse engineering of transcriptomic data to reconstruct nontumorous lung reference networks, focusing on transcription factors (TFs) and their inferred target genes, referred as regulatory units or regulons. Afterwards, we used 13 case-control studies to identify TFs acting as master regulators of the disease and their regulatory units. Furthermore, the inferred activation patterns of regulons were used to evaluate patient survival and search drug candidates for repositioning. RESULTS The regulatory units under the influence of ATOH8, DACH1, EPAS1, ETV5, FOXA2, FOXM1, HOXA4, SMAD6, and UHRF1 transcription factors were consistently associated with the pathological phenotype, suggesting that they may be master regulators of lung adenocarcinoma. We also observed that the inferred activity of FOXA2, FOXM1, and UHRF1 was significantly associated with risk of death in patients. Finally, we obtained deptropine, promazine, valproic acid, azacyclonol, methotrexate, and ChemBridge ID compound 5109870 as potential candidates to revert the molecular profile leading to decreased survival. CONCLUSION Using an integrated transcriptomics approach, we identified master regulator candidates involved with the development and prognostic of lung adenocarcinoma, as well as potential drugs for repurposing.
Collapse
Affiliation(s)
- Marco Antônio De Bastiani
- Laboratory of Cellular Biochemistry, Department of Biochemistry, Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, RS, Brazil.,National Institute of Science and Technology for Translational Medicine (INCT-TM), Porto Alegre, RS, Brazil
| | - Fábio Klamt
- Laboratory of Cellular Biochemistry, Department of Biochemistry, Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, RS, Brazil.,National Institute of Science and Technology for Translational Medicine (INCT-TM), Porto Alegre, RS, Brazil
| |
Collapse
|
6
|
Khatibipour MJ, Kurtoğlu F, Çakır T. JacLy: a Jacobian-based method for the inference of metabolic interactions from the covariance of steady-state metabolome data. PeerJ 2018; 6:e6034. [PMID: 30564518 PMCID: PMC6286809 DOI: 10.7717/peerj.6034] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 10/30/2018] [Indexed: 11/20/2022] Open
Abstract
Reverse engineering metabolome data to infer metabolic interactions is a challenging research topic. Here we introduce JacLy, a Jacobian-based method to infer metabolic interactions of small networks (<20 metabolites) from the covariance of steady-state metabolome data. The approach was applied to two different in silico small-scale metabolome datasets. The power of JacLy lies on the use of steady-state metabolome data to predict the Jacobian matrix of the system, which is a source of information on structure and dynamic characteristics of the system. Besides its advantage of inferring directed interactions, its superiority over correlation-based network inference was especially clear in terms of the required number of replicates and the effect of the use of priori knowledge in the inference. Additionally, we showed the use of standard deviation of the replicate data as a suitable approximation for the magnitudes of metabolite fluctuations inherent in the system.
Collapse
Affiliation(s)
- Mohammad Jafar Khatibipour
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University, Gebze, Kocaeli, Turkey.,Department of Chemical Engineering, Gebze Technical University, Gebze, Kocaeli, Turkey
| | - Furkan Kurtoğlu
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University, Gebze, Kocaeli, Turkey
| | - Tunahan Çakır
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University, Gebze, Kocaeli, Turkey
| |
Collapse
|
7
|
Bekkar A, Estreicher A, Niknejad A, Casals-Casas C, Bridge A, Xenarios I, Dorier J, Crespo I. Expert curation for building network-based dynamical models: a case study on atherosclerotic plaque formation. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:4960931. [PMID: 29688381 PMCID: PMC5887269 DOI: 10.1093/database/bay031] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 03/07/2018] [Indexed: 12/13/2022]
Abstract
Knowledgebases play an increasingly important role in scientific research, where the expert curation of biological knowledge in forms that are amenable to computational analysis (using ontologies for example)–provides a significant added value and enables new types of computational analyses for high throughput datasets. In this work, we demonstrate how expert curation can also play a more direct role in research, by supporting the use of network-based dynamical models to study a specific biological process. This curation effort is focused on the regulatory interactions between biological entities, such as genes or proteins and compounds, which may interact with each other in a complex manner, including regulatory complexes and conditional dependencies between co-regulators. This critical information has to be captured and encoded in a computable manner, which is currently far beyond the current capabilities of automatically constructed network. As a case study, we report here the prior knowledge network constructed by the sysVASC consortium to model the biological events leading to the formation of atherosclerotic plaques, during the onset of cardiovascular disease and discuss some specific examples to illustrate the main pitfalls and added value provided by the expert curation during this endeavor. Database URL: http://biomodels.caltech.edu
Collapse
Affiliation(s)
- Amel Bekkar
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode, 1015 Lausanne, Switzerland
| | - Anne Estreicher
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, 1 Michel Servet, 1211 Geneva 4, Switzerland
| | - Anne Niknejad
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode, 1015 Lausanne, Switzerland.,Swiss-Prot group, SIB Swiss Institute of Bioinformatics, 1 Michel Servet, 1211 Geneva 4, Switzerland
| | - Cristina Casals-Casas
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, 1 Michel Servet, 1211 Geneva 4, Switzerland
| | - Alan Bridge
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, 1 Michel Servet, 1211 Geneva 4, Switzerland
| | - Ioannis Xenarios
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode, 1015 Lausanne, Switzerland.,Swiss-Prot group, SIB Swiss Institute of Bioinformatics, 1 Michel Servet, 1211 Geneva 4, Switzerland
| | - Julien Dorier
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode, 1015 Lausanne, Switzerland
| | - Isaac Crespo
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode, 1015 Lausanne, Switzerland
| |
Collapse
|
8
|
Mohorianu I, Fowler EK, Dalmay T, Chapman T. Control of seminal fluid protein expression via regulatory hubs in Drosophila melanogaster. Proc Biol Sci 2018; 285:20181681. [PMID: 30257913 PMCID: PMC6170815 DOI: 10.1098/rspb.2018.1681] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 09/03/2018] [Indexed: 12/25/2022] Open
Abstract
Highly precise, yet flexible and responsive coordination of expression across groups of genes underpins the integrity of many vital functions. However, our understanding of gene regulatory networks (GRNs) is often hampered by the lack of experimentally tractable systems, by significant computational challenges derived from the large number of genes involved or from difficulties in the accurate identification and characterization of gene interactions. Here we used a tractable experimental system in which to study GRNs: the genes encoding the seminal fluid proteins that are transferred along with sperm (the 'transferome') in Drosophila melanogaster fruit flies. The products of transferome genes are core determinants of reproductive success and, to date, only transcription factors have been implicated in the modulation of their expression. Hence, as yet, we know nothing about the post-transcriptional mechanisms underlying the tight, responsive and precise regulation of this important gene set. We investigated this omission in the current study. We first used bioinformatics to identify potential regulatory motifs that linked the transferome genes in a putative interaction network. This predicted the presence of putative microRNA (miRNA) 'hubs'. We then tested this prediction, that post-transcriptional regulation is important for the control of transferome genes, by knocking down miRNA expression in adult males. This abolished the ability of males to respond adaptively to the threat of sexual competition, indicating a regulatory role for miRNAs in the regulation of transferome function. Further bioinformatics analysis then identified candidate miRNAs as putative regulatory hubs and evidence for variation in the strength of miRNA regulation across the transferome gene set. The results revealed regulatory mechanisms that can underpin robust, precise and flexible regulation of multiple fitness-related genes. They also help to explain how males can adaptively modulate ejaculate composition.
Collapse
Affiliation(s)
- Irina Mohorianu
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK
- School of Computing Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK
| | - Emily K Fowler
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK
| | - Tamas Dalmay
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK
| | - Tracey Chapman
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK
| |
Collapse
|
9
|
Botero D, Alvarado C, Bernal A, Danies G, Restrepo S. Network Analyses in Plant Pathogens. Front Microbiol 2018; 9:35. [PMID: 29441045 PMCID: PMC5797656 DOI: 10.3389/fmicb.2018.00035] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 01/09/2018] [Indexed: 11/14/2022] Open
Abstract
Even in the age of big data in Biology, studying the connections between the biological processes and the molecular mechanisms behind them is a challenging task. Systems biology arose as a transversal discipline between biology, chemistry, computer science, mathematics, and physics to facilitate the elucidation of such connections. A scenario, where the application of systems biology constitutes a very powerful tool, is the study of interactions between hosts and pathogens using network approaches. Interactions between pathogenic bacteria and their hosts, both in agricultural and human health contexts are of great interest to researchers worldwide. Large amounts of data have been generated in the last few years within this area of research. However, studies have been relatively limited to simple interactions. This has left great amounts of data that remain to be utilized. Here, we review the main techniques in network analysis and their complementary experimental assays used to investigate bacterial-plant interactions. Other host-pathogen interactions are presented in those cases where few or no examples of plant pathogens exist. Furthermore, we present key results that have been obtained with these techniques and how these can help in the design of new strategies to control bacterial pathogens. The review comprises metabolic simulation, protein-protein interactions, regulatory control of gene expression, host-pathogen modeling, and genome evolution in bacteria. The aim of this review is to offer scientists working on plant-pathogen interactions basic concepts around network biology, as well as an array of techniques that will be useful for a better and more complete interpretation of their data.
Collapse
Affiliation(s)
- David Botero
- Laboratory of Mycology and Plant Pathology (LAMFU), Department of Biological Sciences, Universidad de Los Andes, Bogotá, Colombia.,Grupo de Diseño de Productos y Procesos, Department of Chemical Engineering, Universidad de Los Andes, Bogotá, Colombia.,Grupo de Biología Computacional y Ecología Microbiana, Department of Biological Sciences, Universidad de Los Andes, Bogotá, Colombia
| | - Camilo Alvarado
- Laboratory of Mycology and Plant Pathology (LAMFU), Department of Biological Sciences, Universidad de Los Andes, Bogotá, Colombia
| | - Adriana Bernal
- Laboratory of Molecular Interactions of Agricultural Microbes, LIMMA, Department of Biological Sciences, Universidad de Los Andes, Bogotá, Colombia
| | - Giovanna Danies
- Department of Design, Universidad de Los Andes, Bogotá, Colombia
| | - Silvia Restrepo
- Laboratory of Mycology and Plant Pathology (LAMFU), Department of Biological Sciences, Universidad de Los Andes, Bogotá, Colombia
| |
Collapse
|
10
|
Magnusson R, Mariotti GP, Köpsén M, Lövfors W, Gawel DR, Jörnsten R, Linde J, Nordling TEM, Nyman E, Schulze S, Nestor CE, Zhang H, Cedersund G, Benson M, Tjärnberg A, Gustafsson M. LASSIM-A network inference toolbox for genome-wide mechanistic modeling. PLoS Comput Biol 2017. [PMID: 28640810 PMCID: PMC5501685 DOI: 10.1371/journal.pcbi.1005608] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Recent technological advancements have made time-resolved, quantitative, multi-omics data available for many model systems, which could be integrated for systems pharmacokinetic use. Here, we present large-scale simulation modeling (LASSIM), which is a novel mathematical tool for performing large-scale inference using mechanistically defined ordinary differential equations (ODE) for gene regulatory networks (GRNs). LASSIM integrates structural knowledge about regulatory interactions and non-linear equations with multiple steady state and dynamic response expression datasets. The rationale behind LASSIM is that biological GRNs can be simplified using a limited subset of core genes that are assumed to regulate all other gene transcription events in the network. The LASSIM method is implemented as a general-purpose toolbox using the PyGMO Python package to make the most of multicore computers and high performance clusters, and is available at https://gitlab.com/Gustafsson-lab/lassim. As a method, LASSIM works in two steps, where it first infers a non-linear ODE system of the pre-specified core gene expression. Second, LASSIM in parallel optimizes the parameters that model the regulation of peripheral genes by core system genes. We showed the usefulness of this method by applying LASSIM to infer a large-scale non-linear model of naïve Th2 cell differentiation, made possible by integrating Th2 specific bindings, time-series together with six public and six novel siRNA-mediated knock-down experiments. ChIP-seq showed significant overlap for all tested transcription factors. Next, we performed novel time-series measurements of total T-cells during differentiation towards Th2 and verified that our LASSIM model could monitor those data significantly better than comparable models that used the same Th2 bindings. In summary, the LASSIM toolbox opens the door to a new type of model-based data analysis that combines the strengths of reliable mechanistic models with truly systems-level data. We demonstrate the power of this approach by inferring a mechanistically motivated, genome-wide model of the Th2 transcription regulatory system, which plays an important role in several immune related diseases. There are excellent methods to mathematically model time-resolved biological data on a small scale using accurate mechanistic models. Despite the rapidly increasing availability of such data, mechanistic models have not been applied on a genome-wide level due to excessive runtimes and the non-identifiability of model parameters. However, genome-wide, mechanistic models could potentially answer key clinical questions, such as finding the best drug combinations to induce an expression change from a disease to a healthy state. We present LASSIM, which is a toolbox built to infer parameters within mechanistic models on a genomic scale. This is made possible due to a property shared across biological systems, namely the existence of a subset of master regulators, here denoted the core system. The introduction of a core system of genes simplifies the network inference into small solvable sub-problems, and implies that all main regulatory actions on peripheral genes come from a small set of regulator genes. This separation allows substantial parts of computations to be solved in parallel, i.e. permitting the use of a computer cluster, which substantially reduces computation time.
Collapse
Affiliation(s)
- Rasmus Magnusson
- Bioinformatics Unit, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
| | - Guido Pio Mariotti
- Bioinformatics Unit, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
| | - Mattias Köpsén
- Centre for Personalised Medicine, Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
- Integrative Systems Biology, Department of Biomedical Engineering, Linköping University, Linköping, Sweden
| | - William Lövfors
- Centre for Personalised Medicine, Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
- Integrative Systems Biology, Department of Biomedical Engineering, Linköping University, Linköping, Sweden
| | - Danuta R. Gawel
- Centre for Personalised Medicine, Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
| | - Rebecka Jörnsten
- Mathematical Sciences, Chalmers University of Technology, University of Gothenburg, Gothenburg, Sweden
| | - Jörg Linde
- Leibniz-Institute for Natural Product Research and Infection Biology, Hans-Knoell-Institute, Research Group Systems Biology and Bioinformatics, Jena, Germany
- Research Group PiDOMICS, Leibniz Institute for Natural Product Research and Infection Biology -Hans Knöll Institute, Jena, Germany
| | - Torbjörn E. M. Nordling
- Department of Mechanical Engineering, National Cheng Kung University, Tainan, Taiwan
- Stockholm Bioinformatics Center, Science for Life Laboratory, Solna, Sweden
| | - Elin Nyman
- Integrative Systems Biology, Department of Biomedical Engineering, Linköping University, Linköping, Sweden
| | - Sylvie Schulze
- Leibniz-Institute for Natural Product Research and Infection Biology, Hans-Knoell-Institute, Research Group Systems Biology and Bioinformatics, Jena, Germany
| | - Colm E. Nestor
- Centre for Personalised Medicine, Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
| | - Huan Zhang
- Centre for Personalised Medicine, Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
| | - Gunnar Cedersund
- Integrative Systems Biology, Department of Biomedical Engineering, Linköping University, Linköping, Sweden
- Cell Biology, Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
| | - Mikael Benson
- Centre for Personalised Medicine, Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
| | - Andreas Tjärnberg
- Bioinformatics Unit, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
| | - Mika Gustafsson
- Bioinformatics Unit, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
- * E-mail:
| |
Collapse
|
11
|
Guo W, Calixto CPG, Tzioutziou N, Lin P, Waugh R, Brown JWS, Zhang R. Evaluation and improvement of the regulatory inference for large co-expression networks with limited sample size. BMC SYSTEMS BIOLOGY 2017; 11:62. [PMID: 28629365 PMCID: PMC5477119 DOI: 10.1186/s12918-017-0440-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Accepted: 06/09/2017] [Indexed: 12/18/2022]
Abstract
BACKGROUND Co-expression has been widely used to identify novel regulatory relationships using high throughput measurements, such as microarray and RNA-seq data. Evaluation studies on co-expression network analysis methods mostly focus on networks of small or medium size of up to a few hundred nodes. For large networks, simulated expression data usually consist of hundreds or thousands of profiles with different perturbations or knock-outs, which is uncommon in real experiments due to their cost and the amount of work required. Thus, the performances of co-expression network analysis methods on large co-expression networks consisting of a few thousand nodes, with only a small number of profiles with a single perturbation, which more accurately reflect normal experimental conditions, are generally uncharacterized and unknown. METHODS We proposed a novel network inference methods based on Relevance Low order Partial Correlation (RLowPC). RLowPC method uses a two-step approach to select on the high-confidence edges first by reducing the search space by only picking the top ranked genes from an intial partial correlation analysis and, then computes the partial correlations in the confined search space by only removing the linear dependencies from the shared neighbours, largely ignoring the genes showing lower association. RESULTS We selected six co-expression-based methods with good performance in evaluation studies from the literature: Partial correlation, PCIT, ARACNE, MRNET, MRNETB and CLR. The evaluation of these methods was carried out on simulated time-series data with various network sizes ranging from 100 to 3000 nodes. Simulation results show low precision and recall for all of the above methods for large networks with a small number of expression profiles. We improved the inference significantly by refinement of the top weighted edges in the pre-inferred partial correlation networks using RLowPC. We found improved performance by partitioning large networks into smaller co-expressed modules when assessing the method performance within these modules. CONCLUSIONS The evaluation results show that current methods suffer from low precision and recall for large co-expression networks where only a small number of profiles are available. The proposed RLowPC method effectively reduces the indirect edges predicted as regulatory relationships and increases the precision of top ranked predictions. Partitioning large networks into smaller highly co-expressed modules also helps to improve the performance of network inference methods. The RLowPC R package for network construction, refinement and evaluation is available at GitHub: https://github.com/wyguo/RLowPC .
Collapse
Affiliation(s)
- Wenbin Guo
- Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee, Scotland, DD2 5DA, UK
- Plant Sciences Division, School of Life Sciences, University of Dundee, Invergowrie, Dundee, Scotland, DD2 5DA, UK
| | - Cristiane P G Calixto
- Plant Sciences Division, School of Life Sciences, University of Dundee, Invergowrie, Dundee, Scotland, DD2 5DA, UK
| | - Nikoleta Tzioutziou
- Plant Sciences Division, School of Life Sciences, University of Dundee, Invergowrie, Dundee, Scotland, DD2 5DA, UK
| | - Ping Lin
- Division of Mathematics, University of Dundee, Nethergate, Dundee, Scotland, DD1 4HN, UK
| | - Robbie Waugh
- Plant Sciences Division, School of Life Sciences, University of Dundee, Invergowrie, Dundee, Scotland, DD2 5DA, UK
- Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee, Scotland, DD2 5DA, UK
| | - John W S Brown
- Plant Sciences Division, School of Life Sciences, University of Dundee, Invergowrie, Dundee, Scotland, DD2 5DA, UK
- Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee, Scotland, DD2 5DA, UK
| | - Runxuan Zhang
- Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee, Scotland, DD2 5DA, UK.
| |
Collapse
|
12
|
Schleicher J, Conrad T, Gustafsson M, Cedersund G, Guthke R, Linde J. Facing the challenges of multiscale modelling of bacterial and fungal pathogen-host interactions. Brief Funct Genomics 2017; 16:57-69. [PMID: 26857943 PMCID: PMC5439285 DOI: 10.1093/bfgp/elv064] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Recent and rapidly evolving progress on high-throughput measurement techniques and computational performance has led to the emergence of new disciplines, such as systems medicine and translational systems biology. At the core of these disciplines lies the desire to produce multiscale models: mathematical models that integrate multiple scales of biological organization, ranging from molecular, cellular and tissue models to organ, whole-organism and population scale models. Using such models, hypotheses can systematically be tested. In this review, we present state-of-the-art multiscale modelling of bacterial and fungal infections, considering both the pathogen and host as well as their interaction. Multiscale modelling of the interactions of bacteria, especially Mycobacterium tuberculosis, with the human host is quite advanced. In contrast, models for fungal infections are still in their infancy, in particular regarding infections with the most important human pathogenic fungi, Candida albicans and Aspergillus fumigatus. We reflect on the current availability of computational approaches for multiscale modelling of host-pathogen interactions and point out current challenges. Finally, we provide an outlook for future requirements of multiscale modelling.
Collapse
Affiliation(s)
| | | | | | | | | | - Jörg Linde
- Corresponding author: Jörg Linde, Leibniz Institute for Natural Product Research and Infection Biology—Hans Knöll Institute, Jena, Germany. Tel.: +49-3641-532-1290; E-mail:
| |
Collapse
|
13
|
Annavarapu CSR, Dara S, Banka H. Cancer microarray data feature selection using multi-objective binary particle swarm optimization algorithm. EXCLI JOURNAL 2016; 15:460-473. [PMID: 27822174 PMCID: PMC5083964 DOI: 10.17179/excli2016-481] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2016] [Accepted: 07/12/2016] [Indexed: 11/12/2022]
Abstract
Cancer investigations in microarray data play a major role in cancer analysis and the treatment. Cancer microarray data consists of complex gene expressed patterns of cancer. In this article, a Multi-Objective Binary Particle Swarm Optimization (MOBPSO) algorithm is proposed for analyzing cancer gene expression data. Due to its high dimensionality, a fast heuristic based pre-processing technique is employed to reduce some of the crude domain features from the initial feature set. Since these pre-processed and reduced features are still high dimensional, the proposed MOBPSO algorithm is used for finding further feature subsets. The objective functions are suitably modeled by optimizing two conflicting objectives i.e., cardinality of feature subsets and distinctive capability of those selected subsets. As these two objective functions are conflicting in nature, they are more suitable for multi-objective modeling. The experiments are carried out on benchmark gene expression datasets, i.e., Colon, Lymphoma and Leukaemia available in literature. The performance of the selected feature subsets with their classification accuracy and validated using 10 fold cross validation techniques. A detailed comparative study is also made to show the betterment or competitiveness of the proposed algorithm.
Collapse
Affiliation(s)
| | - Suresh Dara
- Department of Computer Science and Engineering, Indian School of Mines, Dhanbad-826004, Jharkhand, India
| | - Haider Banka
- Department of Computer Science and Engineering, Indian School of Mines, Dhanbad-826004, Jharkhand, India
| |
Collapse
|
14
|
Izadi F, Zarrini HN, Kiani G, Jelodar NB. A comparative analytical assay of gene regulatory networks inferred using microarray and RNA-seq datasets. Bioinformation 2016; 12:340-346. [PMID: 28293077 PMCID: PMC5320930 DOI: 10.6026/97320630012340] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Revised: 08/05/2016] [Accepted: 08/06/2016] [Indexed: 01/16/2023] Open
Abstract
A Gene Regulatory Network (GRN) is a collection of interactions between molecular regulators and their targets in cells governing gene expression level. Omics data explosion generated from high-throughput genomic assays such as microarray and RNA-Seq technologies and the emergence of a number of pre-processing methods demands suitable guidelines to determine the impact of transcript data platforms and normalization procedures on describing associations in GRNs. In this study exploiting publically available microarray and RNA-Seq datasets and a gold standard of transcriptional interactions in Arabidopsis, we performed a comparison between six GRNs derived by RNA-Seq and microarray data and different normalization procedures. As a result we observed that compared algorithms were highly data-specific and Networks reconstructed by RNA-Seq data revealed a considerable accuracy against corresponding networks captured by microarrays. Topological analysis showed that GRNs inferred from two platforms were similar in several of topological features although we observed more connectivity in RNA-Seq derived genes network. Taken together transcriptional regulatory networks obtained by Robust Multiarray Averaging (RMA) and Variance-Stabilizing Transformed (VST) normalized data demonstrated predicting higher rate of true edges over the rest of methods used in this comparison.
Collapse
Affiliation(s)
- Fereshteh Izadi
- Plant Breeding Department, Sari Agricultural Sciences and Natural Resources, Iran
| | - Hamid Najafi Zarrini
- Plant Breeding Department, Sari Agricultural Sciences and Natural Resources, Iran
| | - Ghaffar Kiani
- Plant Breeding Department, Sari Agricultural Sciences and Natural Resources, Iran
| | | |
Collapse
|
15
|
Guthke R, Gerber S, Conrad T, Vlaic S, Durmuş S, Çakır T, Sevilgen FE, Shelest E, Linde J. Data-based Reconstruction of Gene Regulatory Networks of Fungal Pathogens. Front Microbiol 2016; 7:570. [PMID: 27148247 PMCID: PMC4840211 DOI: 10.3389/fmicb.2016.00570] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2016] [Accepted: 04/05/2016] [Indexed: 12/17/2022] Open
Abstract
In the emerging field of systems biology of fungal infection, one of the central roles belongs to the modeling of gene regulatory networks (GRNs). Utilizing omics-data, GRNs can be predicted by mathematical modeling. Here, we review current advances of data-based reconstruction of both small-scale and large-scale GRNs for human pathogenic fungi. The advantage of large-scale genome-wide modeling is the possibility to predict central (hub) genes and thereby indicate potential biomarkers and drug targets. In contrast, small-scale GRN models provide hypotheses on the mode of gene regulatory interactions, which have to be validated experimentally. Due to the lack of sufficient quantity and quality of both experimental data and prior knowledge about regulator–target gene relations, the genome-wide modeling still remains problematic for fungal pathogens. While a first genome-wide GRN model has already been published for Candida albicans, the feasibility of such modeling for Aspergillus fumigatus is evaluated in the present article. Based on this evaluation, opinions are drawn on future directions of GRN modeling of fungal pathogens. The crucial point of genome-wide GRN modeling is the experimental evidence, both used for inferring the networks (omics ‘first-hand’ data as well as literature data used as prior knowledge) and for validation and evaluation of the inferred network models.
Collapse
Affiliation(s)
- Reinhard Guthke
- Research Group Systems Biology and Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knoell Institute Jena, Germany
| | - Silvia Gerber
- Research Group Systems Biology and Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knoell Institute Jena, Germany
| | - Theresia Conrad
- Research Group Systems Biology and Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knoell Institute Jena, Germany
| | - Sebastian Vlaic
- Research Group Systems Biology and Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knoell Institute Jena, Germany
| | - Saliha Durmuş
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University Kocaeli, Turkey
| | - Tunahan Çakır
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University Kocaeli, Turkey
| | - F E Sevilgen
- Department of Computer Engineering, Gebze Technical University Kocaeli, Turkey
| | - Ekaterina Shelest
- Research Group Systems Biology and Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knoell Institute Jena, Germany
| | - Jörg Linde
- Research Group Systems Biology and Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knoell Institute Jena, Germany
| |
Collapse
|
16
|
Schulze S, Schleicher J, Guthke R, Linde J. How to Predict Molecular Interactions between Species? Front Microbiol 2016; 7:442. [PMID: 27065992 PMCID: PMC4814556 DOI: 10.3389/fmicb.2016.00442] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Accepted: 03/18/2016] [Indexed: 12/21/2022] Open
Abstract
Organisms constantly interact with other species through physical contact which leads to changes on the molecular level, for example the transcriptome. These changes can be monitored for all genes, with the help of high-throughput experiments such as RNA-seq or microarrays. The adaptation of the gene expression to environmental changes within cells is mediated through complex gene regulatory networks. Often, our knowledge of these networks is incomplete. Network inference predicts gene regulatory interactions based on transcriptome data. An emerging application of high-throughput transcriptome studies are dual transcriptomics experiments. Here, the transcriptome of two or more interacting species is measured simultaneously. Based on a dual RNA-seq data set of murine dendritic cells infected with the fungal pathogen Candida albicans, the software tool NetGenerator was applied to predict an inter-species gene regulatory network. To promote further investigations of molecular inter-species interactions, we recently discussed dual RNA-seq experiments for host-pathogen interactions and extended the applied tool NetGenerator (Schulze et al., 2015). The updated version of NetGenerator makes use of measurement variances in the algorithmic procedure and accepts gene expression time series data with missing values. Additionally, we tested multiple modeling scenarios regarding the stimuli functions of the gene regulatory network. Here, we summarize the work by Schulze et al. (2015) and put it into a broader context. We review various studies making use of the dual transcriptomics approach to investigate the molecular basis of interacting species. Besides the application to host-pathogen interactions, dual transcriptomics data are also utilized to study mutualistic and commensalistic interactions. Furthermore, we give a short introduction into additional approaches for the prediction of gene regulatory networks and discuss their application to dual transcriptomics data. We conclude that the application of network inference on dual-transcriptomics data is a promising approach to predict molecular inter-species interactions.
Collapse
Affiliation(s)
- Sylvie Schulze
- Research Group Systems Biology and Bioinformatics, Leibniz-Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute Jena, Germany
| | - Jana Schleicher
- Research Group Systems Biology and Bioinformatics, Leibniz-Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute Jena, Germany
| | - Reinhard Guthke
- Research Group Systems Biology and Bioinformatics, Leibniz-Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute Jena, Germany
| | - Jörg Linde
- Research Group Systems Biology and Bioinformatics, Leibniz-Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute Jena, Germany
| |
Collapse
|
17
|
Budak G, Eren Ozsoy O, Aydin Son Y, Can T, Tuncbag N. Reconstruction of the temporal signaling network in Salmonella-infected human cells. Front Microbiol 2015; 6:730. [PMID: 26257716 PMCID: PMC4507143 DOI: 10.3389/fmicb.2015.00730] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2015] [Accepted: 07/03/2015] [Indexed: 12/02/2022] Open
Abstract
Salmonella enterica is a bacterial pathogen that usually infects its host through food sources. Translocation of the pathogen proteins into the host cells leads to changes in the signaling mechanism either by activating or inhibiting the host proteins. Given that the bacterial infection modifies the response network of the host, a more coherent view of the underlying biological processes and the signaling networks can be obtained by using a network modeling approach based on the reverse engineering principles. In this work, we have used a published temporal phosphoproteomic dataset of Salmonella-infected human cells and reconstructed the temporal signaling network of the human host by integrating the interactome and the phosphoproteomic dataset. We have combined two well-established network modeling frameworks, the Prize-collecting Steiner Forest (PCSF) approach and the Integer Linear Programming (ILP) based edge inference approach. The resulting network conserves the information on temporality, direction of interactions, while revealing hidden entities in the signaling, such as the SNARE binding, mTOR signaling, immune response, cytoskeleton organization, and apoptosis pathways. Targets of the Salmonella effectors in the host cells such as CDC42, RHOA, 14-3-3δ, Syntaxin family, Oxysterol-binding proteins were included in the reconstructed signaling network although they were not present in the initial phosphoproteomic data. We believe that integrated approaches, such as the one presented here, have a high potential for the identification of clinical targets in infectious diseases, especially in the Salmonella infections.
Collapse
Affiliation(s)
- Gungor Budak
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University Ankara, Turkey
| | - Oyku Eren Ozsoy
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University Ankara, Turkey
| | - Yesim Aydin Son
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University Ankara, Turkey
| | - Tolga Can
- Department of Computer Engineering, College of Engineering, Middle East Technical University Ankara, Turkey
| | - Nurcan Tuncbag
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University Ankara, Turkey
| |
Collapse
|
18
|
Durmuş S, Çakır T, Özgür A, Guthke R. A review on computational systems biology of pathogen-host interactions. Front Microbiol 2015; 6:235. [PMID: 25914674 PMCID: PMC4391036 DOI: 10.3389/fmicb.2015.00235] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Accepted: 03/10/2015] [Indexed: 12/27/2022] Open
Abstract
Pathogens manipulate the cellular mechanisms of host organisms via pathogen-host interactions (PHIs) in order to take advantage of the capabilities of host cells, leading to infections. The crucial role of these interspecies molecular interactions in initiating and sustaining infections necessitates a thorough understanding of the corresponding mechanisms. Unlike the traditional approach of considering the host or pathogen separately, a systems-level approach, considering the PHI system as a whole is indispensable to elucidate the mechanisms of infection. Following the technological advances in the post-genomic era, PHI data have been produced in large-scale within the last decade. Systems biology-based methods for the inference and analysis of PHI regulatory, metabolic, and protein-protein networks to shed light on infection mechanisms are gaining increasing demand thanks to the availability of omics data. The knowledge derived from the PHIs may largely contribute to the identification of new and more efficient therapeutics to prevent or cure infections. There are recent efforts for the detailed documentation of these experimentally verified PHI data through Web-based databases. Despite these advances in data archiving, there are still large amounts of PHI data in the biomedical literature yet to be discovered, and novel text mining methods are in development to unearth such hidden data. Here, we review a collection of recent studies on computational systems biology of PHIs with a special focus on the methods for the inference and analysis of PHI networks, covering also the Web-based databases and text-mining efforts to unravel the data hidden in the literature.
Collapse
Affiliation(s)
- Saliha Durmuş
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University, KocaeliTurkey
| | - Tunahan Çakır
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University, KocaeliTurkey
| | - Arzucan Özgür
- Department of Computer Engineering, Boǧaziçi University, IstanbulTurkey
| | - Reinhard Guthke
- Leibniz Institute for Natural Product Research and Infection Biology – Hans-Knoell-Institute, JenaGermany
| |
Collapse
|