1
|
Martini L, Baek SH, Lo I, Raby BA, Silverman E, Weiss S, Glass K, Halu A. Detecting and dissecting signaling crosstalk via the multilayer network integration of signaling and regulatory interactions. Nucleic Acids Res 2024; 52:e5. [PMID: 37953325 PMCID: PMC10783515 DOI: 10.1093/nar/gkad1035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 06/27/2023] [Accepted: 10/23/2023] [Indexed: 11/14/2023] Open
Abstract
The versatility of cellular response arises from the communication, or crosstalk, of signaling pathways in a complex network of signaling and transcriptional regulatory interactions. Understanding the various mechanisms underlying crosstalk on a global scale requires untargeted computational approaches. We present a network-based statistical approach, MuXTalk, that uses high-dimensional edges called multilinks to model the unique ways in which signaling and regulatory interactions can interface. We demonstrate that the signaling-regulatory interface is located primarily in the intermediary region between signaling pathways where crosstalk occurs, and that multilinks can differentiate between distinct signaling-transcriptional mechanisms. Using statistically over-represented multilinks as proxies of crosstalk, we infer crosstalk among 60 signaling pathways, expanding currently available crosstalk databases by more than five-fold. MuXTalk surpasses existing methods in terms of model performance metrics, identifies additions to manual curation efforts, and pinpoints potential mediators of crosstalk. Moreover, it accommodates the inherent context-dependence of crosstalk, allowing future applications to cell type- and disease-specific crosstalk.
Collapse
Affiliation(s)
- Leonardo Martini
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Department of Computer, Control, and Management Engineering, Sapienza University of Rome, Rome, 00185, Italy
| | - Seung Han Baek
- Division of Pulmonary Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Ian Lo
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Benjamin A Raby
- Division of Pulmonary Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Scott T Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Kimberly Glass
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Arda Halu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| |
Collapse
|
2
|
Nasser R, Sharan R. BERTwalk for integrating gene networks to predict gene- to pathway-level properties. BIOINFORMATICS ADVANCES 2023; 3:vbad086. [PMID: 37448813 PMCID: PMC10336298 DOI: 10.1093/bioadv/vbad086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 06/14/2023] [Accepted: 07/02/2023] [Indexed: 07/15/2023]
Abstract
Motivation Graph representation learning is a fundamental problem in the field of data science with applications to integrative analysis of biological networks. Previous work in this domain was mostly limited to shallow representation techniques. A recent deep representation technique, BIONIC, has achieved state-of-the-art results in a variety of tasks but used arbitrarily defined components. Results Here, we present BERTwalk, an unsupervised learning scheme that combines the BERT masked language model with a network propagation regularization for graph representation learning. The transformation from networks to texts allows our method to naturally integrate different networks and provide features that inform not only nodes or edges but also pathway-level properties. We show that our BERTwalk model outperforms BIONIC, as well as four other recent methods, on two comprehensive benchmarks in yeast and human. We further show that our model can be utilized to infer functional pathways and their effects. Availability and implementation Code and data are available at https://github.com/raminass/BERTwalk. Contact roded@tauex.tau.ac.il.
Collapse
Affiliation(s)
- Rami Nasser
- School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| | | |
Collapse
|
3
|
Jiang Y, Liang Y, Wang D, Xu D, Joshi T. A dynamic programing approach to integrate gene expression data and network information for pathway model generation. Bioinformatics 2020; 36:169-176. [PMID: 31168616 DOI: 10.1093/bioinformatics/btz467] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 05/15/2019] [Accepted: 05/31/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION As large amounts of biological data continue to be rapidly generated, a major focus of bioinformatics research has been aimed toward integrating these data to identify active pathways or modules under certain experimental conditions or phenotypes. Although biologically significant modules can often be detected globally by many existing methods, it is often hard to interpret or make use of the results toward pathway model generation and testing. RESULTS To address this gap, we have developed the IMPRes algorithm, a new step-wise active pathway detection method using a dynamic programing approach. IMPRes takes advantage of the existing pathway interaction knowledge in Kyoto Encyclopedia of Genes and Genomes. Omics data are then used to assign penalties to genes, interactions and pathways. Finally, starting from one or multiple seed genes, a shortest path algorithm is applied to detect downstream pathways that best explain the gene expression data. Since dynamic programing enables the detection one step at a time, it is easy for researchers to trace the pathways, which may lead to more accurate drug design and more effective treatment strategies. The evaluation experiments conducted on three yeast datasets have shown that IMPRes can achieve competitive or better performance than other state-of-the-art methods. Furthermore, a case study on human lung cancer dataset was performed and we provided several insights on genes and mechanisms involved in lung cancer, which had not been discovered before. AVAILABILITY AND IMPLEMENTATION IMPRes visualization tool is available via web server at http://digbio.missouri.edu/impres. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yuexu Jiang
- Department of Computer Science and Technology, Jilin University, Changchun 130012, China.,Department of Electrical Engineering and Computer Science, Columbia, MO 65211, USA
| | - Yanchun Liang
- Department of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Duolin Wang
- Department of Computer Science and Technology, Jilin University, Changchun 130012, China.,Department of Electrical Engineering and Computer Science, Columbia, MO 65211, USA
| | - Dong Xu
- Department of Computer Science and Technology, Jilin University, Changchun 130012, China.,Department of Electrical Engineering and Computer Science, Columbia, MO 65211, USA.,Informatics Institute and Christopher S. Bond Life Sciences Center, Columbia, MO 65211, USA
| | - Trupti Joshi
- Department of Electrical Engineering and Computer Science, Columbia, MO 65211, USA.,Informatics Institute and Christopher S. Bond Life Sciences Center, Columbia, MO 65211, USA.,Department of Health Management and Informatics, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
4
|
Guan G, Fang M, Wong MK, Ho VWS, An X, Tang C, Huang X, Zhao Z. Multilevel regulation of muscle-specific transcription factor hlh-1 during Caenorhabditis elegans embryogenesis. Dev Genes Evol 2020; 230:265-278. [PMID: 32556563 PMCID: PMC7371654 DOI: 10.1007/s00427-020-00662-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Accepted: 05/31/2020] [Indexed: 11/29/2022]
Abstract
hlh-1 is a myogenic transcription factor required for body-wall muscle specification during embryogenesis in Caenorhabditis elegans. Despite its well-known role in muscle specification, comprehensive regulatory control upstream of hlh-1 remains poorly defined. Here, we first established a statistical reference for the spatiotemporal expression of hlh-1 at single-cell resolution up to the second last round of divisions for most of the cell lineages (from 4- to 350-cell stage) using 13 wild-type embryos. We next generated lineal expression of hlh-1 after RNA interference (RNAi) perturbation of 65 genes, which were selected based on their degree of conservation, mutant phenotypes, and known roles in development. We then compared the expression profiles between wild-type and RNAi embryos by clustering according to their lineal expression patterns using mean-shift and density-based clustering algorithms, which not only confirmed the roles of existing genes but also uncovered the potential functions of novel genes in muscle specification at multiple levels, including cellular, lineal, and embryonic levels. By combining the public data on protein-protein interactions, protein-DNA interactions, and genetic interactions with our RNAi data, we inferred regulatory pathways upstream of hlh-1 that function globally or locally. This work not only revealed diverse and multilevel regulatory mechanisms coordinating muscle differentiation during C. elegans embryogenesis but also laid a foundation for further characterizing the regulatory pathways controlling muscle specification at the cellular, lineal (local), or embryonic (global) level.
Collapse
Affiliation(s)
- Guoye Guan
- Center for Quantitative Biology, Peking University, Beijing, 100871, China
| | - Meichen Fang
- School of Life Sciences, Peking University, Beijing, 100871, China
| | - Ming-Kin Wong
- Department of Biology, Hong Kong Baptist University, Hong Kong, 999077, China
| | - Vincy Wing Sze Ho
- Department of Biology, Hong Kong Baptist University, Hong Kong, 999077, China
| | - Xiaomeng An
- Department of Biology, Hong Kong Baptist University, Hong Kong, 999077, China
| | - Chao Tang
- Center for Quantitative Biology, Peking University, Beijing, 100871, China
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, 100871, China
- School of Physics, Peking University, Beijing, 100871, China
| | - Xiaotai Huang
- School of Computer Science and Technology, Xidian University, Xi'an, 710126, Shaanxi, China.
| | - Zhongying Zhao
- Department of Biology, Hong Kong Baptist University, Hong Kong, 999077, China.
- State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Hong Kong, 999077, China.
| |
Collapse
|
5
|
Cardner M, Meyer-Schaller N, Christofori G, Beerenwinkel N. Inferring signalling dynamics by integrating interventional with observational data. Bioinformatics 2020; 35:i577-i585. [PMID: 31510686 PMCID: PMC6612850 DOI: 10.1093/bioinformatics/btz325] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Motivation In order to infer a cell signalling network, we generally need interventional data from perturbation experiments. If the perturbation experiments are time-resolved, then signal progression through the network can be inferred. However, such designs are infeasible for large signalling networks, where it is more common to have steady-state perturbation data on the one hand, and a non-interventional time series on the other. Such was the design in a recent experiment investigating the coordination of epithelial–mesenchymal transition (EMT) in murine mammary gland cells. We aimed to infer the underlying signalling network of transcription factors and microRNAs coordinating EMT, as well as the signal progression during EMT. Results In the context of nested effects models, we developed a method for integrating perturbation data with a non-interventional time series. We applied the model to RNA sequencing data obtained from an EMT experiment. Part of the network inferred from RNA interference was validated experimentally using luciferase reporter assays. Our model extension is formulated as an integer linear programme, which can be solved efficiently using heuristic algorithms. This extension allowed us to infer the signal progression through the network during an EMT time course, and thereby assess when each regulator is necessary for EMT to advance. Availability and implementation R package at https://github.com/cbg-ethz/timeseriesNEM. The RNA sequencing data and microscopy images can be explored through a Shiny app at https://emt.bsse.ethz.ch. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mathias Cardner
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | | | | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
6
|
Farahmand S, O'Connor C, Macoska JA, Zarringhalam K. Causal Inference Engine: a platform for directional gene set enrichment analysis and inference of active transcriptional regulators. Nucleic Acids Res 2020; 47:11563-11573. [PMID: 31701125 PMCID: PMC7145661 DOI: 10.1093/nar/gkz1046] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Revised: 09/19/2019] [Accepted: 10/28/2019] [Indexed: 02/07/2023] Open
Abstract
Inference of active regulatory mechanisms underlying specific molecular and environmental perturbations is essential for understanding cellular response. The success of inference algorithms relies on the quality and coverage of the underlying network of regulator–gene interactions. Several commercial platforms provide large and manually curated regulatory networks and functionality to perform inference on these networks. Adaptation of such platforms for open-source academic applications has been hindered by the lack of availability of accurate, high-coverage networks of regulatory interactions and integration of efficient causal inference algorithms. In this work, we present CIE, an integrated platform for causal inference of active regulatory mechanisms form differential gene expression data. Using a regularized Gaussian Graphical Model, we construct a transcriptional regulatory network by integrating publicly available ChIP-seq experiments with gene-expression data from tissue-specific RNA-seq experiments. Our GGM approach identifies high confidence transcription factor (TF)–gene interactions and annotates the interactions with information on mode of regulation (activation vs. repression). Benchmarks against manually curated databases of TF–gene interactions show that our method can accurately detect mode of regulation. We demonstrate the ability of our platform to identify active transcriptional regulators by using controlled in vitro overexpression and stem-cell differentiation studies and utilize our method to investigate transcriptional mechanisms of fibroblast phenotypic plasticity.
Collapse
Affiliation(s)
- Saman Farahmand
- Computational Sciences PhD program, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Corey O'Connor
- Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Jill A Macoska
- Center for Personalized Cancer Therapy, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Kourosh Zarringhalam
- Computational Sciences PhD program, University of Massachusetts Boston, Boston, MA 02125, USA.,Department of Mathematics, University of Massachusetts Boston, Boston, MA 02125, USA
| |
Collapse
|
7
|
IMPRes-Pro: A high dimensional multiomics integration method for in silico hypothesis generation. Methods 2020; 173:16-23. [DOI: 10.1016/j.ymeth.2019.06.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 06/08/2019] [Accepted: 06/13/2019] [Indexed: 01/18/2023] Open
|
8
|
Silverbush D, Sharan R. A systematic approach to orient the human protein-protein interaction network. Nat Commun 2019; 10:3015. [PMID: 31289271 PMCID: PMC6617457 DOI: 10.1038/s41467-019-10887-6] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2018] [Accepted: 06/06/2019] [Indexed: 11/16/2022] Open
Abstract
The protein-protein interaction (PPI) network of an organism serves as a skeleton for its signaling circuitry, which mediates cellular response to environmental and genetic cues. Understanding this circuitry could improve the prediction of gene function and cellular behavior in response to diverse signals. To realize this potential, one has to comprehensively map PPIs and their directions of signal flow. While the quality and the volume of identified human PPIs improved dramatically over the last decade, the directions of these interactions are still mostly unknown, thus precluding subsequent prediction and modeling efforts. Here we present a systematic approach to orient the human PPI network using drug response and cancer genomic data. We provide a diffusion-based method for the orientation task that significantly outperforms existing methods. The oriented network leads to improved prioritization of cancer driver genes and drug targets compared to the state-of-the-art unoriented network.
Collapse
Affiliation(s)
- Dana Silverbush
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Roded Sharan
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel.
| |
Collapse
|
9
|
Kiblawi S, Chasman D, Henning A, Park E, Poon H, Gould M, Ahlquist P, Craven M. Augmenting subnetwork inference with information extracted from the scientific literature. PLoS Comput Biol 2019; 15:e1006758. [PMID: 31246951 PMCID: PMC6619809 DOI: 10.1371/journal.pcbi.1006758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2018] [Revised: 07/10/2019] [Accepted: 01/04/2019] [Indexed: 11/20/2022] Open
Abstract
Many biological studies involve either (i) manipulating some aspect of a cell or its environment and then simultaneously measuring the effect on thousands of genes, or (ii) systematically manipulating each gene and then measuring the effect on some response of interest. A common challenge that arises in these studies is to explain how genes identified as relevant in the given experiment are organized into a subnetwork that accounts for the response of interest. The task of inferring a subnetwork is typically dependent on the information available in publicly available, structured databases, which suffer from incompleteness. However, a wealth of potentially relevant information resides in the scientific literature, such as information about genes associated with certain concepts of interest, as well as interactions that occur among various biological entities. We contend that by exploiting this information, we can improve the explanatory power and accuracy of subnetwork inference in multiple applications. Here we propose and investigate several ways in which information extracted from the scientific literature can be used to augment subnetwork inference. We show that we can use literature-extracted information to (i) augment the set of entities identified as being relevant in a subnetwork inference task, (ii) augment the set of interactions used in the process, and (iii) support targeted browsing of a large inferred subnetwork by identifying entities and interactions that are closely related to concepts of interest. We use this approach to uncover the pathways involved in interactions between a virus and a host cell, and the pathways that are regulated by a transcription factor associated with breast cancer. Our experimental results demonstrate that these approaches can provide more accurate and more interpretable subnetworks. Integer program code, background network data, and pathfinding code are available at https://github.com/Craven-Biostat-Lab/subnetwork_inference There is a multitude of publicly available databases that contain information about biological entities (i.e., genes, proteins, and other small molecules) as well as information about how these entities interact together. However, these databases are often incomplete. There is a wealth of information present in the text of the scientific literature that is not yet available in these databases. Using tools that mine the scientific literature we are able to extract some of this potentially relevant information. In this work we show how we can use publicly available databases in conjunction with the information extracted from the scientific literature to infer the networks that are involved in specific biological processes, such as viral replication and cancer tumor growth.
Collapse
Affiliation(s)
- Sid Kiblawi
- Department of Computer Sciences, University of Wisconsin, Madison, WI, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA
| | - Deborah Chasman
- Wisconsin Institute for Discovery, University of Wisconsin, Madison, WI, USA
| | - Amanda Henning
- Department of Oncology, University of Wisconsin, Madison, WI, USA
| | - Eunju Park
- Institute for Molecular Virology, University of Wisconsin, Madison, WI, USA
- Morgridge Institute for Research, Madison, WI, USA
| | | | - Michael Gould
- Department of Oncology, University of Wisconsin, Madison, WI, USA
| | - Paul Ahlquist
- Institute for Molecular Virology, University of Wisconsin, Madison, WI, USA
- Morgridge Institute for Research, Madison, WI, USA
- Howard Hughes Medical Institute, University of Wisconsin, Madison, WI, USA
| | - Mark Craven
- Department of Computer Sciences, University of Wisconsin, Madison, WI, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA
- * E-mail:
| |
Collapse
|
10
|
Abstract
Motivation A chief goal of systems biology is the reconstruction of large-scale executable models of cellular processes of interest. While accurate continuous models are still beyond reach, a powerful alternative is to learn a logical model of the processes under study, which predicts the logical state of any node of the model as a Boolean function of its incoming nodes. Key to learning such models is the functional annotation of the underlying physical interactions with activation/repression (sign) effects. Such annotations are pretty common for a few well-studied biological pathways. Results Here we present a novel optimization framework for large-scale sign annotation that employs different plausible models of signaling and combines them in a rigorous manner. We apply our framework to two large-scale knockout datasets in yeast and evaluate its different components as well as the combined model to predict signs of different subsets of physical interactions. Overall, we obtain an accurate predictor that outperforms previous work by a considerable margin. Availability and implementation The code is publicly available at https://github.com/spatkar94/NetworkAnnotation.git.
Collapse
Affiliation(s)
- Sushant Patkar
- Computer Science, University of Maryland, College Park, MD, USA
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
11
|
Alim MA, Ay A, Hasan MM, Thai MT, Kahveci T. Construction of Signaling Pathways with RNAi Data and Multiple Reference Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1079-1091. [PMID: 30102599 DOI: 10.1109/tcbb.2017.2710129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Signaling networks are involved in almost all major diseases such as cancer. As a result of this, understanding how signaling networks function is vital for finding new treatments for many diseases. Using gene knockdown assays such as RNA interference (RNAi) technology, many genes involved in these networks can be identified. However, determining the interactions between these genes in the signaling networks using only experimental techniques is very challenging, as performing extensive experiments is very expensive and sometimes, even impractical. Construction of signaling networks from RNAi data using computational techniques have been proposed as an alternative way to solve this challenging problem. However, the earlier approaches are either not scalable to large scale networks, or their accuracy levels are not satisfactory. In this study, we integrate RNAi data given on a target network with multiple reference signaling networks and phylogenetic trees to construct the topology of the target signaling network. In our work, the network construction is considered as finding the minimum number of edit operations on given multiple reference networks, in which their contributions are weighted by their phylogenetic distances to the target network. The edit operations on the reference networks lead to a target network that satisfies the RNAi knockdown observations. Here, we propose two new reference-based signaling network construction methods that provide optimal results and scale well to large-scale signaling networks of hundreds of components. We compare the performance of these approaches to the state-of-the-art reference-based network construction method SiNeC on synthetic, semi-synthetic, and real datasets. Our analyses show that the proposed methods outperform SiNeC method in terms of accuracy. Furthermore, we show that our methods function well even if evolutionarily distant reference networks are used. Application of our methods to the Apoptosis and Wnt signaling pathways recovers the known protein-protein interactions and suggests additional relevant interactions that can be tested experimentally.
Collapse
|
12
|
Knowledge-Based Neuroendocrine Immunomodulation (NIM) Molecular Network Construction and Its Application. Molecules 2018; 23:molecules23061312. [PMID: 29848990 PMCID: PMC6099962 DOI: 10.3390/molecules23061312] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2018] [Revised: 05/24/2018] [Accepted: 05/25/2018] [Indexed: 01/23/2023] Open
Abstract
Growing evidence shows that the neuroendocrine immunomodulation (NIM) network plays an important role in maintaining and modulating body function and the homeostasis of the internal environment. The disequilibrium of NIM in the body is closely associated with many diseases. In the present study, we first collected a core dataset of NIM signaling molecules based on our knowledge and obtained 611 NIM signaling molecules. Then, we built a NIM molecular network based on the MetaCore database and analyzed the signaling transduction characteristics of the core network. We found that the endocrine system played a pivotal role in the bridge between the nervous and immune systems and the signaling transduction between the three systems was not homogeneous. Finally, employing the forest algorithm, we identified the molecular hub playing an important role in the pathogenesis of rheumatoid arthritis (RA) and Alzheimer’s disease (AD), based on the NIM molecular network constructed by us. The results showed that GSK3B, SMARCA4, PSMD7, HNF4A, PGR, RXRA, and ESRRA might be the key molecules for RA, while RARA, STAT3, STAT1, and PSMD14 might be the key molecules for AD. The molecular hub may be a potentially druggable target for these two complex diseases based on the literature. This study suggests that the NIM molecular network in this paper combined with the forest algorithm might provide a useful tool for predicting drug targets and understanding the pathogenesis of diseases. Therefore, the NIM molecular network and the corresponding online tool will not only enhance research on complex diseases and system biology, but also promote the communication of valuable clinical experience between modern medicine and Traditional Chinese Medicine (TCM).
Collapse
|
13
|
MacGilvray ME, Shishkova E, Chasman D, Place M, Gitter A, Coon JJ, Gasch AP. Network inference reveals novel connections in pathways regulating growth and defense in the yeast salt response. PLoS Comput Biol 2018; 13:e1006088. [PMID: 29738528 PMCID: PMC5940180 DOI: 10.1371/journal.pcbi.1006088] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Accepted: 03/13/2018] [Indexed: 11/18/2022] Open
Abstract
Cells respond to stressful conditions by coordinating a complex, multi-faceted response that spans many levels of physiology. Much of the response is coordinated by changes in protein phosphorylation. Although the regulators of transcriptome changes during stress are well characterized in Saccharomyces cerevisiae, the upstream regulatory network controlling protein phosphorylation is less well dissected. Here, we developed a computational approach to infer the signaling network that regulates phosphorylation changes in response to salt stress. We developed an approach to link predicted regulators to groups of likely co-regulated phospho-peptides responding to stress, thereby creating new edges in a background protein interaction network. We then use integer linear programming (ILP) to integrate wild type and mutant phospho-proteomic data and predict the network controlling stress-activated phospho-proteomic changes. The network we inferred predicted new regulatory connections between stress-activated and growth-regulating pathways and suggested mechanisms coordinating metabolism, cell-cycle progression, and growth during stress. We confirmed several network predictions with co-immunoprecipitations coupled with mass-spectrometry protein identification and mutant phospho-proteomic analysis. Results show that the cAMP-phosphodiesterase Pde2 physically interacts with many stress-regulated transcription factors targeted by PKA, and that reduced phosphorylation of those factors during stress requires the Rck2 kinase that we show physically interacts with Pde2. Together, our work shows how a high-quality computational network model can facilitate discovery of new pathway interactions during osmotic stress. Cells sense and respond to stressful environments by utilizing complex signaling networks that integrate diverse signals to coordinate a multi-faceted physiological response. Much of this response is controlled by post-translational protein phosphorylation. Although many regulators that mediate changes in protein phosphorylation are known, how these regulators inter-connect in a single regulatory network that can transmit cellular signals is not known. It is also unclear how regulators that promote growth and regulators that activate the stress response interconnect to reorganize resource allocation during stress. Here, we developed an integrated experimental and computational workflow to infer the signaling network that regulates phosphorylation changes during osmotic stress in the budding yeast Saccharomyces cerevisiae. The workflow integrates data measuring protein phosphorylation changes in response to osmotic stress with known physical interactions between yeast proteins from large-scale datasets, along with other information about how regulators recognize their targets. The resulting network suggested new signaling connections between regulators and pathways, including those involved in regulating growth and defense, and predicted new regulators involved in stress defense. Our work highlights the power of using network inference to deliver new insight on how cells coordinate a diverse adaptive strategy to stress.
Collapse
Affiliation(s)
- Matthew E. MacGilvray
- Laboratory of Genetics, University of Wisconsin—Madison, Madison, WI, United States of America
| | - Evgenia Shishkova
- Department of Biomolecular Chemistry, University of Wisconsin—Madison, Madison, WI, United States of America
| | - Deborah Chasman
- Wisconsin Institute for Discovery, University of Wisconsin–Madison, Madison, WI, United States of America
| | - Michael Place
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin -Madison, Madison, WI, United States of America
- Morgridge Institute for Research, Madison, WI, United States of America
| | - Joshua J. Coon
- Department of Biomolecular Chemistry, University of Wisconsin—Madison, Madison, WI, United States of America
- Morgridge Institute for Research, Madison, WI, United States of America
- Department of Chemistry, University of Wisconsin -Madison, Madison, WI, United States of America
- Genome Center of Wisconsin, Madison, WI, United States of America
| | - Audrey P. Gasch
- Laboratory of Genetics, University of Wisconsin—Madison, Madison, WI, United States of America
- Department of Chemistry, University of Wisconsin -Madison, Madison, WI, United States of America
- * E-mail:
| |
Collapse
|
14
|
Sehl ME, Wicha MS. Modeling of Interactions between Cancer Stem Cells and their Microenvironment: Predicting Clinical Response. Methods Mol Biol 2018; 1711:333-349. [PMID: 29344897 PMCID: PMC6322404 DOI: 10.1007/978-1-4939-7493-1_16] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2023]
Abstract
Mathematical models of cancer stem cells are useful in translational cancer research for facilitating the understanding of tumor growth dynamics and for predicting treatment response and resistance to combined targeted therapies. In this chapter, we describe appealing aspects of different methods used in mathematical oncology and discuss compelling questions in oncology that can be addressed with these modeling techniques. We describe a simplified version of a model of the breast cancer stem cell niche, illustrate the visualization of the model, and apply stochastic simulation to generate full distributions and average trajectories of cell type populations over time. We further discuss the advent of single-cell data in studying cancer stem cell heterogeneity and how these data can be integrated with modeling to advance understanding of the dynamics of invasive and proliferative populations during cancer progression and response to therapy.
Collapse
Affiliation(s)
- Mary E Sehl
- Division of Hematology-Oncology, Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
- Department of Biomathematics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
| | - Max S Wicha
- Department of Internal Medicine, University of Michigan, 1500 East Medical Center Drive, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
15
|
Ruffalo M, Stojanov P, Pillutla VK, Varma R, Bar-Joseph Z. Reconstructing cancer drug response networks using multitask learning. BMC SYSTEMS BIOLOGY 2017; 11:96. [PMID: 29017547 PMCID: PMC5635550 DOI: 10.1186/s12918-017-0471-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2017] [Accepted: 10/02/2017] [Indexed: 01/03/2023]
Abstract
BACKGROUND Translating in vitro results to clinical tests is a major challenge in systems biology. Here we present a new Multi-Task learning framework which integrates thousands of cell line expression experiments to reconstruct drug specific response networks in cancer. RESULTS The reconstructed networks correctly identify several shared key proteins and pathways while simultaneously highlighting many cell type specific proteins. We used top proteins from each drug network to predict survival for patients prescribed the drug. CONCLUSIONS Predictions based on proteins from the in-vitro derived networks significantly outperformed predictions based on known cancer genes indicating that Multi-Task learning can indeed identify accurate drug response networks.
Collapse
Affiliation(s)
- Matthew Ruffalo
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Petar Stojanov
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Venkata Krishna Pillutla
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Rohan Varma
- Electrical and Computer Engineering, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. .,Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|
16
|
Jain S, Arrais J, Venkatachari NJ, Ayyavoo V, Bar-Joseph Z. Reconstructing the temporal progression of HIV-1 immune response pathways. Bioinformatics 2017; 32:i253-i261. [PMID: 27307624 PMCID: PMC4908338 DOI: 10.1093/bioinformatics/btw254] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Motivation: Most methods for reconstructing response networks from high throughput data generate static models which cannot distinguish between early and late response stages. Results: We present TimePath, a new method that integrates time series and static datasets to reconstruct dynamic models of host response to stimulus. TimePath uses an Integer Programming formulation to select a subset of pathways that, together, explain the observed dynamic responses. Applying TimePath to study human response to HIV-1 led to accurate reconstruction of several known regulatory and signaling pathways and to novel mechanistic insights. We experimentally validated several of TimePaths’ predictions highlighting the usefulness of temporal models. Availability and Implementation: Data, Supplementary text and the TimePath software are available from http://sb.cs.cmu.edu/timepath Contact:zivbj@cs.cmu.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Siddhartha Jain
- Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Joel Arrais
- Department of Computer Science, University of Coimbra, Coimbra, Portugal
| | | | - Velpandi Ayyavoo
- Department of Infectious Diseases, University of Pittsburgh, Pittsburgh, PA, USA
| | - Ziv Bar-Joseph
- Computational Biology and Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
17
|
Sverchkov Y, Craven M. A review of active learning approaches to experimental design for uncovering biological networks. PLoS Comput Biol 2017; 13:e1005466. [PMID: 28570593 PMCID: PMC5453429 DOI: 10.1371/journal.pcbi.1005466] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Various types of biological knowledge describe networks of interactions among elementary entities. For example, transcriptional regulatory networks consist of interactions among proteins and genes. Current knowledge about the exact structure of such networks is highly incomplete, and laboratory experiments that manipulate the entities involved are conducted to test hypotheses about these networks. In recent years, various automated approaches to experiment selection have been proposed. Many of these approaches can be characterized as active machine learning algorithms. Active learning is an iterative process in which a model is learned from data, hypotheses are generated from the model to propose informative experiments, and the experiments yield new data that is used to update the model. This review describes the various models, experiment selection strategies, validation techniques, and successful applications described in the literature; highlights common themes and notable distinctions among methods; and identifies likely directions of future research and open problems in the area.
Collapse
Affiliation(s)
- Yuriy Sverchkov
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Mark Craven
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| |
Collapse
|
18
|
Ren Y, Wang Q, Hasan MM, Ay A, Kahveci T. Identifying the topology of signaling networks from partial RNAi data. BMC SYSTEMS BIOLOGY 2016; 10 Suppl 2:53. [PMID: 27490106 PMCID: PMC4977480 DOI: 10.1186/s12918-016-0301-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Background Methods for inferring signaling networks using single gene knockdown RNAi experiments and reference networks have been proposed in recent years. These methods assume that RNAi information is available for all the genes in the signal transduction pathway, i.e., complete. This assumption does not always hold up since RNAi experiments are often incomplete and information for some genes is missing. Results In this article, we develop two methods to construct signaling networks from incomplete RNAi data with the help of a reference network. These methods infer the RNAi constraints for the missing genes such that the inferred network is closest to the reference network. We perform extensive experiments with both real and synthetic networks and demonstrate that these methods produce accurate results efficiently. Conclusions Application of our methods to Wnt signal transduction pathway has shown that our methods can be used to construct highly accurate signaling networks from experimental data in less than 100 ms. The two methods that produce accurate results efficiently show great promise of constructing real signaling networks.
Collapse
Affiliation(s)
- Yuanfang Ren
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, 32611, FL, USA.
| | - Qiyao Wang
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, 32611, FL, USA
| | - Md Mahmudul Hasan
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, 32611, FL, USA
| | - Ahmet Ay
- Department of Biology & Mathematics, Colgate University, Hamilton, 13346, NY, USA
| | - Tamer Kahveci
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, 32611, FL, USA
| |
Collapse
|
19
|
Chasman D, Fotuhi Siahpirani A, Roy S. Network-based approaches for analysis of complex biological systems. Curr Opin Biotechnol 2016; 39:157-166. [PMID: 27115495 DOI: 10.1016/j.copbio.2016.04.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2015] [Revised: 04/04/2016] [Accepted: 04/05/2016] [Indexed: 12/22/2022]
Abstract
Cells function and respond to changes in their environment by the coordinated activity of their molecular components, including mRNAs, proteins and metabolites. At the heart of proper cellular function are molecular networks connecting these components to process extra-cellular environmental signals and drive dynamic, context-specific cellular responses. Network-based computational approaches aim to systematically integrate measurements from high-throughput experiments to gain a global understanding of cellular function under changing environmental conditions. We provide an overview of recent methodological developments toward solving two major computational problems within this field in the past two years (2013-2015): network reconstruction and network-based interpretation. Looking forward, we envision development of methods that can predict phenotypes with high accuracy as well as provide biologically plausible mechanistic hypotheses.
Collapse
Affiliation(s)
- Deborah Chasman
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States
| | - Alireza Fotuhi Siahpirani
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, United States; Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, United States
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, United States; Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, United States.
| |
Collapse
|
20
|
Politano G, Orso F, Raimo M, Benso A, Savino A, Taverna D, Di Carlo S. CyTRANSFINDER: a Cytoscape 3.3 plugin for three-component (TF, gene, miRNA) signal transduction pathway construction. BMC Bioinformatics 2016; 17:157. [PMID: 27059647 PMCID: PMC4826505 DOI: 10.1186/s12859-016-0964-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2015] [Accepted: 02/19/2016] [Indexed: 12/02/2022] Open
Abstract
Background Biological research increasingly relies on network models to study complex phenomena. Signal Transduction Pathways are molecular circuits that model how cells receive, process, and respond to information from the environment providing snapshots of the overall cell dynamics. Most of the attempts to reconstruct signal transduction pathways are limited to single regulator networks including only genes/proteins. However, networks involving a single type of regulator and neglecting transcriptional and post-transcriptional regulations mediated by transcription factors and microRNAs, respectively, may not fully reveal the complex regulatory mechanisms of a cell. We observed a lack of computational instruments supporting explorative analysis on this type of three-component signal transduction pathways. Results We have developed CyTRANSFINDER, a new Cytoscape plugin able to infer three-component signal transduction pathways based on user defined regulatory patterns and including miRNAs, TFs and genes. Since CyTRANSFINDER has been designed to support exploratory analysis, it does not rely on expression data. To show the potential of the plugin we have applied it in a study of two miRNAs that are particularly relevant in human melanoma progression, miR-146a and miR-214. Conclusions CyTRANSFINDER supports the reconstruction of small signal transduction pathways among groups of genes. Results obtained from its use in a real case study have been analyzed and validated through both literature data and preliminary wet-lab experiments, showing the potential of this tool when performing exploratory analysis. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-0964-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Gianfranco Politano
- Department of Control and Computer Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, Torino, 10129, Italy
| | - Francesca Orso
- Molecular Biotechnology Center (MBC), Via Nizza, 52, Torino, 10126, Italy.,Dept. Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza, 52, Torino, 10126, Italy.,Center for Complex Systems in Molecular Biology and Medicine, Via Accademia Albertina, 13, Torino, 10123, Italy
| | - Monica Raimo
- Molecular Biotechnology Center (MBC), Via Nizza, 52, Torino, 10126, Italy.,Dept. Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza, 52, Torino, 10126, Italy
| | - Alfredo Benso
- Department of Control and Computer Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, Torino, 10129, Italy
| | - Alessandro Savino
- Department of Control and Computer Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, Torino, 10129, Italy
| | - Daniela Taverna
- Molecular Biotechnology Center (MBC), Via Nizza, 52, Torino, 10126, Italy.,Dept. Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza, 52, Torino, 10126, Italy.,Center for Complex Systems in Molecular Biology and Medicine, Via Accademia Albertina, 13, Torino, 10123, Italy
| | - Stefano Di Carlo
- Department of Control and Computer Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, Torino, 10129, Italy.
| |
Collapse
|
21
|
Crespo I, Doucey MA, Xenarios I. Social networks help to infer causality in the tumor microenvironment. BMC Res Notes 2016; 9:168. [PMID: 26979239 PMCID: PMC4793762 DOI: 10.1186/s13104-016-1976-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Accepted: 03/03/2016] [Indexed: 11/10/2022] Open
Abstract
Background Networks have become a popular way to conceptualize a system of interacting elements, such as electronic circuits, social communication, metabolism or gene regulation. Network inference, analysis, and modeling techniques have been developed in different areas of science and technology, such as computer science, mathematics, physics, and biology, with an active interdisciplinary exchange of concepts and approaches. However, some concepts seem to belong to a specific field without a clear transferability to other domains. At the same time, it is increasingly recognized that within some biological systems—such as the tumor microenvironment—where different types of resident and infiltrating cells interact to carry out their functions, the complexity of the system demands a theoretical framework, such as statistical inference, graph analysis and dynamical models, in order to asses and study the information derived from high-throughput experimental technologies. Results In this article we propose to adopt and adapt the concepts of influence and investment from the world of social network analysis to biological problems, and in particular to apply this approach to infer causality in the tumor microenvironment. We showed that constructing a bidirectional network of influence between cell and cell communication molecules allowed us to determine the direction of inferred regulations at the expression level and correctly recapitulate cause-effect relationships described in literature. Conclusions This work constitutes an example of a transfer of knowledge and concepts from the world of social network analysis to biomedical research, in particular to infer network causality in biological networks. This causality elucidation is essential to model the homeostatic response of biological systems to internal and external factors, such as environmental conditions, pathogens or treatments. Electronic supplementary material The online version of this article (doi:10.1186/s13104-016-1976-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Isaac Crespo
- Vital-IT, SIB (Swiss Institute of Bioinformatics), University of Lausanne, Lausanne, Switzerland.
| | - Marie-Agnès Doucey
- Ludwig Center for Cancer Research, University of Lausanne, Epalinges, Switzerland
| | - Ioannis Xenarios
- Vital-IT, SIB (Swiss Institute of Bioinformatics), University of Lausanne, Lausanne, Switzerland.
| |
Collapse
|
22
|
De Maeyer D, Weytjens B, Renkens J, De Raedt L, Marchal K. PheNetic: network-based interpretation of molecular profiling data. Nucleic Acids Res 2015; 43:W244-50. [PMID: 25878035 PMCID: PMC4489255 DOI: 10.1093/nar/gkv347] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2015] [Accepted: 04/03/2015] [Indexed: 12/17/2022] Open
Abstract
Molecular profiling experiments have become standard in current wet-lab practices. Classically, enrichment analysis has been used to identify biological functions related to these experimental results. Combining molecular profiling results with the wealth of currently available interactomics data, however, offers the opportunity to identify the molecular mechanism behind an observed molecular phenotype. In this paper, we therefore introduce ‘PheNetic’, a user-friendly web server for inferring a sub-network based on probabilistic logical querying. PheNetic extracts from an interactome, the sub-network that best explains genes prioritized through a molecular profiling experiment. Depending on its run mode, PheNetic searches either for a regulatory mechanism that gave explains to the observed molecular phenotype or for the pathways (in)activated in the molecular phenotype. The web server provides access to a large number of interactomes, making sub-network inference readily applicable to a wide variety of organisms. The inferred sub-networks can be interactively visualized in the browser. PheNetic's method and use are illustrated using an example analysis of differential expression results of ampicillin treated Escherichia coli cells. The PheNetic web service is available at http://bioinformatics.intec.ugent.be/phenetic/.
Collapse
Affiliation(s)
- Dries De Maeyer
- Dept. of Microbial and Molecular Systems, KULeuven, Leuven, 3000, Belgium Dept. of Information Technology (INTEC, iMINDS), U.Ghent, Ghent, 9052, Belgium
| | - Bram Weytjens
- Dept. of Microbial and Molecular Systems, KULeuven, Leuven, 3000, Belgium Dept. of Information Technology (INTEC, iMINDS), U.Ghent, Ghent, 9052, Belgium
| | - Joris Renkens
- Dept. of Computer Science, KULeuven, Leuven, 3000, Belgium
| | - Luc De Raedt
- Dept. of Computer Science, KULeuven, Leuven, 3000, Belgium
| | - Kathleen Marchal
- Dept. of Microbial and Molecular Systems, KULeuven, Leuven, 3000, Belgium Dept. of Information Technology (INTEC, iMINDS), U.Ghent, Ghent, 9052, Belgium Dept. of Plant Biotechnology and Bioinformatics, U.Ghent, Ghent, 9052, Belgium
| |
Collapse
|
23
|
Chasman D, Ho YH, Berry DB, Nemec CM, MacGilvray ME, Hose J, Merrill AE, Lee MV, Will JL, Coon JJ, Ansari AZ, Craven M, Gasch AP. Pathway connectivity and signaling coordination in the yeast stress-activated signaling network. Mol Syst Biol 2014; 10:759. [PMID: 25411400 PMCID: PMC4299600 DOI: 10.15252/msb.20145120] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Stressed cells coordinate a multi-faceted response spanning many levels of physiology. Yet
knowledge of the complete stress-activated regulatory network as well as design principles for
signal integration remains incomplete. We developed an experimental and computational approach to
integrate available protein interaction data with gene fitness contributions, mutant transcriptome
profiles, and phospho-proteome changes in cells responding to salt stress, to infer the
salt-responsive signaling network in yeast. The inferred subnetwork presented many novel predictions
by implicating new regulators, uncovering unrecognized crosstalk between known pathways, and
pointing to previously unknown ‘hubs’ of signal integration. We exploited these
predictions to show that Cdc14 phosphatase is a central hub in the network and that modification of
RNA polymerase II coordinates induction of stress-defense genes with reduction of growth-related
transcripts. We find that the orthologous human network is enriched for cancer-causing genes,
underscoring the importance of the subnetwork's predictions in understanding stress
biology.
Collapse
Affiliation(s)
- Deborah Chasman
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, USA
| | - Yi-Hsuan Ho
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, USA
| | - David B Berry
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, USA
| | - Corey M Nemec
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA
| | | | - James Hose
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, USA
| | - Anna E Merrill
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - M Violet Lee
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Jessica L Will
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, USA
| | - Joshua J Coon
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI, USA Department of Biological Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Aseem Z Ansari
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI, USA
| | - Mark Craven
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, USA Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI, USA Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Audrey P Gasch
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, USA Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
24
|
Ghasemi O, Ma Y, Lindsey ML, Jin YF. Using systems biology approaches to understand cardiac inflammation and extracellular matrix remodeling in the setting of myocardial infarction. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2014; 6:77-91. [PMID: 24741709 DOI: 10.1002/wsbm.1248] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Inflammation and extracellular matrix (ECM) remodeling are important components regulating the response of the left ventricle to myocardial infarction (MI). Significant cellular- and molecular-level contributors can be identified by analyzing data acquired through high-throughput genomic and proteomic technologies that provide expression levels for thousands of genes and proteins. Large-scale data provide both temporal and spatial information that need to be analyzed and interpreted using systems biology approaches in order to integrate this information into dynamic models that predict and explain mechanisms of cardiac healing post-MI. In this review, we summarize the systems biology approaches needed to computationally simulate post-MI remodeling, including data acquisition, data analysis for biomarker classification and identification, data integration to build dynamic models, and data interpretation for biological functions. An example for applying a systems biology approach to ECM remodeling is presented as a reference illustration.
Collapse
|
25
|
Guo NL, Wan YW. Network-based identification of biomarkers coexpressed with multiple pathways. Cancer Inform 2014; 13:37-47. [PMID: 25392692 PMCID: PMC4218687 DOI: 10.4137/cin.s14054] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Revised: 06/25/2014] [Accepted: 06/29/2014] [Indexed: 02/07/2023] Open
Abstract
Unraveling complex molecular interactions and networks and incorporating clinical information in modeling will present a paradigm shift in molecular medicine. Embedding biological relevance via modeling molecular networks and pathways has become increasingly important for biomarker identification in cancer susceptibility and metastasis studies. Here, we give a comprehensive overview of computational methods used for biomarker identification, and provide a performance comparison of several network models used in studies of cancer susceptibility, disease progression, and prognostication. Specifically, we evaluated implication networks, Boolean networks, Bayesian networks, and Pearson’s correlation networks in constructing gene coexpression networks for identifying lung cancer diagnostic and prognostic biomarkers. The results show that implication networks, implemented in Genet package, identified sets of biomarkers that generated an accurate prediction of lung cancer risk and metastases; meanwhile, implication networks revealed more biologically relevant molecular interactions than Boolean networks, Bayesian networks, and Pearson’s correlation networks when evaluated with MSigDB database.
Collapse
Affiliation(s)
- Nancy Lan Guo
- Mary Babb Randolph Cancer Center/School of Public Health, West Virginia University, Morgantown, WV, USA
| | - Ying-Wooi Wan
- Mary Babb Randolph Cancer Center/School of Public Health, West Virginia University, Morgantown, WV, USA
| |
Collapse
|
26
|
Chasman D, Gancarz B, Hao L, Ferris M, Ahlquist P, Craven M. Inferring host gene subnetworks involved in viral replication. PLoS Comput Biol 2014; 10:e1003626. [PMID: 24874113 PMCID: PMC4038467 DOI: 10.1371/journal.pcbi.1003626] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Accepted: 02/06/2014] [Indexed: 12/16/2022] Open
Abstract
Systematic, genome-wide loss-of-function experiments can be used to identify host factors that directly or indirectly facilitate or inhibit the replication of a virus in a host cell. We present an approach that combines an integer linear program and a diffusion kernel method to infer the pathways through which those host factors modulate viral replication. The inputs to the method are a set of viral phenotypes observed in single-host-gene mutants and a background network consisting of a variety of host intracellular interactions. The output is an ensemble of subnetworks that provides a consistent explanation for the measured phenotypes, predicts which unassayed host factors modulate the virus, and predicts which host factors are the most direct interfaces with the virus. We infer host-virus interaction subnetworks using data from experiments screening the yeast genome for genes modulating the replication of two RNA viruses. Because a gold-standard network is unavailable, we assess the predicted subnetworks using both computational and qualitative analyses. We conduct a cross-validation experiment in which we predict whether held-aside test genes have an effect on viral replication. Our approach is able to make high-confidence predictions more accurately than several baselines, and about as well as the best baseline, which does not infer mechanistic pathways. We also examine two kinds of predictions made by our method: which host factors are nearest to a direct interaction with a viral component, and which unassayed host genes are likely to be involved in viral replication. Multiple predictions are supported by recent independent experimental data, or are components or functional partners of confirmed relevant complexes or pathways. Integer program code, background network data, and inferred host-virus subnetworks are available at http://www.biostat.wisc.edu/~craven/chasman_host_virus/.
Collapse
Affiliation(s)
- Deborah Chasman
- Department of Computer Sciences, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Brandi Gancarz
- Luminex Corporation, Madison, Wisconsin, United States of America
- Institute for Molecular Virology, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Linhui Hao
- Institute for Molecular Virology, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
- Howard Hughes Medical Institute, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Michael Ferris
- Department of Computer Sciences, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Paul Ahlquist
- Institute for Molecular Virology, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
- Howard Hughes Medical Institute, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
- Morgridge Institute for Research, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Mark Craven
- Department of Computer Sciences, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| |
Collapse
|
27
|
Abstract
UNLABELLED The graph orientation problem calls for orienting the edges of a graph so as to maximize the number of pre-specified source-target vertex pairs that admit a directed path from the source to the target. Most algorithmic approaches to this problem share a common preprocessing step, in which the input graph is reduced to a tree by repeatedly contracting its cycles. Although this reduction is valid from an algorithmic perspective, the assignment of directions to the edges of the contracted cycles becomes arbitrary, and the connecting source-target paths may be arbitrarily long. In the context of biological networks, the connection of vertex pairs via shortest paths is highly motivated, leading to the following problem variant: given a graph and a collection of source-target vertex pairs, assign directions to the edges so as to maximize the number of pairs that are connected by a shortest (in the original graph) directed path. This problem is NP-complete and hard to approximate to within sub-polynomial factors. Here we provide a first polynomial-size integer linear program formulation for this problem, which allows its exact solution in seconds on current networks. We apply our algorithm to orient protein-protein interaction networks in yeast and compare it with two state-of-the-art algorithms. We find that our algorithm outperforms previous approaches and can orient considerable parts of the network, thus revealing its structure and function. AVAILABILITY AND IMPLEMENTATION The source code is available at www.cs.tau.ac.il/∼roded/shortest.zip. CONTACT roded@post.tau.ac.il.
Collapse
Affiliation(s)
- Dana Silverbush
- The Balavatnik School of Computer Science, Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | | |
Collapse
|
28
|
Lan A, Ziv-Ukelson M, Yeger-Lotem E. A context-sensitive framework for the analysis of human signalling pathways in molecular interaction networks. Bioinformatics 2013; 29:i210-6. [PMID: 23812986 PMCID: PMC3694656 DOI: 10.1093/bioinformatics/btt240] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
MOTIVATION A major challenge in systems biology is to reveal the cellular pathways that give rise to specific phenotypes and behaviours. Current techniques often rely on a network representation of molecular interactions, where each node represents a protein or a gene and each interaction is assigned a single static score. However, the use of single interaction scores fails to capture the tendency of proteins to favour different partners under distinct cellular conditions. RESULTS Here, we propose a novel context-sensitive network model, in which genes and protein nodes are assigned multiple contexts based on their gene ontology annotations, and their interactions are associated with multiple context-sensitive scores. Using this model, we developed a new approach and a corresponding tool, ContextNet, based on a dynamic programming algorithm for identifying signalling paths linking proteins to their downstream target genes. ContextNet finds high-ranking context-sensitive paths in the interactome, thereby revealing the intermediate proteins in the path and their path-specific contexts. We validated the model using 18 348 manually curated cellular paths derived from the SPIKE database. We next applied our framework to elucidate the responses of human primary lung cells to influenza infection. Top-ranking paths were much more likely to contain infection-related proteins, and this likelihood was highly correlated with path score. Moreover, the contexts assigned by the algorithm pointed to putative, as well as previously known responses to viral infection. Thus, context sensitivity is an important extension to current network biology models and can be efficiently used to elucidate cellular response mechanisms. AVAILABILITY ContextNet is publicly available at http://netbio.bgu.ac.il/ContextNet. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alexander Lan
- Department of Computer Science, National Center for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | | | | |
Collapse
|
29
|
Abstract
MOTIVATION Several types of studies, including genome-wide association studies and RNA interference screens, strive to link genes to diseases. Although these approaches have had some success, genetic variants are often only present in a small subset of the population, and screens are noisy with low overlap between experiments in different labs. Neither provides a mechanistic model explaining how identified genes impact the disease of interest or the dynamics of the pathways those genes regulate. Such mechanistic models could be used to accurately predict downstream effects of knocking down pathway members and allow comprehensive exploration of the effects of targeting pairs or higher-order combinations of genes. RESULTS We developed methods to model the activation of signaling and dynamic regulatory networks involved in disease progression. Our model, SDREM, integrates static and time series data to link proteins and the pathways they regulate in these networks. SDREM uses prior information about proteins' likelihood of involvement in a disease (e.g. from screens) to improve the quality of the predicted signaling pathways. We used our algorithms to study the human immune response to H1N1 influenza infection. The resulting networks correctly identified many of the known pathways and transcriptional regulators of this disease. Furthermore, they accurately predict RNA interference effects and can be used to infer genetic interactions, greatly improving over other methods suggested for this task. Applying our method to the more pathogenic H5N1 influenza allowed us to identify several strain-specific targets of this infection. AVAILABILITY SDREM is available from http://sb.cs.cmu.edu/sdrem. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anthony Gitter
- Computer Science Department and Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
| | | |
Collapse
|
30
|
|
31
|
Blokh D, Segev D, Sharan R. The approximability of shortest path-based graph orientations of protein-protein interaction networks. J Comput Biol 2013; 20:945-57. [PMID: 24073924 DOI: 10.1089/cmb.2013.0064] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The graph orientation problem calls for orienting the edges of an undirected graph so as to maximize the number of prespecified source-target vertex pairs that admit a directed path from the source to the target. Most algorithmic approaches to this problem share a common preprocessing step, in which the input graph is reduced to a tree by repeatedly contracting its cycles. Although this reduction is valid from an algorithmic perspective, the assignment of directions to the edges of the contracted cycles becomes arbitrary and, consequently, the connecting source-target paths may be arbitrarily long. In the context of biological networks, the connection of vertex pairs via shortest paths is highly motivated, leading to the following variant: Given an undirected graph and a collection of source-target vertex pairs, assign directions to the edges so as to maximize the number of pairs that are connected by a shortest (in the original graph) directed path. Here we study this variant, provide strong inapproximability results for it, and propose approximation algorithms for the problem, as well as for relaxations where the connecting paths need only be approximately shortest.
Collapse
Affiliation(s)
- Dima Blokh
- 1 Blavatnik School of Computer Science, Tel Aviv University , Tel Aviv, Israel
| | | | | |
Collapse
|
32
|
Dynamics of the Saccharomyces cerevisiae transcriptome during bread dough fermentation. Appl Environ Microbiol 2013; 79:7325-33. [PMID: 24056467 DOI: 10.1128/aem.02649-13] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
The behavior of yeast cells during industrial processes such as the production of beer, wine, and bioethanol has been extensively studied. In contrast, our knowledge about yeast physiology during solid-state processes, such as bread dough, cheese, or cocoa fermentation, remains limited. We investigated changes in the transcriptomes of three genetically distinct Saccharomyces cerevisiae strains during bread dough fermentation. Our results show that regardless of the genetic background, all three strains exhibit similar changes in expression patterns. At the onset of fermentation, expression of glucose-regulated genes changes dramatically, and the osmotic stress response is activated. The middle fermentation phase is characterized by the induction of genes involved in amino acid metabolism. Finally, at the latest time point, cells suffer from nutrient depletion and activate pathways associated with starvation and stress responses. Further analysis shows that genes regulated by the high-osmolarity glycerol (HOG) pathway, the major pathway involved in the response to osmotic stress and glycerol homeostasis, are among the most differentially expressed genes at the onset of fermentation. More importantly, deletion of HOG1 and other genes of this pathway significantly reduces the fermentation capacity. Together, our results demonstrate that cells embedded in a solid matrix such as bread dough suffer severe osmotic stress and that a proper induction of the HOG pathway is critical for optimal fermentation.
Collapse
|
33
|
Tuncbag N, Braunstein A, Pagnani A, Huang SSC, Chayes J, Borgs C, Zecchina R, Fraenkel E. Simultaneous reconstruction of multiple signaling pathways via the prize-collecting steiner forest problem. J Comput Biol 2013; 20:124-36. [PMID: 23383998 DOI: 10.1089/cmb.2012.0092] [Citation(s) in RCA: 90] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Signaling and regulatory networks are essential for cells to control processes such as growth, differentiation, and response to stimuli. Although many "omic" data sources are available to probe signaling pathways, these data are typically sparse and noisy. Thus, it has been difficult to use these data to discover the cause of the diseases and to propose new therapeutic strategies. We overcome these problems and use "omic" data to reconstruct simultaneously multiple pathways that are altered in a particular condition by solving the prize-collecting Steiner forest problem. To evaluate this approach, we use the well-characterized yeast pheromone response. We then apply the method to human glioblastoma data, searching for a forest of trees, each of which is rooted in a different cell-surface receptor. This approach discovers both overlapping and independent signaling pathways that are enriched in functionally and clinically relevant proteins, which could provide the basis for new therapeutic strategies. Although the algorithm was not provided with any information about the phosphorylation status of receptors, it identifies a small set of clinically relevant receptors among hundreds present in the interactome.
Collapse
Affiliation(s)
- Nurcan Tuncbag
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | | | | | | | | | | | | |
Collapse
|
34
|
Abstract
High-throughput experimental technologies are generating increasingly massive and complex genomic data sets. The sheer enormity and heterogeneity of these data threaten to make the arising problems computationally infeasible. Fortunately, powerful algorithmic techniques lead to software that can answer important biomedical questions in practice. In this Review, we sample the algorithmic landscape, focusing on state-of-the-art techniques, the understanding of which will aid the bench biologist in analysing omics data. We spotlight specific examples that have facilitated and enriched analyses of sequence, transcriptomic and network data sets.
Collapse
Affiliation(s)
- Bonnie Berger
- Department of Mathematics and Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
| | | | | |
Collapse
|
35
|
Glass K, Huttenhower C, Quackenbush J, Yuan GC. Passing messages between biological networks to refine predicted interactions. PLoS One 2013; 8:e64832. [PMID: 23741402 PMCID: PMC3669401 DOI: 10.1371/journal.pone.0064832] [Citation(s) in RCA: 123] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2012] [Accepted: 04/17/2013] [Indexed: 01/10/2023] Open
Abstract
Regulatory network reconstruction is a fundamental problem in computational biology. There are significant limitations to such reconstruction using individual datasets, and increasingly people attempt to construct networks using multiple, independent datasets obtained from complementary sources, but methods for this integration are lacking. We developed PANDA (Passing Attributes between Networks for Data Assimilation), a message-passing model using multiple sources of information to predict regulatory relationships, and used it to integrate protein-protein interaction, gene expression, and sequence motif data to reconstruct genome-wide, condition-specific regulatory networks in yeast as a model. The resulting networks were not only more accurate than those produced using individual data sets and other existing methods, but they also captured information regarding specific biological mechanisms and pathways that were missed using other methodologies. PANDA is scalable to higher eukaryotes, applicable to specific tissue or cell type data and conceptually generalizable to include a variety of regulatory, interaction, expression, and other genome-scale data. An implementation of the PANDA algorithm is available at www.sourceforge.net/projects/panda-net.
Collapse
Affiliation(s)
- Kimberly Glass
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - Curtis Huttenhower
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - John Quackenbush
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - Guo-Cheng Yuan
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
36
|
De Maeyer D, Renkens J, Cloots L, De Raedt L, Marchal K. PheNetic: network-based interpretation of unstructured gene lists in E. coli. MOLECULAR BIOSYSTEMS 2013; 9:1594-603. [PMID: 23591551 DOI: 10.1039/c3mb25551d] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
At the present time, omics experiments are commonly used in wet lab practice to identify leads involved in interesting phenotypes. These omics experiments often result in unstructured gene lists, the interpretation of which in terms of pathways or the mode of action is challenging. To aid in the interpretation of such gene lists, we developed PheNetic, a decision theoretic method that exploits publicly available information, captured in a comprehensive interaction network to obtain a mechanistic view of the listed genes. PheNetic selects from an interaction network the sub-networks highlighted by these gene lists. We applied PheNetic to an Escherichia coli interaction network to reanalyse a previously published KO compendium, assessing gene expression of 27 E. coli knock-out mutants under mild acidic conditions. Being able to unveil previously described mechanisms involved in acid resistance demonstrated both the performance of our method and the added value of our integrated E. coli network. PheNetic is available at .
Collapse
Affiliation(s)
- Dries De Maeyer
- Center of Microbial and Plant Genetics, Katholieke Universiteit Leuven, Kasteelpark Arenberg 20, B-3001 Leuven, Belgium
| | | | | | | | | |
Collapse
|
37
|
Verbeke LPC, Cloots L, Demeester P, Fostier J, Marchal K. EPSILON: an eQTL prioritization framework using similarity measures derived from local networks. ACTA ACUST UNITED AC 2013; 29:1308-16. [PMID: 23595663 DOI: 10.1093/bioinformatics/btt142] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
MOTIVATION When genomic data are associated with gene expression data, the resulting expression quantitative trait loci (eQTL) will likely span multiple genes. eQTL prioritization techniques can be used to select the most likely causal gene affecting the expression of a target gene from a list of candidates. As an input, these techniques use physical interaction networks that often contain highly connected genes and unreliable or irrelevant interactions that can interfere with the prioritization process. We present EPSILON, an extendable framework for eQTL prioritization, which mitigates the effect of highly connected genes and unreliable interactions by constructing a local network before a network-based similarity measure is applied to select the true causal gene. RESULTS We tested the new method on three eQTL datasets derived from yeast data using three different association techniques. A physical interaction network was constructed, and each eQTL in each dataset was prioritized using the EPSILON approach: first, a local network was constructed using a k-trials shortest path algorithm, followed by the calculation of a network-based similarity measure. Three similarity measures were evaluated: random walks, the Laplacian Exponential Diffusion kernel and the Regularized Commute-Time kernel. The aim was to predict knockout interactions from a yeast knockout compendium. EPSILON outperformed two reference prioritization methods, random assignment and shortest path prioritization. Next, we found that using a local network significantly increased prioritization performance in terms of predicted knockout pairs when compared with using exactly the same network similarity measures on the global network, with an average increase in prioritization performance of 8 percentage points (P < 10(-5)). AVAILABILITY The physical interaction network and the source code (Matlab/C++) of our implementation can be downloaded from http://bioinformatics.intec.ugent.be/epsilon. CONTACT lieven.verbeke@intec.ugent.be, kamar@psb.ugent.be, jan.fostier@intec.ugent.be SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lieven P C Verbeke
- Department of Information Technology, Ghent University - iMinds, 9050 Gent, Belgium.
| | | | | | | | | |
Collapse
|
38
|
Gosline SJC, Spencer SJ, Ursu O, Fraenkel E. SAMNet: a network-based approach to integrate multi-dimensional high throughput datasets. Integr Biol (Camb) 2013; 4:1415-27. [PMID: 23060147 DOI: 10.1039/c2ib20072d] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The rapid development of high throughput biotechnologies has led to an onslaught of data describing genetic perturbations and changes in mRNA and protein levels in the cell. Because each assay provides a one-dimensional snapshot of active signaling pathways, it has become desirable to perform multiple assays (e.g. mRNA expression and phospho-proteomics) to measure a single condition. However, as experiments expand to accommodate various cellular conditions, proper analysis and interpretation of these data have become more challenging. Here we introduce a novel approach called SAMNet, for Simultaneous Analysis of Multiple Networks, that is able to interpret diverse assays over multiple perturbations. The algorithm uses a constrained optimization approach to integrate mRNA expression data with upstream genes, selecting edges in the protein-protein interaction network that best explain the changes across all perturbations. The result is a putative set of protein interactions that succinctly summarizes the results from all experiments, highlighting the network elements unique to each perturbation. We evaluated SAMNet in both yeast and human datasets. The yeast dataset measured the cellular response to seven different transition metals, and the human dataset measured cellular changes in four different lung cancer models of Epithelial-Mesenchymal Transition (EMT), a crucial process in tumor metastasis. SAMNet was able to identify canonical yeast metal-processing genes unique to each commodity in the yeast dataset, as well as human genes such as β-catenin and TCF7L2/TCF4 that are required for EMT signaling but escaped detection in the mRNA and phospho-proteomic data. Moreover, SAMNet also highlighted drugs likely to modulate EMT, identifying a series of less canonical genes known to be affected by the BCR-ABL inhibitor imatinib (Gleevec), suggesting a possible influence of this drug on EMT.
Collapse
Affiliation(s)
- Sara J C Gosline
- Dept. of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | | | | |
Collapse
|
39
|
Atias N, Sharan R. iPoint: an integer programming based algorithm for inferring protein subnetworks. MOLECULAR BIOSYSTEMS 2013; 9:1662-9. [PMID: 23385645 DOI: 10.1039/c3mb25432a] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Large scale screening experiments have become the workhorse of molecular biology, producing data at an ever increasing scale. The interpretation of such data, particularly in the context of a protein interaction network, has the potential to shed light on the molecular pathways underlying the phenotype or the process in question. A host of approaches have been developed in recent years to tackle this reconstruction challenge. These approaches aim to infer a compact subnetwork that connects the genes revealed by the screen while optimizing local (individual path lengths) or global (likelihood) aspects of the subnetwork. Yosef et al. [Mol. Syst. Biol., 2009, 5, 248] were the first to provide a joint optimization of both criteria, albeit approximate in nature. Here we devise an integer linear programming formulation for the joint optimization problem, allowing us to solve it to optimality in minutes on current networks. We apply our algorithm, iPoint, to various data sets in yeast and human and evaluate its performance against state-of-the-art algorithms. We show that iPoint attains very compact and accurate solutions that outperform previous network inference algorithms with respect to their local and global attributes, their consistency across multiple experiments targeting the same pathway, and their agreement with current biological knowledge.
Collapse
Affiliation(s)
- Nir Atias
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| | | |
Collapse
|
40
|
Hashemikhabir S, Ayaz ES, Kavurucu Y, Can T, Kahveci T. Large-scale signaling network reconstruction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1696-1708. [PMID: 23221085 DOI: 10.1109/tcbb.2012.128] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Reconstructing the topology of a signaling network by means of RNA interference (RNAi) technology is an underdetermined problem especially when a single gene in the network is knocked down or observed. In addition, the exponential search space limits the existing methods to small signaling networks of size 10-15 genes. In this paper, we propose integrating RNAi data with a reference physical interaction network. We formulate the problem of signaling network reconstruction as finding the minimum number of edit operations on a given reference network. The edit operations transform the reference network to a network that satisfies the RNAi observations. We show that using a reference network does not simplify the computational complexity of the problem. Therefore, we propose two methods which provide near optimal results and can scale well for reconstructing networks up to hundreds of components. We validate the proposed methods on synthetic and real data sets. Comparison with the state of the art on real signaling networks shows that the proposed methodology can scale better and generates biologically significant results.
Collapse
|
41
|
Gitter A, Carmi M, Barkai N, Bar-Joseph Z. Linking the signaling cascades and dynamic regulatory networks controlling stress responses. Genome Res 2012; 23:365-76. [PMID: 23064748 PMCID: PMC3561877 DOI: 10.1101/gr.138628.112] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Accurate models of the cross-talk between signaling pathways and transcriptional regulatory networks within cells are essential to understand complex response programs. We present a new computational method that combines condition-specific time-series expression data with general protein interaction data to reconstruct dynamic and causal stress response networks. These networks characterize the pathways involved in the response, their time of activation, and the affected genes. The signaling and regulatory components of our networks are linked via a set of common transcription factors that serve as targets in the signaling network and as regulators of the transcriptional response network. Detailed case studies of stress responses in budding yeast demonstrate the predictive power of our method. Our method correctly identifies the core signaling proteins and transcription factors of the response programs. It further predicts the involvement of additional transcription factors and other proteins not previously implicated in the response pathways. We experimentally verify several of these predictions for the osmotic stress response network. Our approach requires little condition-specific data: only a partial set of upstream initiators and time-series gene expression data, which are readily available for many conditions and species. Consequently, our method is widely applicable and can be used to derive accurate, dynamic response models in several species.
Collapse
Affiliation(s)
- Anthony Gitter
- Computer Science Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | | | | | | |
Collapse
|
42
|
Ho JWK. Application of a systems approach to study developmental gene regulation. Biophys Rev 2012; 4:245-253. [PMID: 28510076 DOI: 10.1007/s12551-012-0092-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Accepted: 06/21/2012] [Indexed: 12/20/2022] Open
Abstract
All cells in a multicellular organism contain the same genome, yet different cell types express different sets of genes. Recent advances in high throughput genomic technologies have opened up new opportunities to understand the gene regulatory network in diverse cell types in a genome-wide manner. Here, I discuss recent advances in experimental and computational approaches for the study of gene regulation in embryonic development from a systems perspective. This review is written for computational biologists who have an interest in studying developmental gene regulation through integrative analysis of gene expression, chromatin landscape, and signaling pathways. I highlight the utility of publicly available data and tools, as well as some common analysis approaches.
Collapse
Affiliation(s)
- Joshua W K Ho
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA.
| |
Collapse
|
43
|
Suen S, Lu HHS, Yeang CH. Evolution of domain architectures and catalytic functions of enzymes in metabolic systems. Genome Biol Evol 2012; 4:976-93. [PMID: 22936075 PMCID: PMC3468959 DOI: 10.1093/gbe/evs072] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Domain architectures and catalytic functions of enzymes constitute the centerpieces of a metabolic network. These types of information are formulated as a two-layered network consisting of domains, proteins, and reactions-a domain-protein-reaction (DPR) network. We propose an algorithm to reconstruct the evolutionary history of DPR networks across multiple species and categorize the mechanisms of metabolic systems evolution in terms of network changes. The reconstructed history reveals distinct patterns of evolutionary mechanisms between prokaryotic and eukaryotic networks. Although the evolutionary mechanisms in early ancestors of prokaryotes and eukaryotes are quite similar, more novel and duplicated domain compositions with identical catalytic functions arise along the eukaryotic lineage. In contrast, prokaryotic enzymes become more versatile by catalyzing multiple reactions with similar chemical operations. Moreover, different metabolic pathways are enriched with distinct network evolution mechanisms. For instance, although the pathways of steroid biosynthesis, protein kinases, and glycosaminoglycan biosynthesis all constitute prominent features of animal-specific physiology, their evolution of domain architectures and catalytic functions follows distinct patterns. Steroid biosynthesis is enriched with reaction creations but retains a relatively conserved repertoire of domain compositions and proteins. Protein kinases retain conserved reactions but possess many novel domains and proteins. In contrast, glycosaminoglycan biosynthesis has high rates of reaction/protein creations and domain recruitments. Finally, we elicit and validate two general principles underlying the evolution of DPR networks: 1) duplicated enzyme proteins possess similar catalytic functions and 2) the majority of novel domains arise to catalyze novel reactions. These results shed new lights on the evolution of metabolic systems.
Collapse
Affiliation(s)
- Summit Suen
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | | | | |
Collapse
|
44
|
Feiglin A, Hacohen A, Sarusi A, Fisher J, Unger R, Ofran Y. Static network structure can be used to model the phenotypic effects of perturbations in regulatory networks. ACTA ACUST UNITED AC 2012; 28:2811-8. [PMID: 22923292 DOI: 10.1093/bioinformatics/bts517] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Biological processes are dynamic, whereas the networks that depict them are typically static. Quantitative modeling using differential equations or logic-based functions can offer quantitative predictions of the behavior of biological systems, but they require detailed experimental characterization of interaction kinetics, which is typically unavailable. To determine to what extent complex biological processes can be modeled and analyzed using only the static structure of the network (i.e. the direction and sign of the edges), we attempt to predict the phenotypic effect of perturbations in biological networks from the static network structure. RESULTS We analyzed three networks from different sources: The EGFR/MAPK and PI3K/AKT network from a detailed experimental study, the TNF regulatory network from the STRING database and a large network of all NCI-curated pathways from the Protein Interaction Database. Altogether, we predicted the effect of 39 perturbations (e.g. by one or two drugs) on 433 target proteins/genes. In up to 82% of the cases, an algorithm that used only the static structure of the network correctly predicted whether any given protein/gene is upregulated or downregulated as a result of perturbations of other proteins/genes. CONCLUSION While quantitative modeling requires detailed experimental data and heavy computations, which limit its scalability for large networks, a wiring-based approach can use available data from pathway and interaction databases and may be scalable. These results lay the foundations for a large-scale approach of predicting phenotypes based on the schematic structure of networks.
Collapse
Affiliation(s)
- Ariel Feiglin
- The Goodman faculty of life sciences, Bar Ilan University, Ramat Gan 52900, Israel
| | | | | | | | | | | |
Collapse
|
45
|
Bar-Joseph Z, Gitter A, Simon I. Studying and modelling dynamic biological processes using time-series gene expression data. Nat Rev Genet 2012; 13:552-64. [PMID: 22805708 DOI: 10.1038/nrg3244] [Citation(s) in RCA: 291] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Biological processes are often dynamic, thus researchers must monitor their activity at multiple time points. The most abundant source of information regarding such dynamic activity is time-series gene expression data. These data are used to identify the complete set of activated genes in a biological process, to infer their rates of change, their order and their causal effects and to model dynamic systems in the cell. In this Review we discuss the basic patterns that have been observed in time-series experiments, how these patterns are combined to form expression programs, and the computational analysis, visualization and integration of these data to infer models of dynamic biological systems.
Collapse
Affiliation(s)
- Ziv Bar-Joseph
- Lane Center for Computational Biology and Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.
| | | | | |
Collapse
|
46
|
Tuncbag N, McCallum S, Huang SSC, Fraenkel E. SteinerNet: a web server for integrating 'omic' data to discover hidden components of response pathways. Nucleic Acids Res 2012; 40:W505-9. [PMID: 22638579 PMCID: PMC3394335 DOI: 10.1093/nar/gks445] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
High-throughput technologies including transcriptional profiling, proteomics and reverse genetics screens provide detailed molecular descriptions of cellular responses to perturbations. However, it is difficult to integrate these diverse data to reconstruct biologically meaningful signaling networks. Previously, we have established a framework for integrating transcriptional, proteomic and interactome data by searching for the solution to the prize-collecting Steiner tree problem. Here, we present a web server, SteinerNet, to make this method available in a user-friendly format for a broad range of users with data from any species. At a minimum, a user only needs to provide a set of experimentally detected proteins and/or genes and the server will search for connections among these data from the provided interactomes for yeast, human, mouse, Drosophila melanogaster and Caenorhabditis elegans. More advanced users can upload their own interactome data as well. The server provides interactive visualization of the resulting optimal network and downloadable files detailing the analysis and results. We believe that SteinerNet will be useful for researchers who would like to integrate their high-throughput data for a specific condition or cellular response and to find biologically meaningful pathways. SteinerNet is accessible at http://fraenkel.mit.edu/steinernet.
Collapse
Affiliation(s)
- Nurcan Tuncbag
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | | | | |
Collapse
|
47
|
Silberberg Y, Gottlieb A, Kupiec M, Ruppin E, Sharan R. Large-scale elucidation of drug response pathways in humans. J Comput Biol 2012; 19:163-74. [PMID: 22300318 DOI: 10.1089/cmb.2011.0264] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Elucidating signaling pathways is a fundamental step in understanding cellular processes and developing new therapeutic strategies. Here we introduce a method for the large-scale elucidation of signaling pathways involved in cellular response to drugs. Combining drug targets, drug response expression profiles, and the human physical interaction network, we infer 99 human drug response pathways and study their properties. Based on the newly inferred pathways, we develop a pathway-based drug-drug similarity measure and compare it to two common, gold standard drug-drug similarity measures. Remarkably, our measure provides better correspondence to these gold standards than similarity measures that are based on associations between drugs and known pathways, or on drug-specific gene expression profiles. It further improves the prediction of drug side effects and indications, elucidating specific response pathways that may be associated with these drug properties. Supplementary Material for this article is available at www.liebertonline.com/cmb.
Collapse
Affiliation(s)
- Yael Silberberg
- Department of Molecular Microbiology and Biotechnology, Tel Aviv University, Tel Aviv, Israel
| | | | | | | | | |
Collapse
|
48
|
Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem. LECTURE NOTES IN COMPUTER SCIENCE 2012. [DOI: 10.1007/978-3-642-29627-7_31] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
49
|
Affiliation(s)
- Nancy Lan Guo
- Mary Babb Randolph Cancer Center/Department of Community Medicine, School of Medicine, West Virginia University, Morgantown, WV 26506-9300
| |
Collapse
|
50
|
Novershtern N, Regev A, Friedman N. Physical Module Networks: an integrative approach for reconstructing transcription regulation. Bioinformatics 2011; 27:i177-85. [PMID: 21685068 PMCID: PMC3117354 DOI: 10.1093/bioinformatics/btr222] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Motivation: Deciphering the complex mechanisms by which regulatory networks control gene expression remains a major challenge. While some studies infer regulation from dependencies between the expression levels of putative regulators and their targets, others focus on measured physical interactions. Results: Here, we present Physical Module Networks, a unified framework that combines a Bayesian model describing modules of co-expressed genes and their shared regulation programs, and a physical interaction graph, describing the protein–protein interactions and protein-DNA binding events that coherently underlie this regulation. Using synthetic data, we demonstrate that a Physical Module Network model has similar recall and improved precision compared to a simple Module Network, as it omits many false positive regulators. Finally, we show the power of Physical Module Networks to reconstruct meaningful regulatory pathways in the genetically perturbed yeast and during the yeast cell cycle, as well as during the response of primary epithelial human cells to infection with H1N1 influenza. Availability: The PMN software is available, free for academic use at http://www.compbio.cs.huji.ac.il/PMN/. Contact:aregev@broad.mit.edu; nirf@cs.huji.ac.il
Collapse
Affiliation(s)
- Noa Novershtern
- School of Computer Science, Hebrew University, Jerusalem 91904, Israel
| | | | | |
Collapse
|