1
|
Boolean function metrics can assist modelers to check and choose logical rules. J Theor Biol 2022; 538:111025. [DOI: 10.1016/j.jtbi.2022.111025] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 12/07/2021] [Accepted: 01/10/2022] [Indexed: 12/25/2022]
|
2
|
Alim MA, Ay A, Hasan MM, Thai MT, Kahveci T. Construction of Signaling Pathways with RNAi Data and Multiple Reference Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1079-1091. [PMID: 30102599 DOI: 10.1109/tcbb.2017.2710129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Signaling networks are involved in almost all major diseases such as cancer. As a result of this, understanding how signaling networks function is vital for finding new treatments for many diseases. Using gene knockdown assays such as RNA interference (RNAi) technology, many genes involved in these networks can be identified. However, determining the interactions between these genes in the signaling networks using only experimental techniques is very challenging, as performing extensive experiments is very expensive and sometimes, even impractical. Construction of signaling networks from RNAi data using computational techniques have been proposed as an alternative way to solve this challenging problem. However, the earlier approaches are either not scalable to large scale networks, or their accuracy levels are not satisfactory. In this study, we integrate RNAi data given on a target network with multiple reference signaling networks and phylogenetic trees to construct the topology of the target signaling network. In our work, the network construction is considered as finding the minimum number of edit operations on given multiple reference networks, in which their contributions are weighted by their phylogenetic distances to the target network. The edit operations on the reference networks lead to a target network that satisfies the RNAi knockdown observations. Here, we propose two new reference-based signaling network construction methods that provide optimal results and scale well to large-scale signaling networks of hundreds of components. We compare the performance of these approaches to the state-of-the-art reference-based network construction method SiNeC on synthetic, semi-synthetic, and real datasets. Our analyses show that the proposed methods outperform SiNeC method in terms of accuracy. Furthermore, we show that our methods function well even if evolutionarily distant reference networks are used. Application of our methods to the Apoptosis and Wnt signaling pathways recovers the known protein-protein interactions and suggests additional relevant interactions that can be tested experimentally.
Collapse
|
3
|
Mathematical and Computational Modeling in Complex Biological Systems. BIOMED RESEARCH INTERNATIONAL 2017; 2017:5958321. [PMID: 28386558 PMCID: PMC5366773 DOI: 10.1155/2017/5958321] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2016] [Revised: 12/20/2016] [Accepted: 01/16/2017] [Indexed: 12/22/2022]
Abstract
The biological process and molecular functions involved in the cancer progression remain difficult to understand for biologists and clinical doctors. Recent developments in high-throughput technologies urge the systems biology to achieve more precise models for complex diseases. Computational and mathematical models are gradually being used to help us understand the omics data produced by high-throughput experimental techniques. The use of computational models in systems biology allows us to explore the pathogenesis of complex diseases, improve our understanding of the latent molecular mechanisms, and promote treatment strategy optimization and new drug discovery. Currently, it is urgent to bridge the gap between the developments of high-throughput technologies and systemic modeling of the biological process in cancer research. In this review, we firstly studied several typical mathematical modeling approaches of biological systems in different scales and deeply analyzed their characteristics, advantages, applications, and limitations. Next, three potential research directions in systems modeling were summarized. To conclude, this review provides an update of important solutions using computational modeling approaches in systems biology.
Collapse
|
4
|
Ren Y, Wang Q, Hasan MM, Ay A, Kahveci T. Identifying the topology of signaling networks from partial RNAi data. BMC SYSTEMS BIOLOGY 2016; 10 Suppl 2:53. [PMID: 27490106 PMCID: PMC4977480 DOI: 10.1186/s12918-016-0301-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Background Methods for inferring signaling networks using single gene knockdown RNAi experiments and reference networks have been proposed in recent years. These methods assume that RNAi information is available for all the genes in the signal transduction pathway, i.e., complete. This assumption does not always hold up since RNAi experiments are often incomplete and information for some genes is missing. Results In this article, we develop two methods to construct signaling networks from incomplete RNAi data with the help of a reference network. These methods infer the RNAi constraints for the missing genes such that the inferred network is closest to the reference network. We perform extensive experiments with both real and synthetic networks and demonstrate that these methods produce accurate results efficiently. Conclusions Application of our methods to Wnt signal transduction pathway has shown that our methods can be used to construct highly accurate signaling networks from experimental data in less than 100 ms. The two methods that produce accurate results efficiently show great promise of constructing real signaling networks.
Collapse
Affiliation(s)
- Yuanfang Ren
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, 32611, FL, USA.
| | - Qiyao Wang
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, 32611, FL, USA
| | - Md Mahmudul Hasan
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, 32611, FL, USA
| | - Ahmet Ay
- Department of Biology & Mathematics, Colgate University, Hamilton, 13346, NY, USA
| | - Tamer Kahveci
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, 32611, FL, USA
| |
Collapse
|
5
|
Acharya L, Reynolds R, Zhu D. Network inference through synergistic subnetwork evolution. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2015; 2015:12. [PMID: 26640480 PMCID: PMC4662719 DOI: 10.1186/s13637-015-0027-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Accepted: 08/21/2015] [Indexed: 12/02/2022]
Abstract
Study of signaling networks is important for a better understanding of cell behaviors e.g., growth, differentiation, metabolism, proptosis, and gaining deeper insights into the molecular mechanisms of complex diseases. While there have been many successes in developing computational approaches for identifying potential genes and proteins involved in cell signaling, new methods are needed for identifying network structures that depict underlying signal cascading mechanisms. In this paper, we propose a new computational approach for inferring signaling network structures from overlapping gene sets related to the networks. In the proposed approach, a signaling network is represented as a directed graph and is viewed as a union of many active paths representing linear and overlapping chains of signal cascading activities in the network. Gene sets represent the sets of genes participating in active paths without prior knowledge of the order in which genes occur within each path. From a compendium of unordered gene sets, the proposed algorithm reconstructs the underlying network structure through evolution of synergistic active paths. In our context, the extent of edge overlapping among active paths is used to define the synergy present in a network. We evaluated the performance of the proposed algorithm in terms of its convergence and recovering true active paths by utilizing four gene set compendiums derived from the KEGG database. Evaluation of results demonstrate the ability of the algorithm in reconstructing the underlying networks with high accuracy and precision.
Collapse
Affiliation(s)
- Lipi Acharya
- Dow AgroSciences, 9330 Zionsville Road, Indianapolis, IN 46268 USA
| | - Robert Reynolds
- Department of Computer Science, Wayne State University, 5057 Woodward Avenue, Detroit, MI 48202 USA
| | - Dongxiao Zhu
- Department of Computer Science, Wayne State University, 5057 Woodward Avenue, Detroit, MI 48202 USA
| |
Collapse
|
6
|
Rodriguez A, Crespo I, Fournier A, del Sol A. Discrete Logic Modelling Optimization to Contextualize Prior Knowledge Networks Using PRUNET. PLoS One 2015; 10:e0127216. [PMID: 26058016 PMCID: PMC4461287 DOI: 10.1371/journal.pone.0127216] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2014] [Accepted: 04/13/2015] [Indexed: 01/09/2023] Open
Abstract
High-throughput technologies have led to the generation of an increasing amount of data in different areas of biology. Datasets capturing the cell's response to its intra- and extra-cellular microenvironment allows such data to be incorporated as signed and directed graphs or influence networks. These prior knowledge networks (PKNs) represent our current knowledge of the causality of cellular signal transduction. New signalling data is often examined and interpreted in conjunction with PKNs. However, different biological contexts, such as cell type or disease states, may have distinct variants of signalling pathways, resulting in the misinterpretation of new data. The identification of inconsistencies between measured data and signalling topologies, as well as the training of PKNs using context specific datasets (PKN contextualization), are necessary conditions to construct reliable, predictive models, which are current challenges in the systems biology of cell signalling. Here we present PRUNET, a user-friendly software tool designed to address the contextualization of a PKNs to specific experimental conditions. As the input, the algorithm takes a PKN and the expression profile of two given stable steady states or cellular phenotypes. The PKN is iteratively pruned using an evolutionary algorithm to perform an optimization process. This optimization rests in a match between predicted attractors in a discrete logic model (Boolean) and a Booleanized representation of the phenotypes, within a population of alternative subnetworks that evolves iteratively. We validated the algorithm applying PRUNET to four biological examples and using the resulting contextualized networks to predict missing expression values and to simulate well-characterized perturbations. PRUNET constitutes a tool for the automatic curation of a PKN to make it suitable for describing biological processes under particular experimental conditions. The general applicability of the implemented algorithm makes PRUNET suitable for a variety of biological processes, for instance cellular reprogramming or transitions between healthy and disease states.
Collapse
Affiliation(s)
- Ana Rodriguez
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg
| | - Isaac Crespo
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg
| | - Anna Fournier
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg
| | - Antonio del Sol
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg
| |
Collapse
|
7
|
Amberkar SS, Kaderali L. An integrative approach for a network based meta-analysis of viral RNAi screens. Algorithms Mol Biol 2015; 10:6. [PMID: 25691914 PMCID: PMC4331137 DOI: 10.1186/s13015-015-0035-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 01/27/2015] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Big data is becoming ubiquitous in biology, and poses significant challenges in data analysis and interpretation. RNAi screening has become a workhorse of functional genomics, and has been applied, for example, to identify host factors involved in infection for a panel of different viruses. However, the analysis of data resulting from such screens is difficult, with often low overlap between hit lists, even when comparing screens targeting the same virus. This makes it a major challenge to select interesting candidates for further detailed, mechanistic experimental characterization. RESULTS To address this problem we propose an integrative bioinformatics pipeline that allows for a network based meta-analysis of viral high-throughput RNAi screens. Initially, we collate a human protein interaction network from various public repositories, which is then subjected to unsupervised clustering to determine functional modules. Modules that are significantly enriched with host dependency factors (HDFs) and/or host restriction factors (HRFs) are then filtered based on network topology and semantic similarity measures. Modules passing all these criteria are finally interpreted for their biological significance using enrichment analysis, and interesting candidate genes can be selected from the modules. CONCLUSIONS We apply our approach to seven screens targeting three different viruses, and compare results with other published meta-analyses of viral RNAi screens. We recover key hit genes, and identify additional candidates from the screens. While we demonstrate the application of the approach using viral RNAi data, the method is generally applicable to identify underlying mechanisms from hit lists derived from high-throughput experimental data, and to select a small number of most promising genes for further mechanistic studies.
Collapse
|
8
|
Kramer A, Calderhead B, Radde N. Hamiltonian Monte Carlo methods for efficient parameter estimation in steady state dynamical systems. BMC Bioinformatics 2014; 15:253. [PMID: 25066046 PMCID: PMC4262080 DOI: 10.1186/1471-2105-15-253] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Accepted: 07/07/2014] [Indexed: 11/22/2022] Open
Abstract
Background Parameter estimation for differential equation models of intracellular processes is a highly relevant bu challenging task. The available experimental data do not usually contain enough information to identify all parameters uniquely, resulting in ill-posed estimation problems with often highly correlated parameters. Sampling-based Bayesian statistical approaches are appropriate for tackling this problem. The samples are typically generated via Markov chain Monte Carlo, however such methods are computationally expensive and their convergence may be slow, especially if there are strong correlations between parameters. Monte Carlo methods based on Euclidean or Riemannian Hamiltonian dynamics have been shown to outperform other samplers by making proposal moves that take the local sensitivities of the system’s states into account and accepting these moves with high probability. However, the high computational cost involved with calculating the Hamiltonian trajectories prevents their widespread use for all but the smallest differential equation models. The further development of efficient sampling algorithms is therefore an important step towards improving the statistical analysis of predictive models of intracellular processes. Results We show how state of the art Hamiltonian Monte Carlo methods may be significantly improved for steady state dynamical models. We present a novel approach for efficiently calculating the required geometric quantities by tracking steady states across the Hamiltonian trajectories using a Newton-Raphson method and employing local sensitivity information. Using our approach, we compare both Euclidean and Riemannian versions of Hamiltonian Monte Carlo on three models for intracellular processes with real data and demonstrate at least an order of magnitude improvement in the effective sampling speed. We further demonstrate the wider applicability of our approach to other gradient based MCMC methods, such as those based on Langevin diffusions. Conclusion Our approach is strictly benefitial in all test cases. The Matlab sources implementing our MCMC methodology is available from https://github.com/a-kramer/ode_rmhmc. Electronic supplementary material The online version of this article (doi:10.1186/1471-2105-15-253) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Andrei Kramer
- Institute for Systems Theory and Automatic Control, Pfaffenwaldring 9, 70550 Stuttgart, Germany.
| | | | | |
Collapse
|
9
|
Kiani NA, Kaderali L. Dynamic probabilistic threshold networks to infer signaling pathways from time-course perturbation data. BMC Bioinformatics 2014; 15:250. [PMID: 25047753 PMCID: PMC4133630 DOI: 10.1186/1471-2105-15-250] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Accepted: 07/15/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Network inference deals with the reconstruction of molecular networks from experimental data. Given N molecular species, the challenge is to find the underlying network. Due to data limitations, this typically is an ill-posed problem, and requires the integration of prior biological knowledge or strong regularization. We here focus on the situation when time-resolved measurements of a system's response after systematic perturbations are available. RESULTS We present a novel method to infer signaling networks from time-course perturbation data. We utilize dynamic Bayesian networks with probabilistic Boolean threshold functions to describe protein activation. The model posterior distribution is analyzed using evolutionary MCMC sampling and subsequent clustering, resulting in probability distributions over alternative networks. We evaluate our method on simulated data, and study its performance with respect to data set size and levels of noise. We then use our method to study EGF-mediated signaling in the ERBB pathway. CONCLUSIONS Dynamic Probabilistic Threshold Networks is a new method to infer signaling networks from time-series perturbation data. It exploits the dynamic response of a system after external perturbation for network reconstruction. On simulated data, we show that the approach outperforms current state of the art methods. On the ERBB data, our approach recovers a significant fraction of the known interactions, and predicts novel mechanisms in the ERBB pathway.
Collapse
Affiliation(s)
- Narsis A Kiani
- Technische Universität Dresden, Medical Faculty Carl Gustav Carus, Institute for Medical Informatics and Biometry, Fetscherstr, 74, 01307 Dresden, Germany.
| | | |
Collapse
|
10
|
Trairatphisan P, Mizera A, Pang J, Tantar AA, Sauter T. optPBN: an optimisation toolbox for probabilistic Boolean networks. PLoS One 2014; 9:e98001. [PMID: 24983623 PMCID: PMC4077690 DOI: 10.1371/journal.pone.0098001] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2013] [Accepted: 04/27/2014] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND There exist several computational tools which allow for the optimisation and inference of biological networks using a Boolean formalism. Nevertheless, the results from such tools yield only limited quantitative insights into the complexity of biological systems because of the inherited qualitative nature of Boolean networks. RESULTS We introduce optPBN, a Matlab-based toolbox for the optimisation of probabilistic Boolean networks (PBN) which operates under the framework of the BN/PBN toolbox. optPBN offers an easy generation of probabilistic Boolean networks from rule-based Boolean model specification and it allows for flexible measurement data integration from multiple experiments. Subsequently, optPBN generates integrated optimisation problems which can be solved by various optimisers. In term of functionalities, optPBN allows for the construction of a probabilistic Boolean network from a given set of potential constitutive Boolean networks by optimising the selection probabilities for these networks so that the resulting PBN fits experimental data. Furthermore, the optPBN pipeline can also be operated on large-scale computational platforms to solve complex optimisation problems. Apart from exemplary case studies which we correctly inferred the original network, we also successfully applied optPBN to study a large-scale Boolean model of apoptosis where it allows identifying the inverse correlation between UVB irradiation, NFκB and Caspase 3 activations, and apoptosis in primary hepatocytes quantitatively. Also, the results from optPBN help elucidating the relevancy of crosstalk interactions in the apoptotic network. SUMMARY The optPBN toolbox provides a simple yet comprehensive pipeline for integrated optimisation problem generation in the PBN formalism that can readily be solved by various optimisers on local or grid-based computational platforms. optPBN can be further applied to various biological studies such as the inference of gene regulatory networks or the identification of the interaction's relevancy in signal transduction networks.
Collapse
Affiliation(s)
- Panuwat Trairatphisan
- Systems Biology Group, Life Sciences Research Unit, University of Luxembourg, Luxembourg, Luxembourg
| | - Andrzej Mizera
- Computer Science and Communications Research Unit, University of Luxembourg, Luxembourg, Luxembourg
| | - Jun Pang
- Computer Science and Communications Research Unit, University of Luxembourg, Luxembourg, Luxembourg
- Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg, Luxembourg
| | - Alexandru Adrian Tantar
- Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg, Luxembourg
| | - Thomas Sauter
- Systems Biology Group, Life Sciences Research Unit, University of Luxembourg, Luxembourg, Luxembourg
| |
Collapse
|
11
|
Hasan MM, Kavurucu Y, Kahveci T. A scalable method for discovering significant subnetworks. BMC SYSTEMS BIOLOGY 2014; 7 Suppl 4:S3. [PMID: 24565174 PMCID: PMC3854656 DOI: 10.1186/1752-0509-7-s4-s3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Study of biological networks is an essential first step to understand the complex functions they govern in different organisms. The topology of interactions that define how biological networks operate is often determined through high-throughput experiments. Noisy nature of high-throughput experiments, however, can result in multiple alternative network topologies that explain this data equally well. One key step to resolve the differences is to identify the subnetworks which appear significantly more frequently in a biological network data set than expected. METHOD We present a method named SiS (Significant Subnetworks) to find subnetworks with the largest probability to appear in a collection of biological networks. We define these subnetworks as the most probable subnetworks. SiS summarizes the interactions in the given collection of networks in a special template network. It uses the template network to guide the search for most probable subnetworks. It computes the lower and upper bound scores on how good the potential solutions are (i.e., the number of input networks that contain the subnetwork). As the search continues, it tightens the bound dynamically and prunes a massive number of unpromising solutions in that process. RESULTS AND CONCLUSIONS Experiments on comprehensive data sets depict that the most probable subnetworks found by SiS in a large collection of networks are also very frequent as well. In metabolic network data set, we found that subnetworks in eukaryote are more conserved than those of prokaryote. SiS also scales well to large data sets and subnetworks and runs orders of magnitude faster than an existing method, MULE. Depending on the size of the subnetwork in the same data set, the running time of SiS ranges from a few seconds to minutes; MULE, on the other hand, runs either for hours or does not even finish in days. In human transcription regulatory network data set, SiS finds a large backbone subnetwork that appears frequently regardless of diverse cell types.
Collapse
|
12
|
Knapp B, Kaderali L. Reconstruction of cellular signal transduction networks using perturbation assays and linear programming. PLoS One 2013; 8:e69220. [PMID: 23935958 PMCID: PMC3728289 DOI: 10.1371/journal.pone.0069220] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2012] [Accepted: 06/06/2013] [Indexed: 12/23/2022] Open
Abstract
Perturbation experiments for example using RNA interference (RNAi) offer an attractive way to elucidate gene function in a high throughput fashion. The placement of hit genes in their functional context and the inference of underlying networks from such data, however, are challenging tasks. One of the problems in network inference is the exponential number of possible network topologies for a given number of genes. Here, we introduce a novel mathematical approach to address this question. We formulate network inference as a linear optimization problem, which can be solved efficiently even for large-scale systems. We use simulated data to evaluate our approach, and show improved performance in particular on larger networks over state-of-the art methods. We achieve increased sensitivity and specificity, as well as a significant reduction in computing time. Furthermore, we show superior performance on noisy data. We then apply our approach to study the intracellular signaling of human primary nave CD4+ T-cells, as well as ErbB signaling in trastuzumab resistant breast cancer cells. In both cases, our approach recovers known interactions and points to additional relevant processes. In ErbB signaling, our results predict an important role of negative and positive feedback in controlling the cell cycle progression.
Collapse
Affiliation(s)
- Bettina Knapp
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- ViroQuant Research Group Modeling, BioQuant, Heidelberg University, Heidelberg, Germany
| | - Lars Kaderali
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- ViroQuant Research Group Modeling, BioQuant, Heidelberg University, Heidelberg, Germany
- * E-mail:
| |
Collapse
|
13
|
Ozsoy OE, Can T. A divide and conquer approach for construction of large-scale signaling networks from PPI and RNAi data using linear programming. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:869-883. [PMID: 24334382 DOI: 10.1109/tcbb.2013.80] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Inference of topology of signaling networks from perturbation experiments is a challenging problem. Recently, the inference problem has been formulated as a reference network editing problem and it has been shown that finding the minimum number of edit operations on a reference network to comply with perturbation experiments is an NP-complete problem. In this paper, we propose an integer linear optimization (ILP) model for reconstruction of signaling networks from RNAi data and a reference network. The ILP model guarantees the optimal solution; however, is practical only for small signaling networks of size 10-15 genes due to computational complexity. To scale for large signaling networks, we propose a divide and conquer-based heuristic, in which a given reference network is divided into smaller subnetworks that are solved separately and the solutions are merged together to form the solution for the large network. We validate our proposed approach on real and synthetic data sets, and comparison with the state of the art shows that our proposed approach is able to scale better for large networks while attaining similar or better biological accuracy.
Collapse
Affiliation(s)
| | - Tolga Can
- Middle East Technical University, Ankara
| |
Collapse
|
14
|
Trairatphisan P, Mizera A, Pang J, Tantar AA, Schneider J, Sauter T. Recent development and biomedical applications of probabilistic Boolean networks. Cell Commun Signal 2013; 11:46. [PMID: 23815817 PMCID: PMC3726340 DOI: 10.1186/1478-811x-11-46] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2013] [Accepted: 06/22/2013] [Indexed: 12/13/2022] Open
Abstract
Probabilistic Boolean network (PBN) modelling is a semi-quantitative approach widely used for the study of the topology and dynamic aspects of biological systems. The combined use of rule-based representation and probability makes PBN appealing for large-scale modelling of biological networks where degrees of uncertainty need to be considered.A considerable expansion of our knowledge in the field of theoretical research on PBN can be observed over the past few years, with a focus on network inference, network intervention and control. With respect to areas of applications, PBN is mainly used for the study of gene regulatory networks though with an increasing emergence in signal transduction, metabolic, and also physiological networks. At the same time, a number of computational tools, facilitating the modelling and analysis of PBNs, are continuously developed.A concise yet comprehensive review of the state-of-the-art on PBN modelling is offered in this article, including a comparative discussion on PBN versus similar models with respect to concepts and biomedical applications. Due to their many advantages, we consider PBN to stand as a suitable modelling framework for the description and analysis of complex biological systems, ranging from molecular to physiological levels.
Collapse
Affiliation(s)
| | - Andrzej Mizera
- Computer Science and Communications Research Unit, University of Luxembourg, Luxembourg
| | - Jun Pang
- Computer Science and Communications Research Unit, University of Luxembourg, Luxembourg
| | - Alexandru Adrian Tantar
- Computer Science and Communications Research Unit, University of Luxembourg, Luxembourg
- Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg
| | - Jochen Schneider
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Luxembourg
- Saarland University Medical Center, Department of Internal Medicine II, Homburg, Saarland, Germany
| | - Thomas Sauter
- Life Sciences Research Unit, University of Luxembourg, Luxembourg
| |
Collapse
|
15
|
Pei B, Shin DG. Reconstruction of biological networks by incorporating prior knowledge into Bayesian network models. J Comput Biol 2013; 19:1324-34. [PMID: 23210479 DOI: 10.1089/cmb.2011.0194] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Bayesian network model is widely used for reverse engineering of biological network structures. An advantage of this model is its capability to integrate prior knowledge into the model learning process, which can lead to improving the quality of the network reconstruction outcome. Some previous works have explored this area with focus on using prior knowledge of the direct molecular links, except for a few recent ones proposing to examine the effects of molecular orderings. In this study, we propose a Bayesian network model that can integrate both direct links and orderings into the model. Random weights are assigned to these two types of prior knowledge to alleviate bias toward certain types of information. We evaluate our model performance using both synthetic data and biological data for the RAF signaling network, and illustrate the significant improvement on network structure reconstruction of the proposing models over the existing methods. We also examine the correlation between the improvement and the abundance of ordering prior knowledge. To address the issue of generating prior knowledge, we propose an approach to automatically extract potential molecular orderings from knowledge resources such as Kyoto Encyclopedia of Genes and Genomes (KEGG) database and Gene Ontology (GO) annotation.
Collapse
Affiliation(s)
- Baikang Pei
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, USA.
| | | |
Collapse
|
16
|
Failmezger H, Praveen P, Tresch A, Fröhlich H. Learning gene network structure from time laps cell imaging in RNAi Knock downs. Bioinformatics 2013; 29:1534-40. [PMID: 23595660 DOI: 10.1093/bioinformatics/btt179] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION As RNA interference is becoming a standard method for targeted gene perturbation, computational approaches to reverse engineer parts of biological networks based on measurable effects of RNAi become increasingly relevant. The vast majority of these methods use gene expression data, but little attention has been paid so far to other data types. RESULTS Here we present a method, which can infer gene networks from high-dimensional phenotypic perturbation effects on single cells recorded by time-lapse microscopy. We use data from the Mitocheck project to extract multiple shape, intensity and texture features at each frame. Features from different cells and movies are then aligned along the cell cycle time. Subsequently we use Dynamic Nested Effects Models (dynoNEMs) to estimate parts of the network structure between perturbed genes via a Markov Chain Monte Carlo approach. Our simulation results indicate a high reconstruction quality of this method. A reconstruction based on 22 gene knock downs yielded a network, where all edges could be explained via the biological literature. AVAILABILITY The implementation of dynoNEMs is part of the Bioconductor R-package nem.
Collapse
Affiliation(s)
- Henrik Failmezger
- Computational Biology and Regulatory Networks, Max-Planck Institute for Plant Breeding Research, Carl-von-Linne-Weg 10, 50829 Cologne, Germany
| | | | | | | |
Collapse
|
17
|
Gambin A, Charzyńska A, Ellert-Miklaszewska A, Rybiński M. Computational models of the JAK1/2-STAT1 signaling. JAKSTAT 2013; 2:e24672. [PMID: 24069559 PMCID: PMC3772111 DOI: 10.4161/jkst.24672] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2013] [Revised: 04/10/2013] [Accepted: 04/11/2013] [Indexed: 12/13/2022] Open
Abstract
Despite a conceptually simple mechanism of signaling, the JAK-STAT pathway exhibits considerable behavioral complexity. Computational pathway models are tools to investigate in detail signaling process. They integrate well with experimental studies, helping to explain molecular dynamics and to state new hypotheses, most often about the structure of interactions. A relatively small amount of experimental data is available for a JAK1/2-STAT1 variant of the pathway, hence, only several computational models were developed. Here we review a dominant approach of kinetic modeling of the JAK1/2-STAT1 pathway, based on ordinary differential equations. We also give a brief overview of attempts to computationally infer topology of this pathway.
Collapse
Affiliation(s)
- Anna Gambin
- Institute of Informatics; University of Warsaw; Warsaw, Poland ; Mossakowski Medical Research Centre; Polish Academy of Sciences; Warsaw, Poland
| | | | | | | |
Collapse
|
18
|
Hashemikhabir S, Ayaz ES, Kavurucu Y, Can T, Kahveci T. Large-scale signaling network reconstruction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1696-1708. [PMID: 23221085 DOI: 10.1109/tcbb.2012.128] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Reconstructing the topology of a signaling network by means of RNA interference (RNAi) technology is an underdetermined problem especially when a single gene in the network is knocked down or observed. In addition, the exponential search space limits the existing methods to small signaling networks of size 10-15 genes. In this paper, we propose integrating RNAi data with a reference physical interaction network. We formulate the problem of signaling network reconstruction as finding the minimum number of edit operations on a given reference network. The edit operations transform the reference network to a network that satisfies the RNAi observations. We show that using a reference network does not simplify the computational complexity of the problem. Therefore, we propose two methods which provide near optimal results and can scale well for reconstructing networks up to hundreds of components. We validate the proposed methods on synthetic and real data sets. Comparison with the state of the art on real signaling networks shows that the proposed methodology can scale better and generates biologically significant results.
Collapse
|
19
|
Niederberger T, Etzold S, Lidschreiber M, Maier KC, Martin DE, Fröhlich H, Cramer P, Tresch A. MC EMiNEM maps the interaction landscape of the Mediator. PLoS Comput Biol 2012; 8:e1002568. [PMID: 22737066 PMCID: PMC3380870 DOI: 10.1371/journal.pcbi.1002568] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2012] [Accepted: 05/04/2012] [Indexed: 11/18/2022] Open
Abstract
The Mediator is a highly conserved, large multiprotein complex that is involved essentially in the regulation of eukaryotic mRNA transcription. It acts as a general transcription factor by integrating regulatory signals from gene-specific activators or repressors to the RNA Polymerase II. The internal network of interactions between Mediator subunits that conveys these signals is largely unknown. Here, we introduce MC EMiNEM, a novel method for the retrieval of functional dependencies between proteins that have pleiotropic effects on mRNA transcription. MC EMiNEM is based on Nested Effects Models (NEMs), a class of probabilistic graphical models that extends the idea of hierarchical clustering. It combines mode-hopping Monte Carlo (MC) sampling with an Expectation-Maximization (EM) algorithm for NEMs to increase sensitivity compared to existing methods. A meta-analysis of four Mediator perturbation studies in Saccharomyces cerevisiae, three of which are unpublished, provides new insight into the Mediator signaling network. In addition to the known modular organization of the Mediator subunits, MC EMiNEM reveals a hierarchical ordering of its internal information flow, which is putatively transmitted through structural changes within the complex. We identify the N-terminus of Med7 as a peripheral entity, entailing only local structural changes upon perturbation, while the C-terminus of Med7 and Med19 appear to play a central role. MC EMiNEM associates Mediator subunits to most directly affected genes, which, in conjunction with gene set enrichment analysis, allows us to construct an interaction map of Mediator subunits and transcription factors.
Collapse
Affiliation(s)
- Theresa Niederberger
- Gene Center Munich and Center for integrated Protein Science CiPSM, Department of Biochemistry, Ludwig-Maximilians-University Munich, Munich, Germany
| | - Stefanie Etzold
- Gene Center Munich and Center for integrated Protein Science CiPSM, Department of Biochemistry, Ludwig-Maximilians-University Munich, Munich, Germany
| | - Michael Lidschreiber
- Gene Center Munich and Center for integrated Protein Science CiPSM, Department of Biochemistry, Ludwig-Maximilians-University Munich, Munich, Germany
| | - Kerstin C. Maier
- Gene Center Munich and Center for integrated Protein Science CiPSM, Department of Biochemistry, Ludwig-Maximilians-University Munich, Munich, Germany
| | - Dietmar E. Martin
- Gene Center Munich and Center for integrated Protein Science CiPSM, Department of Biochemistry, Ludwig-Maximilians-University Munich, Munich, Germany
| | - Holger Fröhlich
- Bonn-Aachen International Center for IT (B-IT) Algorithmic Bioinformatics, Rheinische Friedrich-Wilhelms-University Bonn, Bonn, Germany
| | - Patrick Cramer
- Gene Center Munich and Center for integrated Protein Science CiPSM, Department of Biochemistry, Ludwig-Maximilians-University Munich, Munich, Germany
| | - Achim Tresch
- Gene Center Munich and Center for integrated Protein Science CiPSM, Department of Biochemistry, Ludwig-Maximilians-University Munich, Munich, Germany
- Max Planck Institute for Plant Breeding Research, Cologne, Germany
- Institute for Genetics, University of Cologne, Cologne, Germany
- * E-mail:
| |
Collapse
|
20
|
Böck M, Ogishima S, Tanaka H, Kramer S, Kaderali L. Hub-centered gene network reconstruction using automatic relevance determination. PLoS One 2012; 7:e35077. [PMID: 22570688 PMCID: PMC3343044 DOI: 10.1371/journal.pone.0035077] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2011] [Accepted: 03/11/2012] [Indexed: 11/17/2022] Open
Abstract
Network inference deals with the reconstruction of biological networks from experimental data. A variety of different reverse engineering techniques are available; they differ in the underlying assumptions and mathematical models used. One common problem for all approaches stems from the complexity of the task, due to the combinatorial explosion of different network topologies for increasing network size. To handle this problem, constraints are frequently used, for example on the node degree, number of edges, or constraints on regulation functions between network components. We propose to exploit topological considerations in the inference of gene regulatory networks. Such systems are often controlled by a small number of hub genes, while most other genes have only limited influence on the network's dynamic. We model gene regulation using a Bayesian network with discrete, Boolean nodes. A hierarchical prior is employed to identify hub genes. The first layer of the prior is used to regularize weights on edges emanating from one specific node. A second prior on hyperparameters controls the magnitude of the former regularization for different nodes. The net effect is that central nodes tend to form in reconstructed networks. Network reconstruction is then performed by maximization of or sampling from the posterior distribution. We evaluate our approach on simulated and real experimental data, indicating that we can reconstruct main regulatory interactions from the data. We furthermore compare our approach to other state-of-the art methods, showing superior performance in identifying hubs. Using a large publicly available dataset of over 800 cell cycle regulated genes, we are able to identify several main hub genes. Our method may thus provide a valuable tool to identify interesting candidate genes for further study. Furthermore, the approach presented may stimulate further developments in regularization methods for network reconstruction from data.
Collapse
Affiliation(s)
- Matthias Böck
- ViroQuant Research Group Modeling, University of Heidelberg, BioQuant BQ26, Heidelberg, Germany
- Institute of Informatics I12, Technische Universität München, Garching, Germany
| | - Soichi Ogishima
- Department of Bioinformatics, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan
| | - Hiroshi Tanaka
- Department of Bioinformatics, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan
| | - Stefan Kramer
- Institute of Informatics I12, Technische Universität München, Garching, Germany
- Institute of Informatics, Johannes Gutenberg-Universität Mainz, Mainz, Germany
| | - Lars Kaderali
- ViroQuant Research Group Modeling, University of Heidelberg, BioQuant BQ26, Heidelberg, Germany
- Institute of Medical Informatics and Biometry, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
21
|
Kholodenko B, Yaffe MB, Kolch W. Computational approaches for analyzing information flow in biological networks. Sci Signal 2012; 5:re1. [PMID: 22510471 DOI: 10.1126/scisignal.2002961] [Citation(s) in RCA: 147] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The advancements in "omics" (proteomics, genomics, lipidomics, and metabolomics) technologies have yielded large inventories of genes, transcripts, proteins, and metabolites. The challenge is to find out how these entities work together to regulate the processes by which cells respond to external and internal signals. Mathematical and computational modeling of signaling networks has a key role in this task, and network analysis provides insights into biological systems and has applications for medicine. Here, we review experimental and theoretical progress and future challenges toward this goal. We focus on how networks are reconstructed from data, how these networks are structured to control the flow of biological information, and how the design features of the networks specify biological decisions.
Collapse
Affiliation(s)
- Boris Kholodenko
- Systems Biology Ireland, University College Dublin, Belfield, Dublin 4, Ireland
| | | | | |
Collapse
|
22
|
Acharya LR, Judeh T, Wang G, Zhu D. Optimal structural inference of signaling pathways from unordered and overlapping gene sets. Bioinformatics 2012; 28:546-56. [PMID: 22199386 PMCID: PMC3278757 DOI: 10.1093/bioinformatics/btr696] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2011] [Revised: 11/16/2011] [Accepted: 12/18/2011] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION A plethora of bioinformatics analysis has led to the discovery of numerous gene sets, which can be interpreted as discrete measurements emitted from latent signaling pathways. Their potential to infer signaling pathway structures, however, has not been sufficiently exploited. Existing methods accommodating discrete data do not explicitly consider signal cascading mechanisms that characterize a signaling pathway. Novel computational methods are thus needed to fully utilize gene sets and broaden the scope from focusing only on pairwise interactions to the more general cascading events in the inference of signaling pathway structures. RESULTS We propose a gene set based simulated annealing (SA) algorithm for the reconstruction of signaling pathway structures. A signaling pathway structure is a directed graph containing up to a few hundred nodes and many overlapping signal cascades, where each cascade represents a chain of molecular interactions from the cell surface to the nucleus. Gene sets in our context refer to discrete sets of genes participating in signal cascades, the basic building blocks of a signaling pathway, with no prior information about gene orderings in the cascades. From a compendium of gene sets related to a pathway, SA aims to search for signal cascades that characterize the optimal signaling pathway structure. In the search process, the extent of overlap among signal cascades is used to measure the optimality of a structure. Throughout, we treat gene sets as random samples from a first-order Markov chain model. We evaluated the performance of SA in three case studies. In the first study conducted on 83 KEGG pathways, SA demonstrated a significantly better performance than Bayesian network methods. Since both SA and Bayesian network methods accommodate discrete data, use a 'search and score' network learning strategy and output a directed network, they can be compared in terms of performance and computational time. In the second study, we compared SA and Bayesian network methods using four benchmark datasets from DREAM. In our final study, we showcased two context-specific signaling pathways activated in breast cancer. AVAILABILITY Source codes are available from http://dl.dropbox.com/u/16000775/sa_sc.zip.
Collapse
Affiliation(s)
- Lipi R Acharya
- Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA
| | | | | | | |
Collapse
|
23
|
An Integrated Bayesian Framework for Identifying Phosphorylation Networks in Stimulated Cells. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2012; 736:59-80. [DOI: 10.1007/978-1-4419-7210-1_3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
24
|
|