1
|
Subbaroyan A, Sil P, Martin OC, Samal A. Leveraging developmental landscapes for model selection in Boolean gene regulatory networks. Brief Bioinform 2023; 24:7145905. [PMID: 37114653 DOI: 10.1093/bib/bbad160] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 03/26/2023] [Accepted: 04/03/2023] [Indexed: 04/29/2023] Open
Abstract
Boolean models are a well-established framework to model developmental gene regulatory networks (DGRNs) for acquisition of cellular identities. During the reconstruction of Boolean DGRNs, even if the network structure is given, there is generally a large number of combinations of Boolean functions that will reproduce the different cell fates (biological attractors). Here we leverage the developmental landscape to enable model selection on such ensembles using the relative stability of the attractors. First we show that previously proposed measures of relative stability are strongly correlated and we stress the usefulness of the one that captures best the cell state transitions via the mean first passage time (MFPT) as it also allows the construction of a cellular lineage tree. A property of great computational importance is the insensitivity of the different stability measures to changes in noise intensities. That allows us to use stochastic approaches to estimate the MFPT and thereby scale up the computations to large networks. Given this methodology, we revisit different Boolean models of Arabidopsis thaliana root development, showing that a most recent one does not respect the biologically expected hierarchy of cell states based on relative stabilities. We therefore developed an iterative greedy algorithm that searches for models which satisfy the expected hierarchy of cell states and found that its application to the root development model yields many models that meet this expectation. Our methodology thus provides new tools that can enable reconstruction of more realistic and accurate Boolean models of DGRNs.
Collapse
Affiliation(s)
- Ajay Subbaroyan
- The Institute of Mathematical Sciences (IMSc), Chennai, 600113, India
- Homi Bhabha National Institute (HBNI), Mumbai, 400094, India
| | - Priyotosh Sil
- The Institute of Mathematical Sciences (IMSc), Chennai, 600113, India
- Homi Bhabha National Institute (HBNI), Mumbai, 400094, India
| | - Olivier C Martin
- Université Paris-Saclay, CNRS, INRAE, Univ Evry, Institute of Plant Sciences Paris-Saclay (IPS2), 91405, Orsay, France
- Université de Paris, CNRS, INRAE, Institute of Plant Sciences Paris-Saclay (IPS2), 91405, Orsay, France
| | - Areejit Samal
- The Institute of Mathematical Sciences (IMSc), Chennai, 600113, India
- Homi Bhabha National Institute (HBNI), Mumbai, 400094, India
| |
Collapse
|
2
|
Yang B, Bao W, Chen B. PGRNIG: novel parallel gene regulatory network identification algorithm based on GPU. Brief Funct Genomics 2022; 21:441-454. [PMID: 36064791 DOI: 10.1093/bfgp/elac028] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 07/30/2022] [Accepted: 08/03/2022] [Indexed: 12/14/2022] Open
Abstract
Molecular biology has revealed that complex life phenomena can be treated as the result of many gene interactions. Investigating these interactions and understanding the intrinsic mechanisms of biological systems using gene expression data have attracted a lot of attention. As a typical gene regulatory network (GRN) inference method, the S-system has been utilized to deal with small-scale network identification. However, it is extremely difficult to optimize it to infer medium-to-large networks. This paper proposes a novel parallel swarm intelligent algorithm, PGRNIG, to optimize the parameters of the S-system. We employed the clone selection strategy to improve the whale optimization algorithm (CWOA). To enhance the time efficiency of CWOA optimization, we utilized a parallel CWOA (PCWOA) based on the compute unified device architecture (CUDA) platform. Decomposition strategy and L1 regularization were utilized to reduce the search space and complexity of GRN inference. We applied the PGRNIG algorithm on three synthetic datasets and two real time-series expression datasets of the species of Escherichia coli and Saccharomyces cerevisiae. Experimental results show that PGRNIG could infer the gene regulatory network more accurately than other state-of-the-art methods with a convincing computational speed-up. Our findings show that CWOA and PCWOA have faster convergence performances than WOA.
Collapse
Affiliation(s)
- Bin Yang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China
| | - Wenzheng Bao
- School of Information and Electrical Engineering, Xuzhou University of Technology, Xuzhou 221018, China
| | - Baitong Chen
- Xuzhou First People's Hospital, Xuzhou 221000, China
| |
Collapse
|
3
|
Nakajima N, Hayashi T, Fujiki K, Shirahige K, Akiyama T, Akutsu T, Nakato R. Codependency and mutual exclusivity for gene community detection from sparse single-cell transcriptome data. Nucleic Acids Res 2021; 49:e104. [PMID: 34291282 PMCID: PMC8501962 DOI: 10.1093/nar/gkab601] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 05/25/2021] [Accepted: 07/04/2021] [Indexed: 12/04/2022] Open
Abstract
Single-cell RNA-seq (scRNA-seq) can be used to characterize cellular heterogeneity in thousands of cells. The reconstruction of a gene network based on coexpression patterns is a fundamental task in scRNA-seq analyses, and the mutual exclusivity of gene expression can be critical for understanding such heterogeneity. Here, we propose an approach for detecting communities from a genetic network constructed on the basis of coexpression properties. The community-based comparison of multiple coexpression networks enables the identification of functionally related gene clusters that cannot be fully captured through differential gene expression-based analysis. We also developed a novel metric referred to as the exclusively expressed index (EEI) that identifies mutually exclusive gene pairs from sparse scRNA-seq data. EEI quantifies and ranks the exclusive expression levels of all gene pairs from binary expression patterns while maintaining robustness against a low sequencing depth. We applied our methods to glioblastoma scRNA-seq data and found that gene communities were partially conserved after serum stimulation despite a considerable number of differentially expressed genes. We also demonstrate that the identification of mutually exclusive gene sets with EEI can improve the sensitivity of capturing cellular heterogeneity. Our methods complement existing approaches and provide new biological insights, even for a large, sparse dataset, in the single-cell analysis field.
Collapse
Affiliation(s)
- Natsu Nakajima
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Tomoatsu Hayashi
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Katsunori Fujiki
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Katsuhiko Shirahige
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Tetsu Akiyama
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan
| | - Ryuichiro Nakato
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| |
Collapse
|
4
|
Boolean model of anchorage dependence and contact inhibition points to coordinated inhibition but semi-independent induction of proliferation and migration. Comput Struct Biotechnol J 2020; 18:2145-2165. [PMID: 32913583 PMCID: PMC7451872 DOI: 10.1016/j.csbj.2020.07.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2019] [Revised: 06/23/2020] [Accepted: 07/22/2020] [Indexed: 12/16/2022] Open
Abstract
Epithelial cells respond to their physical neighborhood with mechano-sensitive behaviors required for development and tissue maintenance. These include anchorage dependence, matrix stiffness-dependent proliferation, contact inhibition of proliferation and migration, and collective migration that balances cell crawling with the maintenance of cell junctions. While required for development and tissue repair, these coordinated responses to the microenvironment also contribute to cancer metastasis. Predictive models of the signaling networks that coordinate these behaviors are critical in controlling cell behavior to halt disease. Here we propose a Boolean regulatory network model that synthesizes mechanosensitive signaling that links anchorage to a matrix of varying stiffness and cell density sensing to contact inhibition, proliferation, migration, and apoptosis. Our model can reproduce anchorage dependence and anoikis, detachment-induced cytokinesis errors, the effect of matrix stiffness on proliferation, and contact inhibition of proliferation and migration by two mechanisms that converge on the YAP transcription factor. In addition, we offer testable predictions related to cell cycle-dependent anoikis sensitivity, the molecular requirements for abolishing contact inhibition, and substrate stiffness dependent expression of the catalytic subunit of PI3K. Moreover, our model predicts heterogeneity in migratory vs. non-migratory phenotypes in sub-confluent monolayers, and co-inhibition but semi-independent induction of proliferation vs. migration as a function of cell density and mitogenic stimulation. Our model serves as a stepping-stone towards modeling mechanosensitive routes to the epithelial to mesenchymal transition, capturing the effects of the mesenchymal state on anoikis resistance, and understanding the balance between migration versus proliferation at each stage of the epithelial to mesenchymal transition.
Collapse
|
5
|
Hoang DT, Jo J, Periwal V. Data-driven inference of hidden nodes in networks. Phys Rev E 2019; 99:042114. [PMID: 31108681 DOI: 10.1103/physreve.99.042114] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Indexed: 01/12/2023]
Abstract
The explosion of activity in finding interactions in complex systems is driven by availability of copious observations of complex natural systems. However, such systems, e.g., the human brain, are rarely completely observable. Interaction network inference must then contend with hidden variables affecting the behavior of the observed parts of the system. We present an effective approach for model inference with hidden variables. From configurations of observed variables, we identify the observed-to-observed, hidden-to-observed, observed-to-hidden, and hidden-to-hidden interactions, the configurations of hidden variables, and the number of hidden variables. We demonstrate the performance of our method by simulating a kinetic Ising model, and show that our method outperforms existing methods. Turning to real data, we infer the hidden nodes in a neuronal network in the salamander retina and a stock market network. We show that predictive modeling with hidden variables is significantly more accurate than that without hidden variables. Finally, an important hidden variable problem is to find the number of clusters in a dataset. We apply our method to classify MNIST handwritten digits. We find that there are about 60 clusters which are roughly equally distributed among the digits.
Collapse
Affiliation(s)
- Danh-Tai Hoang
- Laboratory of Biological Modeling, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA.,Department of Natural Sciences, Quang Binh University, Dong Hoi, Quang Binh 510000, Vietnam
| | - Junghyo Jo
- Department of Statistics, Keimyung University, Daegu 42601, Korea.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455, Korea
| | - Vipul Periwal
- Laboratory of Biological Modeling, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| |
Collapse
|
6
|
Tavakkolkhah P, Zimmer R, Küffner R. Detection of network motifs using three-way ANOVA. PLoS One 2018; 13:e0201382. [PMID: 30080876 PMCID: PMC6078297 DOI: 10.1371/journal.pone.0201382] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Accepted: 07/13/2018] [Indexed: 01/03/2023] Open
Abstract
Motivation Gene regulatory networks (GRN) can be determined via various experimental techniques, and also by computational methods, which infer networks from gene expression data. However, these techniques treat interactions separately such that interdependencies of interactions forming meaningful subnetworks are typically not considered. Methods For the investigation of network properties and for the classification of different (sub-)networks based on gene expression data, we consider biological network motifs consisting of three genes and up to three interactions, e.g. the cascade chain (CSC), feed-forward loop (FFL), and dense-overlapping regulon (DOR). We examine several conventional methods for the inference of network motifs, which typically consider each interaction individually. In addition, we propose a new method based on three-way ANOVA (ANalysis Of VAriance) (3WA) that analyzes entire subnetworks at once. To demonstrate the advantages of such a more holistic perspective, we compare the ability of 3WA and other methods to detect and categorize network motifs on large real and artificial datasets. Results We find that conventional methods perform much better on artificial data (AUC up to 80%), than on real E. coli expression datasets (AUC 50% corresponding to random guessing). To explain this observation, we examine several important properties that differ between datasets and analyze predicted motifs in detail. We find that in case of real networks our new 3WA method outperforms (AUC 70% in E. coli) previous methods by exploiting the interdependencies in the full motif structure. Because of important differences between current artificial datasets and real measurements, the construction and testing of motif detection methods should focus on real data.
Collapse
Affiliation(s)
- Pegah Tavakkolkhah
- Department of Informatics, Ludwig-Maximilians-Universität München, München, Germany
| | - Ralf Zimmer
- Department of Informatics, Ludwig-Maximilians-Universität München, München, Germany
| | - Robert Küffner
- Department of Informatics, Ludwig-Maximilians-Universität München, München, Germany
- Icahn School of Medicine at Mount Sinai, New York, NY, United States of America
- * E-mail:
| |
Collapse
|
7
|
Muñoz S, Carrillo M, Azpeitia E, Rosenblueth DA. Griffin: A Tool for Symbolic Inference of Synchronous Boolean Molecular Networks. Front Genet 2018; 9:39. [PMID: 29559993 PMCID: PMC5845696 DOI: 10.3389/fgene.2018.00039] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Accepted: 01/29/2018] [Indexed: 11/30/2022] Open
Abstract
Boolean networks are important models of biochemical systems, located at the high end of the abstraction spectrum. A number of Boolean gene networks have been inferred following essentially the same method. Such a method first considers experimental data for a typically underdetermined “regulation” graph. Next, Boolean networks are inferred by using biological constraints to narrow the search space, such as a desired set of (fixed-point or cyclic) attractors. We describe Griffin, a computer tool enhancing this method. Griffin incorporates a number of well-established algorithms, such as Dubrova and Teslenko's algorithm for finding attractors in synchronous Boolean networks. In addition, a formal definition of regulation allows Griffin to employ “symbolic” techniques, able to represent both large sets of network states and Boolean constraints. We observe that when the set of attractors is required to be an exact set, prohibiting additional attractors, a naive Boolean coding of this constraint may be unfeasible. Such cases may be intractable even with symbolic methods, as the number of Boolean constraints may be astronomically large. To overcome this problem, we employ an Artificial Intelligence technique known as “clause learning” considerably increasing Griffin's scalability. Without clause learning only toy examples prohibiting additional attractors are solvable: only one out of seven queries reported here is answered. With clause learning, by contrast, all seven queries are answered. We illustrate Griffin with three case studies drawn from the Arabidopsis thaliana literature. Griffin is available at: http://turing.iimas.unam.mx/griffin.
Collapse
Affiliation(s)
- Stalin Muñoz
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Mexico City, Mexico.,Facultad de Ingeniería, Universidad Nacional Autónoma de México, Mexico City, Mexico.,Maestría en Ciencias de la Complejidad, Universidad Autónoma de la Ciudad de México, Mexico City, Mexico
| | - Miguel Carrillo
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Eugenio Azpeitia
- Institut National de Recherche en Informatique et en Automatique Project-Team Virtual Plants, Inria, CIRAD, INRA, Montpellier, France.,Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
| | - David A Rosenblueth
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Mexico City, Mexico.,Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de Mexico, Mexico City, Mexico
| |
Collapse
|
8
|
Kim Y, Hao J, Gautam Y, Mersha TB, Kang M. DiffGRN: differential gene regulatory network analysis. INT J DATA MIN BIOIN 2018; 20:362-379. [PMID: 31114627 DOI: 10.1504/ijdmb.2018.094891] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Identification of differential gene regulators with significant changes under disparate conditions is essential to understand complex biological mechanism in a disease. Differential Network Analysis (DiNA) examines different biological processes based on gene regulatory networks that represent regulatory interactions between genes with a graph model. While most studies in DiNA have considered correlation-based inference to construct gene regulatory networks from gene expression data due to its intuitive representation and simple implementation, the approach lacks in the representation of causal effects and multivariate effects between genes. In this paper, we propose an approach named Differential Gene Regulatory Network (DiffGRN) that infers differential gene regulation between two groups. We infer gene regulatory networks of two groups using Random LASSO, and then we identify differential gene regulations by the proposed significance test. The advantages of DiffGRN are to capture multivariate effects of genes that regulate a gene simultaneously, to identify causality of gene regulations, and to discover differential gene regulators between regression-based gene regulatory networks. We assessed DiffGRN by simulation experiments and showed its outstanding performance than the current state-of-the-art correlation-based method, DINGO. DiffGRN is applied to gene expression data in asthma. The DiNA with asthma data showed a number of gene regulations, such as ADAM12 and RELB, reported in biological literature.
Collapse
Affiliation(s)
- Youngsoon Kim
- Department of Computer Science, Kennesaw State University, Marietta, GA, USA
| | - Jie Hao
- Analytics and Data Science Institute, Kennesaw State University, Kennesaw, GA, USA
| | - Yadu Gautam
- Department of Pediatrics, University of Cincinnati, Cincinnati, OH, USA
| | - Tesfaye B Mersha
- Department of Pediatrics, University of Cincinnati, Cincinnati, OH, USA
| | - Mingon Kang
- Department of Computer Science, Kennesaw State University, Marietta, GA, USA
| |
Collapse
|
9
|
Altarawy D, Eid FE, Heath LS. PEAK: Integrating Curated and Noisy Prior Knowledge in Gene Regulatory Network Inference. J Comput Biol 2017; 24:863-873. [PMID: 28294630 DOI: 10.1089/cmb.2016.0199] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
With abundance of biological data, computational prediction of gene regulatory networks (GRNs) from gene expression data has become more feasible. Although incorporating other prior knowledge (PK), along with gene expression data, greatly improves prediction accuracy, the overall accuracy is still low. PK in GRN inference can be categorized into noisy and curated. In noisy PK, relations between genes do not necessarily correspond to regulatory relations and are thus considered inaccurate by inference algorithms such as transcription factor binding and protein-protein interactions. In contrast, curated PK is experimentally verified regulatory interactions in pathway databases. An issue in real data is that gene expression can poorly support the curated PK and thus most existing prediction algorithms cannot use these curated PK. Although several algorithms were proposed to incorporate noisy PK, none address curated PK with poor gene expression support. We present PEAK, a system to integrate both curated and noisy PK in GRN inference, especially with poor gene expression support. We introduce a novel method for GRN inference, CurInf, to effectively integrate curated PK, even when the gene expression data poorly support the PK. PEAK also uses the previously proposed method Modified Elastic Net to incorporate noisy PK, and we call it NoisInf. In our experiment, CurInf significantly incorporates curated PK, which was regarded as noise by previous methods. Using 100% curated PK, CurInf improves the area under precision-recall curve accuracy score over NoisInf by 27.3% in synthetic data, 86.5% in Escherichia coli data, and 31.1% in Saccharomyces cerevisiae data. Moreover, even when the noise in PK is 10 times more than true PK, PEAK performs better than inference without any PK. Better integration of curated PK helps biologists benefit from verified experimental data to predict more reliable GRN.
Collapse
Affiliation(s)
- Doaa Altarawy
- 1 Department of Computer Science, Virginia Tech , Blacksburg, Virginia.,2 Department of Computer and Systems Engineering, Alexandria University , Alexandria, Egypt
| | - Fatma-Elzahraa Eid
- 1 Department of Computer Science, Virginia Tech , Blacksburg, Virginia.,3 Department of Systems and Computer Engineering, Al-Azhar University , Cairo, Egypt
| | - Lenwood S Heath
- 1 Department of Computer Science, Virginia Tech , Blacksburg, Virginia
| |
Collapse
|
10
|
Kannan V, Tegner J. Adaptive input data transformation for improved network reconstruction with information theoretic algorithms. Stat Appl Genet Mol Biol 2016; 15:507-520. [PMID: 27875324 DOI: 10.1515/sagmb-2016-0013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
We propose a novel systematic procedure of non-linear data transformation for an adaptive algorithm in the context of network reverse-engineering using information theoretic methods. Our methodology is rooted in elucidating and correcting for the specific biases in the estimation techniques for mutual information (MI) given a finite sample of data. These are, in turn, tied to lack of well-defined bounds for numerical estimation of MI for continuous probability distributions from finite data. The nature and properties of the inevitable bias is described, complemented by several examples illustrating their form and variation. We propose an adaptive partitioning scheme for MI estimation that effectively transforms the sample data using parameters determined from its local and global distribution guaranteeing a more robust and reliable reconstruction algorithm. Together with a normalized measure (Shared Information Metric) we report considerably enhanced performance both for in silico and real-world biological networks. We also find that the recovery of true interactions is in particular better for intermediate range of false positive rates, suggesting that our algorithm is less vulnerable to spurious signals of association.
Collapse
|
11
|
Wang J, Wu Q, Hu XT, Tian T. An integrated approach to infer dynamic protein-gene interactions - A case study of the human P53 protein. Methods 2016; 110:3-13. [PMID: 27514497 DOI: 10.1016/j.ymeth.2016.08.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Revised: 07/18/2016] [Accepted: 08/01/2016] [Indexed: 11/19/2022] Open
Abstract
Investigating the dynamics of genetic regulatory networks through high throughput experimental data, such as microarray gene expression profiles, is a very important but challenging task. One of the major hindrances in building detailed mathematical models for genetic regulation is the large number of unknown model parameters. To tackle this challenge, a new integrated method is proposed by combining a top-down approach and a bottom-up approach. First, the top-down approach uses probabilistic graphical models to predict the network structure of DNA repair pathway that is regulated by the p53 protein. Two networks are predicted, namely a network of eight genes with eight inferred interactions and an extended network of 21 genes with 17 interactions. Then, the bottom-up approach using differential equation models is developed to study the detailed genetic regulations based on either a fully connected regulatory network or a gene network obtained by the top-down approach. Model simulation error, parameter identifiability and robustness property are used as criteria to select the optimal network. Simulation results together with permutation tests of input gene network structures indicate that the prediction accuracy and robustness property of the two predicted networks using the top-down approach are better than those of the corresponding fully connected networks. In particular, the proposed approach reduces computational cost significantly for inferring model parameters. Overall, the new integrated method is a promising approach for investigating the dynamics of genetic regulation.
Collapse
Affiliation(s)
- Junbai Wang
- Department of Pathology, Oslo University Hospital - Norwegian Radium Hospital, Montebello, 0310 Oslo, Norway
| | - Qianqian Wu
- School of Mathematical Sciences, Monash University, Melbourne 3800, Victoria, Australia; School of Mathematics, Hefei University of Technology, Hefei, Anhui 230009, China
| | - Xiaohua Tony Hu
- College of Computing and Informatics, Drexel University, Philadelphia, PA 19104, USA
| | - Tianhai Tian
- School of Mathematical Sciences, Monash University, Melbourne 3800, Victoria, Australia.
| |
Collapse
|
12
|
Mayer G, Marcus K, Eisenacher M, Kohl M. Boolean modeling techniques for protein co-expression networks in systems medicine. Expert Rev Proteomics 2016; 13:555-69. [PMID: 27105325 DOI: 10.1080/14789450.2016.1181546] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
INTRODUCTION Application of systems biology/systems medicine approaches is promising for proteomics/biomedical research, but requires selection of an adequate modeling type. AREAS COVERED This article reviews the existing Boolean network modeling approaches, which provide in comparison with alternative modeling techniques several advantages for the processing of proteomics data. Application of methods for inference, reduction and validation of protein co-expression networks that are derived from quantitative high-throughput proteomics measurements is presented. It's also shown how Boolean models can be used to derive system-theoretic characteristics that describe both the dynamical behavior of such networks as a whole and the properties of different cell states (e.g. healthy or diseased cell states). Furthermore, application of methods derived from control theory is proposed in order to simulate the effects of therapeutic interventions on such networks, which is a promising approach for the computer-assisted discovery of biomarkers and drug targets. Finally, the clinical application of Boolean modeling analyses is discussed. Expert commentary: Boolean modeling of proteomics data is still in its infancy. Progress in this field strongly depends on provision of a repository with public access to relevant reference models. Also required are community supported standards that facilitate input of both proteomics and patient related data (e.g. age, gender, laboratory results, etc.).
Collapse
Affiliation(s)
- Gerhard Mayer
- a Medizinisches Proteom Center (MPC) , Ruhr-Universität Bochum , Bochum , Germany
| | - Katrin Marcus
- a Medizinisches Proteom Center (MPC) , Ruhr-Universität Bochum , Bochum , Germany
| | - Martin Eisenacher
- a Medizinisches Proteom Center (MPC) , Ruhr-Universität Bochum , Bochum , Germany
| | - Michael Kohl
- a Medizinisches Proteom Center (MPC) , Ruhr-Universität Bochum , Bochum , Germany
| |
Collapse
|
13
|
Inferring Broad Regulatory Biology from Time Course Data: Have We Reached an Upper Bound under Constraints Typical of In Vivo Studies? PLoS One 2015; 10:e0127364. [PMID: 25984725 PMCID: PMC4435750 DOI: 10.1371/journal.pone.0127364] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 04/13/2015] [Indexed: 12/21/2022] Open
Abstract
There is a growing appreciation for the network biology that regulates the coordinated expression of molecular and cellular markers however questions persist regarding the identifiability of these networks. Here we explore some of the issues relevant to recovering directed regulatory networks from time course data collected under experimental constraints typical of in vivo studies. NetSim simulations of sparsely connected biological networks were used to evaluate two simple feature selection techniques used in the construction of linear Ordinary Differential Equation (ODE) models, namely truncation of terms versus latent vector projection. Performance was compared with ODE-based Time Series Network Identification (TSNI) integral, and the information-theoretic Time-Delay ARACNE (TD-ARACNE). Projection-based techniques and TSNI integral outperformed truncation-based selection and TD-ARACNE on aggregate networks with edge densities of 10-30%, i.e. transcription factor, protein-protein cliques and immune signaling networks. All were more robust to noise than truncation-based feature selection. Performance was comparable on the in silico 10-node DREAM 3 network, a 5-node Yeast synthetic network designed for In vivo Reverse-engineering and Modeling Assessment (IRMA) and a 9-node human HeLa cell cycle network of similar size and edge density. Performance was more sensitive to the number of time courses than to sample frequency and extrapolated better to larger networks by grouping experiments. In all cases performance declined rapidly in larger networks with lower edge density. Limited recovery and high false positive rates obtained overall bring into question our ability to generate informative time course data rather than the design of any particular reverse engineering algorithm.
Collapse
|
14
|
|
15
|
Hodgman T, Ajmera I. The successful application of systems approaches in plant biology. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2015; 117:59-68. [DOI: 10.1016/j.pbiomolbio.2015.01.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Accepted: 01/12/2015] [Indexed: 11/26/2022]
|
16
|
Dimitrova E, Stigler B. Data identification for improving gene network inference using computational algebra. Bull Math Biol 2014; 76:2923-40. [PMID: 25280666 DOI: 10.1007/s11538-014-9979-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Accepted: 05/12/2014] [Indexed: 11/25/2022]
Abstract
Identification of models of gene regulatory networks is sensitive to the amount of data used as input. Considering the substantial costs in conducting experiments, it is of value to have an estimate of the amount of data required to infer the network structure. To minimize wasted resources, it is also beneficial to know which data are necessary to identify the network. Knowledge of the data and knowledge of the terms in polynomial models are often required a priori in model identification. In applications, it is unlikely that the structure of a polynomial model will be known, which may force data sets to be unnecessarily large in order to identify a model. Furthermore, none of the known results provides any strategy for constructing data sets to uniquely identify a model. We provide a specialization of an existing criterion for deciding when a set of data points identifies a minimal polynomial model when its monomial terms have been specified. Then, we relax the requirement of the knowledge of the monomials and present results for model identification given only the data. Finally, we present a method for constructing data sets that identify minimal polynomial models.
Collapse
Affiliation(s)
- Elena Dimitrova
- Department of Mathematical Sciences, Clemson University, Clemson, SC, 29634, USA
| | | |
Collapse
|
17
|
Windhager L, Zierer J, Küffner R. Refining ensembles of predicted gene regulatory networks based on characteristic interaction sets. PLoS One 2014; 9:e84596. [PMID: 24498260 PMCID: PMC3911903 DOI: 10.1371/journal.pone.0084596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Accepted: 11/14/2013] [Indexed: 11/30/2022] Open
Abstract
Different ensemble voting approaches have been successfully applied for reverse-engineering of gene regulatory networks. They are based on the assumption that a good approximation of true network structure can be derived by considering the frequencies of individual interactions in a large number of predicted networks. Such approximations are typically superior in terms of prediction quality and robustness as compared to considering a single best scoring network only. Nevertheless, ensemble approaches only work well if the predicted gene regulatory networks are sufficiently similar to each other. If the topologies of predicted networks are considerably different, an ensemble of all networks obscures interesting individual characteristics. Instead, networks should be grouped according to local topological similarities and ensemble voting performed for each group separately. We argue that the presence of sets of co-occurring interactions is a suitable indicator for grouping predicted networks. A stepwise bottom-up procedure is proposed, where first mutual dependencies between pairs of interactions are derived from predicted networks. Pairs of co-occurring interactions are subsequently extended to derive characteristic interaction sets that distinguish groups of networks. Finally, ensemble voting is applied separately to the resulting topologically similar groups of networks to create distinct group-ensembles. Ensembles of topologically similar networks constitute distinct hypotheses about the reference network structure. Such group-ensembles are easier to interpret as their characteristic topology becomes clear and dependencies between interactions are known. The availability of distinct hypotheses facilitates the design of further experiments to distinguish between plausible network structures. The proposed procedure is a reasonable refinement step for non-deterministic reverse-engineering applications that produce a large number of candidate predictions for a gene regulatory network, e.g. due to probabilistic optimization or a cross-validation procedure.
Collapse
Affiliation(s)
- Lukas Windhager
- Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Jonas Zierer
- Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Robert Küffner
- Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany
- * E-mail:
| |
Collapse
|
18
|
Maetschke SR, Ragan MA. Characterizing cancer subtypes as attractors of Hopfield networks. ACTA ACUST UNITED AC 2014; 30:1273-9. [PMID: 24407221 DOI: 10.1093/bioinformatics/btt773] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Cancer is a heterogeneous progressive disease caused by perturbations of the underlying gene regulatory network that can be described by dynamic models. These dynamics are commonly modeled as Boolean networks or as ordinary differential equations. Their inference from data is computationally challenging, and at least partial knowledge of the regulatory network and its kinetic parameters is usually required to construct predictive models. RESULTS Here, we construct Hopfield networks from static gene-expression data and demonstrate that cancer subtypes can be characterized by different attractors of the Hopfield network. We evaluate the clustering performance of the network and find that it is comparable with traditional methods but offers additional advantages including a dynamic model of the energy landscape and a unification of clustering, feature selection and network inference. We visualize the Hopfield attractor landscape and propose a pruning method to generate sparse networks for feature selection and improved understanding of feature relationships.
Collapse
Affiliation(s)
- Stefan R Maetschke
- The University of Queensland, Institute for Molecular Bioscience, Brisbane, QLD 4072, Australia and Australian Research Council Centre of Excellence in Bioinformatics, Australia
| | | |
Collapse
|
19
|
Reconstructing biological gene regulatory networks: where optimization meets big data. EVOLUTIONARY INTELLIGENCE 2013. [DOI: 10.1007/s12065-013-0098-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
20
|
Fluck J, Hofmann-Apitius M. Text mining for systems biology. Drug Discov Today 2013; 19:140-4. [PMID: 24070668 DOI: 10.1016/j.drudis.2013.09.012] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2013] [Revised: 09/05/2013] [Accepted: 09/12/2013] [Indexed: 01/08/2023]
Abstract
Scientific communication in biomedicine is, by and large, still text based. Text mining technologies for the automated extraction of useful biomedical information from unstructured text that can be directly used for systems biology modelling have been substantially improved over the past few years. In this review, we underline the importance of named entity recognition and relationship extraction as fundamental approaches that are relevant to systems biology. Furthermore, we emphasize the role of publicly organized scientific benchmarking challenges that reflect the current status of text-mining technology and are important in moving the entire field forward. Given further interdisciplinary development of systems biology-orientated ontologies and training corpora, we expect a steadily increasing impact of text-mining technology on systems biology in the future.
Collapse
Affiliation(s)
- Juliane Fluck
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53754 Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53754 Sankt Augustin, Germany; Bonn-Aachen International Center for Information Technology (B-IT), Dahlmannstraβe 2, 53113 Bonn, Germany.
| |
Collapse
|
21
|
Santra T, Kolch W, Kholodenko BN. Integrating Bayesian variable selection with Modular Response Analysis to infer biochemical network topology. BMC SYSTEMS BIOLOGY 2013; 7:57. [PMID: 23829771 PMCID: PMC3726398 DOI: 10.1186/1752-0509-7-57] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/23/2012] [Accepted: 06/28/2013] [Indexed: 12/31/2022]
Abstract
Background Recent advancements in genetics and proteomics have led to the acquisition of large quantitative data sets. However, the use of these data to reverse engineer biochemical networks has remained a challenging problem. Many methods have been proposed to infer biochemical network topologies from different types of biological data. Here, we focus on unraveling network topologies from steady state responses of biochemical networks to successive experimental perturbations. Results We propose a computational algorithm which combines a deterministic network inference method termed Modular Response Analysis (MRA) and a statistical model selection algorithm called Bayesian Variable Selection, to infer functional interactions in cellular signaling pathways and gene regulatory networks. It can be used to identify interactions among individual molecules involved in a biochemical pathway or reveal how different functional modules of a biological network interact with each other to exchange information. In cases where not all network components are known, our method reveals functional interactions which are not direct but correspond to the interaction routes through unknown elements. Using computer simulated perturbation responses of signaling pathways and gene regulatory networks from the DREAM challenge, we demonstrate that the proposed method is robust against noise and scalable to large networks. We also show that our method can infer network topologies using incomplete perturbation datasets. Consequently, we have used this algorithm to explore the ERBB regulated G1/S transition pathway in certain breast cancer cells to understand the molecular mechanisms which cause these cells to become drug resistant. The algorithm successfully inferred many well characterized interactions of this pathway by analyzing experimentally obtained perturbation data. Additionally, it identified some molecular interactions which promote drug resistance in breast cancer cells. Conclusions The proposed algorithm provides a robust, scalable and cost effective solution for inferring network topologies from biological data. It can potentially be applied to explore novel pathways which play important roles in life threatening disease like cancer.
Collapse
Affiliation(s)
- Tapesh Santra
- Systems Biology Ireland, Conway Institute of Biomolecular & Biomedical Research, University College Dublin, Belfield, Dublin 4, Ireland.
| | | | | |
Collapse
|
22
|
Higa CHA, Andrade TP, Hashimoto RF. Growing seed genes from time series data and thresholded Boolean networks with perturbation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:37-49. [PMID: 23702542 DOI: 10.1109/tcbb.2012.169] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Models of gene regulatory networks (GRN) have been proposed along with algorithms for inferring their structure. By structure, we mean the relationships among the genes of the biological system under study. Despite the large number of genes found in the genome of an organism, it is believed that a small set of genes is responsible for maintaining a specific core regulatory mechanism (small subnetworks). We propose an algorithm for inference of subnetworks of genes from a small initial set of genes called seed and time series gene expression data. The algorithm has two main steps: First, it grows the seed of genes by adding genes to it, and second, it searches for subnetworks that can be biologically meaningful. The seed growing step is treated as a feature selection problem and we used a thresholded Boolean network with a perturbation model to design the criterion function that is used to select the features (genes). Given that the reverse engineering of GRN is a problem that does not necessarily have one unique solution, the proposed algorithm has as output a set of networks instead of one single network. The algorithm also analyzes the dynamics of the networks which can be time-consuming. Nevertheless, the algorithm is suitable when the number of genes is small. The results showed that the algorithm is capable of recovering an acceptable rate of gene interactions and to generate regulatory hypotheses that can be explored in the wet lab.
Collapse
Affiliation(s)
- Carlos H A Higa
- College of Computing, Federal University of Mato Grosso do Sul, Campo Grande MS, Brazil.
| | | | | |
Collapse
|
23
|
Manshaei R, Sobhe Bidari P, Aliyari Shoorehdeli M, Feizi A, Lohrasebi T, Malboobi MA, Kyan M, Alirezaie J. Hybrid-controlled neurofuzzy networks analysis resulting in genetic regulatory networks reconstruction. ISRN BIOINFORMATICS 2012; 2012:419419. [PMID: 25969749 PMCID: PMC4393070 DOI: 10.5402/2012/419419] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/10/2012] [Accepted: 08/15/2012] [Indexed: 12/03/2022]
Abstract
Reverse engineering of gene regulatory networks (GRNs) is the process of estimating genetic interactions of a cellular system from gene expression data. In this paper, we propose a novel hybrid systematic algorithm based on neurofuzzy network for reconstructing GRNs from observational gene expression data when only a medium-small number of measurements are available. The approach uses fuzzy logic to transform gene expression values into qualitative descriptors that can be evaluated by using a set of defined rules. The algorithm uses neurofuzzy network to model genes effects on other genes followed by four stages of decision making to extract gene interactions. One of the main features of the proposed algorithm is that an optimal number of fuzzy rules can be easily and rapidly extracted without overparameterizing. Data analysis and simulation are conducted on microarray expression profiles of S. cerevisiae cell cycle and demonstrate that the proposed algorithm not only selects the patterns of the time series gene expression data accurately, but also provides models with better reconstruction accuracy when compared with four published algorithms: DBNs, VBEM, time delay ARACNE, and PF subjected to LASSO. The accuracy of the proposed approach is evaluated in terms of recall and F-score for the network reconstruction task.
Collapse
Affiliation(s)
- Roozbeh Manshaei
- Electrical and Computer Engineering Department, Ryerson University, Toronto, ON, Canada M5B 2K3
| | - Pooya Sobhe Bidari
- Electrical and Computer Engineering Department, Ryerson University, Toronto, ON, Canada M5B 2K3
| | - Mahdi Aliyari Shoorehdeli
- Electrical and Computer Engineering Department, K.N. Toosi University of Technology, Tehran 16315-1355, Iran
| | - Amir Feizi
- Department of Chemical and Biological Engineering, Systems and Synthetic Biology Group, Chalmers University, 41296 Gutenberg, Sweden
| | - Tahmineh Lohrasebi
- National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran 14965/161, Iran
| | - Mohammad Ali Malboobi
- National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran 14965/161, Iran
| | - Matthew Kyan
- Electrical and Computer Engineering Department, Ryerson University, Toronto, ON, Canada M5B 2K3
| | - Javad Alirezaie
- Electrical and Computer Engineering Department, Ryerson University, Toronto, ON, Canada M5B 2K3
| |
Collapse
|
24
|
Cheng TMK, Gulati S, Agius R, Bates PA. Understanding cancer mechanisms through network dynamics. Brief Funct Genomics 2012; 11:543-60. [PMID: 22811516 DOI: 10.1093/bfgp/els025] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2024] Open
Abstract
Cancer is a complex, multifaceted disease. Cellular systems are perturbed both during the onset and development of cancer, and the behavioural change of tumour cells usually involves a broad range of dynamic variations. To an extent, the difficulty of monitoring the systemic change has been alleviated by recent developments in the high-throughput technologies. At both the genomic as well as proteomic levels, the technological advances in microarray and mass spectrometry, in conjunction with computational simulations and the construction of human interactome maps have facilitated the progress of identifying disease-associated genes. On a systems level, computational approaches developed for network analysis are becoming especially useful for providing insights into the mechanism behind tumour development and metastasis. This review emphasizes network approaches that have been developed to study cancer and provides an overview of our current knowledge of protein-protein interaction networks, and how their systemic perturbation can be analysed by two popular network simulation methods: Boolean network and ordinary differential equations.
Collapse
Affiliation(s)
- Tammy M K Cheng
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, Lincoln's Inn Fields, London WC2A 3LY, UK
| | | | | | | |
Collapse
|
25
|
Gillet JP, Wang J, Calcagno AM, Green LJ, Varma S, Elstrand MB, Trope CG, Ambudkar SV, Davidson B, Gottesman MM. Clinical relevance of multidrug resistance gene expression in ovarian serous carcinoma effusions. Mol Pharm 2011; 8:2080-8. [PMID: 21761824 PMCID: PMC3224865 DOI: 10.1021/mp200240a] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The presence of tumor cells in effusions within serosal cavities is a clinical manifestation of advanced-stage cancer and is generally associated with poor survival. Identifying molecular targets may help to design efficient treatments to eradicate these aggressive cancer cells and improve patient survival. Using a state-of-the-art TaqMan-based qRT-PCR assay, we investigated the multidrug resistance (MDR) transcriptome of 32 unpaired ovarian serous carcinoma effusion samples obtained at diagnosis or at disease recurrence following chemotherapy. MDR genes were selected a priori based on an extensive curation of the literature published during the last three decades. We found three gene signatures with a statistically significant correlation with overall survival (OS), response to treatment [complete response (CR) vs other], and progression free survival (PFS). The median log-rank p-values for the signatures were 0.023, 0.034, and 0.008, respectively. No correlation was found with residual tumor status after cytoreductive surgery, treatment (with or without chemotherapy) and stage defined according to the International Federation of Gynecology and Obstetrics. Further analyses demonstrated that gene expression alone can effectively predict the survival outcome of women with ovarian serous carcinoma (OS, log-rank p = 0.0000; and PFS, log-rank p = 0.002). Interestingly, the signature for overall survival is the same in patients at first presentation and those who had chemotherapy and relapsed. This pilot study highlights two new gene signatures that may help in optimizing the treatment for ovarian carcinoma patients with effusions.
Collapse
Affiliation(s)
- Jean-Pierre Gillet
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, NIH
| | - Junbai Wang
- Division of Pathology, Norwegian Radium Hospital, Oslo University Hospital, N-0310 Oslo, Norway
| | - Anna Maria Calcagno
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, NIH
| | - Lisa J. Green
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, NIH
| | - Sudhir Varma
- Bioinformatics and Computational Biosciences Branch, Office of Cyber Infrastructure and Computational Biology, Office of Science Management and Operations, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD
| | - Mari Bunkholt Elstrand
- Department of Gynecologic Oncology, Norwegian Radium Hospital, Oslo University Hospital, N-0310 Oslo, Norway
| | - Claes G. Trope
- Department of Gynecologic Oncology, Norwegian Radium Hospital, Oslo University Hospital, N-0310 Oslo, Norway
- The Medical Faculty, University of Oslo, N-0316 Oslo, Norway
| | - Suresh V. Ambudkar
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, NIH
| | - Ben Davidson
- Division of Pathology, Norwegian Radium Hospital, Oslo University Hospital, N-0310 Oslo, Norway
- The Medical Faculty, University of Oslo, N-0316 Oslo, Norway
| | - Michael M. Gottesman
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, NIH
| |
Collapse
|
26
|
Wong L. BRIEF INTRODUCTION TO SOME NEW RESULTS IN GENE EXPRESSION ANALYSIS, SYSTEMS BIOLOGY MODELING, MOTIF IDENTIFICATION, AND (NONCODING) RNA ANALYSIS. J Bioinform Comput Biol 2011. [DOI: 10.1142/s0219720010005026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
27
|
Dimitrova ES, Mitra I, Jarrah AS. Probabilistic polynomial dynamical systems for reverse engineering of gene regulatory networks. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2011; 2011:1. [PMID: 21910920 PMCID: PMC3171177 DOI: 10.1186/1687-4153-2011-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2010] [Accepted: 06/06/2011] [Indexed: 02/08/2023]
Abstract
Elucidating the structure and/or dynamics of gene regulatory networks from experimental data is a major goal of systems biology. Stochastic models have the potential to absorb noise, account for un-certainty, and help avoid data overfitting. Within the frame work of probabilistic polynomial dynamical systems, we present an algorithm for the reverse engineering of any gene regulatory network as a discrete, probabilistic polynomial dynamical system. The resulting stochastic model is assembled from all minimal models in the model space and the probability assignment is based on partitioning the model space according to the likeliness with which a minimal model explains the observed data. We used this method to identify stochastic models for two published synthetic network models. In both cases, the generated model retains the key features of the original model and compares favorably to the resulting models from other algorithms.
Collapse
Affiliation(s)
- Elena S Dimitrova
- Department of Mathematical Sciences, Clemson University, Clemson, SC 29634-0975, USA
| | - Indranil Mitra
- Sealy Center of Molecular Medicine, University of Texas Medical Branch, Galveston, TX 77550, USA
| | - Abdul Salam Jarrah
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, VA 24061-0477, USA
- Department of Mathematics and Statistics, American University of Sharjah, Sharjah, UAE
| |
Collapse
|
28
|
Lopes FM, Cesar RM, Costa LDF. Gene expression complex networks: synthesis, identification, and analysis. J Comput Biol 2011; 18:1353-67. [PMID: 21548810 DOI: 10.1089/cmb.2010.0118] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdös-Rényi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabási-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree <k> variation, decreasing its network recovery rate with the increase of <k>. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.
Collapse
Affiliation(s)
- Fabrício M Lopes
- Federal University of Technology-Paraná and Institute of Mathematics and Statistics, University of São Paulo, Brazil.
| | | | | |
Collapse
|
29
|
Abstract
Systems biology is all about networks. A recent trend has been to associate systems biology exclusively with the study of gene regulatory or protein-interaction networks. However, systems biology approaches can be applied at many other scales, from the subatomic to the ecosystem scales. In this review, we describe studies at the sub-cellular, tissue, whole plant and crop scales and highlight how these studies can be related to systems biology. We discuss the properties of system approaches at each scale as well as their current limits, and pinpoint in each case advances unique to the considered scale but representing potential for the other scales. We conclude by examining plant models bridging different scales and considering the future prospects of plant systems biology.
Collapse
Affiliation(s)
- Mikaël Lucas
- Centre for Plant Integrative Biology, University of Nottingham, Nottingham, UK.
| | | | | |
Collapse
|