1
|
Kim H, Choi H, Lee D, Kim J. A review on gene regulatory network reconstruction algorithms based on single cell RNA sequencing. Genes Genomics 2024; 46:1-11. [PMID: 38032470 DOI: 10.1007/s13258-023-01473-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Accepted: 10/24/2023] [Indexed: 12/01/2023]
Abstract
BACKGROUND Understanding gene regulatory networks (GRNs) is essential for unraveling the molecular mechanisms governing cellular behavior. With the advent of high-throughput transcriptome measurement technology, researchers have aimed to reverse engineer the biological systems, extracting gene regulatory rules from their outputs, which represented by gene expression data. Bulk RNA sequencing, a widely used method for measuring gene expression, has been employed for GRN reconstruction. However, it falls short in capturing dynamic changes in gene expression at the level of individual cells since it averages gene expression across mixed cell populations. OBJECTIVE In this review, we provide an overview of 15 GRN reconstruction tools and discuss their respective strengths and limitations, particularly in the context of single cell RNA sequencing (scRNA-seq). METHODS Recent advancements in scRNA-seq break new ground of GRN reconstruction. They offer snapshots of the individual cell transcriptomes and capturing dynamic changes. We emphasize how these technological breakthroughs have enhanced GRN reconstruction. CONCLUSION GRN reconstructors can be classified based on their requirement for cellular trajectory, which represents a dynamical cellular process including differentiation, aging, or disease progression. Benchmarking studies support the superiority of GRN reconstructors that do not require trajectory analysis in identifying regulator-target relationships. However, methods equipped with trajectory analysis demonstrate better performance in identifying key regulatory factors. In conclusion, researchers should select a suitable GRN reconstructor based on their specific research objectives.
Collapse
Affiliation(s)
- Hyeonkyu Kim
- School of Systems Biomedical Science, Soongsil University, 369 Sangdo-Ro, Dongjak-Gu, Seoul, 06978, Republic of Korea
| | - Hwisoo Choi
- School of Systems Biomedical Science, Soongsil University, 369 Sangdo-Ro, Dongjak-Gu, Seoul, 06978, Republic of Korea
| | - Daewon Lee
- School of Art and Technology, Chung-Ang University, 4726 Seodong-Daero, Anseong-Si, Gyeonggi-Do, 17546, Republic of Korea.
| | - Junil Kim
- School of Systems Biomedical Science, Soongsil University, 369 Sangdo-Ro, Dongjak-Gu, Seoul, 06978, Republic of Korea.
| |
Collapse
|
2
|
Bermudez A, Gonzalez Z, Zhao B, Salter E, Liu X, Ma L, Jawed MK, Hsieh CJ, Lin NYC. Supracellular measurement of spatially varying mechanical heterogeneities in live monolayers. Biophys J 2022; 121:3358-3369. [PMID: 36028999 PMCID: PMC9515370 DOI: 10.1016/j.bpj.2022.08.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Revised: 07/10/2022] [Accepted: 08/19/2022] [Indexed: 11/29/2022] Open
Abstract
The mechanical properties of tissues have profound impacts on a wide range of biological processes such as embryo development (1,2), wound healing (3-6), and disease progression (7). Specifically, the spatially varying moduli of cells largely influence the local tissue deformation and intercellular interaction. Despite the importance of characterizing such a heterogeneous mechanical property, it has remained difficult to measure the supracellular modulus field in live cell layers with a high-throughput and minimal perturbation. In this work, we developed a monolayer effective modulus measurement by integrating a custom cell stretcher, light microscopy, and AI-based inference. Our approach first quantifies the heterogeneous deformation of a slightly stretched cell layer and converts the measured strain fields into an effective modulus field using an AI inference. This method allowed us to directly visualize the effective modulus distribution of thousands of cells virtually instantly. We characterized the mean value, SD, and correlation length of the effective cell modulus for epithelial cells and fibroblasts, which are in agreement with previous results. We also observed a mild correlation between cell area and stiffness in jammed epithelia, suggesting the influence of cell modulus on packing. Overall, our reported experimental platform provides a valuable alternative cell mechanics measurement tool that can be integrated with microscopy-based characterizations.
Collapse
Affiliation(s)
- Alexandra Bermudez
- Department of Mechanical and Aerospace Engineering, University of California, Los Angeles, California 90095, USA; Department of Bioengineering, University of California, Los Angeles, California.
| | - Zachary Gonzalez
- Department of Mechanical and Aerospace Engineering, University of California, Los Angeles, California 90095, USA; Department of Physics and Astronomy, University of California, Los Angeles, California
| | - Bao Zhao
- Department of Mechanical and Aerospace Engineering, University of California, Los Angeles, California 90095, USA
| | - Ethan Salter
- Department of Mechanical and Aerospace Engineering, University of California, Los Angeles, California 90095, USA; Department of Bioengineering, University of California, Los Angeles, California
| | - Xuanqing Liu
- Department of Computer Science, University of California, Los Angeles, California
| | - Leixin Ma
- Department of Mechanical and Aerospace Engineering, University of California, Los Angeles, California 90095, USA
| | - Mohammad Khalid Jawed
- Department of Mechanical and Aerospace Engineering, University of California, Los Angeles, California 90095, USA
| | - Cho-Jui Hsieh
- Department of Computer Science, University of California, Los Angeles, California
| | - Neil Y C Lin
- Department of Mechanical and Aerospace Engineering, University of California, Los Angeles, California 90095, USA; Department of Bioengineering, University of California, Los Angeles, California; Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, Los Angeles, California.
| |
Collapse
|
3
|
Ajmal HB, Madden MG. Dynamic Bayesian Network Learning to Infer Sparse Models From Time Series Gene Expression Data. IEEE/ACM Trans Comput Biol Bioinform 2022; 19:2794-2805. [PMID: 34181549 DOI: 10.1109/tcbb.2021.3092879] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
One of the key challenges in systems biology is to derive gene regulatory networks (GRNs) from complex high-dimensional sparse data. Bayesian networks (BNs) and dynamic Bayesian networks (DBNs) have been widely applied to infer GRNs from gene expression data. GRNs are typically sparse but traditional approaches of BN structure learning to elucidate GRNs often produce many spurious (false positive) edges. We present two new BN scoring functions, which are extensions to the Bayesian Information Criterion (BIC) score, with additional penalty terms and use them in conjunction with DBN structure search methods to find a graph structure that maximises the proposed scores. Our BN scoring functions offer better solutions for inferring networks with fewer spurious edges compared to the BIC score. The proposed methods are evaluated extensively on auto regressive and DREAM4 benchmarks. We found that they significantly improve the precision of the learned graphs, relative to the BIC score. The proposed methods are also evaluated on three real time series gene expression datasets. The results demonstrate that our algorithms are able to learn sparse graphs from high-dimensional time series data. The implementation of these algorithms is open source and is available in form of an R package on GitHub at https://github.com/HamdaBinteAjmal/DBN4GRN, along with the documentation and tutorials.
Collapse
|
4
|
Subbaroyan A, Martin OC, Samal A. A preference for link operator functions can drive Boolean biological networks towards critical dynamics. J Biosci 2022; 47. [DOI: 10.1007/s12038-022-00256-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
5
|
Kim J, T. Jakobsen S, Natarajan KN, Won KJ. TENET: gene network reconstruction using transfer entropy reveals key regulatory factors from single cell transcriptomic data. Nucleic Acids Res 2021; 49:e1. [PMID: 33170214 PMCID: PMC7797076 DOI: 10.1093/nar/gkaa1014] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 10/05/2020] [Accepted: 10/14/2020] [Indexed: 12/22/2022] Open
Abstract
Accurate prediction of gene regulatory rules is important towards understanding of cellular processes. Existing computational algorithms devised for bulk transcriptomics typically require a large number of time points to infer gene regulatory networks (GRNs), are applicable for a small number of genes and fail to detect potential causal relationships effectively. Here, we propose a novel approach 'TENET' to reconstruct GRNs from single cell RNA sequencing (scRNAseq) datasets. Employing transfer entropy (TE) to measure the amount of causal relationships between genes, TENET predicts large-scale gene regulatory cascades/relationships from scRNAseq data. TENET showed better performance than other GRN reconstructors, in identifying key regulators from public datasets. Specifically from scRNAseq, TENET identified key transcriptional factors in embryonic stem cells (ESCs) and during direct cardiomyocytes reprogramming, where other predictors failed. We further demonstrate that known target genes have significantly higher TE values, and TENET predicted higher TE genes were more influenced by the perturbation of their regulator. Using TENET, we identified and validated that Nme2 is a culture condition specific stem cell factor. These results indicate that TENET is uniquely capable of identifying key regulators from scRNAseq data.
Collapse
Affiliation(s)
- Junil Kim
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, 2200 Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, Faculty of Health and Medical Sciences, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen N, Denmark
| | - Simon T. Jakobsen
- Functional Genomics and Metabolism Unit, Department of Biochemistry and Molecular Biology, University of Southern Denmark, Denmark
| | - Kedar N Natarajan
- Functional Genomics and Metabolism Unit, Department of Biochemistry and Molecular Biology, University of Southern Denmark, Denmark
- Danish Institute of Advanced Study (D-IAS), University of Southern Denmark, Denmark
| | - Kyoung-Jae Won
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, 2200 Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, Faculty of Health and Medical Sciences, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen N, Denmark
| |
Collapse
|
6
|
Cheung FKM, Qin J. The Methods and Tools for Molecular Network Construction. Systems Medicine 2021. [DOI: 10.1016/b978-0-12-801238-3.11464-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
7
|
Ajmal HB, Madden MG. Inferring dynamic gene regulatory networks with low-order conditional independencies – an evaluation of the method. Stat Appl Genet Mol Biol 2020. [DOI: 10.1515/sagmb-2020-0051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
AbstractOver a decade ago, Lèbre (2009) proposed an inference method, G1DBN, to learn the structure of gene regulatory networks (GRNs) from high dimensional, sparse time-series gene expression data. Their approach is based on concept of low-order conditional independence graphs that they extend to dynamic Bayesian networks (DBNs). They present results to demonstrate that their method yields better structural accuracy compared to the related Lasso and Shrinkage methods, particularly where the data is sparse, that is, the number of time measurements n is much smaller than the number of genes p. This paper challenges these claims using a careful experimental analysis, to show that the GRNs reverse engineered from time-series data using the G1DBN approach are less accurate than claimed by Lèbre (2009). We also show that the Lasso method yields higher structural accuracy for graphs learned from the simulated data, compared to the G1DBN method, particularly when the data is sparse ($n{< }{< }p$). The Lasso method is also better than G1DBN at identifying the transcription factors (TFs) involved in the cell cycle of Saccharomyces cerevisiae.
Collapse
Affiliation(s)
- Hamda B. Ajmal
- School of Computer Science, National University of Ireland, Galway, Ireland
| | - Michael G. Madden
- School of Computer Science, National University of Ireland, Galway, Ireland
| |
Collapse
|
8
|
Khalid M, Khan S, Ahmad J, Shaheryar M. Identification of self-regulatory network motifs in reverse engineering gene regulatory networks using microarray gene expression data. IET Syst Biol 2019; 13:55-68. [PMID: 33444479 PMCID: PMC8687352 DOI: 10.1049/iet-syb.2018.5001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 11/01/2018] [Accepted: 12/10/2018] [Indexed: 11/19/2022] Open
Abstract
Gene Regulatory Networks (GRNs) are reconstructed from the microarray gene expression data through diversified computational approaches. This process ensues in symmetric and diagonal interaction of gene pairs that cannot be modelled as direct activation, inhibition, and self-regulatory interactions. The values of gene co-expressions could help in identifying co-regulations among them. The proposed approach aims at computing the differences in variances of co-expressed genes rather than computing differences in values of mean expressions across experimental conditions. It adopts multivariate co-variances using principal component analysis (PCA) to predict an asymmetric and non-diagonal gene interaction matrix, to select only those gene pair interactions that exhibit the maximum variances in gene regulatory expressions. The asymmetric gene regulatory interactions help in identifying the controlling regulatory agents, thus lowering the false positive rate by minimizing the connections between previously unlinked network components. The experimental results on real as well as in silico datasets including time-series RTX therapy, Arabidopsis thaliana, DREAM-3, and DREAM-8 datasets, in comparison with existing state-of-the-art approaches demonstrated the enhanced performance of the proposed approach for predicting positive and negative feedback loops and self-regulatory interactions. The generated GRNs hold the potential in determining the real nature of gene pair regulatory interactions.
Collapse
Affiliation(s)
- Mehrosh Khalid
- School of Electrical Engineering and Computer ScienceNational University of Sciences and TechnologyIslamabadPakistan
| | - Sharifullah Khan
- School of Electrical Engineering and Computer ScienceNational University of Sciences and TechnologyIslamabadPakistan
| | - Jamil Ahmad
- Research Centre for Modelling and SimulationNational University of Sciences and TechnologyIslamabadPakistan
| | - Muhammad Shaheryar
- Department of Computer ScienceCapital University of Science and TechnologyIslamabadPakistan
| |
Collapse
|
9
|
Alexiou A, Chatzichronis S, Perveen A, Hafeez A, Ashraf GM. Algorithmic and Stochastic Representations of Gene Regulatory Networks and Protein-Protein Interactions. Curr Top Med Chem 2019; 19:413-425. [PMID: 30854971 DOI: 10.2174/1568026619666190311125256] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Revised: 10/15/2018] [Accepted: 12/26/2018] [Indexed: 02/06/2023]
Abstract
BACKGROUND Latest studies reveal the importance of Protein-Protein interactions on physiologic functions and biological structures. Several stochastic and algorithmic methods have been published until now, for the modeling of the complex nature of the biological systems. OBJECTIVE Biological Networks computational modeling is still a challenging task. The formulation of the complex cellular interactions is a research field of great interest. In this review paper, several computational methods for the modeling of GRN and PPI are presented analytically. METHODS Several well-known GRN and PPI models are presented and discussed in this review study such as: Graphs representation, Boolean Networks, Generalized Logical Networks, Bayesian Networks, Relevance Networks, Graphical Gaussian models, Weight Matrices, Reverse Engineering Approach, Evolutionary Algorithms, Forward Modeling Approach, Deterministic models, Static models, Hybrid models, Stochastic models, Petri Nets, BioAmbients calculus and Differential Equations. RESULTS GRN and PPI methods have been already applied in various clinical processes with potential positive results, establishing promising diagnostic tools. CONCLUSION In literature many stochastic algorithms are focused in the simulation, analysis and visualization of the various biological networks and their dynamics interactions, which are referred and described in depth in this review paper.
Collapse
Affiliation(s)
| | | | - Asma Perveen
- Glocal School of Life Sciences, Glocal University, Mirzapur Pole, Saharanpur, Uttar Pradesh, India
| | - Abdul Hafeez
- Glocal School of Pharmacy, Glocal University, Mirzapur Pole, Saharanpur, Uttar Pradesh, India
| | - Ghulam Md. Ashraf
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
10
|
Raza K. Fuzzy logic based approaches for gene regulatory network inference. Artif Intell Med 2018; 97:189-203. [PMID: 30573378 DOI: 10.1016/j.artmed.2018.12.004] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 12/10/2018] [Accepted: 12/12/2018] [Indexed: 12/26/2022]
Abstract
The rapid advancements in high-throughput techniques have fueled large-scale production of biological data at very affordable costs. Some of these techniques are microarrays and next-generation sequencing that provide genome level insight of living cells. As a result, the size of most of the biological databases, such as NCBI-GEO, NCBI-SRA, etc., is growing exponentially. These biological data are analyzed using various computational techniques for knowledge discovery - which is also one of the objectives of bioinformatics research. Gene regulatory network (GRN) is a gene-gene interaction network which plays a pivotal role in understanding gene regulation processes and disease mechanism at the molecular level. From last couple of decades, researchers are interested in developing computational algorithms for GRN inference (GRNI) from high-throughput experimental data. Several computational approaches have been proposed for inferring GRN from gene expression data including statistical techniques (correlation coefficient), information theory (mutual information), regression-based approaches, probabilistic approaches (Bayesian networks, naïve byes), artificial neural networks and fuzzy logic. The fuzzy logic, along with its hybridization with other intelligent approaches, is a well-studied technique in GRNI due to its several advantages. In this paper, we present a consolidated review on fuzzy logic and its hybrid approaches developed during last two decades for GRNI.
Collapse
Affiliation(s)
- Khalid Raza
- Department of Computer Science, Jamia Millia Islamia, New Delhi, India.
| |
Collapse
|
11
|
Abstract
Inference of biochemical network models from experimental data is a crucial problem in systems and synthetic biology that includes parameter calibration but also identification of unknown interactions. Stochastic modelling from single-cell data is known to improve identifiability of reaction network parameters for specific systems. However, general results are lacking, and the advantage over deterministic, population-average approaches has not been explored for network reconstruction. In this work, we study identifiability and propose new reconstruction methods for biochemical interaction networks. Focusing on population-snapshot data and networks with reaction rates affine in the state, for parameter estimation, we derive general methods to test structural identifiability and demonstrate them in connection with practical identifiability for a reporter gene in silico case study. In the same framework, we next develop a two-step approach to the reconstruction of unknown networks of interactions. We apply it to compare the achievable network reconstruction performance in a deterministic and a stochastic setting, showing the advantage of the latter, and demonstrate it on population-snapshot data from a simulated example.
Collapse
|
12
|
Melkman AA, Cheng X, Ching WK, Akutsu T. Identifying a Probabilistic Boolean Threshold Network From Samples. IEEE Trans Neural Netw Learn Syst 2018; 29:869-881. [PMID: 28129190 DOI: 10.1109/tnnls.2017.2648039] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
This paper studies the problem of exactly identifying the structure of a probabilistic Boolean network (PBN) from a given set of samples, where PBNs are probabilistic extensions of Boolean networks. Cheng et al. studied the problem while focusing on PBNs consisting of pairs of AND/OR functions. This paper considers PBNs consisting of Boolean threshold functions while focusing on those threshold functions that have unit coefficients. The treatment of Boolean threshold functions, and triplets and -tuplets of such functions, necessitates a deepening of the theoretical analyses. It is shown that wide classes of PBNs with such threshold functions can be exactly identified from samples under reasonable constraints, which include: 1) PBNs in which any number of threshold functions can be assigned provided that all have the same number of input variables and 2) PBNs consisting of pairs of threshold functions with different numbers of input variables. It is also shown that the problem of deciding the equivalence of two Boolean threshold functions is solvable in pseudopolynomial time but remains co-NP complete.
Collapse
|
13
|
Abstract
In metazoans, epithelial architecture provides a context that dynamically modulates most if not all epithelial cell responses to intrinsic and extrinsic signals, including growth or survival signalling and transforming oncogene action. Three-dimensional (3D) epithelial culture systems provide tractable models to interrogate the function of human genetic determinants in establishment of context-dependency. We performed an arrayed genetic shRNA screen in mammary epithelial 3D cultures to identify new determinants of epithelial architecture, finding that the key phenotype impacting shRNAs altered not only the data population average but even more noticeably the population distribution. The broad distributions were attributable to sporadic gene silencing actions by shRNA in unselected populations. We employed Maximum Mean Discrepancy concept to capture similar population distribution patterns and demonstrate here the feasibility of the test in identifying an impact of shRNA in populations of 3D structures. Integration of the clustered morphometric data with protein-protein interactions data enabled hypothesis generation of novel biological pathways underlying similar 3D phenotype alterations. The results present a new strategy for 3D phenotype-driven pathway analysis, which is expected to accelerate discovery of context-dependent gene functions in epithelial biology and tumorigenesis.
Collapse
Affiliation(s)
- Elsa Marques
- Cancer Cell Circuitry Laboratory, Research Programs Unit/Translational Cancer Biology & Medicum, University of Helsinki, P.O Box 63 (street address: Haartmaninkatu 8), 00014 University of Helsinki, Helsinki, Finland
| | - Tomi Peltola
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, PO BOX 15400, FI-00076, Aalto, Finland
| | - Samuel Kaski
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, PO BOX 15400, FI-00076, Aalto, Finland
| | - Juha Klefström
- Cancer Cell Circuitry Laboratory, Research Programs Unit/Translational Cancer Biology & Medicum, University of Helsinki, P.O Box 63 (street address: Haartmaninkatu 8), 00014 University of Helsinki, Helsinki, Finland.
| |
Collapse
|
14
|
Moran B, Rahman A, Palonen K, Lanigan FT, Gallagher WM. Master Transcriptional Regulators in Cancer: Discovery via Reverse Engineering Approaches and Subsequent Validation. Cancer Res 2017; 77:2186-2190. [PMID: 28428271 DOI: 10.1158/0008-5472.can-16-1813] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2016] [Revised: 09/08/2016] [Accepted: 02/22/2017] [Indexed: 11/16/2022]
Abstract
Reverse engineering of transcriptional networks using gene expression data enables identification of genes that underpin the development and progression of different cancers. Methods to this end have been available for over a decade and, with a critical mass of transcriptomic data in the oncology arena having been reached, they are ever more applicable. Extensive and complex networks can be distilled into a small set of key master transcriptional regulators (MTR), genes that are very highly connected and have been shown to be involved in processes of known importance in disease. Interpreting and validating the results of standardized bioinformatic methods is of crucial importance in determining the inherent value of MTRs. In this review, we briefly describe how MTRs are identified and focus on providing an overview of how MTRs can and have been validated for use in clinical decision making in malignant diseases, along with serving as tractable therapeutic targets. Cancer Res; 77(9); 2186-90. ©2017 AACR.
Collapse
Affiliation(s)
- Bruce Moran
- Cancer Biology and Therapeutics Laboratory, UCD School of Biomolecular and Biomedical Research, UCD Conway Institute, University College Dublin, Dublin, Ireland.,OncoMark Limited, NovaUCD, Belfield Innovation Park, Belfield, Dublin, Ireland
| | - Arman Rahman
- Cancer Biology and Therapeutics Laboratory, UCD School of Biomolecular and Biomedical Research, UCD Conway Institute, University College Dublin, Dublin, Ireland.,OncoMark Limited, NovaUCD, Belfield Innovation Park, Belfield, Dublin, Ireland
| | - Katja Palonen
- Cancer Biology and Therapeutics Laboratory, UCD School of Biomolecular and Biomedical Research, UCD Conway Institute, University College Dublin, Dublin, Ireland.,OncoMark Limited, NovaUCD, Belfield Innovation Park, Belfield, Dublin, Ireland
| | - Fiona T Lanigan
- Cancer Biology and Therapeutics Laboratory, UCD School of Biomolecular and Biomedical Research, UCD Conway Institute, University College Dublin, Dublin, Ireland
| | - William M Gallagher
- Cancer Biology and Therapeutics Laboratory, UCD School of Biomolecular and Biomedical Research, UCD Conway Institute, University College Dublin, Dublin, Ireland. .,OncoMark Limited, NovaUCD, Belfield Innovation Park, Belfield, Dublin, Ireland
| |
Collapse
|
15
|
Raza K, Alam M. Recurrent neural network based hybrid model for reconstructing gene regulatory network. Comput Biol Chem 2016; 64:322-34. [PMID: 27570069 DOI: 10.1016/j.compbiolchem.2016.08.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Revised: 05/01/2016] [Accepted: 08/13/2016] [Indexed: 11/22/2022]
Abstract
One of the exciting problems in systems biology research is to decipher how genome controls the development of complex biological system. The gene regulatory networks (GRNs) help in the identification of regulatory interactions between genes and offer fruitful information related to functional role of individual gene in a cellular system. Discovering GRNs lead to a wide range of applications, including identification of disease related pathways providing novel tentative drug targets, helps to predict disease response, and also assists in diagnosing various diseases including cancer. Reconstruction of GRNs from available biological data is still an open problem. This paper proposes a recurrent neural network (RNN) based model of GRN, hybridized with generalized extended Kalman filter for weight update in backpropagation through time training algorithm. The RNN is a complex neural network that gives a better settlement between biological closeness and mathematical flexibility to model GRN; and is also able to capture complex, non-linear and dynamic relationships among variables. Gene expression data are inherently noisy and Kalman filter performs well for estimation problem even in noisy data. Hence, we applied non-linear version of Kalman filter, known as generalized extended Kalman filter, for weight update during RNN training. The developed model has been tested on four benchmark networks such as DNA SOS repair network, IRMA network, and two synthetic networks from DREAM Challenge. We performed a comparison of our results with other state-of-the-art techniques which shows superiority of our proposed model. Further, 5% Gaussian noise has been induced in the dataset and result of the proposed model shows negligible effect of noise on results, demonstrating the noise tolerance capability of the model.
Collapse
|
16
|
Berrones A, Jiménez E, Alcorta-García MA, Almaguer FJ, Peña B. Parameter inference of general nonlinear dynamical models of gene regulatory networks from small and noisy time series. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.10.095] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
17
|
Zhan C, Li BYS, Yeung LF. Structural and practical identifiability analysis of S-system. IET Syst Biol 2015; 9:285-293. [PMID: 26577163 PMCID: PMC8687182 DOI: 10.1049/iet-syb.2015.0014] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Revised: 08/03/2015] [Accepted: 08/14/2015] [Indexed: 10/08/2023] Open
Abstract
In the field of systems biology, biological reaction networks are usually modelled by ordinary differential equations. A sub-class, the S-systems representation, is a widely used form of modelling. Existing S-systems identification techniques assume that the system itself is always structurally identifiable. However, due to practical limitations, biological reaction networks are often only partially measured. In addition, the captured data only covers a limited trajectory, therefore data can only be considered as a local snapshot of the system responses with respect to the complete set of state trajectories over the entire state space. Hence the estimated model can only reflect partial system dynamics and may not be unique. To improve the identification quality, the structural and practical identifiablility of S-system are studied. The S-system is shown to be identifiable under a set of assumptions. Then, an application on yeast fermentation pathway was conducted. Two case studies were chosen; where the first case is based on a larger state trajectories and the second case is based on a smaller one. By expanding the dataset which span a relatively larger state space, the uncertainty of the estimated system can be reduced. The results indicated that initial concentration is related to the practical identifiablity.
Collapse
Affiliation(s)
- Choujun Zhan
- Department of Electronics Communication and Software Engineering, Nanfang College of Sun Yat-Sen University, Guangdong 510970, People's Republic of China
| | - Benjamin Yee Shing Li
- Department of Electronic Engineering, City University of Hong Kong, Hong Kong, Hong Kong.
| | - Lam Fat Yeung
- Department of Electronic Engineering, City University of Hong Kong, Hong Kong, Hong Kong
| |
Collapse
|
18
|
Vashishtha S, Broderick G, Craddock TJ, Fletcher MA, Klimas NG. Inferring Broad Regulatory Biology from Time Course Data: Have We Reached an Upper Bound under Constraints Typical of In Vivo Studies? PLoS One 2015; 10:e0127364. [PMID: 25984725 DOI: 10.1371/journal.pone.0127364] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 04/13/2015] [Indexed: 12/21/2022] Open
Abstract
There is a growing appreciation for the network biology that regulates the coordinated expression of molecular and cellular markers however questions persist regarding the identifiability of these networks. Here we explore some of the issues relevant to recovering directed regulatory networks from time course data collected under experimental constraints typical of in vivo studies. NetSim simulations of sparsely connected biological networks were used to evaluate two simple feature selection techniques used in the construction of linear Ordinary Differential Equation (ODE) models, namely truncation of terms versus latent vector projection. Performance was compared with ODE-based Time Series Network Identification (TSNI) integral, and the information-theoretic Time-Delay ARACNE (TD-ARACNE). Projection-based techniques and TSNI integral outperformed truncation-based selection and TD-ARACNE on aggregate networks with edge densities of 10-30%, i.e. transcription factor, protein-protein cliques and immune signaling networks. All were more robust to noise than truncation-based feature selection. Performance was comparable on the in silico 10-node DREAM 3 network, a 5-node Yeast synthetic network designed for In vivo Reverse-engineering and Modeling Assessment (IRMA) and a 9-node human HeLa cell cycle network of similar size and edge density. Performance was more sensitive to the number of time courses than to sample frequency and extrapolated better to larger networks by grouping experiments. In all cases performance declined rapidly in larger networks with lower edge density. Limited recovery and high false positive rates obtained overall bring into question our ability to generate informative time course data rather than the design of any particular reverse engineering algorithm.
Collapse
|
19
|
|
20
|
Zhan C, Situ W, Yeung LF, Tsang PWM, Yang G. A Parameter Estimation Method for Biological Systems modelled by ODE/DDE Models Using Spline Approximation and Differential Evolution Algorithm. IEEE/ACM Trans Comput Biol Bioinform 2014; 11:1066-1076. [PMID: 26357044 DOI: 10.1109/tcbb.2014.2322360] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The inverse problem of identifying unknown parameters of known structure dynamical biological systems, which are modelled by ordinary differential equations or delay differential equations, from experimental data is treated in this paper. A two stage approach is adopted: first, combine spline theory and Nonlinear Programming (NLP), the parameter estimation problem is formulated as an optimization problem with only algebraic constraints; then, a new differential evolution (DE) algorithm is proposed to find a feasible solution. The approach is designed to handle problem of realistic size with noisy observation data. Three cases are studied to evaluate the performance of the proposed algorithm: two are based on benchmark models with priori-determined structure and parameters; the other one is a particular biological system with unknown model structure. In the last case, only a set of observation data available and in this case a nominal model is adopted for the identification. All the test systems were successfully identified by using a reasonable amount of experimental data within an acceptable computation time. Experimental evaluation reveals that the proposed method is capable of fast estimation on the unknown parameters with good precision.
Collapse
|
21
|
Dimitrova E, Stigler B. Data identification for improving gene network inference using computational algebra. Bull Math Biol 2014; 76:2923-40. [PMID: 25280666 DOI: 10.1007/s11538-014-9979-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Accepted: 05/12/2014] [Indexed: 11/25/2022]
Abstract
Identification of models of gene regulatory networks is sensitive to the amount of data used as input. Considering the substantial costs in conducting experiments, it is of value to have an estimate of the amount of data required to infer the network structure. To minimize wasted resources, it is also beneficial to know which data are necessary to identify the network. Knowledge of the data and knowledge of the terms in polynomial models are often required a priori in model identification. In applications, it is unlikely that the structure of a polynomial model will be known, which may force data sets to be unnecessarily large in order to identify a model. Furthermore, none of the known results provides any strategy for constructing data sets to uniquely identify a model. We provide a specialization of an existing criterion for deciding when a set of data points identifies a minimal polynomial model when its monomial terms have been specified. Then, we relax the requirement of the knowledge of the monomials and present results for model identification given only the data. Finally, we present a method for constructing data sets that identify minimal polynomial models.
Collapse
Affiliation(s)
- Elena Dimitrova
- Department of Mathematical Sciences, Clemson University, Clemson, SC, 29634, USA
| | | |
Collapse
|
22
|
Farhangmehr F, Maurya MR, Tartakovsky DM, Subramaniam S. Information theoretic approach to complex biological network reconstruction: application to cytokine release in RAW 264.7 macrophages. BMC Syst Biol 2014; 8:77. [PMID: 24964861 PMCID: PMC4094931 DOI: 10.1186/1752-0509-8-77] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Accepted: 06/04/2014] [Indexed: 12/27/2022]
Abstract
BACKGROUND High-throughput methods for biological measurements generate vast amounts of quantitative data, which necessitate the development of advanced approaches to data analysis to help understand the underlying mechanisms and networks. Reconstruction of biological networks from measured data of different components is a significant challenge in systems biology. RESULTS We use an information theoretic approach to reconstruct phosphoprotein-cytokine networks in RAW 264.7 macrophage cells. Cytokines are secreted upon activation of a wide range of regulatory signals transduced by the phosphoprotein network. Identifying these components can help identify regulatory modules responsible for the inflammatory phenotype. The information theoretic approach is based on estimation of mutual information of interactions by using kernel density estimators. Mutual information provides a measure of statistical dependencies between interacting components. Using the topology of the network derived, we develop a data-driven parsimonious input-output model of the phosphoprotein-cytokine network. CONCLUSIONS We demonstrate the applicability of our information theoretic approach to reconstruction of biological networks. For the phosphoprotein-cytokine network, this approach not only captures most of the known signaling components involved in cytokine release but also predicts new signaling components involved in the release of cytokines. The results of this study are important for gaining a clear understanding of macrophage activation during the inflammation process.
Collapse
Affiliation(s)
| | | | | | - Shankar Subramaniam
- Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, 92093-0412 La Jolla, CA, USA.
| |
Collapse
|
23
|
Chavez-Alvarez R, Chavoya A, Mendez-Vazquez A. Discovery of possible gene relationships through the application of self-organizing maps to DNA microarray databases. PLoS One 2014; 9:e93233. [PMID: 24699245 PMCID: PMC3974722 DOI: 10.1371/journal.pone.0093233] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2013] [Accepted: 03/04/2014] [Indexed: 11/19/2022] Open
Abstract
DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques –an unsupervised artificial neural network called a Self-Organizing Map (SOM)–which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms.
Collapse
Affiliation(s)
- Rocio Chavez-Alvarez
- Department of Information Systems CUCEA, Universidad de Guadalajara, Zapopan, Jalisco, Mexico
| | - Arturo Chavoya
- Department of Information Systems CUCEA, Universidad de Guadalajara, Zapopan, Jalisco, Mexico
- * E-mail:
| | - Andres Mendez-Vazquez
- Department of Electrical Engineering and Computer Science Campus Guadalajara, Cinvestav, Zapopan, Jalisco, Mexico
| |
Collapse
|
24
|
Nakajima N, Akutsu T. Exact and heuristic methods for network completion for time-varying genetic networks. Biomed Res Int 2014; 2014:684014. [PMID: 24738067 PMCID: PMC3966496 DOI: 10.1155/2014/684014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/13/2013] [Revised: 01/09/2014] [Accepted: 01/22/2014] [Indexed: 12/02/2022]
Abstract
Robustness in biological networks can be regarded as an important feature of living systems. A system maintains its functions against internal and external perturbations, leading to topological changes in the network with varying delays. To understand the flexibility of biological networks, we propose a novel approach to analyze time-dependent networks, based on the framework of network completion, which aims to make the minimum amount of modifications to a given network so that the resulting network is most consistent with the observed data. We have developed a novel network completion method for time-varying networks by extending our previous method for the completion of stationary networks. In particular, we introduce a double dynamic programming technique to identify change time points and required modifications. Although this extended method allows us to guarantee the optimality of the solution, this method has relatively low computational efficiency. In order to resolve this difficulty, we developed a heuristic method for speeding up the calculation of minimum least squares errors. We demonstrate the effectiveness of our proposed methods through computational experiments using synthetic data and real microarray gene expression data. The results indicate that our methods exhibit good performance in terms of completing and inferring gene association networks with time-varying structures.
Collapse
Affiliation(s)
- Natsu Nakajima
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan
| |
Collapse
|
25
|
Abstract
Biological interaction networks represent a powerful tool for characterizing intracellular functional relationships, such as transcriptional regulation and protein interactions. Although artificial neural networks are routinely employed for a broad range of applications across computational biology, their underlying connectionist basis has not been extensively applied to modeling biological interaction networks. In particular, the Hopfield network offers nonlinear dynamics that represent the minimization of a system energy function through temporally distinct rewiring events. Here, a scaled energy minimization model is presented to test the feasibility of deriving a composite biological interaction network from multiple constituent data sets using the Hebbian learning principle. The performance of the scaled energy minimization model is compared against the standard Hopfield model using simulated data. Several networks are also derived from real data, compared to one another, and then combined to produce an aggregate network. The utility and limitations of the proposed model are discussed, along with possible implications for a genomic learning analogy where the fundamental Hebbian postulate is rendered into its genomic equivalent: Genes that function together junction together.
Collapse
|
26
|
Villaverde AF, Ross J, Banga JR. Reverse engineering cellular networks with information theoretic methods. Cells 2013; 2:306-29. [PMID: 24709703 PMCID: PMC3972682 DOI: 10.3390/cells2020306] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2013] [Revised: 04/22/2013] [Accepted: 04/27/2013] [Indexed: 11/16/2022] Open
Abstract
Building mathematical models of cellular networks lies at the core of systems biology. It involves, among other tasks, the reconstruction of the structure of interactions between molecular components, which is known as network inference or reverse engineering. Information theory can help in the goal of extracting as much information as possible from the available data. A large number of methods founded on these concepts have been proposed in the literature, not only in biology journals, but in a wide range of areas. Their critical comparison is difficult due to the different focuses and the adoption of different terminologies. Here we attempt to review some of the existing information theoretic methodologies for network inference, and clarify their differences. While some of these methods have achieved notable success, many challenges remain, among which we can mention dealing with incomplete measurements, noisy data, counterintuitive behaviour emerging from nonlinear relations or feedback loops, and computational burden of dealing with large data sets.
Collapse
Affiliation(s)
| | - John Ross
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA.
| | - Julio R Banga
- Bioprocess Engineering Group, IIM-CSIC, Eduardo Cabello 6, Vigo 36208, Spain.
| |
Collapse
|
27
|
|
28
|
Raza K, Parveen R. Soft Computing Approach for Modeling Genetic Regulatory Networks. Advances in Computing and Information Technology 2013. [DOI: 10.1007/978-3-642-31600-5_1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
|
29
|
Vlaic S, Schmidt-Heck W, Matz-Soja M, Marbach E, Linde J, Meyer-Baese A, Zellmer S, Guthke R, Gebhardt R. The extended TILAR approach: a novel tool for dynamic modeling of the transcription factor network regulating the adaption to in vitro cultivation of murine hepatocytes. BMC Syst Biol 2012. [PMID: 23190768 PMCID: PMC3573979 DOI: 10.1186/1752-0509-6-147] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
BACKGROUND Network inference is an important tool to reveal the underlying interactions of biological systems. In the liver, a complex system of transcription factors is active to distribute signals and induce the cellular response following extracellular stimuli. Plenty of information is available about single transcription factors important for the different functions of the liver, but little is known about their causal relations to each other. RESULTS Given a DNA microarray time series dataset of collagen monolayers cultured murine hepatocytes, we identified 22 differentially expressed genes for which the corresponding protein is known to exhibit transcription factor activity. We developed the Extended TILAR (ExTILAR) network inference algorithm based on the modeling concept of the previously published TILAR algorithm. Using ExTILAR, we inferred a transcription factor network based on gene expression data which puts these important genes into a functional context. This way, we identified a previously unknown relationship between Tgif1 and Atf3 which we validated experimentally. Beside its known role in metabolic processes, this extends the knowledge about Tgif1 in hepatocytes towards a possible influence of processes such as proliferation and cell cycle. Moreover, two positive (i.e. double negative) regulatory loops were predicted that could give rise to bistable behavior. We further evaluated the performance of ExTILAR by systematic inference of an in silico network. CONCLUSIONS We present the ExTILAR algorithm, which combines the advantages of the regression based inference algorithm TILAR, like large network sizes processable and low computational costs, with the advantages of dynamic network models based on ordinary differential equation (i.e. in silico knock-down simulations). Like TILAR, ExTILAR makes use of various prior-knowledge types such as transcription factor binding site information and gene interaction knowledge to infer biologically meaningful gene regulatory networks. Therefore, ExTILAR is especially useful when a large number of genes is modeled using a small number of experimental data points.
Collapse
Affiliation(s)
- Sebastian Vlaic
- Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Beutenbergstr. 11a, D-07745 Jena, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Asadi B, Maurya M, Subramaniam S, Tartakovsky D. Comparison of statistical and optimisation-based methods for data-driven network reconstruction of biochemical systems. IET Syst Biol 2012; 6:155-63. [DOI: 10.1049/iet-syb.2011.0052] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
31
|
Wang XD, Qi YX, Jiang ZL. Reconstruction of transcriptional network from microarray data using combined mutual information and network-assisted regression. IET Syst Biol 2011; 5:95-102. [PMID: 21405197 DOI: 10.1049/iet-syb.2010.0041] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Many methods had been developed on inferring transcriptional network from gene expression. However, it is still necessary to design new method that discloses more detailed and exact network information. Using network-assisted regression, the authors combined the averaged three-way mutual information (AMI3) and non-linear ordinary differential equation (ODE) model to infer the transcriptional network, and to obtain both the topological structure and the regulatory dynamics. Synthetic and experimental data were used to evaluate the performance of the above approach. In comparison with the previous methods based on mutual information, AMI3 obtained higher precision with the same sensitivity. To describe the regulatory dynamics between transcription factors and target genes, network-assisted regression and regression without network, respectively, were applied in the steady-state and time series microarray data. The results revealed that comparing with regression without network, network-assisted regression increased the precision, but decreased the fitting goodness. Then, the authors reconstructed the transcriptional network of Escherichia coli and simulated the regulatory dynamics of genes. Furthermore, the authors' approach identified potential transcription factors regulating yeast cell cycle. In conclusion, network-assisted regression, combined AMI3 and ODE model, was a more precisely to infer the topological structure and the regulatory dynamics of transcriptional network from microarray data. [Includes supplementary material].
Collapse
Affiliation(s)
- X-D Wang
- Shanghai Jiao Tong University, Institute of Mechanobiology and Medical Engineering, Shanghai, People's Republic of China
| | | | | |
Collapse
|
32
|
Ao S, Palade V. Ensemble of Elman neural networks and support vector machines for reverse engineering of gene regulatory networks. Appl Soft Comput 2011. [DOI: 10.1016/j.asoc.2010.05.014] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
33
|
Visconti A, Esposito R, Cordero F. Tackling the DREAM Challenge for Gene Regulatory Networks Reverse Engineering. AI*IA 2011: Artificial Intelligence Around Man and Beyond 2011. [DOI: 10.1007/978-3-642-23954-0_34] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
34
|
Abstract
Gene regulatory network models are a major area of study in systems and computational biology and the construction of network models is among the most important problems in these disciplines. The critical epistemological issue concerns validation. Validity can be approached from two different perspectives (i) given a hypothesized network model, its scientific validity relates to the ability to make predictions from the model that can be checked against experimental observations; and (ii) the validity of a network inference procedure must be evaluated relative to its ability to infer a network from sample points generated by the network. This article examines both perspectives in the framework of a distance function between two networks. It considers some of the obstacles to validation and provides examples of both validation paradigms.
Collapse
Affiliation(s)
- Edward R Dougherty
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, USA.
| |
Collapse
|
35
|
Hecker M, Goertsches RH, Fatum C, Koczan D, Thiesen HJ, Guthke R, Zettl UK. Network analysis of transcriptional regulation in response to intramuscular interferon-β-1a multiple sclerosis treatment. Pharmacogenomics J 2010; 12:134-46. [PMID: 20956993 DOI: 10.1038/tpj.2010.77] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Interferon-β (IFN-β) is one of the major drugs for multiple sclerosis (MS) treatment. The purpose of this study was to characterize the transcriptional effects induced by intramuscular IFN-β-1a therapy in patients with relapsing-remitting form of MS. By using Affymetrix DNA microarrays, we obtained genome-wide expression profiles of peripheral blood mononuclear cells of 24 MS patients within the first 4 weeks of IFN-β administration. We identified 121 genes that were significantly up- or downregulated compared with baseline, with stronger changed expression at 1 week after start of therapy. Eleven transcription factor-binding sites (TFBS) are overrepresented in the regulatory regions of these genes, including those of IFN regulatory factors and NF-κB. We then applied TFBS-integrating least angle regression, a novel integrative algorithm for deriving gene regulatory networks from gene expression data and TFBS information, to reconstruct the underlying network of molecular interactions. An NF-κB-centered sub-network of genes was highly expressed in patients with IFN-β-related side effects. Expression alterations were confirmed by real-time PCR and literature mining was applied to evaluate network inference accuracy.
Collapse
Affiliation(s)
- M Hecker
- Leibniz Institute for Natural Product Research and Infection Biology-Hans-Knoell-Institute, Jena, Germany.
| | | | | | | | | | | | | |
Collapse
|
36
|
Swain MT, Mandel JJ, Dubitzky W. Comparative study of three commonly used continuous deterministic methods for modeling gene regulation networks. BMC Bioinformatics 2010; 11:459. [PMID: 20840745 PMCID: PMC2949891 DOI: 10.1186/1471-2105-11-459] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2010] [Accepted: 09/14/2010] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND A gene-regulatory network (GRN) refers to DNA segments that interact through their RNA and protein products and thereby govern the rates at which genes are transcribed. Creating accurate dynamic models of GRNs is gaining importance in biomedical research and development. To improve our understanding of continuous deterministic modeling methods employed to construct dynamic GRN models, we have carried out a comprehensive comparative study of three commonly used systems of ordinary differential equations: The S-system (SS), artificial neural networks (ANNs), and the general rate law of transcription (GRLOT) method. These were thoroughly evaluated in terms of their ability to replicate the reference models' regulatory structure and dynamic gene expression behavior under varying conditions. RESULTS While the ANN and GRLOT methods appeared to produce robust models even when the model parameters deviated considerably from those of the reference models, SS-based models exhibited a notable loss of performance even when the parameters of the reverse-engineered models corresponded closely to those of the reference models: this is due to the high number of power terms in the SS-method, and the manner in which they are combined. In cross-method reverse-engineering experiments the different characteristics, biases and idiosynchracies of the methods were revealed. Based on limited training data, with only one experimental condition, all methods produced dynamic models that were able to reproduce the training data accurately. However, an accurate reproduction of regulatory network features was only possible with training data originating from multiple experiments under varying conditions. CONCLUSIONS The studied GRN modeling methods produced dynamic GRN models exhibiting marked differences in their ability to replicate the reference models' structure and behavior. Our results suggest that care should be taking when a method is chosen for a particular application. In particular, reliance on only a single method might unduly bias the results.
Collapse
Affiliation(s)
- Martin T Swain
- University of Ulster, School of Biomedical Sciences, Cromore Road, Coleraine BT52 1SA, Co. Londonderry, UK
| | | | - Werner Dubitzky
- University of Ulster, School of Biomedical Sciences, Cromore Road, Coleraine BT52 1SA, Co. Londonderry, UK
| |
Collapse
|
37
|
de Jong H, Ranquet C, Ropers D, Pinel C, Geiselmann J. Experimental and computational validation of models of fluorescent and luminescent reporter genes in bacteria. BMC Syst Biol 2010; 4:55. [PMID: 20429918 PMCID: PMC2877006 DOI: 10.1186/1752-0509-4-55] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2009] [Accepted: 04/29/2010] [Indexed: 11/10/2022]
Abstract
BACKGROUND Fluorescent and luminescent reporter genes have become popular tools for the real-time monitoring of gene expression in living cells. However, mathematical models are necessary for extracting biologically meaningful quantities from the primary data. RESULTS We present a rigorous method for deriving relative protein synthesis rates (mRNA concentrations) and protein concentrations by means of kinetic models of gene expression. We experimentally and computationally validate this approach in the case of the protein Fis, a global regulator of transcription in Escherichia coli. We show that the mRNA and protein concentration profiles predicted from the models agree quite well with direct measurements obtained by Northern and Western blots, respectively. Moreover, we present computational procedures for taking into account systematic biases like the folding time of the fluorescent reporter protein and differences in the half-lives of reporter and host gene products. The results show that large differences in protein half-lives, more than mRNA half-lives, may be critical for the interpretation of reporter gene data in the analysis of the dynamics of regulatory systems. CONCLUSIONS The paper contributes to the development of sound methods for the interpretation of reporter gene data, notably in the context of the reconstruction and validation of models of regulatory networks. The results have wide applicability for the analysis of gene expression in bacteria and may be extended to higher organisms.
Collapse
Affiliation(s)
- Hidde de Jong
- INRIA Grenoble - Rhône-Alpes, 655 Av. de l'Europe, Montbonnot, 38334 St Ismier Cedex, France
| | - Caroline Ranquet
- Institut Jean Roget, LAPM, UMR5163, Campus Santé, Université Joseph Fourier, Domaine de la Merci, 38700 La Tronche, France
- INRIA Grenoble - Rhône-Alpes, 655 Av. de l'Europe, Montbonnot, 38334 St Ismier Cedex, France
| | - Delphine Ropers
- INRIA Grenoble - Rhône-Alpes, 655 Av. de l'Europe, Montbonnot, 38334 St Ismier Cedex, France
| | - Corinne Pinel
- Institut Jean Roget, LAPM, UMR5163, Campus Santé, Université Joseph Fourier, Domaine de la Merci, 38700 La Tronche, France
- INRIA Grenoble - Rhône-Alpes, 655 Av. de l'Europe, Montbonnot, 38334 St Ismier Cedex, France
| | - Johannes Geiselmann
- Institut Jean Roget, LAPM, UMR5163, Campus Santé, Université Joseph Fourier, Domaine de la Merci, 38700 La Tronche, France
- INRIA Grenoble - Rhône-Alpes, 655 Av. de l'Europe, Montbonnot, 38334 St Ismier Cedex, France
| |
Collapse
|
38
|
Gustafsson M, Hörnquist M. Gene expression prediction by soft integration and the elastic net-best performance of the DREAM3 gene expression challenge. PLoS One 2010; 5:e9134. [PMID: 20169069 PMCID: PMC2821917 DOI: 10.1371/journal.pone.0009134] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2009] [Accepted: 10/06/2009] [Indexed: 12/02/2022] Open
Abstract
Background To predict gene expressions is an important endeavour within computational systems biology. It can both be a way to explore how drugs affect the system, as well as providing a framework for finding which genes are interrelated in a certain process. A practical problem, however, is how to assess and discriminate among the various algorithms which have been developed for this purpose. Therefore, the DREAM project invited the year 2008 to a challenge for predicting gene expression values, and here we present the algorithm with best performance. Methodology/Principal Findings We develop an algorithm by exploring various regression schemes with different model selection procedures. It turns out that the most effective scheme is based on least squares, with a penalty term of a recently developed form called the “elastic net”. Key components in the algorithm are the integration of expression data from other experimental conditions than those presented for the challenge and the utilization of transcription factor binding data for guiding the inference process towards known interactions. Of importance is also a cross-validation procedure where each form of external data is used only to the extent it increases the expected performance. Conclusions/Significance Our algorithm proves both the possibility to extract information from large-scale expression data concerning prediction of gene levels, as well as the benefits of integrating different data sources for improving the inference. We believe the former is an important message to those still hesitating on the possibilities for computational approaches, while the latter is part of an important way forward for the future development of the field of computational systems biology.
Collapse
Affiliation(s)
- Mika Gustafsson
- Department of Science and Technology, Linköping University, Norrköping, Sweden
| | - Michael Hörnquist
- Department of Science and Technology, Linköping University, Norrköping, Sweden
- * E-mail:
| |
Collapse
|
39
|
Gupta S, Maurya MR, Subramaniam S. Identification of crosstalk between phosphoprotein signaling pathways in RAW 264.7 macrophage cells. PLoS Comput Biol 2010; 6:e1000654. [PMID: 20126526 PMCID: PMC2813256 DOI: 10.1371/journal.pcbi.1000654] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2009] [Accepted: 12/21/2009] [Indexed: 11/25/2022] Open
Abstract
Signaling pathways mediate the effect of external stimuli on gene expression in cells. The signaling proteins in these pathways interact with each other and their phosphorylation levels often serve as indicators for the activity of signaling pathways. Several signaling pathways have been identified in mammalian cells but the crosstalk between them is not well understood. Alliance for Cellular Signaling (AfCS) has measured time-course data in RAW 264.7 macrophage cells on important phosphoproteins, such as the mitogen-activated protein kinases (MAPKs) and signal transducer and activator of transcription (STATs), in single- and double-ligand stimulation experiments for 22 ligands. In the present work, we have used a data-driven approach to analyze the AfCS data to decipher the interactions and crosstalk between signaling pathways in stimulated macrophage cells. We have used dynamic mapping to develop a predictive model using a partial least squares approach. Significant interactions were selected through statistical hypothesis testing and were used to reconstruct the phosphoprotein signaling network. The proposed data-driven approach is able to identify most of the known signaling interactions such as protein kinase B (Akt) --> glycogen synthase kinase 3alpha/beta (GSKalpha/beta) etc., and predicts potential novel interactions such as P38 --> RSK and GSK --> ezrin/radixin/moesin. We have also shown that the model has good predictive power for extrapolation. Our novel approach captures the temporal causality and directionality in intracellular signaling pathways. Further, case specific analysis of the phosphoproteins in the network has led us to propose hypothesis about inhibition (phosphorylation) of GSKalpha/beta via P38.
Collapse
Affiliation(s)
- Shakti Gupta
- Department of Bioengineering, University of California, San Diego, La Jolla, California, United States of America
| | - Mano Ram Maurya
- Department of Bioengineering, University of California, San Diego, La Jolla, California, United States of America
| | - Shankar Subramaniam
- Department of Bioengineering, University of California, San Diego, La Jolla, California, United States of America
- Department of Chemistry, University of California, San Diego, La Jolla, California, United States of America
- Department of Biochemistry, University of California, San Diego, La Jolla, California, United States of America
- Cellular & Molecular Medicine, University of California, San Diego, La Jolla, California, United States of America
- Graduate Program in Bioinformatics, University of California, San Diego, La Jolla, California, United States of America
| |
Collapse
|
40
|
Knabe JF, Wegner K, Nehaniv CL, Schilstra MJ. Genetic algorithms and their application to in silico evolution of genetic regulatory networks. Methods Mol Biol 2010; 673:297-321. [PMID: 20835807 DOI: 10.1007/978-1-60761-842-3_19] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
A genetic algorithm (GA) is a procedure that mimics processes occurring in Darwinian evolution to solve computational problems. A GA introduces variation through "mutation" and "recombination" in a "population" of possible solutions to a problem, encoded as strings of characters in "genomes," and allows this population to evolve, using selection procedures that favor the gradual enrichment of the gene pool with the genomes of the "fitter" individuals. GAs are particularly suitable for optimization problems in which an effective system design or set of parameter values is sought.In nature, genetic regulatory networks (GRNs) form the basic control layer in the regulation of gene expression levels. GRNs are composed of regulatory interactions between genes and their gene products, and are, inter alia, at the basis of the development of single fertilized cells into fully grown organisms. This paper describes how GAs may be applied to find functional regulatory schemes and parameter values for models that capture the fundamental GRN characteristics. The central ideas behind evolutionary computation and GRN modeling, and the considerations in GA design and use are discussed, and illustrated with an extended example. In this example, a GRN-like controller is sought for a developmental system based on Lewis Wolpert's French flag model for positional specification, in which cells in a growing embryo secrete and detect morphogens to attain a specific spatial pattern of cellular differentiation.
Collapse
|
41
|
Mazur J, Ritter D, Reinelt G, Kaderali L. Reconstructing nonlinear dynamic models of gene regulation using stochastic sampling. BMC Bioinformatics 2009; 10:448. [PMID: 20038296 PMCID: PMC2811124 DOI: 10.1186/1471-2105-10-448] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2009] [Accepted: 12/28/2009] [Indexed: 12/01/2022] Open
Abstract
Background The reconstruction of gene regulatory networks from time series gene expression data is one of the most difficult problems in systems biology. This is due to several reasons, among them the combinatorial explosion of possible network topologies, limited information content of the experimental data with high levels of noise, and the complexity of gene regulation at the transcriptional, translational and post-translational levels. At the same time, quantitative, dynamic models, ideally with probability distributions over model topologies and parameters, are highly desirable. Results We present a novel approach to infer such models from data, based on nonlinear differential equations, which we embed into a stochastic Bayesian framework. We thus address both the stochasticity of experimental data and the need for quantitative dynamic models. Furthermore, the Bayesian framework allows it to easily integrate prior knowledge into the inference process. Using stochastic sampling from the Bayes' posterior distribution, our approach can infer different likely network topologies and model parameters along with their respective probabilities from given data. We evaluate our approach on simulated data and the challenge #3 data from the DREAM 2 initiative. On the simulated data, we study effects of different levels of noise and dataset sizes. Results on real data show that the dynamics and main regulatory interactions are correctly reconstructed. Conclusions Our approach combines dynamic modeling using differential equations with a stochastic learning framework, thus bridging the gap between biophysical modeling and stochastic inference approaches. Results show that the method can reap the advantages of both worlds, and allows the reconstruction of biophysically accurate dynamic models from noisy data. In addition, the stochastic learning framework used permits the computation of probability distributions over models and model parameters, which holds interesting prospects for experimental design purposes.
Collapse
Affiliation(s)
- Johanna Mazur
- Viroquant Research Group Modeling, University of Heidelberg, Bioquant BQ26, INF 267, D-69120 Heidelberg, Germany.
| | | | | | | |
Collapse
|
42
|
Saez-Rodriguez J, Alexopoulos LG, Epperlein J, Samaga R, Lauffenburger DA, Klamt S, Sorger PK. Discrete logic modelling as a means to link protein signalling networks with functional analysis of mammalian signal transduction. Mol Syst Biol 2009; 5:331. [PMID: 19953085 DOI: 10.1038/msb.2009.87] [Citation(s) in RCA: 278] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2009] [Accepted: 10/28/2009] [Indexed: 01/13/2023] Open
Abstract
Large-scale protein signalling networks are useful for exploring complex biochemical pathways but do not reveal how pathways respond to specific stimuli. Such specificity is critical for understanding disease and designing drugs. Here we describe a computational approach—implemented in the free CNO software—for turning signalling networks into logical models and calibrating the models against experimental data. When a literature-derived network of 82 proteins covering the immediate-early responses of human cells to seven cytokines was modelled, we found that training against experimental data dramatically increased predictive power, despite the crudeness of Boolean approximations, while significantly reducing the number of interactions. Thus, many interactions in literature-derived networks do not appear to be functional in the liver cells from which we collected our data. At the same time, CNO identified several new interactions that improved the match of model to data. Although missing from the starting network, these interactions have literature support. Our approach, therefore, represents a means to generate predictive, cell-type-specific models of mammalian signalling from generic protein signalling networks.
Collapse
|
43
|
Tanaka H, Yi TM. Reverse engineering a signaling network using alternative inputs. PLoS One 2009; 4:e7622. [PMID: 19898612 PMCID: PMC2764141 DOI: 10.1371/journal.pone.0007622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2009] [Accepted: 10/06/2009] [Indexed: 11/19/2022] Open
Abstract
One of the goals of systems biology is to reverse engineer in a comprehensive fashion the arrow diagrams of signal transduction systems. An important tool for ordering pathway components is genetic epistasis analysis, and here we present a strategy termed Alternative Inputs (AIs) to perform systematic epistasis analysis. An alternative input is defined as any genetic manipulation that can activate the signaling pathway instead of the natural input. We introduced the concept of an "AIs-Deletions matrix" that summarizes the outputs of all combinations of alternative inputs and deletions. We developed the theory and algorithms to construct a pairwise relationship graph from the AIs-Deletions matrix capturing both functional ordering (upstream, downstream) and logical relationships (AND, OR), and then interpreting these relationships into a standard arrow diagram. As a proof-of-principle, we applied this methodology to a subset of genes involved in yeast mating signaling. This experimental pilot study highlights the robustness of the approach and important technical challenges. In summary, this research formalizes and extends classical epistasis analysis from linear pathways to more complex networks, facilitating computational analysis and reconstruction of signaling arrow diagrams.
Collapse
Affiliation(s)
- Hiromasa Tanaka
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, California, United States of America
- Center for Complex Biological Systems, University of California Irvine, Irvine, California, United States of America
| | - Tau-Mu Yi
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, California, United States of America
- Center for Complex Biological Systems, University of California Irvine, Irvine, California, United States of America
- * E-mail:
| |
Collapse
|
44
|
Sima C, Hua J, Jung S. Inference of gene regulatory networks using time-series data: a survey. Curr Genomics 2009; 10:416-29. [PMID: 20190956 PMCID: PMC2766792 DOI: 10.2174/138920209789177610] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2008] [Revised: 02/28/2009] [Accepted: 03/02/2009] [Indexed: 11/22/2022] Open
Abstract
The advent of high-throughput technology like microarrays has provided the platform for studying how different cellular components work together, thus created an enormous interest in mathematically modeling biological network, particularly gene regulatory network (GRN). Of particular interest is the modeling and inference on time-series data, which capture a more thorough picture of the system than non-temporal data do. We have given an extensive review of methodologies that have been used on time-series data. In realizing that validation is an impartible part of the inference paradigm, we have also presented a discussion on the principles and challenges in performance evaluation of different methods. This survey gives a panoramic view on these topics, with anticipation that the readers will be inspired to improve and/or expand GRN inference and validation tool repository.
Collapse
Affiliation(s)
- Chao Sima
- Address correspondence to this author at the Computational Biology Division, Translational Genomics Research Institute, Phoenix, AZ 85004, USA; Tel: 1(602)343-8485; Fax: 1(602)343-8740; E-mail:
| | | | | |
Collapse
|
45
|
Hecker M, Goertsches RH, Engelmann R, Thiesen HJ, Guthke R. Integrative modeling of transcriptional regulation in response to antirheumatic therapy. BMC Bioinformatics 2009; 10:262. [PMID: 19703281 PMCID: PMC2757030 DOI: 10.1186/1471-2105-10-262] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2009] [Accepted: 08/24/2009] [Indexed: 12/29/2022] Open
Abstract
Background The investigation of gene regulatory networks is an important issue in molecular systems biology and significant progress has been made by combining different types of biological data. The purpose of this study was to characterize the transcriptional program induced by etanercept therapy in patients with rheumatoid arthritis (RA). Etanercept is known to reduce disease symptoms and progression in RA, but the underlying molecular mechanisms have not been fully elucidated. Results Using a DNA microarray dataset providing genome-wide expression profiles of 19 RA patients within the first week of therapy we identified significant transcriptional changes in 83 genes. Most of these genes are known to control the human body's immune response. A novel algorithm called TILAR was then applied to construct a linear network model of the genes' regulatory interactions. The inference method derives a model from the data based on the Least Angle Regression while incorporating DNA-binding site information. As a result we obtained a scale-free network that exhibits a self-regulating and highly parallel architecture, and reflects the pleiotropic immunological role of the therapeutic target TNF-alpha. Moreover, we could show that our integrative modeling strategy performs much better than algorithms using gene expression data alone. Conclusion We present TILAR, a method to deduce gene regulatory interactions from gene expression data by integrating information on transcription factor binding sites. The inferred network uncovers gene regulatory effects in response to etanercept and thus provides useful hypotheses about the drug's mechanisms of action.
Collapse
Affiliation(s)
- Michael Hecker
- Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knoell-Institute, Jena, Germany.
| | | | | | | | | |
Collapse
|
46
|
Abstract
Designing and conducting experiments are routine practices for modern biologists. The real challenge, especially in the post-genome era, usually comes not from acquiring data, but from subsequent activities such as data processing, analysis, knowledge generation and gaining insight into the research question of interest. The approach of inferring gene regulatory networks (GRNs) has been flourishing for many years, and new methods from mathematics, information science, engineering and social sciences have been applied. We review different kinds of computational methods biologists use to infer networks of varying levels of accuracy and complexity. The primary concern of biologists is how to translate the inferred network into hypotheses that can be tested with real-life experiments. Taking the biologists' viewpoint, we scrutinized several methods for predicting GRNs in mammalian cells, and more importantly show how the power of different knowledge databases of different types can be used to identify modules and subnetworks, thereby reducing complexity and facilitating the generation of testable hypotheses.
Collapse
Affiliation(s)
- Wei-Po Lee
- Department of Information Management, National Sun Yat-sen University, Kaohsiung, Taiwan.
| | | |
Collapse
|
47
|
He F, Balling R, Zeng AP. Reverse engineering and verification of gene networks: principles, assumptions, and limitations of present methods and future perspectives. J Biotechnol 2009; 144:190-203. [PMID: 19631244 DOI: 10.1016/j.jbiotec.2009.07.013] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2009] [Revised: 07/13/2009] [Accepted: 07/16/2009] [Indexed: 12/21/2022]
Abstract
Reverse engineering of gene networks aims at revealing the structure of the gene regulation network in a biological system by reasoning backward directly from experimental data. Many methods have recently been proposed for reverse engineering of gene networks by using gene transcript expression data measured by microarray. Whereas the potentials of the methods have been well demonstrated, the assumptions and limitations behind them are often not clearly stated or not well understood. In this review, we first briefly explain the principles of the major methods, identify the assumptions behind them and pinpoint the limitations and possible pitfalls in applying them to real biological questions. With regard to applications, we then discuss challenges in the experimental verification of gene networks generated from reverse engineering methods. We further propose an optimal experimental design for allocating sampling schedule and possible strategies for reducing the limitations of some of the current reverse engineering methods. Finally, we examine the perspectives for the development of reverse engineering and urge the need to move from revealing network structure to the dynamics of biological systems.
Collapse
Affiliation(s)
- Feng He
- Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany
| | | | | |
Collapse
|
48
|
Weinreb GE, Kapustina MT, Jacobson K, Elston TC. In silico generation of alternative hypotheses using causal mapping (CMAP). PLoS One 2009; 4:e5378. [PMID: 19401774 PMCID: PMC2671158 DOI: 10.1371/journal.pone.0005378] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2008] [Accepted: 03/29/2009] [Indexed: 11/18/2022] Open
Abstract
Previously, we introduced causal mapping (CMAP) as an easy to use systems biology tool for studying the behavior of biological processes that occur at the cellular and molecular level. CMAP is a coarse-grained graphical modeling approach in which the system of interest is modeled as an interaction map between functional elements of the system, in a manner similar to portrayals of signaling pathways commonly used by molecular cell biologists. CMAP describes details of the interactions while maintaining the simplicity of other qualitative methods (e.g., Boolean networks).In this paper, we use the CMAP methodology as a tool for generating hypotheses about the mechanisms that regulate molecular and cellular systems. Furthermore, our approach allows competing hypotheses to be ranked according to a fitness index and suggests experimental tests to distinguish competing high fitness hypotheses. To motivate the CMAP as a hypotheses generating tool and demonstrate the methodology, we first apply this protocol to a simple test-case of a three-element signaling module. Our methods are next applied to the more complex phenomenon of cortical oscillations observed in spreading cells. This analysis produces two high fitness hypotheses for the mechanism that underlies this dynamic behavior and suggests experiments to distinguish the hypotheses. The method can be widely applied to other cellular systems to generate and compare alternative hypotheses based on experimentally observed data and using computer simulations.
Collapse
Affiliation(s)
- Gabriel E Weinreb
- Department of Cell and Developmental Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America.
| | | | | | | |
Collapse
|
49
|
Lagomarsino MC, Bassetti B, Castellani G, Remondini D. Functional models for large-scale gene regulation networks: realism and fiction. Mol Biosyst 2009; 5:335-44. [PMID: 19396369 DOI: 10.1039/b816841p] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
High-throughput experiments are shedding light on the topology of large regulatory networks and at the same time their functional states, namely the states of activation of the nodes (for example transcript or protein levels) in different conditions, times, environments. We now possess a certain amount of information about these two levels of description, stored in libraries, databases and ontologies. A current challenge is to bridge the gap between topology and function, i.e. developing quantitative models aimed at characterizing the expression patterns of large sets of genes. However, approaches that work well for small networks become impossible to master at large scales, mainly because parameters proliferate. In this review we discuss the state of the art of large-scale functional network models, addressing the issue of what can be considered as "realistic" and what the main limitations may be. We also show some directions for future work, trying to set the goals that future models should try to achieve. Finally, we will emphasize the possible benefits in the understanding of biological mechanisms underlying complex multifactorial diseases, and in the development of novel strategies for the description and the treatment of such pathologies.
Collapse
|
50
|
Porreca R, Drulhe S, de Jong H, Ferrari-Trecate G. Structural identification of piecewise-linear models of genetic regulatory networks. J Comput Biol 2009; 15:1365-80. [PMID: 19040369 DOI: 10.1089/cmb.2008.0109] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Abstract
We present a method for the structural identification of genetic regulatory networks (GRNs), based on the use of a class of Piecewise-Linear (PL) models. These models consist of a set of decoupled linear models describing the different modes of operation of the GRN and discrete switches between the modes accounting for the nonlinear character of gene regulation. They thus form a compromise between the mathematical simplicity of linear models and the biological expressiveness of nonlinear models. The input of the PL identification method consists of time-series measurements of concentrations of gene products. As output it produces estimates of the modes of operation of the GRN, as well as all possible minimal combinations of threshold concentrations of the gene products accounting for switches between the modes of operation. The applicability of the PL identification method has been evaluated using simulated data obtained from a model of the carbon starvation response in the bacterium Escherichia coli. This has allowed us to systematically test the performance of the method under different data characteristics, notably variations in the noise level and the sampling density.
Collapse
Affiliation(s)
- Riccardo Porreca
- Dipartimento di Informatica e Sistemistica, Università degli Studi di Pavia, Pavia, Italy
| | | | | | | |
Collapse
|