451
|
Farina L, De Santis A, Salvucci S, Morelli G, Ruberti I. Embedding mRNA stability in correlation analysis of time-series gene expression data. PLoS Comput Biol 2008; 4:e1000141. [PMID: 18670596 PMCID: PMC2453326 DOI: 10.1371/journal.pcbi.1000141] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2008] [Accepted: 06/24/2008] [Indexed: 12/23/2022] Open
Abstract
Current methods for the identification of putatively co-regulated genes directly from gene expression time profiles are based on the similarity of the time profile. Such association metrics, despite their central role in gene network inference and machine learning, have largely ignored the impact of dynamics or variation in mRNA stability. Here we introduce a simple, but powerful, new similarity metric called lead-lag R2 that successfully accounts for the properties of gene dynamics, including varying mRNA degradation and delays. Using yeast cell-cycle time-series gene expression data, we demonstrate that the predictive power of lead-lag R2 for the identification of co-regulated genes is significantly higher than that of standard similarity measures, thus allowing the selection of a large number of entirely new putatively co-regulated genes. Furthermore, the lead-lag metric can also be used to uncover the relationship between gene expression time-series and the dynamics of formation of multiple protein complexes. Remarkably, we found a high lead-lag R2 value among genes coding for a transient complex. Microarrays provide snapshots of the transcriptional state of the cell at some point in time. Multiple snapshots can be taken sequentially in time, thus providing insight into the dynamics of change. Since genome-wide expression data report on the abundance of mRNA, not on the underlying activity of genes, we developed a novel method to relate the expression pattern of genes, detected in a time-series experiment, using a similarity measure that incorporates mRNA decay and called lead-lag R2. We used the lead-lag R2 similarity measure to predict the presence of common transcription factors between gene pairs using an integrated dataset consisting of 13 yeast cell-cycles. The method was benchmarked against six well-established similarity measures and obtained the best true positive rate result, around 95%. We believe that the lead-lag analysis can be successfully used also to predict the presence of a common mechanism able to modulate the degradation rate of specific transcripts. Finally, we envisage the possibility to extend our analysis to different experimental conditions and organisms, thus providing a simple off-the-shelf computational tool to support the understanding of the transcriptional and post-transcriptional regulation layer and its role in many diseases, such as cancer.
Collapse
Affiliation(s)
- Lorenzo Farina
- Dipartimento di Informatica e Sistemistica Antonio Ruberti, Sapienza Università di Roma, Rome, Italy.
| | | | | | | | | |
Collapse
|
452
|
Abstract
In recent years, a number of technical and experimental advances have allowed us to obtain an unprecedented amount of information about living systems on a genomic scale. Although the complete genomes of many organisms are available due to the progress made in sequencing technology, the challenge to understand how the individual genes are regulated within the cell remains. Here, I provide an overview of current computational methods to investigate transcriptional regulation. I will first discuss how representing protein-DNA interactions as a network provides us with a conceptual framework to understand the organization of regulatory interactions in an organism. I will then describe methods to predict transcription factors and cis-regulatory elements using information such as sequence, structure and evolutionary conservation. Finally, I will discuss approaches to infer genome-scale transcriptional regulatory networks using experimentally characterized interactions from model organisms and by reverse-engineering regulatory interactions that makes use of gene expression data and genomewide location data. The methods summarized here can be exploited to discover previously uncharacterized transcriptional pathways in organisms whose genome sequence is known. In addition, such a framework and approach can be invaluable to investigate transcriptional regulation in complex microbial communities such as the human gut flora or populations of emerging pathogens. Apart from these medical applications, the concepts and methods discussed can be used to understand the combinatorial logic of transcriptional regulation and can be exploited in biotechnological applications, such as in synthetic biology experiments aimed at engineering regulatory circuits for various purposes.
Collapse
Affiliation(s)
- M Madan Babu
- MRC Laboratory of Molecular Biology, Cambridge, UK.
| |
Collapse
|
453
|
Bezerianos A, Maraziotis IA. Computational models reconstruct gene regulatory networks. MOLECULAR BIOSYSTEMS 2008; 4:993-1000. [PMID: 19082138 DOI: 10.1039/b800446n] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The post-genomic era is flooded with data from high-throughput techniques such as cDNA microarrays. In the field of systems biology the reconstruction of gene regulatory networks from gene expression data is one of the major problems in understanding complex cell functions. Drawing conclusions from microarray data requires sophisticated computational analyses that will explore causal genetic relations. In this paper we provide a brief summary of some of the most recent and promising computational models and mathematical frameworks used to reconstruct, model and infer gene regulatory networks from data.
Collapse
|
454
|
Yang L, Vondriska TM, Han Z, Maclellan WR, Weiss JN, Qu Z. Deducing topology of protein-protein interaction networks from experimentally measured sub-networks. BMC Bioinformatics 2008; 9:301. [PMID: 18598366 PMCID: PMC2474618 DOI: 10.1186/1471-2105-9-301] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2008] [Accepted: 07/03/2008] [Indexed: 01/17/2023] Open
Abstract
Background Protein-protein interaction networks are commonly sampled using yeast two hybrid approaches. However, whether topological information reaped from these experimentally-measured sub-networks can be extrapolated to complete protein-protein interaction networks is unclear. Results By analyzing various experimental protein-protein interaction datasets, we found that they are not random samples of the parent networks. Based on the experimental bait-prey behaviors, our computer simulations show that these non-random sampling features may affect the topological information. We tested the hypothesis that a core sub-network exists within the experimentally sampled network that better maintains the topological characteristics of the parent protein-protein interaction network. We developed a method to filter the experimentally sampled network to result in a core sub-network that more accurately reflects the topology of the parent network. These findings have fundamental implications for large-scale protein interaction studies and for our understanding of the behavior of cellular networks. Conclusion The topological information from experimental measured networks network as is may not be the correct source for topological information about the parent protein-protein interaction network. We define a core sub-network that more accurately reflects the topology of the parent network.
Collapse
Affiliation(s)
- Ling Yang
- Department of Medicine, David Geffen School of Medicine at University of California, Los Angeles, California 90095, USA .
| | | | | | | | | | | |
Collapse
|
455
|
Macedo C, Magalhães DA, Tonani M, Marques MC, Junta CM, Passos GAS. Genes that code for T cell signaling proteins establish transcriptional regulatory networks during thymus ontogeny. Mol Cell Biochem 2008; 318:63-71. [DOI: 10.1007/s11010-008-9857-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2007] [Accepted: 06/13/2008] [Indexed: 01/20/2023]
|
456
|
Della Gatta G, Bansal M, Ambesi-Impiombato A, Antonini D, Missero C, di Bernardo D. Direct targets of the TRP63 transcription factor revealed by a combination of gene expression profiling and reverse engineering. Genome Res 2008; 18:939-48. [PMID: 18441228 PMCID: PMC2413161 DOI: 10.1101/gr.073601.107] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2007] [Accepted: 02/14/2008] [Indexed: 01/24/2023]
Abstract
Genome-wide identification of bona-fide targets of transcription factors in mammalian cells is still a challenge. We present a novel integrated computational and experimental approach to identify direct targets of a transcription factor. This consists of measuring time-course (dynamic) gene expression profiles upon perturbation of the transcription factor under study, and in applying a novel "reverse-engineering" algorithm (TSNI) to rank genes according to their probability of being direct targets. Using primary keratinocytes as a model system, we identified novel transcriptional target genes of TRP63, a crucial regulator of skin development. TSNI-predicted TRP63 target genes were validated by Trp63 knockdown and by ChIP-chip to identify TRP63-bound regions in vivo. Our study revealed that short sampling times, in the order of minutes, are needed to capture the dynamics of gene expression in mammalian cells. We show that TRP63 transiently regulates a subset of its direct targets, thus highlighting the importance of considering temporal dynamics when identifying transcriptional targets. Using this approach, we uncovered a previously unsuspected transient regulation of the AP-1 complex by TRP63 through direct regulation of a subset of AP-1 components. The integrated experimental and computational approach described here is readily applicable to other transcription factors in mammalian systems and is complementary to genome-wide identification of transcription-factor binding sites.
Collapse
Affiliation(s)
| | - Mukesh Bansal
- Telethon Institute of Genetics and Medicine, 80131 Naples, Italy
| | | | | | - Caterina Missero
- Telethon Institute of Genetics and Medicine, 80131 Naples, Italy
| | - Diego di Bernardo
- Telethon Institute of Genetics and Medicine, 80131 Naples, Italy
- Department of Computer and Systems Engineering, University of Naples, Federico II, 80125 Naples, Italy
| |
Collapse
|
457
|
Ellis T, Wang X, Collins JJ. Gene regulation: hacking the network on a sugar high. Mol Cell 2008; 30:1-2. [PMID: 18406319 DOI: 10.1016/j.molcel.2008.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In a recent issue of Molecular Cell, Kaplan et al. (2008) determine the input functions for 19 E. coli sugar-utilization genes by using a two-dimensional high-throughput approach. The resulting input-function map reveals that gene network regulation follows non-Boolean, and often nonmonotonic, logic.
Collapse
Affiliation(s)
- Tom Ellis
- Center for BioDynamics, Boston University, 44 Cummington Street, Boston, MA 02115, USA
| | | | | |
Collapse
|
458
|
Marinazzo D, Pellicoro M, Stramaglia S. Kernel-Granger causality and the analysis of dynamical networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2008; 77:056215. [PMID: 18643150 DOI: 10.1103/physreve.77.056215] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2008] [Indexed: 05/26/2023]
Abstract
We propose a method of analysis of dynamical networks based on a recent measure of Granger causality between time series, based on kernel methods. The generalization of kernel-Granger causality to the multivariate case, here presented, shares the following features with the bivariate measures: (i) the nonlinearity of the regression model can be controlled by choosing the kernel function and (ii) the problem of false causalities, arising as the complexity of the model increases, is addressed by a selection strategy of the eigenvectors of a reduced Gram matrix whose range represents the additional features due to the second time series. Moreover, there is no a priori assumption that the network must be a directed acyclic graph. We apply the proposed approach to a network of chaotic maps and to a simulated genetic regulatory network: it is shown that the underlying topology of the network can be reconstructed from time series of node's dynamics, provided that a sufficient number of samples is available. Considering a linear dynamical network, built by preferential attachment scheme, we show that for limited data use of the bivariate Granger causality is a better choice than methods using L1 minimization. Finally we consider real expression data from HeLa cells, 94 genes and 48 time points. The analysis of static correlations between genes reveals two modules corresponding to well-known transcription factors; Granger analysis puts in evidence 19 causal relationships, all involving genes related to tumor development.
Collapse
Affiliation(s)
- D Marinazzo
- Dipartimento Interateneo di Fisica, Università di Bari, I-70126 Bari, Italy.
| | | | | |
Collapse
|
459
|
Abstract
The completion of genome sequences and subsequent high-throughput mapping of molecular networks have allowed us to study biology from the network perspective. Experimental, statistical and mathematical modeling approaches have been employed to study the structure, function and dynamics of molecular networks, and begin to reveal important links of various network properties to the functions of the biological systems. In agreement with these functional links, evolutionary selection of a network is apparently based on the function, rather than directly on the structure of the network. Dynamic modularity is one of the prominent features of molecular networks. Taking advantage of such a feature may simplify network-based biological studies through construction of process-specific modular networks and provide functional and mechanistic insights linking genotypic variations to complex traits or diseases, which is likely to be a key approach in the next wave of understanding complex human diseases. With the development of ready-to-use network analysis and modeling tools the networks approaches will be infused into everyday biological research in the near future.
Collapse
Affiliation(s)
- Jing-Dong Jackie Han
- Chinese Academy of Sciences Key Laboratory of Molecular Developmental Biology and Center for Molecular Systems Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Datun Road, Beijing 100101, China.
| |
Collapse
|
460
|
|
461
|
Computational identification of the normal and perturbed genetic networks involved in myeloid differentiation and acute promyelocytic leukemia. Genome Biol 2008; 9:R38. [PMID: 18291030 PMCID: PMC2374711 DOI: 10.1186/gb-2008-9-2-r38] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2007] [Revised: 01/01/2008] [Accepted: 02/21/2008] [Indexed: 01/04/2023] Open
Abstract
A dissection of the genetic networks and circuitries is described for two form of leukaemia. Integrating transcription factor binding and gene expression profiling, networks are revealed that underly this important human disease. Background Acute myeloid leukemia (AML) comprises a group of diseases characterized by the abnormal development of malignant myeloid cells. Recent studies have demonstrated an important role for aberrant transcriptional regulation in AML pathophysiology. Although several transcription factors (TFs) involved in myeloid development and leukemia have been studied extensively and independently, how these TFs coordinate with others and how their dysregulation perturbs the genetic circuitry underlying myeloid differentiation is not yet known. We propose an integrated approach for mammalian genetic network construction by combining the analysis of gene expression profiling data and the identification of TF binding sites. Results We utilized our approach to construct the genetic circuitries operating in normal myeloid differentiation versus acute promyelocytic leukemia (APL), a subtype of AML. In the normal and disease networks, we found that multiple transcriptional regulatory cascades converge on the TFs Rora and Rxra, respectively. Furthermore, the TFs dysregulated in APL participate in a common regulatory pathway and may perturb the normal network through Fos. Finally, a model of APL pathogenesis is proposed in which the chimeric TF PML-RARα activates the dysregulation in APL through six mediator TFs. Conclusion This report demonstrates the utility of our approach to construct mammalian genetic networks, and to obtain new insights regarding regulatory circuitries operating in complex diseases in humans.
Collapse
|
462
|
Hirose O, Yoshida R, Imoto S, Yamaguchi R, Higuchi T, Charnock-Jones DS, Print C, Miyano S. Statistical inference of transcriptional module-based gene networks from time course gene expression profiles by using state space models. ACTA ACUST UNITED AC 2008; 24:932-42. [PMID: 18292116 DOI: 10.1093/bioinformatics/btm639] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
MOTIVATION Statistical inference of gene networks by using time-course microarray gene expression profiles is an essential step towards understanding the temporal structure of gene regulatory mechanisms. Unfortunately, most of the current studies have been limited to analysing a small number of genes because the length of time-course gene expression profiles is fairly short. One promising approach to overcome such a limitation is to infer gene networks by exploring the potential transcriptional modules which are sets of genes sharing a common function or involved in the same pathway. RESULTS In this article, we present a novel approach based on the state space model to identify the transcriptional modules and module-based gene networks simultaneously. The state space model has the potential to infer large-scale gene networks, e.g. of order 10(3), from time-course gene expression profiles. Particularly, we succeeded in the identification of a cell cycle system by using the gene expression profiles of Saccharomyces cerevisiae in which the length of the time-course and number of genes were 24 and 4382, respectively. However, when analysing shorter time-course data, e.g. of length 10 or less, the parameter estimations of the state space model often fail due to overfitting. To extend the applicability of the state space model, we provide an approach to use the technical replicates of gene expression profiles, which are often measured in duplicate or triplicate. The use of technical replicates is important for achieving highly-efficient inferences of gene networks with short time-course data. The potential of the proposed method has been demonstrated through the time-course analysis of the gene expression profiles of human umbilical vein endothelial cells (HUVECs) undergoing growth factor deprivation-induced apoptosis. AVAILABILITY Supplementary Information and the software (TRANS-MNET) are available at http://daweb.ism.ac.jp/~yoshidar/software/ssm/.
Collapse
Affiliation(s)
- Osamu Hirose
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
| | | | | | | | | | | | | | | |
Collapse
|
463
|
Abstract
All organisms possess a diverse set of genetic programs that are used to alter cellular physiology in response to environmental cues. The gram-negative bacterium, Escherichia coli, mounts what is known as the "SOS response" following DNA damage, replication fork arrest, and a myriad of other environmental stresses. For over 50 years, E. coli has served as the paradigm for our understanding of the transcriptional, and physiological changes that occur following DNA damage (400). In this chapter, we summarize the current view of the SOS response and discuss how this genetic circuit is regulated. In addition to examining the E. coli SOS response, we also include a discussion of the SOS regulatory networks in other bacteria to provide a broader perspective on how prokaryotes respond to DNA damage.
Collapse
|
464
|
A systems biology approach to prediction of oncogenes and molecular perturbation targets in B-cell lymphomas. Mol Syst Biol 2008; 4:169. [PMID: 18277385 PMCID: PMC2267731 DOI: 10.1038/msb.2008.2] [Citation(s) in RCA: 150] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2007] [Accepted: 12/14/2007] [Indexed: 01/03/2023] Open
Abstract
The computational identification of oncogenic lesions is still a key open problem in cancer biology. Although several methods have been proposed, they fail to model how such events are mediated by the network of molecular interactions in the cell. In this paper, we introduce a systems biology approach, based on the analysis of molecular interactions that become dysregulated in specific tumor phenotypes. Such a strategy provides important insights into tumorigenesis, effectively extending and complementing existing methods. Furthermore, we show that the same approach is highly effective in identifying the targets of molecular perturbations in a human cellular context, a task virtually unaddressed by existing computational methods. To identify interactions that are dysregulated in three distinct non-Hodgkin's lymphomas and in samples perturbed with CD40 ligand, we use the B-cell interactome (BCI), a genome-wide compendium of human B-cell molecular interactions, in combination with a large set of microarray expression profiles. The method consistently ranked the known gene in the top 20 (0.3%), outperforming conventional approaches in 3 of 4 cases.
Collapse
|
465
|
Napoletani D, Sauer TD. Reconstructing the topology of sparsely connected dynamical networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2008; 77:026103. [PMID: 18352086 DOI: 10.1103/physreve.77.026103] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2007] [Revised: 12/07/2007] [Indexed: 05/09/2023]
Abstract
Given a general physical network and measurements of node dynamics, methods are proposed for reconstructing the network topology. We focus on networks whose connections are sparse and where data are limited. Under these conditions, common in many biological networks, constrained optimization techniques based on the L1 vector norm are found to be superior for inference of the network connections.
Collapse
Affiliation(s)
- Domenico Napoletani
- Department of Mathematical Sciences, George Mason University, Fairfax, Virginia 22030, USA
| | | |
Collapse
|
466
|
Chen G, Larsen P, Almasri E, Dai Y. Rank-based edge reconstruction for scale-free genetic regulatory networks. BMC Bioinformatics 2008; 9:75. [PMID: 18237422 PMCID: PMC2275249 DOI: 10.1186/1471-2105-9-75] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2007] [Accepted: 01/31/2008] [Indexed: 11/12/2022] Open
Abstract
Background The reconstruction of genetic regulatory networks from microarray gene expression data has been a challenging task in bioinformatics. Various approaches to this problem have been proposed, however, they do not take into account the topological characteristics of the targeted networks while reconstructing them. Results In this study, an algorithm that explores the scale-free topology of networks was proposed based on the modification of a rank-based algorithm for network reconstruction. The new algorithm was evaluated with the use of both simulated and microarray gene expression data. The results demonstrated that the proposed algorithm outperforms the original rank-based algorithm. In addition, in comparison with the Bayesian Network approach, the results show that the proposed algorithm gives much better recovery of the underlying network when sample size is much smaller relative to the number of genes. Conclusion The proposed algorithm is expected to be useful in the reconstruction of biological networks whose degree distributions follow the scale-free topology.
Collapse
Affiliation(s)
- Guanrao Chen
- Department of Computer Science (MC152), University of Illinois at Chicago, 851 South Morgan Street, Chicago, IL 60607, USA.
| | | | | | | |
Collapse
|
467
|
Kimura S, Sonoda K, Yamane S, Maeda H, Matsumura K, Hatakeyama M. Function approximation approach to the inference of reduced NGnet models of genetic networks. BMC Bioinformatics 2008; 9:23. [PMID: 18194576 PMCID: PMC2258286 DOI: 10.1186/1471-2105-9-23] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2007] [Accepted: 01/14/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The inference of a genetic network is a problem in which mutual interactions among genes are deduced using time-series of gene expression patterns. While a number of models have been proposed to describe genetic regulatory networks, this study focuses on a set of differential equations since it has the ability to model dynamic behavior of gene expression. When we use a set of differential equations to describe genetic networks, the inference problem can be defined as a function approximation problem. On the basis of this problem definition, we propose in this study a new method to infer reduced NGnet models of genetic networks. RESULTS Through numerical experiments on artificial genetic network inference problems, we demonstrated that our method has the ability to infer genetic networks correctly and it was faster than the other inference methods. We then applied the proposed method to actual expression data of the bacterial SOS DNA repair system, and succeeded in finding several reasonable regulations. When our method inferred the genetic network from the actual data, it required about 4.7 min on a single-CPU personal computer. CONCLUSION The proposed method has an ability to obtain reasonable networks with a short computational time. As a high performance computer is not always available at every laboratory, the short computational time of our method is a preferable feature. There does not seem to be a perfect model for the inference of genetic networks yet. Therefore, in order to extract reliable information from the observed gene expression data, we should infer genetic networks using multiple inference methods based on different models. Our approach could be used as one of the promising inference methods.
Collapse
Affiliation(s)
- Shuhei Kimura
- Faculty of Engineering, Tottori University, 4-101 Koyama-Minami, Tottori, Japan.
| | | | | | | | | | | |
Collapse
|
468
|
Tan K, Tegner J, Ravasi T. Integrated approaches to uncovering transcription regulatory networks in mammalian cells. Genomics 2008; 91:219-31. [PMID: 18191937 DOI: 10.1016/j.ygeno.2007.11.005] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2007] [Revised: 11/14/2007] [Accepted: 11/16/2007] [Indexed: 11/16/2022]
Abstract
Integrative systems biology has emerged as an exciting research approach in molecular biology and functional genomics that involves the integration of genomics, proteomics, and metabolomics datasets. These endeavors establish a systematic paradigm by which to interrogate, model, and iteratively refine our knowledge of the regulatory events within a cell. Here we review the latest technologies available to collect high-throughput measurements of a cellular state as well as the most successful methods for the integration and interrogation of these measurements. In particular we will focus on methods available to infer transcription regulatory networks in mammals.
Collapse
Affiliation(s)
- Kai Tan
- Department of Bioengineering, Jacobs School of Engineering, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA.
| | | | | |
Collapse
|
469
|
Dhurjati P, Mahadevan R. Systems Biology: The synergistic interplay between biology and mathematics. CAN J CHEM ENG 2008. [DOI: 10.1002/cjce.20025] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
470
|
|
471
|
Abstract
Common human diseases like obesity and diabetes are driven by complex networks of genes and any number of environmental factors. To understand this complexity in hopes of identifying targets and developing drugs against disease, a systematic approach is required to elucidate the genetic and environmental factors and interactions among and between these factors, and to establish how these factors induce changes in gene networks that in turn lead to disease. The explosion of large-scale, high-throughput technologies in the biological sciences has enabled researchers to take a more systems biology approach to study complex traits like disease. Genotyping of hundreds of thousands of DNA markers and profiling tens of thousands of molecular phenotypes simultaneously in thousands of individuals is now possible, and this scale of data is making it possible for the first time to reconstruct whole gene networks associated with disease. In the following sections, we review different approaches for integrating genetic expression and clinical data to infer causal relationships among gene expression traits and between expression and disease traits. We further review methods to integrate these data in a more comprehensive manner to identify common pathways shared by the causal factors driving disease, including the reconstruction of association and probabilistic causal networks. Particular attention is paid to integrating diverse information to refine these types of networks so that they are more predictive. To highlight these different approaches in practice, we step through an example on how Insig2 was identified as a causal factor for plasma cholesterol levels in mice.
Collapse
|
472
|
de Bivort B, Huang S, Bar-Yam Y. Empirical multiscale networks of cellular regulation. PLoS Comput Biol 2007; 3:1968-78. [PMID: 17953478 PMCID: PMC2041980 DOI: 10.1371/journal.pcbi.0030207] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2006] [Accepted: 09/07/2007] [Indexed: 11/25/2022] Open
Abstract
Grouping genes by similarity of expression across multiple cellular conditions enables the identification of cellular modules. The known functions of genes enable the characterization of the aggregate biological functions of these modules. In this paper, we use a high-throughput approach to identify the effective mutual regulatory interactions between modules composed of mouse genes from the Alliance for Cell Signaling (AfCS) murine B-lymphocyte database which tracks the response of ∼15,000 genes following chemokine perturbation. This analysis reveals principles of cellular organization that we discuss along four conceptual axes. (1) Regulatory implications: the derived collection of influences between any two modules quantifies intuitive as well as unexpected regulatory interactions. (2) Behavior across scales: trends across global networks of varying resolution (composed of various numbers of modules) reveal principles of assembly of high-level behaviors from smaller components. (3) Temporal behavior: tracking the mutual module influences over different time intervals provides features of regulation dynamics such as duration, persistence, and periodicity. (4) Gene Ontology correspondence: the association of modules to known biological roles of individual genes describes the organization of functions within coexpressed modules of various sizes. We present key specific results in each of these four areas, as well as derive general principles of cellular organization. At the coarsest scale, the entire transcriptional network contains five divisions: two divisions devoted to ATP production/biosynthesis and DNA replication that activate all other divisions, an “extracellular interaction” division that represses all other divisions, and two divisions (proliferation/differentiation and membrane infrastructure) that activate and repress other divisions in specific ways consistent with cell cycle control. In a eukaryotic organism such as the mouse, the complete transcriptional network contains ∼15,000 genes and up to 225 million regulatory relationships between pairs of genes. Determining all of these relationships is currently intractable using traditional experimental techniques, and, thus, a comprehensive description of the entire mouse transcriptional network is elusive. Alternatively, one can apply the limited amount of experimental data to determine the entire transcriptional network at a less detailed, higher level. This is analogous to considering a map of the world resolved to the kilometer rather than to the millimeter. Here, we derive from mouse microarray data several high-scale transcriptional networks by determining the mutual effective regulatory influences of large modules of genes. In particular, global transcriptional networks containing 12 to 72 modules are derived, and analysis of these multiscale networks reveals properties of the transcriptional network that are universal at all scales (e.g., maintenance of homeostasis) and properties that vary as a function of scale (e.g., the fractions of module pairs that exert mutual regulation). In addition, we describe how cellular functions associated with large modules (those containing many genes) are composed of more specific functions associated with smaller modules.
Collapse
Affiliation(s)
- Benjamin de Bivort
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA.
| | | | | |
Collapse
|
473
|
Socolovsky M, Murrell M, Liu Y, Pop R, Porpiglia E, Levchenko A. Negative autoregulation by FAS mediates robust fetal erythropoiesis. PLoS Biol 2007; 5:e252. [PMID: 17896863 PMCID: PMC1988857 DOI: 10.1371/journal.pbio.0050252] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2007] [Accepted: 07/27/2007] [Indexed: 01/22/2023] Open
Abstract
Tissue development is regulated by signaling networks that control developmental rate and determine ultimate tissue mass. Here we present a novel computational algorithm used to identify regulatory feedback and feedforward interactions between progenitors in developing erythroid tissue. The algorithm makes use of dynamic measurements of red cell progenitors between embryonic days 12 and 15 in the mouse. It selects for intercellular interactions that reproduce the erythroid developmental process and endow it with robustness to external perturbations. This analysis predicts that negative autoregulatory interactions arise between early erythroblasts of similar maturation stage. By studying embryos mutant for the death receptor FAS, or for its ligand, FASL, and by measuring the rate of FAS-mediated apoptosis in vivo, we show that FAS and FASL are pivotal negative regulators of fetal erythropoiesis, in the manner predicted by the computational model. We suggest that apoptosis in erythroid development mediates robust homeostasis regulating the number of red blood cells reaching maturity.
Collapse
Affiliation(s)
- Merav Socolovsky
- Department of Pediatrics, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America.
| | | | | | | | | | | |
Collapse
|
474
|
Kiesel J, Miller C, Abu-Amer Y, Aurora R. Systems level analysis of osteoclastogenesis reveals intrinsic and extrinsic regulatory interactions. Dev Dyn 2007; 236:2181-97. [PMID: 17584858 DOI: 10.1002/dvdy.21206] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Osteoclasts are bone-resorbing cells derived from the myeloid lineage that play a central role in bone remodeling and inflammatory bone erosion diseases. The receptor activator of NF-kappaB ligand (RANKL) produced by osteoblasts and activated immune cells initiates the development of osteoclasts in the bone marrow. Using time series gene expression data, the intrinsic processes and the extrinsic factors that control osteoclastogenesis were identified. The gene expression profiles display distinct commitment and differentiation phases. Analysis of the time course revealed several mechanistic details, including the complex role of cholesterol in osteoclast development. Epistatic interactions and the coordination between cellular processes that regulate development were inferred from the coexpression network. The coexpression network indicated that osteoclasts induce angiogenesis and recruit T-cells to the site of osteoclastogenesis early in the commitment phase. The resulting model provides an essential framework for a better understanding of the epigenetic program of osteoclastogenesis.
Collapse
Affiliation(s)
- Jennifer Kiesel
- Department of Molecular Microbiology and Immunology, Saint Louis University School of Medicine, Saint Louis, Missouri, USA
| | | | | | | |
Collapse
|
475
|
Dong W, Tang X, Yu Y, Griffith J, Nilsen R, Choi D, Baldwin J, Hilton L, Kelps K, Mcguire J, Morgan R, Smith M, Case M, Arnold J, Schüttler HB, Wang Q, Liu J, Reeves J, Logan D. Systems biology of the neurospora biological clock. IET Syst Biol 2007; 1:257-65. [PMID: 17907673 DOI: 10.1049/iet-syb:20060080] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
A major challenge of systems biology is explaining complex traits, such as the biological clock, in terms of the kinetics of macromolecules. The clock poses at least four challenges for systems biology: (i) identifying the genetic network to explain the clock mechanism quantitatively; (ii) specifying the clock's functional connection to a thousand or more genes and their products in the genome; (iii) explaining the clock's response to light and other environmental cues; and (iv) explaining how the clock's genetic network evolves. Here, the authors illustrate an approach to these problems by fitting an ensemble of genetic networks to microarray data derived from oligonucleotide arrays with approximately all 11 000 Neurospora crassa genes represented. A promising genetic network for the clock mechanism is identified.
Collapse
Affiliation(s)
- W Dong
- Genetics Department, University of Georgia, Athens, GA 30602, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
476
|
Roth CL, Mastronardi C, Lomniczi A, Wright H, Cabrera R, Mungenast AE, Heger S, Jung H, Dubay C, Ojeda SR. Expression of a tumor-related gene network increases in the mammalian hypothalamus at the time of female puberty. Endocrinology 2007; 148:5147-61. [PMID: 17615149 DOI: 10.1210/en.2007-0634] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Much has been learned in recent years about the central mechanisms controlling the initiation of mammalian puberty. It is now clear that this process requires the interactive participation of several genes. Using a combination of high throughput, molecular, and bioinformatics strategies, in combination with a system biology approach, we singled out from the hypothalamus of nonhuman primates and rats a group of related genes whose expression increases at the time of female puberty. Although these genes [henceforth termed tumor-related genes (TRGs)] have diverse cellular functions, they share the common feature of having been earlier identified as involved in tumor suppression/tumor formation. A prominent member of this group is KiSS1, a gene recently shown to be essential for the occurrence of puberty. Cis-regulatory analysis revealed the presence of a hierarchically arranged gene set containing five major hubs (CDP/CUTL1, MAF, p53, YY1, and USF2) controlling the network at the transcriptional level. In turn, these hubs are heavily connected to non-TRGs involved in the transcriptional regulation of the pubertal process. TRGs may be expressed in the mammalian hypothalamus as components of a regulatory gene network that facilitates and integrates cellular and cell-cell communication programs required for the acquisition of female reproductive competence.
Collapse
Affiliation(s)
- Christian L Roth
- Division of Neuroscience, Oregon National Primate Research Center, Oregon Health and Science University, Beaverton, OR 97006, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
477
|
Nam D, Yoon SH, Kim JF. Ensemble learning of genetic networks from time-series expression data. ACTA ACUST UNITED AC 2007; 23:3225-31. [PMID: 17977884 DOI: 10.1093/bioinformatics/btm514] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Inferring genetic networks from time-series expression data has been a great deal of interest. In most cases, however, the number of genes exceeds that of data points which, in principle, makes it impossible to recover the underlying networks. To address the dimensionality problem, we apply the subset selection method to a linear system of difference equations. Previous approaches assign the single most likely combination of regulators to each target gene, which often causes over-fitting of the small number of data. RESULTS Here, we propose a new algorithm, named LEARNe, which merges the predictions from all the combinations of regulators that have a certain level of likelihood. LEARNe provides more accurate and robust predictions than previous methods for the structure of genetic networks under the linear system model. We tested LEARNe for reconstructing the SOS regulatory network of Escherichia coli and the cell cycle regulatory network of yeast from real experimental data, where LEARNe also exhibited better performances than previous methods. AVAILABILITY The MATLAB codes are available upon request from the authors.
Collapse
Affiliation(s)
- Dougu Nam
- Korea Research Institute of Bioscience and Biotechnology (KRIBB), PO Box 115, Yuseong, Daejeon 305-600, Republic of Korea.
| | | | | |
Collapse
|
478
|
Kluger Y, Kluger H, Tuck D. Association between pathways in regulatory networks. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2007; 2006:2036-40. [PMID: 17946929 DOI: 10.1109/iembs.2006.260730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
During cell progression from one state to another, such as transformation from benign to malignant conditions, cells undergo changes in gene regulation. To reveal state-dependent circuitries in human regulatory networks, we employed drafts of normal and malignant cell networks. Using these condition specific networks, gene profiles and annotated pathways we studied: a) the capacity to separate samples or cell states based on the collective expression of all the genes in each pathway rather than individual genes, b) the degree of regulatory network connectivity within and between pathways. Distinct cell types reveal notable differences in transcriptional activity in numerous pathways. On the other hand, in datasets from breast cancer patients with variable outcome the capacity of single pathway expression signatures to predict disease outcome is very limited, though this can be somewhat improved by combining multiple pathways. Remarkable connectivity between pathways on the transcriptional regulatory level revealed a non-modular network structure. Overall, network blueprints enable us to quantify the degree of interaction between condition specific co-regulated pathways. This can contribute to understanding deregulated processes associated with cancer.
Collapse
Affiliation(s)
- Yuval Kluger
- Dept. of Cell Biol., New York Univ. Sch. of Medicine, NY, NY 10016, USA.
| | | | | |
Collapse
|
479
|
Stolovitzky G, Monroe D, Califano A. Dialogue on reverse-engineering assessment and methods: the DREAM of high-throughput pathway inference. Ann N Y Acad Sci 2007; 1115:1-22. [PMID: 17925349 DOI: 10.1196/annals.1407.021] [Citation(s) in RCA: 227] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The biotechnological advances of the last decade have confronted us with an explosion of genetics, genomics, transcriptomics, proteomics, and metabolomics data. These data need to be organized and structured before they may provide a coherent biological picture. To accomplish this formidable task, the availability of an accurate map of the physical interactions in the cell that are responsible for cellular behavior and function would be exceedingly helpful, as these data are ultimately the result of such molecular interactions. However, all we have at this time is, at best, a fragmentary and only partially correct representation of the interactions between genes, their byproducts, and other cellular entities. If we want to succeed in our quest for understanding the biological whole as more than the sum of the individual parts, we need to build more comprehensive and cell-context-specific maps of the biological interaction networks. DREAM, the Dialogue on Reverse Engineering Assessment and Methods, is fostering a concerted effort by computational and experimental biologists to understand the limitations and to enhance the strengths of the efforts to reverse engineer cellular networks from high-throughput data. In this chapter we will discuss the salient arguments of the first DREAM conference. We will highlight both the state of the art in the field of reverse engineering as well as some of its challenges and opportunities.
Collapse
Affiliation(s)
- Gustavo Stolovitzky
- IBM Computational Biology Center, P.O. Box 218, Yorktown Heights, NY 10598, USA.
| | | | | |
Collapse
|
480
|
Margolin AA, Califano A. Theory and limitations of genetic network inference from microarray data. Ann N Y Acad Sci 2007; 1115:51-72. [PMID: 17925348 DOI: 10.1196/annals.1407.019] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Since the advent of gene expression microarray technology more than 10 years ago, many computational approaches have been developed aimed at using statistical associations between mRNA abundance profiles to predict transcriptional regulatory interactions. The ultimate goal is to develop causal network models describing the transcriptional influences that genes exert on each other (via their protein products), which can be used to predict network disruptions (e.g., mutations) leading to a disease phenotype, as well as the appropriate therapeutic intervention. However, microarray data measure only a small component of the interacting variables in a genetic regulatory network, as cells are known to regulate gene expression via many diverse mechanisms. Although many researchers have acknowledged the questionable interpretation of statistical dependencies between mRNA profiles, very little work has been done on theoretically characterizing the nature of inferred dependencies using models that account for unobserved interacting variables. In this work, we review the theory behind reverse engineering algorithms derived from three separate disciplines-system control theory, graphical models, and information theory-and highlight several mathematical relationships between the various methods. We then apply recent theoretical work on constructing graphical models with latent variables to the context of reverse engineering genetic networks. We demonstrate that even the addition of simple latent variables induces statistical dependencies between non-directly interacting (e.g., co-regulated) genes that cannot be eliminated by conditioning on any observed variables.
Collapse
Affiliation(s)
- Adam A Margolin
- Department of Biomedical Informatics, 1130 St. Nicholas Avenue, Room 917, New York, NY 10032, USA.
| | | |
Collapse
|
481
|
Camacho D, Vera Licona P, Mendes P, Laubenbacher R. Comparison of reverse-engineering methods using an in silico network. Ann N Y Acad Sci 2007; 1115:73-89. [PMID: 17925358 DOI: 10.1196/annals.1407.006] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The reverse engineering of biochemical networks is a central problem in systems biology. In recent years several methods have been developed for this purpose, using techniques from a variety of fields. A systematic comparison of the different methods is complicated by their widely varying data requirements, making benchmarking difficult. Also, because of the lack of detailed knowledge about most real networks, it is not easy to use experimental data for this purpose. This paper contains a comparison of four reverse-engineering methods using data from a simulated network. The network is sufficiently realistic and complex to include many of the challenges that data from real networks pose. Our results indicate that the two methods based on genetic perturbations of the network outperform the other methods, including dynamic Bayesian networks and a partial correlation method.
Collapse
Affiliation(s)
- Diogo Camacho
- Applied Biodynamics Lab, Biomedical Engineering Department, Boston University, MA, USA
| | | | | | | |
Collapse
|
482
|
Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol 2007; 5:e8. [PMID: 17214507 PMCID: PMC1764438 DOI: 10.1371/journal.pbio.0050008] [Citation(s) in RCA: 1013] [Impact Index Per Article: 56.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2006] [Accepted: 11/07/2006] [Indexed: 11/19/2022] Open
Abstract
Machine learning approaches offer the potential to systematically identify transcriptional regulatory interactions from a compendium of microarray expression profiles. However, experimental validation of the performance of these methods at the genome scale has remained elusive. Here we assess the global performance of four existing classes of inference algorithms using 445 Escherichia coli Affymetrix arrays and 3,216 known E. coli regulatory interactions from RegulonDB. We also developed and applied the context likelihood of relatedness (CLR) algorithm, a novel extension of the relevance networks class of algorithms. CLR demonstrates an average precision gain of 36% relative to the next-best performing algorithm. At a 60% true positive rate, CLR identifies 1,079 regulatory interactions, of which 338 were in the previously known network and 741 were novel predictions. We tested the predicted interactions for three transcription factors with chromatin immunoprecipitation, confirming 21 novel interactions and verifying our RegulonDB-based performance estimates. CLR also identified a regulatory link providing central metabolic control of iron transport, which we confirmed with real-time quantitative PCR. The compendium of expression data compiled in this study, coupled with RegulonDB, provides a valuable model system for further improvement of network inference algorithms using experimental data.
Collapse
Affiliation(s)
- Jeremiah J Faith
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
| | - Boris Hayete
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
| | - Joshua T Thaden
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Ilaria Mogno
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- Department of Computer and Systems Science A. Ruberti, University of Rome, La Sapienza, Rome, Italy
| | - Jamey Wierzbowski
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- Cellicon Biotechnologies, Boston, Massachusetts, United States of America
| | - Guillaume Cottarel
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- Cellicon Biotechnologies, Boston, Massachusetts, United States of America
| | - Simon Kasif
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
| | - James J Collins
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Timothy S Gardner
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
483
|
Abstract
In this review we give an overview of computational and statistical methods to reconstruct cellular networks. Although this area of research is vast and fast developing, we show that most currently used methods can be organized by a few key concepts. The first part of the review deals with conditional independence models including Gaussian graphical models and Bayesian networks. The second part discusses probabilistic and graph-based methods for data from experimental interventions and perturbations.
Collapse
Affiliation(s)
- Florian Markowetz
- Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany
- Princeton University, Lewis-Sigler Institute for Integrative Genomics and Dept. of Computer Science, Princeton, NJ 08544, USA
| | - Rainer Spang
- Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany
- Present affiliation: University Regensburg, Institute of Functional Genomics, Josef-Engert-Str. 9, 93053 Regensburg, Germany
| |
Collapse
|
484
|
Margolin AA, Wang K, Lim WK, Kustagi M, Nemenman I, Califano A. Reverse engineering cellular networks. Nat Protoc 2007; 1:662-71. [PMID: 17406294 DOI: 10.1038/nprot.2006.106] [Citation(s) in RCA: 243] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
We describe a computational protocol for the ARACNE algorithm, an information-theoretic method for identifying transcriptional interactions between gene products using microarray expression profile data. Similar to other algorithms, ARACNE predicts potential functional associations among genes, or novel functions for uncharacterized genes, by identifying statistical dependencies between gene products. However, based on biochemical validation, literature searches and DNA binding site enrichment analysis, ARACNE has also proven effective in identifying bona fide transcriptional targets, even in complex mammalian networks. Thus we envision that predictions made by ARACNE, especially when supplemented with prior knowledge or additional data sources, can provide appropriate hypotheses for the further investigation of cellular networks. While the examples in this protocol use only gene expression profile data, the algorithm's theoretical basis readily extends to a variety of other high-throughput measurements, such as pathway-specific or genome-wide proteomics, microRNA and metabolomics data. As these data become readily available, we expect that ARACNE might prove increasingly useful in elucidating the underlying interaction models. For a microarray data set containing approximately 10,000 probes, reconstructing the network around a single probe completes in several minutes using a desktop computer with a Pentium 4 processor. Reconstructing a genome-wide network generally requires a computational cluster, especially if the recommended bootstrapping procedure is used.
Collapse
Affiliation(s)
- Adam A Margolin
- Department of Biomedical Informatics, Columbia University, New York, New York 10032, USA
| | | | | | | | | | | |
Collapse
|
485
|
Abstract
Genes interact with each other in complex networks that enable the processing of information and the metabolism of nutrients inside the cell. A novel inference algorithm based on linear ordinary differential equations is proposed. The algorithm can infer the local network of gene-gene interactions surrounding a gene of interest from time-series gene expression profiles. The performance of the algorithm has been tested on in silico simulated gene expression data and on a nine gene subnetwork part of the DNA-damage response pathway (SOS pathway) in the bacteria Escherichia coli. This approach can infer regulatory interactions even when only a small number of measurements is available.
Collapse
Affiliation(s)
- M Bansal
- Telethon Institute of Genetics and Medicine, Via P. Castellino 111, Naples 80131, Italy.
| | | |
Collapse
|
486
|
Fujita A, Sato JR, Garay-Malpartida HM, Yamaguchi R, Miyano S, Sogayar MC, Ferreira CE. Modeling gene expression regulatory networks with the sparse vector autoregressive model. BMC SYSTEMS BIOLOGY 2007; 1:39. [PMID: 17761000 PMCID: PMC2048982 DOI: 10.1186/1752-0509-1-39] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2007] [Accepted: 08/30/2007] [Indexed: 12/25/2022]
Abstract
BACKGROUND To understand the molecular mechanisms underlying important biological processes, a detailed description of the gene products networks involved is required. In order to define and understand such molecular networks, some statistical methods are proposed in the literature to estimate gene regulatory networks from time-series microarray data. However, several problems still need to be overcome. Firstly, information flow need to be inferred, in addition to the correlation between genes. Secondly, we usually try to identify large networks from a large number of genes (parameters) originating from a smaller number of microarray experiments (samples). Due to this situation, which is rather frequent in Bioinformatics, it is difficult to perform statistical tests using methods that model large gene-gene networks. In addition, most of the models are based on dimension reduction using clustering techniques, therefore, the resulting network is not a gene-gene network but a module-module network. Here, we present the Sparse Vector Autoregressive model as a solution to these problems. RESULTS We have applied the Sparse Vector Autoregressive model to estimate gene regulatory networks based on gene expression profiles obtained from time-series microarray experiments. Through extensive simulations, by applying the SVAR method to artificial regulatory networks, we show that SVAR can infer true positive edges even under conditions in which the number of samples is smaller than the number of genes. Moreover, it is possible to control for false positives, a significant advantage when compared to other methods described in the literature, which are based on ranks or score functions. By applying SVAR to actual HeLa cell cycle gene expression data, we were able to identify well known transcription factor targets. CONCLUSION The proposed SVAR method is able to model gene regulatory networks in frequent situations in which the number of samples is lower than the number of genes, making it possible to naturally infer partial Granger causalities without any a priori information. In addition, we present a statistical test to control the false discovery rate, which was not previously possible using other gene regulatory network models.
Collapse
Affiliation(s)
- André Fujita
- Institute of Mathematics and Statistics, University of São Paulo, Rua do Matão, 1010 – São Paulo, 05508-090, SP, Brazil
- Chemistry Institute, University of São Paulo, Av. Lineu Prestes, 748 – São Paulo, 05513-970, SP, Brazil
| | - João R Sato
- Institute of Mathematics and Statistics, University of São Paulo, Rua do Matão, 1010 – São Paulo, 05508-090, SP, Brazil
| | - Humberto M Garay-Malpartida
- Chemistry Institute, University of São Paulo, Av. Lineu Prestes, 748 – São Paulo, 05513-970, SP, Brazil
- School of Arts, Science and Humanities, University of São Paulo, Av. Arlindo Bettio, 1000 – São Paulo, 03828-000, SP, Brazil
| | - Rui Yamaguchi
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
| | - Satoru Miyano
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
| | - Mari C Sogayar
- Chemistry Institute, University of São Paulo, Av. Lineu Prestes, 748 – São Paulo, 05513-970, SP, Brazil
| | - Carlos E Ferreira
- Institute of Mathematics and Statistics, University of São Paulo, Rua do Matão, 1010 – São Paulo, 05508-090, SP, Brazil
| |
Collapse
|
487
|
Luo F, Yang Y, Zhong J, Gao H, Khan L, Thompson DK, Zhou J. Constructing gene co-expression networks and predicting functions of unknown genes by random matrix theory. BMC Bioinformatics 2007; 8:299. [PMID: 17697349 PMCID: PMC2212665 DOI: 10.1186/1471-2105-8-299] [Citation(s) in RCA: 170] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2006] [Accepted: 08/14/2007] [Indexed: 11/16/2022] Open
Abstract
Background Large-scale sequencing of entire genomes has ushered in a new age in biology. One of the next grand challenges is to dissect the cellular networks consisting of many individual functional modules. Defining co-expression networks without ambiguity based on genome-wide microarray data is difficult and current methods are not robust and consistent with different data sets. This is particularly problematic for little understood organisms since not much existing biological knowledge can be exploited for determining the threshold to differentiate true correlation from random noise. Random matrix theory (RMT), which has been widely and successfully used in physics, is a powerful approach to distinguish system-specific, non-random properties embedded in complex systems from random noise. Here, we have hypothesized that the universal predictions of RMT are also applicable to biological systems and the correlation threshold can be determined by characterizing the correlation matrix of microarray profiles using random matrix theory. Results Application of random matrix theory to microarray data of S. oneidensis, E. coli, yeast, A. thaliana, Drosophila, mouse and human indicates that there is a sharp transition of nearest neighbour spacing distribution (NNSD) of correlation matrix after gradually removing certain elements insider the matrix. Testing on an in silico modular model has demonstrated that this transition can be used to determine the correlation threshold for revealing modular co-expression networks. The co-expression network derived from yeast cell cycling microarray data is supported by gene annotation. The topological properties of the resulting co-expression network agree well with the general properties of biological networks. Computational evaluations have showed that RMT approach is sensitive and robust. Furthermore, evaluation on sampled expression data of an in silico modular gene system has showed that under-sampled expressions do not affect the recovery of gene co-expression network. Moreover, the cellular roles of 215 functionally unknown genes from yeast, E. coli and S. oneidensis are predicted by the gene co-expression networks using guilt-by-association principle, many of which are supported by existing information or our experimental verification, further demonstrating the reliability of this approach for gene function prediction. Conclusion Our rigorous analysis of gene expression microarray profiles using RMT has showed that the transition of NNSD of correlation matrix of microarray profile provides a profound theoretical criterion to determine the correlation threshold for identifying gene co-expression networks.
Collapse
Affiliation(s)
- Feng Luo
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
- School of Computing, Clemson University, Clemson, SC, 29634, USA
| | - Yunfeng Yang
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
| | - Jianxin Zhong
- Computer Science & Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
- Department of Physics, Xiangtan University, Hunan 411105, PR China
| | - Haichun Gao
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
- Insitute for Environmental Genomics, and Department of Botany and Microbiology, University of Oklahoma, Norman, OK, 73019, USA
| | - Latifur Khan
- Department of Computer Science, University of Texas at Dallas, Richardson, TX 75083, USA
| | - Dorothea K Thompson
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Jizhong Zhou
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
- Insitute for Environmental Genomics, and Department of Botany and Microbiology, University of Oklahoma, Norman, OK, 73019, USA
| |
Collapse
|
488
|
Bussemaker HJ, Foat BC, Ward LD. Predictive modeling of genome-wide mRNA expression: from modules to molecules. ACTA ACUST UNITED AC 2007; 36:329-47. [PMID: 17311525 DOI: 10.1146/annurev.biophys.36.040306.132725] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Various algorithms are available for predicting mRNA expression and modeling gene regulatory processes. They differ in whether they rely on the existence of modules of coregulated genes or build a model that applies to all genes, whether they represent regulatory activities as hidden variables or as mRNA levels, and whether they implicitly or explicitly model the complex cis-regulatory logic of multiple interacting transcription factors binding the same DNA. The fact that functional genomics data of different types reflect the same molecular processes provides a natural strategy for integrative computational analysis. One promising avenue toward an accurate and comprehensive model of gene regulation combines biophysical modeling of the interactions among proteins, DNA, and RNA with the use of large-scale functional genomics data to estimate regulatory network connectivity and activity parameters. As the ability of these models to represent complex cis-regulatory logic increases, the need for approaches based on cross-species conservation may diminish.
Collapse
Affiliation(s)
- Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA.
| | | | | |
Collapse
|
489
|
Kim S, Kim J, Cho KH. Inferring gene regulatory networks from temporal expression profiles under time-delay and noise. Comput Biol Chem 2007; 31:239-45. [PMID: 17631421 DOI: 10.1016/j.compbiolchem.2007.03.013] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2007] [Accepted: 03/30/2007] [Indexed: 11/22/2022]
Abstract
Ordinary differential equations (ODE) have been widely used for modeling and analysis of dynamic gene networks in systems biology. In this paper, we propose an optimization method that can infer a gene regulatory network from time-series gene expression data. Specifically, the following four cases are considered: (1) reconstruction of a gene network from synthetic gene expression data with noise, (2) reconstruction of a gene network from synthetic gene expression data with time-delay, (3) reconstruction of a gene network from synthetic gene expression data with noise and time-delay, and (4) reconstruction of a gene network from experimental time-series data in budding yeast cell cycle.
Collapse
Affiliation(s)
- Shinuk Kim
- Bio-MAX Institute, Seoul National University, Gwanak-gu, Seoul 151-818, Republic of Korea
| | | | | |
Collapse
|
490
|
Korcsmáros T, Szalay MS, Böde C, Kovács IA, Csermely P. How to design multi-target drugs. Expert Opin Drug Discov 2007; 2:799-808. [DOI: 10.1517/17460441.2.6.799] [Citation(s) in RCA: 114] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
491
|
Cho KH, Choo SM, Jung SH, Kim JR, Choi HS, Kim J. Reverse engineering of gene regulatory networks. IET Syst Biol 2007; 1:149-63. [PMID: 17591174 DOI: 10.1049/iet-syb:20060075] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Systems biology is a multi-disciplinary approach to the study of the interactions of various cellular mechanisms and cellular components. Owing to the development of new technologies that simultaneously measure the expression of genetic information, systems biological studies involving gene interactions are increasingly prominent. In this regard, reconstructing gene regulatory networks (GRNs) forms the basis for the dynamical analysis of gene interactions and related effects on cellular control pathways. Various approaches of inferring GRNs from gene expression profiles and biological information, including machine learning approaches, have been reviewed, with a brief introduction of DNA microarray experiments as typical tools for measuring levels of messenger ribonucleic acid (mRNA) expression. In particular, the inference methods are classified according to the required input information, and the main idea of each method is elucidated by comparing its advantages and disadvantages with respect to the other methods. In addition, recent developments in this field are introduced and discussions on the challenges and opportunities for future research are provided.
Collapse
Affiliation(s)
- K H Cho
- College of Medicine, Seoul National University, Jongnogu, Seoul 110-799, South Korea.
| | | | | | | | | | | |
Collapse
|
492
|
Srividhya J, Crampin EJ, McSharry PE, Schnell S. Reconstructing biochemical pathways from time course data. Proteomics 2007; 7:828-38. [PMID: 17370261 DOI: 10.1002/pmic.200600428] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Time series data on biochemical reactions reveal transient behavior, away from chemical equilibrium, and contain information on the dynamic interactions among reacting components. However, this information can be difficult to extract using conventional analysis techniques. We present a new method to infer biochemical pathway mechanisms from time course data using a global nonlinear modeling technique to identify the elementary reaction steps which constitute the pathway. The method involves the generation of a complete dictionary of polynomial basis functions based on the law of mass action. Using these basis functions, there are two approaches to model construction, namely the general to specific and the specific to general approach. We demonstrate that our new methodology reconstructs the chemical reaction steps and connectivity of the glycolytic pathway of Lactococcus lactis from time course experimental data.
Collapse
Affiliation(s)
- Jeyaraman Srividhya
- Indiana University School of Informatics and Biocomplexity Institute, Bloomington, IN 47406, USA
| | | | | | | |
Collapse
|
493
|
Loscalzo J, Kohane I, Barabasi AL. Human disease classification in the postgenomic era: a complex systems approach to human pathobiology. Mol Syst Biol 2007; 3:124. [PMID: 17625512 PMCID: PMC1948102 DOI: 10.1038/msb4100163] [Citation(s) in RCA: 375] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2007] [Accepted: 05/04/2007] [Indexed: 02/06/2023] Open
Abstract
Contemporary classification of human disease derives from observational correlation between pathological analysis and clinical syndromes. Characterizing disease in this way established a nosology that has served clinicians well to the current time, and depends on observational skills and simple laboratory tools to define the syndromic phenotype. Yet, this time-honored diagnostic strategy has significant shortcomings that reflect both a lack of sensitivity in identifying preclinical disease, and a lack of specificity in defining disease unequivocally. In this paper, we focus on the latter limitation, viewing it as a reflection both of the different clinical presentations of many diseases (variable phenotypic expression), and of the excessive reliance on Cartesian reductionism in establishing diagnoses. The purpose of this perspective is to provide a logical basis for a new approach to classifying human disease that uses conventional reductionism and incorporates the non-reductionist approach of systems biomedicine.
Collapse
Affiliation(s)
- Joseph Loscalzo
- Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA.
| | | | | |
Collapse
|
494
|
Abstract
The authors analyze the question of whether heart transplantation still has a role in the current era of complex technologies. To achieve this objective, the authors first discuss the known benefits of different therapeutic modalities currently available for patients who have end-stage heart failure, including pharmacologic management, electrophysiologic therapies, high-risk surgical strategies, implantation of mechanical circulatory support device therapy, and heart transplantation. The authors then evaluate the current developments and future perspectives in the field that may influence the likelihood of heart transplantation to remain the therapeutic modality of choice for end-stage heart failure.
Collapse
Affiliation(s)
- Martin Cadeiras
- College of Physicians and Surgeons, Columbia University, New York, NY 10032, USA
| | | | | |
Collapse
|
495
|
Amato F, Cosentino C, Curatola W, di Bernardo D. LMI-based Algorithm for the Reconstruction of Biological Networks. ACTA ACUST UNITED AC 2007. [DOI: 10.1109/acc.2007.4282913] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
496
|
A model-based optimization framework for the inference of regulatory interactions using time-course DNA microarray expression data. BMC Bioinformatics 2007; 8:228. [PMID: 17603872 PMCID: PMC1940027 DOI: 10.1186/1471-2105-8-228] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2006] [Accepted: 06/29/2007] [Indexed: 12/02/2022] Open
Abstract
Background Proteins are the primary regulatory agents of transcription even though mRNA expression data alone, from systems like DNA microarrays, are widely used. In addition, the regulation process in genetic systems is inherently non-linear in nature, and most studies employ a time-course analysis of mRNA expression. These considerations should be taken into account in the development of methods for the inference of regulatory interactions in genetic networks. Results We use an S-system based model for the transcription and translation process. We propose an optimization-based regulatory network inference approach that uses time-varying data from DNA microarray analysis. Currently, this seems to be the only model-based method that can be used for the analysis of time-course "relative" expressions (expression ratios). We perform an analysis of the dynamic behavior of the system when the number of experimental samples available is varied, when there are different levels of noise in the data and when there are genes that are not considered by the experimenter. Our studies show that the principal factor affecting the ability of a method to infer interactions correctly is the similarity in the time profiles of some or all the genes. The less similar the profiles are to each other the easier it is to infer the interactions. We propose a heuristic method for resolving networks and show that it displays reasonable performance on a synthetic network. Finally, we validate our approach using real experimental data for a chosen subset of genes involved in the sporulation cascade of Bacillus anthracis. We show that the method captures most of the important known interactions between the chosen genes. Conclusion The performance of any inference method for regulatory interactions between genes depends on the noise in the data, the existence of unknown genes affecting the network genes, and the similarity in the time profiles of some or all genes. Though subject to these issues, the inference method proposed in this paper would be useful because of its ability to infer important interactions, the fact that it can be used with time-course DNA microarray data and because it is based on a non-linear model of the process that explicitly accounts for the regulatory role of proteins.
Collapse
|
497
|
Stender JD, Frasor J, Komm B, Chang KCN, Kraus WL, Katzenellenbogen BS. Estrogen-regulated gene networks in human breast cancer cells: involvement of E2F1 in the regulation of cell proliferation. Mol Endocrinol 2007; 21:2112-23. [PMID: 17550982 DOI: 10.1210/me.2006-0474] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Estrogens generally stimulate the proliferation of estrogen receptor (ER)-containing breast cancer cells, but they also suppress proliferation of some ER-positive breast tumors. Using a genome-wide analysis of gene expression in two ER-positive human breast cancer cell lines that differ in their proliferative response to estrogen, we sought to identify genes involved in estrogen-regulated cell proliferation. To this end, we compared the transcriptional profiles of MCF-7 and MDA-MB-231ER+ cells, which have directionally opposite 17beta-estradiol (E2)-dependent proliferation patterns, MCF-7 cells being stimulated and 231ER+ cells suppressed by E2. We identified a set of approximately 70 genes regulated by E2 in both cells, with most being regulated by hormone in an opposite fashion. Using a variety of bioinformatics approaches, we found the E2F binding site to be overrepresented in the potential regulatory regions of many cell cycle-related genes stimulated by estrogen in MCF-7 but inhibited by estrogen in 231ER+ cells. Biochemical analyses confirmed that E2F1 and E2F downstream target genes were increased in MCF-7 and decreased in 231ER+ cells upon estrogen treatment. Furthermore, RNA interference-mediated knockdown of E2F1 blocked estrogen regulation of E2F1 target genes and resulted in loss of estrogen regulation of proliferation. These results demonstrate that regulation by estrogen of E2F1, and subsequently its downstream target genes, is critical for hormone regulation of the proliferative program of these breast cancer cells, and that gene expression profiling combined with bioinformatic analyses of transcription factor binding site enrichment in regulated genes can identify key components associated with nuclear receptor hormonal regulation of important cellular functions.
Collapse
Affiliation(s)
- Joshua D Stender
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801-3704, USA
| | | | | | | | | | | |
Collapse
|
498
|
Tung TQ, Ryu T, Lee KH, Lee D. Inferring Gene Regulatory Networks from Microarray Time Series Data Using Transfer Entropy. ACTA ACUST UNITED AC 2007. [DOI: 10.1109/cbms.2007.60] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
499
|
Timme M. Revealing network connectivity from response dynamics. PHYSICAL REVIEW LETTERS 2007; 98:224101. [PMID: 17677845 DOI: 10.1103/physrevlett.98.224101] [Citation(s) in RCA: 132] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2006] [Indexed: 05/03/2023]
Abstract
We present a method to infer the complete connectivity of a network from its stable response dynamics. As a paradigmatic example, we consider networks of coupled phase oscillators and explicitly study their long-term stationary response to temporally constant driving. For a given driving condition, measuring the phase differences and the collective frequency reveals information about how the units are interconnected. Sufficiently many repetitions for different driving conditions yield the entire network connectivity (the absence or presence of each connection) from measuring the response dynamics only. For sparsely connected networks, we obtain good predictions of the actual connectivity even for formally underdetermined problems.
Collapse
Affiliation(s)
- Marc Timme
- Network Dynamics Group, Max Planck Institute for Dynamics and Self-Organization, and Bernstein Center for Computational Neuroscience, Bunsenstrasse 10, 37073 Göttingen, Germany
| |
Collapse
|
500
|
Gennemark P, Wedelin D. Efficient algorithms for ordinary differential equation model identification of biological systems. IET Syst Biol 2007; 1:120-9. [PMID: 17441553 DOI: 10.1049/iet-syb:20050098] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Algorithms for parameter estimation and model selection that identify both the structure and the parameters of an ordinary differential equation model from experimental data are presented. The work presented here focuses on the case of an unknown structure and some time course information available for every variable to be analysed, and this is exploited to make the algorithms as efficient as possible. The algorithms are designed to handle problems of realistic size, where reactions can be nonlinear in the parameters and where data can be sparse and noisy. To achieve computational efficiency, parameters are mostly estimated for one equation at a time, giving a fast and accurate parameter estimation algorithm compared with other algorithms in the literature. The model selection is done with an efficient heuristic search algorithm, where the structure is built incrementally. Two test systems are used that have previously been used to evaluate identification algorithms, a metabolic pathway and a genetic network. Both test systems were successfully identified by using a reasonable amount of simulated data. Besides, measurement noise of realistic levels can be handled. In comparison to other methods that were used for these test systems, the main strengths of the presented algorithms are that a fully specified model, and not only a structure, is identified, and that they are considerably faster compared with other identification algorithms.
Collapse
Affiliation(s)
- P Gennemark
- Department of Computer Science and Engineering, Chalmers University of Technology, Göteborg SE-412 96, Sweden.
| | | |
Collapse
|