151
|
Inferring causal networks using fuzzy cognitive maps and evolutionary algorithms with application to gene regulatory network reconstruction. Appl Soft Comput 2015. [DOI: 10.1016/j.asoc.2015.08.039] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
152
|
Guner U, Jang H, Realff MJ, Lee JH. An Extended Constrained Total Least-Squares Method for the Identification of Genetic Networks from Noisy Measurements. Ind Eng Chem Res 2015. [DOI: 10.1021/acs.iecr.5b01418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Ugur Guner
- School
of Chemical and Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Hong Jang
- Department
of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon 305-701, Republic of Korea
| | - Matthew J. Realff
- School
of Chemical and Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Jay H. Lee
- Department
of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon 305-701, Republic of Korea
| |
Collapse
|
153
|
Takenaka Y, Seno S, Matsuda H. Detecting shifts in gene regulatory networks during time-course experiments at single-time-point temporal resolution. J Bioinform Comput Biol 2015; 13:1543002. [PMID: 26508425 DOI: 10.1142/s0219720015430027] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Comprehensively understanding the dynamics of biological systems is one of the greatest challenges in biology. Vastly improved biological technologies have provided vast amounts of information that must be understood by bioinformatics and systems biology researchers. Gene regulations have been frequently modeled by ordinary differential equations or graphical models based on time-course gene expression profiles. The state-of-the-art computational approaches for analyzing gene regulations assume that their models are same throughout time-course experiments. However, these approaches cannot easily analyze transient changes at a time point, such as diauxic shift. We propose a score that analyzes the gene regulations at each time point. The score is based on the information gains of information criterion values. The method detects the shifts in gene regulatory networks (GRNs) during time-course experiments with single-time-point resolution. The effectiveness of the method is evaluated on the diauxic shift from glucose to lactose in Escherichia coli. Gene regulation shifts were detected at two time points: the first corresponding to the time at which the growth of E. coli ceased and the second corresponding to the end of the experiment, when the nutrient sources (glucose and lactose) had become exhausted. According to these results, the proposed score and method can appropriately detect the time of gene regulation shifts. The method based on the proposed score provides a new tool for analyzing dynamic biological systems. Because the score value indicates the strength of gene regulation at each time point in a gene expression profile, it can potentially infer hidden GRNs from time-course experiments.
Collapse
Affiliation(s)
- Yoichi Takenaka
- Graduate School of Information Science and Technology, Osaka University, Yamadaoka 1-5, Suita, Osaka, Japan
| | - Shigeto Seno
- Graduate School of Information Science and Technology, Osaka University, Yamadaoka 1-5, Suita, Osaka, Japan
| | - Hideo Matsuda
- Graduate School of Information Science and Technology, Osaka University, Yamadaoka 1-5, Suita, Osaka, Japan
| |
Collapse
|
154
|
Liu LZ, Wu FX, Zhang WJ. Properties of sparse penalties on inferring gene regulatory networks from time-course gene expression data. IET Syst Biol 2015; 9:16-24. [PMID: 25569860 DOI: 10.1049/iet-syb.2013.0060] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Genes regulate each other and form a gene regulatory network (GRN) to realise biological functions. Elucidating GRN from experimental data remains a challenging problem in systems biology. Numerous techniques have been developed and sparse linear regression methods become a promising approach to infer accurate GRNs. However, most linear methods are either based on steady-state gene expression data or their statistical properties are not analysed. Here, two sparse penalties, adaptive least absolute shrinkage and selection operator and smoothly clipped absolute deviation, are proposed to infer GRNs from time-course gene expression data based on an auto-regressive model and their Oracle properties are proved under mild conditions. The effectiveness of those methods is demonstrated by applications to in silico and real biological data.
Collapse
Affiliation(s)
- Li-Zhi Liu
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| | - Fang-Xiang Wu
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada.
| | - Wen-Jun Zhang
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
155
|
Woo JH, Shimoni Y, Yang WS, Subramaniam P, Iyer A, Nicoletti P, Rodríguez Martínez M, López G, Mattioli M, Realubit R, Karan C, Stockwell BR, Bansal M, Califano A. Elucidating Compound Mechanism of Action by Network Perturbation Analysis. Cell 2015; 162:441-451. [PMID: 26186195 DOI: 10.1016/j.cell.2015.05.056] [Citation(s) in RCA: 273] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Revised: 02/17/2015] [Accepted: 05/28/2015] [Indexed: 01/01/2023]
Abstract
Genome-wide identification of the mechanism of action (MoA) of small-molecule compounds characterizing their targets, effectors, and activity modulators represents a highly relevant yet elusive goal, with critical implications for assessment of compound efficacy and toxicity. Current approaches are labor intensive and mostly limited to elucidating high-affinity binding target proteins. We introduce a regulatory network-based approach that elucidates genome-wide MoA proteins based on the assessment of the global dysregulation of their molecular interactions following compound perturbation. Analysis of cellular perturbation profiles identified established MoA proteins for 70% of the tested compounds and elucidated novel proteins that were experimentally validated. Finally, unknown-MoA compound analysis revealed altretamine, an anticancer drug, as an inhibitor of glutathione peroxidase 4 lipid repair activity, which was experimentally confirmed, thus revealing unexpected similarity to the activity of sulfasalazine. This suggests that regulatory network analysis can provide valuable mechanistic insight into the elucidation of small-molecule MoA and compound similarity.
Collapse
Affiliation(s)
- Jung Hoon Woo
- Department of Biomedical Informatics (DBMI), Columbia University, New York, NY 10032, USA
| | - Yishai Shimoni
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY 10032, USA
| | - Wan Seok Yang
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
| | - Prem Subramaniam
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY 10032, USA
| | - Archana Iyer
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY 10032, USA
| | - Paola Nicoletti
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY 10032, USA
| | - María Rodríguez Martínez
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY 10032, USA
| | - Gonzalo López
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY 10032, USA
| | - Michela Mattioli
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia (IIT), 20139 Milano, Italy
| | - Ronald Realubit
- Columbia Genome Center, High Throughput Screening Facility, Columbia University, New York, NY 10032, USA
| | - Charles Karan
- Columbia Genome Center, High Throughput Screening Facility, Columbia University, New York, NY 10032, USA
| | - Brent R Stockwell
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biological Sciences, Columbia University, New York, NY 10027, USA; Department of Chemistry, Columbia University, New York, NY 10027, USA; Howard Hughes Medical Institute, Columbia University, New York, NY 10027, USA
| | - Mukesh Bansal
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY 10032, USA.
| | - Andrea Califano
- Department of Biomedical Informatics (DBMI), Columbia University, New York, NY 10032, USA; Department of Systems Biology, Columbia University, New York, NY 10032, USA; Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Institute for Cancer Genetics, Columbia University, New York, NY 10032, USA; Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
156
|
Discriminating direct and indirect connectivities in biological networks. Proc Natl Acad Sci U S A 2015; 112:12893-8. [PMID: 26420864 DOI: 10.1073/pnas.1507168112] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Reverse engineering of biological pathways involves an iterative process between experiments, data processing, and theoretical analysis. Despite concurrent advances in quality and quantity of data as well as computing resources and algorithms, difficulties in deciphering direct and indirect network connections are prevalent. Here, we adopt the notions of abstraction, emulation, benchmarking, and validation in the context of discovering features specific to this family of connectivities. After subjecting benchmark synthetic circuits to perturbations, we inferred the network connections using a combination of nonparametric single-cell data resampling and modular response analysis. Intriguingly, we discovered that recovered weights of specific network edges undergo divergent shifts under differential perturbations, and that the particular behavior is markedly different between topologies. Our results point to a conceptual advance for reverse engineering beyond weight inference. Investigating topological changes under differential perturbations may address the longstanding problem of discriminating direct and indirect connectivities in biological networks.
Collapse
|
157
|
Kim JR, Choo SM, Choi HS, Cho KH. Identification of Gene Networks with Time Delayed Regulation Based on Temporal Expression Profiles. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:1161-1168. [PMID: 26451827 DOI: 10.1109/tcbb.2015.2394312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
There are fundamental limitations in inferring the functional interaction structure of a gene (regulatory) network only from sequence information such as binding motifs. To overcome such limitations, various approaches have been developed to infer the functional interaction structure from expression profiles. However, most of them have not been so successful due to the experimental limitations and computational complexity. Hence, there is a pressing need to develop a simple but effective methodology that can systematically identify the functional interaction structure of a gene network from time-series expression profiles. In particular, we need to take into account the different time delay effects in gene regulation since they are ubiquitously present. We have considered a new experiment that measures the overall expression changes after a perturbation on a specific gene. Based on this experiment, we have proposed a new inference method that can take account of the time delay induced while the perturbation affects its primary target genes. Specifically, we have developed an algebraic equation from which we can identify the subnetwork structure around the perturbed gene. We have also analyzed the influence of time delay on the inferred network structure. The proposed method is particularly useful for identification of a gene network with small variations in the time delay of gene regulation.
Collapse
|
158
|
Farine DR, Whitehead H. Constructing, conducting and interpreting animal social network analysis. J Anim Ecol 2015; 84:1144-63. [PMID: 26172345 PMCID: PMC4973823 DOI: 10.1111/1365-2656.12418] [Citation(s) in RCA: 500] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2015] [Accepted: 06/25/2015] [Indexed: 11/27/2022]
Abstract
1. Animal social networks are descriptions of social structure which, aside from their intrinsic interest for understanding sociality, can have significant bearing across many fields of biology. 2. Network analysis provides a flexible toolbox for testing a broad range of hypotheses, and for describing the social system of species or populations in a quantitative and comparable manner. However, it requires careful consideration of underlying assumptions, in particular differentiating real from observed networks and controlling for inherent biases that are common in social data. 3. We provide a practical guide for using this framework to analyse animal social systems and test hypotheses. First, we discuss key considerations when defining nodes and edges, and when designing methods for collecting data. We discuss different approaches for inferring social networks from these data and displaying them. We then provide an overview of methods for quantifying properties of nodes and networks, as well as for testing hypotheses concerning network structure and network processes. Finally, we provide information about assessing the power and accuracy of an observed network. 4. Alongside this manuscript, we provide appendices containing background information on common programming routines and worked examples of how to perform network analysis using the r programming language. 5. We conclude by discussing some of the major current challenges in social network analysis and interesting future directions. In particular, we highlight the under-exploited potential of experimental manipulations on social networks to address research questions.
Collapse
Affiliation(s)
- Damien R Farine
- Department of Zoology, Edward Grey Institute of Field Ornithology, University of Oxford, South Parks Road, Oxford, OX1 3PS, UK
- Department of Anthropology (Evolutionary), University of California Davis, 1 Shields Avenue, Davis, CA, 95616, USA
- Smithsonian Tropical Research Institute, Ancon, Panama
| | - Hal Whitehead
- Department of Biology, Dalhousie University, 1355 Oxford St, Halifax, NS, Canada, B3H 4J1
| |
Collapse
|
159
|
Zhang W, Zhou T. A Sparse Reconstruction Approach for Identifying Gene Regulatory Networks Using Steady-State Experiment Data. PLoS One 2015. [PMID: 26207991 PMCID: PMC4514654 DOI: 10.1371/journal.pone.0130979] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Motivation Identifying gene regulatory networks (GRNs) which consist of a large number of interacting units has become a problem of paramount importance in systems biology. Situations exist extensively in which causal interacting relationships among these units are required to be reconstructed from measured expression data and other a priori information. Though numerous classical methods have been developed to unravel the interactions of GRNs, these methods either have higher computing complexities or have lower estimation accuracies. Note that great similarities exist between identification of genes that directly regulate a specific gene and a sparse vector reconstruction, which often relates to the determination of the number, location and magnitude of nonzero entries of an unknown vector by solving an underdetermined system of linear equations y = Φx. Based on these similarities, we propose a novel framework of sparse reconstruction to identify the structure of a GRN, so as to increase accuracy of causal regulation estimations, as well as to reduce their computational complexity. Results In this paper, a sparse reconstruction framework is proposed on basis of steady-state experiment data to identify GRN structure. Different from traditional methods, this approach is adopted which is well suitable for a large-scale underdetermined problem in inferring a sparse vector. We investigate how to combine the noisy steady-state experiment data and a sparse reconstruction algorithm to identify causal relationships. Efficiency of this method is tested by an artificial linear network, a mitogen-activated protein kinase (MAPK) pathway network and the in silico networks of the DREAM challenges. The performance of the suggested approach is compared with two state-of-the-art algorithms, the widely adopted total least-squares (TLS) method and those available results on the DREAM project. Actual results show that, with a lower computational cost, the proposed method can significantly enhance estimation accuracy and greatly reduce false positive and negative errors. Furthermore, numerical calculations demonstrate that the proposed algorithm may have faster convergence speed and smaller fluctuation than other methods when either estimate error or estimate bias is considered.
Collapse
Affiliation(s)
- Wanhong Zhang
- School of Chemical Machinery, Qinghai University, Qinghai, China
- Department of Automation, Tsinghua University, Beijing, China
- * E-mail:
| | - Tong Zhou
- School of Chemical Machinery, Qinghai University, Qinghai, China
- Tsinghua National Laboratory for Information Science and Technology(TNList), Tsinghua University, Beijing, China
| |
Collapse
|
160
|
Nair A, Chetty M, Wangikar PP. Improving gene regulatory network inference using network topology information. MOLECULAR BIOSYSTEMS 2015; 11:2449-63. [PMID: 26126758 DOI: 10.1039/c5mb00122f] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Inferring the gene regulatory network (GRN) structure from data is an important problem in computational biology. However, it is a computationally complex problem and approximate methods such as heuristic search techniques, restriction of the maximum-number-of-parents (maxP) for a gene, or an optimal search under special conditions are required. The limitations of a heuristic search are well known but literature on the detailed analysis of the widely used maxP technique is lacking. The optimal search methods require large computational time. We report the theoretical analysis and experimental results of the strengths and limitations of the maxP technique. Further, using an optimal search method, we combine the strengths of the maxP technique and the known GRN topology to propose two novel algorithms. These algorithms are implemented in a Bayesian network framework and tested on biological, realistic, and in silico networks of different sizes and topologies. They overcome the limitations of the maxP technique and show superior computational speed when compared to the current optimal search algorithms.
Collapse
Affiliation(s)
- Ajay Nair
- IITB-Monash Research Academy, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India.
| | | | | |
Collapse
|
161
|
Lin W, Wang Y, Ying H, Lai YC, Wang X. Consistency between functional and structural networks of coupled nonlinear oscillators. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2015; 92:012912. [PMID: 26274252 DOI: 10.1103/physreve.92.012912] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2015] [Indexed: 06/04/2023]
Abstract
In data-based reconstruction of complex networks, dynamical information can be measured and exploited to generate a functional network, but is it a true representation of the actual (structural) network? That is, when do the functional and structural networks match and is a perfect matching possible? To address these questions, we use coupled nonlinear oscillator networks and investigate the transition in the synchronization dynamics to identify the conditions under which the functional and structural networks are best matched. We find that, as the coupling strength is increased in the weak-coupling regime, the consistency between the two networks first increases and then decreases, reaching maximum in an optimal coupling regime. Moreover, by changing the network structure, we find that both the optimal regime and the maximum consistency will be affected. In particular, the consistency for heterogeneous networks is generally weaker than that for homogeneous networks. Based on the stability of the functional network, we propose further an efficient method to identify the optimal coupling regime in realistic situations where the detailed information about the network structure, such as the network size and the number of edges, is not available. Two real-world examples are given: corticocortical network of cat brain and the Nepal power grid. Our results provide new insights not only into the fundamental interplay between network structure and dynamics but also into the development of methodologies to reconstruct complex networks from data.
Collapse
Affiliation(s)
- Weijie Lin
- Department of Physics, Zhejiang University, Hangzhou 310027, China
- School of Physics and Information Technology, Shaanxi Normal University, Xi'an 710062, China
| | - Yafeng Wang
- School of Physics and Information Technology, Shaanxi Normal University, Xi'an 710062, China
- Institute of Theoretical & Computational Physics, Shaanxi Normal University, Xi'an 710062, China
| | - Heping Ying
- Department of Physics, Zhejiang University, Hangzhou 310027, China
| | - Ying-Cheng Lai
- School of Electrical, Computer, and Energy Engineering, Arizona State University, Tempe, Arizona 85287, USA
| | - Xingang Wang
- School of Physics and Information Technology, Shaanxi Normal University, Xi'an 710062, China
- Institute of Theoretical & Computational Physics, Shaanxi Normal University, Xi'an 710062, China
| |
Collapse
|
162
|
Shao B, Liu X, Zhang D, Wu J, Ouyang Q. From Boolean Network Model to Continuous Model Helps in Design of Functional Circuits. PLoS One 2015; 10:e0128630. [PMID: 26061094 PMCID: PMC4464762 DOI: 10.1371/journal.pone.0128630] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2015] [Accepted: 04/29/2015] [Indexed: 11/19/2022] Open
Abstract
Computational circuit design with desired functions in a living cell is a challenging task in synthetic biology. To achieve this task, numerous methods that either focus on small scale networks or use evolutionary algorithms have been developed. Here, we propose a two-step approach to facilitate the design of functional circuits. In the first step, the search space of possible topologies for target functions is reduced by reverse engineering using a Boolean network model. In the second step, continuous simulation is applied to evaluate the performance of these topologies. We demonstrate the usefulness of this method by designing an example biological function: the SOS response of E. coli. Our numerical results show that the desired function can be faithfully reproduced by candidate networks with different parameters and initial conditions. Possible circuits are ranked according to their robustness against perturbations in parameter and gene expressions. The biological network is among the candidate networks, yet novel designs can be generated. Our method provides a scalable way to design robust circuits that can achieve complex functions, and makes it possible to uncover design principles of biological networks.
Collapse
Affiliation(s)
- Bin Shao
- The State Key Laboratory for Artificial Microstructures and Mesoscopic Physics, School of Physics, Peking University, Beijing, China
- The Center for Quantitative Biology and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Xiang Liu
- The Center for Quantitative Biology and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Dongliang Zhang
- The State Key Laboratory for Artificial Microstructures and Mesoscopic Physics, School of Physics, Peking University, Beijing, China
| | - Jiayi Wu
- The State Key Laboratory for Artificial Microstructures and Mesoscopic Physics, School of Physics, Peking University, Beijing, China
| | - Qi Ouyang
- The State Key Laboratory for Artificial Microstructures and Mesoscopic Physics, School of Physics, Peking University, Beijing, China
- The Center for Quantitative Biology and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
- * E-mail:
| |
Collapse
|
163
|
Inferring regulatory networks from experimental morphological phenotypes: a computational method reverse-engineers planarian regeneration. PLoS Comput Biol 2015; 11:e1004295. [PMID: 26042810 PMCID: PMC4456145 DOI: 10.1371/journal.pcbi.1004295] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Accepted: 04/21/2015] [Indexed: 01/18/2023] Open
Abstract
Transformative applications in biomedicine require the discovery of complex regulatory networks that explain the development and regeneration of anatomical structures, and reveal what external signals will trigger desired changes of large-scale pattern. Despite recent advances in bioinformatics, extracting mechanistic pathway models from experimental morphological data is a key open challenge that has resisted automation. The fundamental difficulty of manually predicting emergent behavior of even simple networks has limited the models invented by human scientists to pathway diagrams that show necessary subunit interactions but do not reveal the dynamics that are sufficient for complex, self-regulating pattern to emerge. To finally bridge the gap between high-resolution genetic data and the ability to understand and control patterning, it is critical to develop computational tools to efficiently extract regulatory pathways from the resultant experimental shape phenotypes. For example, planarian regeneration has been studied for over a century, but despite increasing insight into the pathways that control its stem cells, no constructive, mechanistic model has yet been found by human scientists that explains more than one or two key features of its remarkable ability to regenerate its correct anatomical pattern after drastic perturbations. We present a method to infer the molecular products, topology, and spatial and temporal non-linear dynamics of regulatory networks recapitulating in silico the rich dataset of morphological phenotypes resulting from genetic, surgical, and pharmacological experiments. We demonstrated our approach by inferring complete regulatory networks explaining the outcomes of the main functional regeneration experiments in the planarian literature; By analyzing all the datasets together, our system inferred the first systems-biology comprehensive dynamical model explaining patterning in planarian regeneration. This method provides an automated, highly generalizable framework for identifying the underlying control mechanisms responsible for the dynamic regulation of growth and form. Developmental and regenerative biology experiments are producing a huge number of morphological phenotypes from functional perturbation experiments. However, existing pathway models do not generally explain the dynamic regulation of anatomical shape due to the difficulty of inferring and testing non-linear regulatory networks responsible for appropriate form, shape, and pattern. We present a method that automates the discovery and testing of regulatory networks explaining morphological outcomes directly from the resultant phenotypes, producing network models as testable hypotheses explaining regeneration data. Our system integrates a formalization of the published results in planarian regeneration, an in silico simulator in which the patterning properties of regulatory networks can be quantitatively tested in a regeneration assay, and a machine learning module that evolves networks whose behavior in this assay optimally matches the database of planarian results. We applied our method to explain the key experiments in planarian regeneration, and discovered the first comprehensive model of anterior-posterior patterning in planaria under surgical, pharmacological, and genetic manipulations. Beyond the planarian data, our approach is readily generalizable to facilitate the discovery of testable regulatory networks in developmental biology and biomedicine, and represents the first developmental model discovered de novo from morphological outcomes by an automated system.
Collapse
|
164
|
Matos MRA, Knapp B, Kaderali L. lpNet: a linear programming approach to reconstruct signal transduction networks. Bioinformatics 2015; 31:3231-3. [PMID: 26026168 DOI: 10.1093/bioinformatics/btv327] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2014] [Accepted: 05/19/2015] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED With the widespread availability of high-throughput experimental technologies it has become possible to study hundreds to thousands of cellular factors simultaneously, such as coding- or non-coding mRNA or protein concentrations. Still, extracting information about the underlying regulatory or signaling interactions from these data remains a difficult challenge. We present a flexible approach towards network inference based on linear programming. Our method reconstructs the interactions of factors from a combination of perturbation/non-perturbation and steady-state/time-series data. We show both on simulated and real data that our methods are able to reconstruct the underlying networks fast and efficiently, thus shedding new light on biological processes and, in particular, into disease's mechanisms of action. We have implemented the approach as an R package available through bioconductor. AVAILABILITY AND IMPLEMENTATION This R package is freely available under the Gnu Public License (GPL-3) from bioconductor.org (http://bioconductor.org/packages/release/bioc/html/lpNet.html) and is compatible with most operating systems (Windows, Linux, Mac OS) and hardware architectures. CONTACT bettina.knapp@helmholtz-muenchen.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marta R A Matos
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany and
| | - Bettina Knapp
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany and Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany
| | - Lars Kaderali
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany and
| |
Collapse
|
165
|
Inferring Broad Regulatory Biology from Time Course Data: Have We Reached an Upper Bound under Constraints Typical of In Vivo Studies? PLoS One 2015; 10:e0127364. [PMID: 25984725 PMCID: PMC4435750 DOI: 10.1371/journal.pone.0127364] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 04/13/2015] [Indexed: 12/21/2022] Open
Abstract
There is a growing appreciation for the network biology that regulates the coordinated expression of molecular and cellular markers however questions persist regarding the identifiability of these networks. Here we explore some of the issues relevant to recovering directed regulatory networks from time course data collected under experimental constraints typical of in vivo studies. NetSim simulations of sparsely connected biological networks were used to evaluate two simple feature selection techniques used in the construction of linear Ordinary Differential Equation (ODE) models, namely truncation of terms versus latent vector projection. Performance was compared with ODE-based Time Series Network Identification (TSNI) integral, and the information-theoretic Time-Delay ARACNE (TD-ARACNE). Projection-based techniques and TSNI integral outperformed truncation-based selection and TD-ARACNE on aggregate networks with edge densities of 10-30%, i.e. transcription factor, protein-protein cliques and immune signaling networks. All were more robust to noise than truncation-based feature selection. Performance was comparable on the in silico 10-node DREAM 3 network, a 5-node Yeast synthetic network designed for In vivo Reverse-engineering and Modeling Assessment (IRMA) and a 9-node human HeLa cell cycle network of similar size and edge density. Performance was more sensitive to the number of time courses than to sample frequency and extrapolated better to larger networks by grouping experiments. In all cases performance declined rapidly in larger networks with lower edge density. Limited recovery and high false positive rates obtained overall bring into question our ability to generate informative time course data rather than the design of any particular reverse engineering algorithm.
Collapse
|
166
|
Shao B, Wu J, Tian B, Ouyang Q. Minimum network constraint on reverse engineering to develop biological regulatory networks. J Theor Biol 2015; 380:9-15. [PMID: 25981630 DOI: 10.1016/j.jtbi.2015.05.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2014] [Revised: 04/23/2015] [Accepted: 05/04/2015] [Indexed: 12/20/2022]
Abstract
Reconstructing the topological structure of biological regulatory networks from microarray expression data or data of protein expression profiles is one of major tasks in systems biology. In recent years, various mathematical methods have been developed to meet this task. Here, based on our previously reported reverse engineering method, we propose a new constraint, i.e., the minimum network constraint, to facilitate the reconstruction of biological networks. Three well studied regulatory networks (the budding yeast cell cycle network, the fission yeast cell cycle network, and the SOS network of Escherichia coli) were used as the test sets to verify the performance of this method. Numerical results show that the biological networks prefer to use the minimal networks to fulfill their functional tasks, making it possible to apply minimal network criteria in the network reconstruction process. Two scenarios were considered in the reconstruction process: generating data using different initial conditions; and generating data from knock out and over-expression experiments. In both cases, network structures are revealed faithfully in a few steps using our approach.
Collapse
Affiliation(s)
- Bin Shao
- Center for Quantitative Biology and Peking-Tsinghua Center for Life Sciences at Peking University, Beijing 100871, China
| | - Jiayi Wu
- School of Physics and the State Key Laboratory for Artificial Microstructures and Mesoscopic Physics, Peking University, Beijing 100871, China
| | - Binghui Tian
- Center for Quantitative Biology and Peking-Tsinghua Center for Life Sciences at Peking University, Beijing 100871, China
| | - Qi Ouyang
- Center for Quantitative Biology and Peking-Tsinghua Center for Life Sciences at Peking University, Beijing 100871, China; School of Physics and the State Key Laboratory for Artificial Microstructures and Mesoscopic Physics, Peking University, Beijing 100871, China.
| |
Collapse
|
167
|
Grabow C, Grosskinsky S, Kurths J, Timme M. Collective relaxation dynamics of small-world networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2015; 91:052815. [PMID: 26066220 DOI: 10.1103/physreve.91.052815] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2015] [Indexed: 06/04/2023]
Abstract
Complex networks exhibit a wide range of collective dynamic phenomena, including synchronization, diffusion, relaxation, and coordination processes. Their asymptotic dynamics is generically characterized by the local Jacobian, graph Laplacian, or a similar linear operator. The structure of networks with regular, small-world, and random connectivities are reasonably well understood, but their collective dynamical properties remain largely unknown. Here we present a two-stage mean-field theory to derive analytic expressions for network spectra. A single formula covers the spectrum from regular via small-world to strongly randomized topologies in Watts-Strogatz networks, explaining the simultaneous dependencies on network size N, average degree k, and topological randomness q. We present simplified analytic predictions for the second-largest and smallest eigenvalue, and numerical checks confirm our theoretical predictions for zero, small, and moderate topological randomness q, including the entire small-world regime. For large q of the order of one, we apply standard random matrix theory, thereby overarching the full range from regular to randomized network topologies. These results may contribute to our analytic and mechanistic understanding of collective relaxation phenomena of network dynamical systems.
Collapse
Affiliation(s)
- Carsten Grabow
- Research Domain on Transdisciplinary Concepts and Methods, Potsdam Institute for Climate Impact Research, P.O. Box 60 12 03, 14412 Potsdam, Germany
- Network Dynamics, Max Planck Institute for Dynamics and Self-Organization (MPIDS), 37077 Göttingen, Germany
| | - Stefan Grosskinsky
- Mathematics Institute and Centre for Complexity Science, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Jürgen Kurths
- Research Domain on Transdisciplinary Concepts and Methods, Potsdam Institute for Climate Impact Research, P.O. Box 60 12 03, 14412 Potsdam, Germany
- Department of Physics, Humboldt University of Berlin, Newtonstr. 15, 12489 Berlin, Germany
- Institute for Complex Systems and Mathematical Biology, University of Aberdeen, Aberdeen AB24 3UE, United Kingdom
| | - Marc Timme
- Network Dynamics, Max Planck Institute for Dynamics and Self-Organization (MPIDS), 37077 Göttingen, Germany
- Institute for Nonlinear Dynamics, Faculty for Physics, Georg August University Göttingen, 37077 Göttingen, Germany
- Bernstein Center for Computational Neuroscience Göttingen, 37077 Göttingen, Germany
| |
Collapse
|
168
|
Albert R, Thakar J. Boolean modeling: a logic-based dynamic approach for understanding signaling and regulatory networks and for making useful predictions. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2015; 6:353-69. [PMID: 25269159 DOI: 10.1002/wsbm.1273] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The biomolecules inside or near cells form a complex interacting system. Cellular phenotypes and behaviors arise from the totality of interactions among the components of this system. A fruitful way of modeling interacting biomolecular systems is by network-based dynamic models that characterize each component by a state variable, and describe the change in the state variables due to the interactions in the system. Dynamic models can capture the stable state patterns of this interacting system and can connect them to different cell fates or behaviors. A Boolean or logic model characterizes each biomolecule by a binary state variable that relates the abundance of that molecule to a threshold abundance necessary for downstream processes. The regulation of this state variable is described in a parameter free manner, making Boolean modeling a practical choice for systems whose kinetic parameters have not been determined. Boolean models integrate the body of knowledge regarding the components and interactions of biomolecular systems, and capture the system's dynamic repertoire, for example the existence of multiple cell fates. These models were used for a variety of systems and led to important insights and predictions. Boolean models serve as an efficient exploratory model, a guide for follow-up experiments, and as a foundation for more quantitative models.
Collapse
|
169
|
Bastiaens P, Birtwistle MR, Blüthgen N, Bruggeman FJ, Cho KH, Cosentino C, de la Fuente A, Hoek JB, Kiyatkin A, Klamt S, Kolch W, Legewie S, Mendes P, Naka T, Santra T, Sontag E, Westerhoff HV, Kholodenko BN. Silence on the relevant literature and errors in implementation. Nat Biotechnol 2015; 33:336-9. [PMID: 25850052 DOI: 10.1038/nbt.3185] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Affiliation(s)
- Philippe Bastiaens
- Department of Systemic Cell Biology, Max Planck Institute of Molecular Physiology, Dortmund, Germany
| | - Marc R Birtwistle
- Icahn School of Medicine at Mount Sinai, Dept. of Pharmacology and Systems Therapeutics, New York, New York, USA
| | - Nils Blüthgen
- 1] Institut für Pathologie Charite, Universitätsmedizin Berlin, Campus Mitte, Berlin, Germany. [2] Integrative Research Institute for the Life Sciences, Humboldt University Berlin, Berlin, Germany
| | | | - Kwang-Hyun Cho
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Yuseong-gu, Daejeon, Republic of Korea
| | - Carlo Cosentino
- Department of Experimental and Clinical Medicine, Magna Graecia University of Catanzaro, Campus Salvatore Venuta, Catanzaro, Italy
| | - Alberto de la Fuente
- Department of Biomathematics and Bioinformatics, Institute for Genetics and Biometry, Leibniz Institute for Farm Animal Biology, Dummerstorf, Mecklenburg-Vorpommern, Germany
| | - Jan B Hoek
- Department of Pathology, Anatomy and Cell Biology, Thomas Jefferson University, Philadelphia, Pennsylvania, USA
| | - Anatoly Kiyatkin
- Department of Physiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Steffen Klamt
- Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany
| | - Walter Kolch
- 1] Systems Biology Ireland, University College Dublin, Belfield, Dublin, Ireland. [2] Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, Dublin, Ireland. [3] School of Medicine and Medical Science, University College Dublin, Belfield, Dublin, Ireland
| | | | - Pedro Mendes
- 1] Manchester Centre for Integrative Systems Biology, Manchester Institute of Biotechnology, The University of Manchester, Manchester, UK. [2] School of Computer Science, Manchester Institute of Biotechnology, The University of Manchester, Manchester, UK. [3] Center for Quantitative Medicine, University of Connecticut Health Center, Farmington, Connecticut, USA
| | - Takashi Naka
- Faculty of Information Science, Kyushu Sangyo University, Higashi-ku, Fukuoka, Japan
| | - Tapesh Santra
- Systems Biology Ireland, University College Dublin, Belfield, Dublin, Ireland
| | - Eduardo Sontag
- Department of Mathematics and Cancer Institute of New Jersey, Rutgers University, Piscataway, New Jersey, USA
| | - Hans V Westerhoff
- 1] Manchester Centre for Integrative Systems Biology, Manchester Institute of Biotechnology, The University of Manchester, Manchester, UK. [2] Molecular Cell Physiology, VU University, Amsterdam, The Netherlands. [3] Synthetic Systems Biology, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, The Netherlands
| | - Boris N Kholodenko
- 1] Systems Biology Ireland, University College Dublin, Belfield, Dublin, Ireland. [2] Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, Dublin, Ireland. [3] School of Medicine and Medical Science, University College Dublin, Belfield, Dublin, Ireland
| |
Collapse
|
170
|
Zhang X, Zhao J, Hao JK, Zhao XM, Chen L. Conditional mutual inclusive information enables accurate quantification of associations in gene regulatory networks. Nucleic Acids Res 2015; 43:e31. [PMID: 25539927 PMCID: PMC4357691 DOI: 10.1093/nar/gku1315] [Citation(s) in RCA: 98] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Revised: 12/03/2014] [Accepted: 12/05/2014] [Indexed: 11/13/2022] Open
Abstract
Mutual information (MI), a quantity describing the nonlinear dependence between two random variables, has been widely used to construct gene regulatory networks (GRNs). Despite its good performance, MI cannot separate the direct regulations from indirect ones among genes. Although the conditional mutual information (CMI) is able to identify the direct regulations, it generally underestimates the regulation strength, i.e. it may result in false negatives when inferring gene regulations. In this work, to overcome the problems, we propose a novel concept, namely conditional mutual inclusive information (CMI2), to describe the regulations between genes. Furthermore, with CMI2, we develop a new approach, namely CMI2NI (CMI2-based network inference), for reverse-engineering GRNs. In CMI2NI, CMI2 is used to quantify the mutual information between two genes given a third one through calculating the Kullback-Leibler divergence between the postulated distributions of including and excluding the edge between the two genes. The benchmark results on the GRNs from DREAM challenge as well as the SOS DNA repair network in Escherichia coli demonstrate the superior performance of CMI2NI. Specifically, even for gene expression data with small sample size, CMI2NI can not only infer the correct topology of the regulation networks but also accurately quantify the regulation strength between genes. As a case study, CMI2NI was also used to reconstruct cancer-specific GRNs using gene expression data from The Cancer Genome Atlas (TCGA). CMI2NI is freely accessible at http://www.comp-sysbio.org/cmi2ni.
Collapse
Affiliation(s)
- Xiujun Zhang
- Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China Department of Mathematics, Xinyang Normal University, Xinyang 464000, China School of Chemical and Biomedical Engineering, Nanyang Technological University, Singapore 637459, Singapore
| | - Juan Zhao
- Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Jin-Kao Hao
- LERIA, Department of Computer Science, University of Angers, Angers 49045, France
| | - Xing-Ming Zhao
- Department of Computer Science, School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
| | - Luonan Chen
- Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China Collaborative Research Center for Innovative Mathematical Modelling, Institute of Industrial Science, University of Tokyo, Tokyo 153-8505, Japan
| |
Collapse
|
171
|
Gong W, Koyano-Nakagawa N, Li T, Garry DJ. Inferring dynamic gene regulatory networks in cardiac differentiation through the integration of multi-dimensional data. BMC Bioinformatics 2015; 16:74. [PMID: 25887857 PMCID: PMC4359553 DOI: 10.1186/s12859-015-0460-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2014] [Accepted: 01/12/2015] [Indexed: 02/07/2023] Open
Abstract
Background Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. Results We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p <10−100), suggesting that a high LR score is a reliable indicator for functional TF binding sites. Our network inference model identified a region with an elevated LR score approximately −9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (−9435 to −8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP-CM transitions. Conclusion We report a novel method to systematically integrate multi-dimensional -omics data and reconstruct the gene regulatory networks. This method will allow one to rapidly determine the cis-modules that regulate key genes during cardiac differentiation. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0460-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wuming Gong
- Lillehei Heart Institute, University of Minnesota, 2231 6th St S.E, 4-165 CCRB, Minneapolis, MN, 55114, USA.
| | - Naoko Koyano-Nakagawa
- Lillehei Heart Institute, University of Minnesota, 2231 6th St S.E, 4-165 CCRB, Minneapolis, MN, 55114, USA.
| | - Tongbin Li
- AccuraScience LLC, 5721 Merle Hay Road, Suite #16B, Johnston, IA, 50131, USA.
| | - Daniel J Garry
- Lillehei Heart Institute, University of Minnesota, 2231 6th St S.E, 4-165 CCRB, Minneapolis, MN, 55114, USA.
| |
Collapse
|
172
|
Linde J, Schulze S, Henkel SG, Guthke R. Data- and knowledge-based modeling of gene regulatory networks: an update. EXCLI JOURNAL 2015; 14:346-78. [PMID: 27047314 PMCID: PMC4817425 DOI: 10.17179/excli2015-168] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Accepted: 02/10/2015] [Indexed: 02/01/2023]
Abstract
Gene regulatory network inference is a systems biology approach which predicts interactions between genes with the help of high-throughput data. In this review, we present current and updated network inference methods focusing on novel techniques for data acquisition, network inference assessment, network inference for interacting species and the integration of prior knowledge. After the advance of Next-Generation-Sequencing of cDNAs derived from RNA samples (RNA-Seq) we discuss in detail its application to network inference. Furthermore, we present progress for large-scale or even full-genomic network inference as well as for small-scale condensed network inference and review advances in the evaluation of network inference methods by crowdsourcing. Finally, we reflect the current availability of data and prior knowledge sources and give an outlook for the inference of gene regulatory networks that reflect interacting species, in particular pathogen-host interactions.
Collapse
Affiliation(s)
- Jörg Linde
- Research Group Systems Biology / Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Beutenbergstr. 11a, 07745 Jena, Germany
| | - Sylvie Schulze
- Research Group Systems Biology / Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Beutenbergstr. 11a, 07745 Jena, Germany
| | | | - Reinhard Guthke
- Research Group Systems Biology / Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Beutenbergstr. 11a, 07745 Jena, Germany
| |
Collapse
|
173
|
Abstract
Behaviours of complex biomolecular systems are often irreducible to the elementary properties of their individual components. Explanatory and predictive mathematical models are therefore useful for fully understanding and precisely engineering cellular functions. The development and analyses of these models require their adaptation to the problems that need to be solved and the type and amount of available genetic or molecular data. Quantitative and logic modelling are among the main methods currently used to model molecular and gene networks. Each approach comes with inherent advantages and weaknesses. Recent developments show that hybrid approaches will become essential for further progress in synthetic biology and in the development of virtual organisms.
Collapse
Affiliation(s)
- Nicolas Le Novère
- Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK
| |
Collapse
|
174
|
Prabakaran S, Gunawardena J, Sontag E. Paradoxical results in perturbation-based signaling network reconstruction. Biophys J 2015; 106:2720-8. [PMID: 24940789 DOI: 10.1016/j.bpj.2014.04.031] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2014] [Revised: 04/18/2014] [Accepted: 04/23/2014] [Indexed: 11/29/2022] Open
Abstract
Mathematical models are extensively employed to understand physicochemical processes in biological systems. In the absence of detailed mechanistic knowledge, models are often based on network inference methods, which in turn rely upon perturbations to nodes by biochemical means. We have discovered a potential pitfall of the approach underpinning such methods when applied to signaling networks. We first show experimentally, and then explain mathematically, how even in the simplest signaling systems, perturbation methods may lead to paradoxical conclusions: for any given pair of two components X and Y, and depending upon the specific intervention on Y, either an activation or a repression of X could be inferred. This effect is of a different nature from incomplete network identification due to underdetermined data and is a phenomenon intrinsic to perturbations. Our experiments are performed in an in vitro minimal system, thus isolating the effect and showing that it cannot be explained by feedbacks due to unknown intermediates. Moreover, our in vitro system utilizes proteins from a pathway in mammalian (and other eukaryotic) cells that play a central role in proliferation, gene expression, differentiation, mitosis, cell survival, and apoptosis. This pathway is the perturbation target of contemporary therapies for various types of cancers. The results presented here show that the simplistic view of intracellular signaling networks being made up of activation and repression links is seriously misleading, and call for a fundamental rethinking of signaling network analysis and inference methods.
Collapse
Affiliation(s)
| | - Jeremy Gunawardena
- Department of Systems Biology, Harvard Medical School, Boston Massachusetts
| | - Eduardo Sontag
- Department of Mathematics & BioMaPs Institute for Quantitative Biology, Rutgers University, Piscataway, New Jersey.
| |
Collapse
|
175
|
Aghdam R, Ganjali M, Zhang X, Eslahchi C. CN: a consensus algorithm for inferring gene regulatory networks using the SORDER algorithm and conditional mutual information test. MOLECULAR BIOSYSTEMS 2015; 11:942-9. [PMID: 25607659 DOI: 10.1039/c4mb00413b] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Inferring Gene Regulatory Networks (GRNs) from gene expression data is a major challenge in systems biology. The Path Consistency (PC) algorithm is one of the popular methods in this field. However, as an order dependent algorithm, PC algorithm is not robust because it achieves different network topologies if gene orders are permuted. In addition, the performance of this algorithm depends on the threshold value used for independence tests. Consequently, selecting suitable sequential ordering of nodes and an appropriate threshold value for the inputs of PC algorithm are challenges to infer a good GRN. In this work, we propose a heuristic algorithm, namely SORDER, to find a suitable sequential ordering of nodes. Based on the SORDER algorithm and a suitable interval threshold for Conditional Mutual Information (CMI) tests, a network inference method, namely the Consensus Network (CN), has been developed. In the proposed method, for each edge of the complete graph, a weighted value is defined. This value is considered as the reliability value of dependency between two nodes. The final inferred network, obtained using the CN algorithm, contains edges with a reliability value of dependency of more than a defined threshold. The effectiveness of this method is benchmarked through several networks from the DREAM challenge and the widely used SOS DNA repair network in Escherichia coli. The results indicate that the CN algorithm is suitable for learning GRNs and it considerably improves the precision of network inference. The source of data sets and codes are available at .
Collapse
Affiliation(s)
- Rosa Aghdam
- Faculty of Mathematical Sciences, Department of Statistics, Shahid Beheshti University, G.C., Tehran, Iran.
| | | | | | | |
Collapse
|
176
|
Liu LZ, Wu FX, Zhang WJ. Properties of sparse penalties on inferring gene regulatory networks from time-course gene expression data. IET Syst Biol 2015. [PMID: 25569860 DOI: 10.1049/iet‐syb.2013.0060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Genes regulate each other and form a gene regulatory network (GRN) to realise biological functions. Elucidating GRN from experimental data remains a challenging problem in systems biology. Numerous techniques have been developed and sparse linear regression methods become a promising approach to infer accurate GRNs. However, most linear methods are either based on steady-state gene expression data or their statistical properties are not analysed. Here, two sparse penalties, adaptive least absolute shrinkage and selection operator and smoothly clipped absolute deviation, are proposed to infer GRNs from time-course gene expression data based on an auto-regressive model and their Oracle properties are proved under mild conditions. The effectiveness of those methods is demonstrated by applications to in silico and real biological data.
Collapse
Affiliation(s)
- Li-Zhi Liu
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| | - Fang-Xiang Wu
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada.
| | - Wen-Jun Zhang
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
177
|
Tjärnberg A, Nordling TEM, Studham M, Nelander S, Sonnhammer ELL. Avoiding pitfalls in L1-regularised inference of gene networks. MOLECULAR BIOSYSTEMS 2015; 11:287-96. [DOI: 10.1039/c4mb00419a] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
L1 regularisation methods fail to infer the correct network even when the data are so informative that all existing links can be proven to exist.
Collapse
Affiliation(s)
- Andreas Tjärnberg
- Stockholm Bioinformatics Centre
- Science for Life Laboratory
- 17121 Solna
- Sweden
- Department of Biochemistry and Biophysics
| | - Torbjörn E. M. Nordling
- Stockholm Bioinformatics Centre
- Science for Life Laboratory
- 17121 Solna
- Sweden
- Department of Immunology
| | - Matthew Studham
- Stockholm Bioinformatics Centre
- Science for Life Laboratory
- 17121 Solna
- Sweden
| | - Sven Nelander
- Department of Immunology
- Genetics and Pathology
- Uppsala University
- Rudbeck laboratory
- 75185 Uppsala
| | - Erik L. L. Sonnhammer
- Stockholm Bioinformatics Centre
- Science for Life Laboratory
- 17121 Solna
- Sweden
- Department of Biochemistry and Biophysics
| |
Collapse
|
178
|
Chan SC, Zhang L, Wu HC, Tsui KM. A Maximum A Posteriori Probability and Time-Varying Approach for Inferring Gene Regulatory Networks from Time Course Gene Microarray Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:123-135. [PMID: 26357083 DOI: 10.1109/tcbb.2014.2343951] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Unlike most conventional techniques with static model assumption, this paper aims to estimate the time-varying model parameters and identify significant genes involved at different timepoints from time course gene microarray data. We first formulate the parameter identification problem as a new maximum a posteriori probability estimation problem so that prior information can be incorporated as regularization terms to reduce the large estimation variance of the high dimensional estimation problem. Under this framework, sparsity and temporal consistency of the model parameters are imposed using L1-regularization and novel continuity constraints, respectively. The resulting problem is solved using the L-BFGS method with the initial guess obtained from the partial least squares method. A novel forward validation measure is also proposed for the selection of regularization parameters, based on both forward and current prediction errors. The proposed method is evaluated using a synthetic benchmark testing data and a publicly available yeast Saccharomyces cerevisiae cell cycle microarray data. For the latter particularly, a number of significant genes identified at different timepoints are found to be biological significant according to previous findings in biological experiments. These suggest that the proposed approach may serve as a valuable tool for inferring time-varying gene regulatory networks in biological studies.
Collapse
|
179
|
Zhang Z, Zheng Z, Niu H, Mi Y, Wu S, Hu G. Solving the inverse problem of noise-driven dynamic networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2015; 91:012814. [PMID: 25679664 DOI: 10.1103/physreve.91.012814] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Indexed: 06/04/2023]
Abstract
Nowadays, massive amounts of data are available for analysis in natural and social systems and the tasks to depict system structures from the data, i.e., the inverse problems, become one of the central issues in wide interdisciplinary fields. In this paper, we study the inverse problem of dynamic complex networks driven by white noise. A simple and universal inference formula of double correlation matrices and noise-decorrelation (DCMND) method is derived analytically, and numerical simulations confirm that the DCMND method can accurately depict both network structures and noise correlations by using available output data only. This inference performance has never been regarded possible by theoretical derivation, numerical computation, and experimental design.
Collapse
Affiliation(s)
- Zhaoyang Zhang
- Department of Physics, Beijing Normal University, Beijing 100875, China
| | - Zhigang Zheng
- Department of Physics, Beijing Normal University, Beijing 100875, China
| | - Haijing Niu
- State Key Laboratory of Cognitive Neuroscience and Learning and International Digital Group (IDG)/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China and Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University, Beijing 100875, China
| | - Yuanyuan Mi
- State Key Laboratory of Cognitive Neuroscience and Learning and International Digital Group (IDG)/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China and Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University, Beijing 100875, China
| | - Si Wu
- State Key Laboratory of Cognitive Neuroscience and Learning and International Digital Group (IDG)/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China and Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University, Beijing 100875, China
| | - Gang Hu
- Department of Physics, Beijing Normal University, Beijing 100875, China
| |
Collapse
|
180
|
Controlling networks of nonlinearly-coupled nodes using response surfaces. Sci Rep 2014; 4:7574. [PMID: 25524558 PMCID: PMC4271252 DOI: 10.1038/srep07574] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2014] [Accepted: 12/02/2014] [Indexed: 11/08/2022] Open
Abstract
Control of complex processes is a major goal of network analyses. Most approaches to control nonlinearly coupled systems require the network topology and/or network dynamics. Unfortunately, neither the full set of participating nodes nor the network topology is known for many important systems. On the other hand, system responses to perturbations are often easily measured. We show how the collection of such responses –a response surface– can be used for network control. Analyses of model systems show that response surfaces are smooth and hence can be approximated using low order polynomials. Importantly, these approximations are largely insensitive to stochastic fluctuations in data or measurement errors. They can be used to compute how a small set of nodes need to be altered in order to direct the network close to a pre-specified target state. These ideas, illustrated on a nonlinear electrical circuit, can prove useful in many contexts including in reprogramming cellular states.
Collapse
|
181
|
Chang YH, Gray JW, Tomlin CJ. Exact reconstruction of gene regulatory networks using compressive sensing. BMC Bioinformatics 2014; 15:400. [PMID: 25495633 PMCID: PMC4308013 DOI: 10.1186/s12859-014-0400-4] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2014] [Accepted: 11/27/2014] [Indexed: 02/04/2023] Open
Abstract
Background We consider the problem of reconstructing a gene regulatory network structure from limited time series gene expression data, without any a priori knowledge of connectivity. We assume that the network is sparse, meaning the connectivity among genes is much less than full connectivity. We develop a method for network reconstruction based on compressive sensing, which takes advantage of the network’s sparseness. Results For the case in which all genes are accessible for measurement, and there is no measurement noise, we show that our method can be used to exactly reconstruct the network. For the more general problem, in which hidden genes exist and all measurements are contaminated by noise, we show that our method leads to reliable reconstruction. In both cases, coherence of the model is used to assess the ability to reconstruct the network and to design new experiments. We demonstrate that it is possible to use the coherence distribution to guide biological experiment design effectively. By collecting a more informative dataset, the proposed method helps reduce the cost of experiments. For each problem, a set of numerical examples is presented. Conclusions The method provides a guarantee on how well the inferred graph structure represents the underlying system, reveals deficiencies in the data and model, and suggests experimental directions to remedy the deficiencies. Electronic supplementary material The online version of this article (doi:10.1186/s12859-014-0400-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Young Hwan Chang
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, 94720, CA, USA.
| | - Joe W Gray
- Department of Biomedical Engineering and the Center for Spatial Systems Biomedicine, Oregon Health and Science University, Portland, OR, USA.
| | - Claire J Tomlin
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, 94720, CA, USA. .,Faculty Scientist, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| |
Collapse
|
182
|
Sontag ED. A technique for determining the signs of sensitivities of steady states in chemical reaction networks. IET Syst Biol 2014; 8:251-67. [PMID: 25478700 DOI: 10.1049/iet-syb.2014.0025] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
This paper studies the direction of change of steady states to parameter perturbations in chemical reaction networks, and, in particular, to changes in conserved quantities. Theoretical considerations lead to the formulation of a computational procedure that provides a set of possible signs of such sensitivities. The procedure is purely algebraic and combinatorial, only using information on stoichiometry, and is independent of the values of kinetic constants. Three examples of important intracellular signal transduction models are worked out as an illustration. In these examples, the set of signs found is minimal, but there is no general guarantee that the set found will always be minimal in other examples. The paper also briefly discusses the relationship of the sign problem to the question of uniqueness of steady states in stoichiometry classes.
Collapse
Affiliation(s)
- Eduardo D Sontag
- Department of Mathematics, Rutgers University, Piscataway, NJ 08854-8019, USA.
| |
Collapse
|
183
|
Bag S, Anbarasu A. Revealing the Strong Functional Association of adipor2 and cdh13 with adipoq: A Gene Network Study. Cell Biochem Biophys 2014; 71:1445-56. [PMID: 25388841 DOI: 10.1007/s12013-014-0367-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
184
|
Bouffier AM, Arnold J, Schüttler HB. A MINE alternative to D-optimal designs for the linear model. PLoS One 2014; 9:e110234. [PMID: 25356931 PMCID: PMC4214713 DOI: 10.1371/journal.pone.0110234] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2014] [Accepted: 09/16/2014] [Indexed: 12/04/2022] Open
Abstract
Doing large-scale genomics experiments can be expensive, and so experimenters want to get the most information out of each experiment. To this end the Maximally Informative Next Experiment (MINE) criterion for experimental design was developed. Here we explore this idea in a simplified context, the linear model. Four variations of the MINE method for the linear model were created: MINE-like, MINE, MINE with random orthonormal basis, and MINE with random rotation. Each method varies in how it maximizes the MINE criterion. Theorem 1 establishes sufficient conditions for the maximization of the MINE criterion under the linear model. Theorem 2 establishes when the MINE criterion is equivalent to the classic design criterion of D-optimality. By simulation under the linear model, we establish that the MINE with random orthonormal basis and MINE with random rotation are faster to discover the true linear relation with regression coefficients and observations when . We also establish in simulations with , , and 1000 replicates that these two variations of MINE also display a lower false positive rate than the MINE-like method and additionally, for a majority of the experiments, for the MINE method.
Collapse
Affiliation(s)
- Amanda M. Bouffier
- Institute of Bioinformatics, University of Georgia, Athens, Georgia, United States of America
| | - Jonathan Arnold
- Genetics Department, University of Georgia, Athens, Georgia, United States of America
- * E-mail:
| | - H. Bernd Schüttler
- Physics and Astronomy Department, University of Georgia, Athens, Georgia, United States of America
| |
Collapse
|
185
|
Ghasemi O, Ma Y, Lindsey ML, Jin YF. Using systems biology approaches to understand cardiac inflammation and extracellular matrix remodeling in the setting of myocardial infarction. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2014; 6:77-91. [PMID: 24741709 DOI: 10.1002/wsbm.1248] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Inflammation and extracellular matrix (ECM) remodeling are important components regulating the response of the left ventricle to myocardial infarction (MI). Significant cellular- and molecular-level contributors can be identified by analyzing data acquired through high-throughput genomic and proteomic technologies that provide expression levels for thousands of genes and proteins. Large-scale data provide both temporal and spatial information that need to be analyzed and interpreted using systems biology approaches in order to integrate this information into dynamic models that predict and explain mechanisms of cardiac healing post-MI. In this review, we summarize the systems biology approaches needed to computationally simulate post-MI remodeling, including data acquisition, data analysis for biomarker classification and identification, data integration to build dynamic models, and data interpretation for biological functions. An example for applying a systems biology approach to ECM remodeling is presented as a reference illustration.
Collapse
|
186
|
Cai C, Chen L, Jiang X, Lu X. Modeling signal transduction from protein phosphorylation to gene expression. Cancer Inform 2014; 13:59-67. [PMID: 25392684 PMCID: PMC4216050 DOI: 10.4137/cin.s13883] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2014] [Revised: 05/04/2014] [Accepted: 05/04/2014] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Signaling networks are of great importance for us to understand the cell’s regulatory mechanism. The rise of large-scale genomic and proteomic data, and prior biological knowledge has paved the way for the reconstruction and discovery of novel signaling pathways in a data-driven manner. In this study, we investigate computational methods that integrate proteomics and transcriptomic data to identify signaling pathways transmitting signals in response to specific stimuli. Such methods can be applied to cancer genomic data to infer perturbed signaling pathways. METHOD We proposed a novel Bayesian Network (BN) framework to integrate transcriptomic data with proteomic data reflecting protein phosphorylation states for the purpose of identifying the pathways transmitting the signal of diverse stimuli in rat and human cells. We represented the proteins and genes as nodes in a BN in which edges reflect the regulatory relationship between signaling proteins. We designed an efficient inference algorithm that incorporated the prior knowledge of pathways and searched for a network structure in a data-driven manner. RESULTS We applied our method to infer rat and human specific networks given gene expression and proteomic datasets. We were able to effectively identify sparse signaling networks that modeled the observed transcriptomic and proteomic data. Our methods were able to identify distinct signaling pathways for rat and human cells in a data-driven manner, based on the facts that rat and human cells exhibited distinct transcriptomic and proteomics responses to a common set of stimuli. Our model performed well in the SBV IMPROVER challenge in comparison to other models addressing the same task. The capability of inferring signaling pathways in a data-driven fashion may contribute to cancer research by identifying distinct aberrations in signaling pathways underlying heterogeneous cancers subtypes.
Collapse
Affiliation(s)
- Chunhui Cai
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Lujia Chen
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Xia Jiang
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Xinghua Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
187
|
Studham ME, Tjärnberg A, Nordling TEM, Nelander S, Sonnhammer ELL. Functional association networks as priors for gene regulatory network inference. ACTA ACUST UNITED AC 2014; 30:i130-8. [PMID: 24931976 PMCID: PMC4058914 DOI: 10.1093/bioinformatics/btu285] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Motivation: Gene regulatory network (GRN) inference reveals the influences genes have on one another in cellular regulatory systems. If the experimental data are inadequate for reliable inference of the network, informative priors have been shown to improve the accuracy of inferences. Results: This study explores the potential of undirected, confidence-weighted networks, such as those in functional association databases, as a prior source for GRN inference. Such networks often erroneously indicate symmetric interaction between genes and may contain mostly correlation-based interaction information. Despite these drawbacks, our testing on synthetic datasets indicates that even noisy priors reflect some causal information that can improve GRN inference accuracy. Our analysis on yeast data indicates that using the functional association databases FunCoup and STRING as priors can give a small improvement in GRN inference accuracy with biological data. Contact:matthew.studham@scilifelab.se Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matthew E Studham
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Andreas Tjärnberg
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Torbjörn E M Nordling
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Sven Nelander
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Erik L L Sonnhammer
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| |
Collapse
|
188
|
Mostafavi S, Ortiz-Lopez A, Bogue MA, Hattori K, Pop C, Koller D, Mathis D, Benoist C. Variation and genetic control of gene expression in primary immunocytes across inbred mouse strains. THE JOURNAL OF IMMUNOLOGY 2014; 193:4485-96. [PMID: 25267973 DOI: 10.4049/jimmunol.1401280] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
To determine the breadth and underpinning of changes in immunocyte gene expression due to genetic variation in mice, we performed, as part of the Immunological Genome Project, gene expression profiling for CD4(+) T cells and neutrophils purified from 39 inbred strains of the Mouse Phenome Database. Considering both cell types, a large number of transcripts showed significant variation across the inbred strains, with 22% of the transcriptome varying by 2-fold or more. These included 119 loci with apparent complete loss of function, where the corresponding transcript was not expressed in some of the strains, representing a useful resource of "natural knockouts." We identified 1222 cis-expression quantitative trait loci (cis-eQTL) that control some of this variation. Most (60%) cis-eQTLs were shared between T cells and neutrophils, but a significant portion uniquely impacted one of the cell types, suggesting cell type-specific regulatory mechanisms. Using a conditional regression algorithm, we predicted regulatory interactions between transcription factors and potential targets, and we demonstrated that these predictions overlap with regulatory interactions inferred from transcriptional changes during immunocyte differentiation. Finally, comparison of these and parallel data from CD4(+) T cells of healthy humans demonstrated intriguing similarities in variability of a gene's expression: the most variable genes tended to be the same in both species, and there was an overlap in genes subject to strong cis-acting genetic variants. We speculate that this "conservation of variation" reflects a differential constraint on intraspecies variation in expression levels of different genes, either through lower pressure for some genes, or by favoring variability for others.
Collapse
Affiliation(s)
- Sara Mostafavi
- Department of Computer Science, Stanford University, Stanford, CA 94305
| | - Adriana Ortiz-Lopez
- Division of Immunology, Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115; and
| | | | - Kimie Hattori
- Division of Immunology, Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115; and
| | - Cristina Pop
- Department of Computer Science, Stanford University, Stanford, CA 94305
| | - Daphne Koller
- Department of Computer Science, Stanford University, Stanford, CA 94305
| | - Diane Mathis
- Division of Immunology, Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115; and
| | - Christophe Benoist
- Division of Immunology, Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115; and
| | | |
Collapse
|
189
|
Ravindranath AC, Perualila-Tan N, Kasim A, Drakakis G, Liggi S, Brewerton SC, Mason D, Bodkin MJ, Evans DA, Bhagwat A, Talloen W, Göhlmann HWH, Shkedy Z, Bender A. Connecting gene expression data from connectivity map and in silico target predictions for small molecule mechanism-of-action analysis. MOLECULAR BIOSYSTEMS 2014; 11:86-96. [PMID: 25254964 DOI: 10.1039/c4mb00328d] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Integrating gene expression profiles with certain proteins can improve our understanding of the fundamental mechanisms in protein-ligand binding. This paper spotlights the integration of gene expression data and target prediction scores, providing insight into mechanism of action (MoA). Compounds are clustered based upon the similarity of their predicted protein targets and each cluster is linked to gene sets using Linear Models for Microarray Data. MLP analysis is used to generate gene sets based upon their biological processes and a qualitative search is performed on the homogeneous target-based compound clusters to identify pathways. Genes and proteins were linked through pathways for 6 of the 8 MCF7 and 6 of the 11 PC3 clusters. Three compound clusters are studied; (i) the target-driven cluster involving HSP90 inhibitors, geldanamycin and tanespimycin induces differential expression for HSP90-related genes and overlap with pathway response to unfolded protein. Gene expression results are in agreement with target prediction and pathway annotations add information to enable understanding of MoA. (ii) The antipsychotic cluster shows differential expression for genes LDLR and INSIG-1 and is predicted to target CYP2D6. Pathway steroid metabolic process links the protein and respective genes, hypothesizing the MoA for antipsychotics. A sub-cluster (verepamil and dexverepamil), although sharing similar protein targets with the antipsychotic drug cluster, has a lower intensity of expression profile on related genes, indicating that this method distinguishes close sub-clusters and suggests differences in their MoA. Lastly, (iii) the thiazolidinediones drug cluster predicted peroxisome proliferator activated receptor (PPAR) PPAR-alpha, PPAR-gamma, acyl CoA desaturase and significant differential expression of genes ANGPTL4, FABP4 and PRKCD. The targets and genes are linked via PPAR signalling pathway and induction of apoptosis, generating a hypothesis for the MoA of thiazolidinediones. Our analysis show one or more underlying MoA for compounds and were well-substantiated with literature.
Collapse
Affiliation(s)
- Aakash Chavan Ravindranath
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
190
|
Oates CJ, Dondelinger F, Bayani N, Korkola J, Gray JW, Mukherjee S. Causal network inference using biochemical kinetics. Bioinformatics 2014; 30:i468-74. [PMID: 25161235 PMCID: PMC4147905 DOI: 10.1093/bioinformatics/btu452] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Networks are widely used as structural summaries of biochemical systems. Statistical estimation of networks is usually based on linear or discrete models. However, the dynamics of biochemical systems are generally non-linear, suggesting that suitable non-linear formulations may offer gains with respect to causal network inference and aid in associated prediction problems. RESULTS We present a general framework for network inference and dynamical prediction using time course data that is rooted in non-linear biochemical kinetics. This is achieved by considering a dynamical system based on a chemical reaction graph with associated kinetic parameters. Both the graph and kinetic parameters are treated as unknown; inference is carried out within a Bayesian framework. This allows prediction of dynamical behavior even when the underlying reaction graph itself is unknown or uncertain. Results, based on (i) data simulated from a mechanistic model of mitogen-activated protein kinase signaling and (ii) phosphoproteomic data from cancer cell lines, demonstrate that non-linear formulations can yield gains in causal network inference and permit dynamical prediction and uncertainty quantification in the challenging setting where the reaction graph is unknown. AVAILABILITY AND IMPLEMENTATION MATLAB R2014a software is available to download from warwick.ac.uk/chrisoates. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chris J Oates
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, MRC Biostatistics Unit, Cambridge, CB2 0SR, UK, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, Department of Biomedical Engineering, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97239-3098, USA and School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SP, UK
| | - Frank Dondelinger
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, MRC Biostatistics Unit, Cambridge, CB2 0SR, UK, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, Department of Biomedical Engineering, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97239-3098, USA and School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SP, UK
| | - Nora Bayani
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, MRC Biostatistics Unit, Cambridge, CB2 0SR, UK, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, Department of Biomedical Engineering, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97239-3098, USA and School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SP, UK
| | - James Korkola
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, MRC Biostatistics Unit, Cambridge, CB2 0SR, UK, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, Department of Biomedical Engineering, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97239-3098, USA and School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SP, UK
| | - Joe W Gray
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, MRC Biostatistics Unit, Cambridge, CB2 0SR, UK, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, Department of Biomedical Engineering, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97239-3098, USA and School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SP, UK
| | - Sach Mukherjee
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, MRC Biostatistics Unit, Cambridge, CB2 0SR, UK, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, Department of Biomedical Engineering, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97239-3098, USA and School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SP, UK Department of Statistics, University of Warwick, Coventry, CV4 7AL, MRC Biostatistics Unit, Cambridge, CB2 0SR, UK, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, Department of Biomedical Engineering, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97239-3098, USA and School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SP, UK
| |
Collapse
|
191
|
Adabor ES, Acquaah-Mensah GK, Oduro FT. SAGA: a hybrid search algorithm for Bayesian Network structure learning of transcriptional regulatory networks. J Biomed Inform 2014; 53:27-35. [PMID: 25181467 DOI: 10.1016/j.jbi.2014.08.010] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Revised: 08/17/2014] [Accepted: 08/22/2014] [Indexed: 11/16/2022]
Abstract
Bayesian Networks have been used for the inference of transcriptional regulatory relationships among genes, and are valuable for obtaining biological insights. However, finding optimal Bayesian Network (BN) is NP-hard. Thus, heuristic approaches have sought to effectively solve this problem. In this work, we develop a hybrid search method combining Simulated Annealing with a Greedy Algorithm (SAGA). SAGA explores most of the search space by undergoing a two-phase search: first with a Simulated Annealing search and then with a Greedy search. Three sets of background-corrected and normalized microarray datasets were used to test the algorithm. BN structure learning was also conducted using the datasets, and other established search methods as implemented in BANJO (Bayesian Network Inference with Java Objects). The Bayesian Dirichlet Equivalence (BDe) metric was used to score the networks produced with SAGA. SAGA predicted transcriptional regulatory relationships among genes in networks that evaluated to higher BDe scores with high sensitivities and specificities. Thus, the proposed method competes well with existing search algorithms for Bayesian Network structure learning of transcriptional regulatory networks.
Collapse
Affiliation(s)
- Emmanuel S Adabor
- Department of Mathematics, Kwame Nkrumah University of Science and Technology, PMB, Kumasi, Ghana.
| | - George K Acquaah-Mensah
- Pharmaceutical Sciences Department, Massachusetts College of Pharmacy and Health Sciences (MCPHS University), 19 Foster Street, Worcester, MA, USA
| | - Francis T Oduro
- Department of Mathematics, Kwame Nkrumah University of Science and Technology, PMB, Kumasi, Ghana
| |
Collapse
|
192
|
Reconstructing propagation networks with natural diversity and identifying hidden sources. Nat Commun 2014; 5:4323. [PMID: 25014310 PMCID: PMC4104449 DOI: 10.1038/ncomms5323] [Citation(s) in RCA: 134] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2013] [Accepted: 06/06/2014] [Indexed: 11/26/2022] Open
Abstract
Our ability to uncover complex network structure and dynamics from data is fundamental to understanding and controlling collective dynamics in complex systems. Despite recent progress in this area, reconstructing networks with stochastic dynamical processes from limited time series remains to be an outstanding problem. Here we develop a framework based on compressed sensing to reconstruct complex networks on which stochastic spreading dynamics take place. We apply the methodology to a large number of model and real networks, finding that a full reconstruction of inhomogeneous interactions can be achieved from small amounts of polarized (binary) data, a virtue of compressed sensing. Further, we demonstrate that a hidden source that triggers the spreading process but is externally inaccessible can be ascertained and located with high confidence in the absence of direct routes of propagation from it. Our approach thus establishes a paradigm for tracing and controlling epidemic invasion and information diffusion in complex networked systems. The structure of many complex systems is usually difficult to determine. Zhesi Shen et al. adapt a signal-processing technique known as compressed sensing to reconstruct the dynamics and structure of a complex propagation network from a small amount of time series data.![]()
Collapse
|
193
|
Zhu F, Guan Y. Predicting dynamic signaling network response under unseen perturbations. ACTA ACUST UNITED AC 2014; 30:2772-8. [PMID: 24919880 DOI: 10.1093/bioinformatics/btu382] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
MOTIVATION Predicting trajectories of signaling networks under complex perturbations is one of the most valuable, but challenging, tasks in systems biology. Signaling networks are involved in most of the biological pathways, and modeling their dynamics has wide applications including drug design and treatment outcome prediction. RESULTS In this paper, we report a novel model for predicting the cell type-specific time course response of signaling proteins under unseen perturbations. This algorithm achieved the top performance in the 2013 8th Dialogue for Reverse Engineering Assessments and Methods (DREAM 8) subchallenge: time course prediction in breast cancer cell lines. We formulate the trajectory prediction problem into a standard regularization problem; the solution becomes solving this discrete ill-posed problem. This algorithm includes three steps: denoising, estimating regression coefficients and modeling trajectories under unseen perturbations. We further validated the accuracy of this method against simulation and experimental data. Furthermore, this method reduces computational time by magnitudes compared to state-of-the-art methods, allowing genome-wide modeling of signaling pathways and time course trajectories to be carried out in a practical time. AVAILABILITY AND IMPLEMENTATION Source code is available at http://guanlab.ccmb.med.umich.edu/DREAM/code.html and as supplementary file online.
Collapse
Affiliation(s)
- Fan Zhu
- Department of Computational Medicine and Bioinformatics, Department of Internal Medicine and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, Department of Internal Medicine and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA Department of Computational Medicine and Bioinformatics, Department of Internal Medicine and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA Department of Computational Medicine and Bioinformatics, Department of Internal Medicine and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
194
|
Santra T. A bayesian framework that integrates heterogeneous data for inferring gene regulatory networks. Front Bioeng Biotechnol 2014; 2:13. [PMID: 25152886 PMCID: PMC4126456 DOI: 10.3389/fbioe.2014.00013] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2014] [Accepted: 04/28/2014] [Indexed: 11/29/2022] Open
Abstract
Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein-protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
Collapse
Affiliation(s)
- Tapesh Santra
- Systems Biology Ireland, University College Dublin, Dublin, Ireland
| |
Collapse
|
195
|
Li X, Wu L, Liu W, Jin Y, Chen Q, Wang L, Fan X, Li Z, Cheng Y. A network pharmacology study of Chinese medicine QiShenYiQi to reveal its underlying multi-compound, multi-target, multi-pathway mode of action. PLoS One 2014; 9:e95004. [PMID: 24817581 PMCID: PMC4015902 DOI: 10.1371/journal.pone.0095004] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2013] [Accepted: 03/21/2014] [Indexed: 01/04/2023] Open
Abstract
Chinese medicine is a complex system guided by traditional Chinese medicine (TCM) theories, which has proven to be especially effective in treating chronic and complex diseases. However, the underlying modes of action (MOA) are not always systematically investigated. Herein, a systematic study was designed to elucidate the multi-compound, multi-target and multi-pathway MOA of a Chinese medicine, QiShenYiQi (QSYQ), on myocardial infarction. QSYQ is composed of Astragalus membranaceus (Huangqi), Salvia miltiorrhiza (Danshen), Panax notoginseng (Sanqi), and Dalbergia odorifera (Jiangxiang). Male Sprague Dawley rat model of myocardial infarction were administered QSYQ intragastrically for 7 days while the control group was not treated. The differentially expressed genes (DEGs) were identified from myocardial infarction rat model treated with QSYQ, followed by constructing a cardiovascular disease (CVD)-related multilevel compound-target-pathway network connecting main compounds to those DEGs supported by literature evidences and the pathways that are functionally enriched in ArrayTrack. 55 potential targets of QSYQ were identified, of which 14 were confirmed in CVD-related literatures with experimental supporting evidences. Furthermore, three sesquiterpene components of QSYQ, Trans-nerolidol, (3S,6S,7R)-3,7,11-trimethyl-3,6-epoxy-1,10-dodecadien-7-ol and (3S,6R,7R)-3,7,11-trimethyl-3,6-epoxy-1,10-dodecadien-7-ol from Dalbergia odorifera T. Chen, were validated experimentally in this study. Their anti-inflammatory effects and potential targets including extracellular signal-regulated kinase-1/2, peroxisome proliferator-activated receptor-gamma and heme oxygenase-1 were identified. Finally, through a three-level compound-target-pathway network with experimental analysis, our study depicts a complex MOA of QSYQ on myocardial infarction.
Collapse
Affiliation(s)
- Xiang Li
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Leihong Wu
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Wei Liu
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Yecheng Jin
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Qian Chen
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Linli Wang
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Zheng Li
- State Key Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, China
| | - Yiyu Cheng
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
196
|
Wu S, Liu ZP, Qiu X, Wu H. Modeling genome-wide dynamic regulatory network in mouse lungs with influenza infection using high-dimensional ordinary differential equations. PLoS One 2014; 9:e95276. [PMID: 24802016 PMCID: PMC4011728 DOI: 10.1371/journal.pone.0095276] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2013] [Accepted: 03/26/2014] [Indexed: 12/20/2022] Open
Abstract
The immune response to viral infection is regulated by an intricate network of many genes and their products. The reverse engineering of gene regulatory networks (GRNs) using mathematical models from time course gene expression data collected after influenza infection is key to our understanding of the mechanisms involved in controlling influenza infection within a host. A five-step pipeline: detection of temporally differentially expressed genes, clustering genes into co-expressed modules, identification of network structure, parameter estimate refinement, and functional enrichment analysis, is developed for reconstructing high-dimensional dynamic GRNs from genome-wide time course gene expression data. Applying the pipeline to the time course gene expression data from influenza-infected mouse lungs, we have identified 20 distinct temporal expression patterns in the differentially expressed genes and constructed a module-based dynamic network using a linear ODE model. Both intra-module and inter-module annotations and regulatory relationships of our inferred network show some interesting findings and are highly consistent with existing knowledge about the immune response in mice after influenza infection. The proposed method is a computationally efficient, data-driven pipeline bridging experimental data, mathematical modeling, and statistical analysis. The application to the influenza infection data elucidates the potentials of our pipeline in providing valuable insights into systematic modeling of complicated biological processes.
Collapse
Affiliation(s)
- Shuang Wu
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, New York, United States of America
| | - Zhi-Ping Liu
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, New York, United States of America
| | - Xing Qiu
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, New York, United States of America
| | - Hulin Wu
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, New York, United States of America
| |
Collapse
|
197
|
Lee J, Tiwari A, Shum V, Mills GB, Mancini MA, Igoshin OA, Balázsi G. Unraveling the regulatory connections between two controllers of breast cancer cell fate. Nucleic Acids Res 2014; 42:6839-49. [PMID: 24792166 PMCID: PMC4066784 DOI: 10.1093/nar/gku360] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Estrogen receptor alpha (ERα) expression is critical for breast cancer classification, high ERα expression being associated with better prognosis. ERα levels strongly correlate with that of GATA binding protein 3 (GATA3), a major regulator of ERα expression. However, the mechanistic details of ERα-GATA3 regulation remain incompletely understood. Here we combine mathematical modeling with perturbation experiments to unravel the nature of regulatory connections in the ERα-GATA3 network. Through cell population-average, single-cell and single-nucleus measurements, we show that the cross-regulation between ERα and GATA3 amounts to overall negative feedback. Further, mathematical modeling reveals that GATA3 positively regulates its own expression and that ERα autoregulation is most likely absent. Lastly, we show that the two cross-regulatory connections in the ERα-GATA3 negative feedback network decrease the noise in ERα or GATA3 expression. This may ensure robust cell fate maintenance in the face of intracellular and environmental fluctuations, contributing to tissue homeostasis in normal conditions, but also to the maintenance of pathogenic states during cancer progression.
Collapse
Affiliation(s)
- Jinho Lee
- Department of Systems Biology - Unit 950, The University of Texas MD Anderson Cancer Center, Houston, TX 77054, USA Department of Bioengineering, Rice University, Houston, TX 77005, USA
| | - Abhinav Tiwari
- Department of Bioengineering, Rice University, Houston, TX 77005, USA
| | - Victor Shum
- Department of Systems Biology - Unit 950, The University of Texas MD Anderson Cancer Center, Houston, TX 77054, USA Department of Physics, University of Houston, Houston, TX 77004, USA
| | - Gordon B Mills
- Department of Systems Biology - Unit 950, The University of Texas MD Anderson Cancer Center, Houston, TX 77054, USA
| | - Michael A Mancini
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Oleg A Igoshin
- Department of Bioengineering, Rice University, Houston, TX 77005, USA
| | - Gábor Balázsi
- Department of Systems Biology - Unit 950, The University of Texas MD Anderson Cancer Center, Houston, TX 77054, USA
| |
Collapse
|
198
|
Idowu MA. Cyclin-Dependent Kinases as Drug Targets for Cell Growth and Proliferation Disorders. A Role for Systems Biology Approach in Drug Development. Part II—CDKs as Drug Targets in Hypertrophic Cell Growth. Modelling of Drugs Targeting CDKs. BIOTECHNOL BIOTEC EQ 2014. [DOI: 10.5504/bbeq.2011.0142] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
|
199
|
Wang YXR, Huang H. Review on statistical methods for gene network reconstruction using expression data. J Theor Biol 2014; 362:53-61. [PMID: 24726980 DOI: 10.1016/j.jtbi.2014.03.040] [Citation(s) in RCA: 112] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2014] [Revised: 03/29/2014] [Accepted: 03/31/2014] [Indexed: 12/16/2022]
Abstract
Network modeling has proven to be a fundamental tool in analyzing the inner workings of a cell. It has revolutionized our understanding of biological processes and made significant contributions to the discovery of disease biomarkers. Much effort has been devoted to reconstruct various types of biochemical networks using functional genomic datasets generated by high-throughput technologies. This paper discusses statistical methods used to reconstruct gene regulatory networks using gene expression data. In particular, we highlight progress made and challenges yet to be met in the problems involved in estimating gene interactions, inferring causality and modeling temporal changes of regulation behaviors. As rapid advances in technologies have made available diverse, large-scale genomic data, we also survey methods of incorporating all these additional data to achieve better, more accurate inference of gene networks.
Collapse
Affiliation(s)
- Y X Rachel Wang
- Department of Statistics, University of California, Berkeley, CA 94720, USA.
| | - Haiyan Huang
- Department of Statistics, University of California, Berkeley, CA 94720, USA.
| |
Collapse
|
200
|
Jia B, Wang X. Regularized EM algorithm for sparse parameter estimation in nonlinear dynamic systems with application to gene regulatory network inference. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2014; 2014:5. [PMID: 24708632 PMCID: PMC3998071 DOI: 10.1186/1687-4153-2014-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/26/2013] [Accepted: 02/26/2014] [Indexed: 11/10/2022]
Abstract
Parameter estimation in dynamic systems finds applications in various disciplines, including system biology. The well-known expectation-maximization (EM) algorithm is a popular method and has been widely used to solve system identification and parameter estimation problems. However, the conventional EM algorithm cannot exploit the sparsity. On the other hand, in gene regulatory network inference problems, the parameters to be estimated often exhibit sparse structure. In this paper, a regularized expectation-maximization (rEM) algorithm for sparse parameter estimation in nonlinear dynamic systems is proposed that is based on the maximum a posteriori (MAP) estimation and can incorporate the sparse prior. The expectation step involves the forward Gaussian approximation filtering and the backward Gaussian approximation smoothing. The maximization step employs a re-weighted iterative thresholding method. The proposed algorithm is then applied to gene regulatory network inference. Results based on both synthetic and real data show the effectiveness of the proposed algorithm.
Collapse
Affiliation(s)
- Bin Jia
- Intelligent Fusion Technology, Germantown, Inc., MD 20876, USA
| | - Xiaodong Wang
- Department of Electrical Engineering, Columbia University, New York, NY 10027, USA
| |
Collapse
|