1
|
Magaña-López G, Calzone L, Zinovyev A, Paulevé L. scBoolSeq: Linking scRNA-seq statistics and Boolean dynamics. PLoS Comput Biol 2024; 20:e1011620. [PMID: 38976751 PMCID: PMC11257695 DOI: 10.1371/journal.pcbi.1011620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 07/18/2024] [Accepted: 06/24/2024] [Indexed: 07/10/2024] Open
Abstract
Boolean networks are largely employed to model the qualitative dynamics of cell fate processes by describing the change of binary activation states of genes and transcription factors with time. Being able to bridge such qualitative states with quantitative measurements of gene expression in cells, as scRNA-seq, is a cornerstone for data-driven model construction and validation. On one hand, scRNA-seq binarisation is a key step for inferring and validating Boolean models. On the other hand, the generation of synthetic scRNA-seq data from baseline Boolean models provides an important asset to benchmark inference methods. However, linking characteristics of scRNA-seq datasets, including dropout events, with Boolean states is a challenging task. We present scBoolSeq, a method for the bidirectional linking of scRNA-seq data and Boolean activation state of genes. Given a reference scRNA-seq dataset, scBoolSeq computes statistical criteria to classify the empirical gene pseudocount distributions as either unimodal, bimodal, or zero-inflated, and fit a probabilistic model of dropouts, with gene-dependent parameters. From these learnt distributions, scBoolSeq can perform both binarisation of scRNA-seq datasets, and generate synthetic scRNA-seq datasets from Boolean traces, as issued from Boolean networks, using biased sampling and dropout simulation. We present a case study demonstrating the application of scBoolSeq's binarisation scheme in data-driven model inference. Furthermore, we compare synthetic scRNA-seq data generated by scBoolSeq with BoolODE's, data for the same Boolean Network model. The comparison shows that our method better reproduces the statistics of real scRNA-seq datasets, such as the mean-variance and mean-dropout relationships while exhibiting clearly defined trajectories in two-dimensional projections of the data.
Collapse
Affiliation(s)
| | - Laurence Calzone
- Institut Curie, Université PSL, Paris, France
- INSERM, U900, Paris, France
- Mines ParisTech, Université PSL, Paris, France
| | | | - Loïc Paulevé
- Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, Talence, France
| |
Collapse
|
2
|
Llanos CD, Xie T, Lim HE, Segatori L. A Computational Modeling Approach for the Design of Genetic Control Systems that Respond to Transcriptional Activity. Methods Mol Biol 2024; 2774:99-117. [PMID: 38441761 DOI: 10.1007/978-1-0716-3718-0_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
Recent progress in synthetic biology has enabled the design of complex genetic circuits that interface with innate cellular functions, such as gene transcription, and control user-defined outputs. Implementing these genetic networks in mammalian cells, however, is a cumbersome process that requires several steps of optimization and benefits from the use of predictive modeling. Combining deterministic mathematical models with software-based numerical computing platforms allows researchers to quickly design, evaluate, and optimize multiple circuit topologies to establish experimental constraints that generate the desired control systems. In this chapter, we present a systematic approach based on predictive mathematical modeling to guide the design and construction of gene activity-based sensors. This approach enables user-driven circuit optimization through iterations of sensitivity analyses and parameter scans, providing a universal method to engineer sense and respond cells for diverse applications.
Collapse
Affiliation(s)
- Carlos D Llanos
- Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, USA
| | - Tianyi Xie
- Department of Bioengineering, Rice University, Houston, TX, USA
| | - Ha Eun Lim
- Department of Bioengineering, Rice University, Houston, TX, USA
| | - Laura Segatori
- Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, USA.
- Department of Bioengineering, Rice University, Houston, TX, USA.
- Department of Chemical and Biochemical Engineering, Rice University, Houston, TX, USA.
- Department of Biosciences, Rice University, Houston, TX, USA.
| |
Collapse
|
3
|
A Platform Technology for Monitoring the Unfolded Protein Response. Methods Mol Biol 2022; 2378:45-67. [PMID: 34985693 PMCID: PMC10053305 DOI: 10.1007/978-1-0716-1732-8_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The unfolded protein response (UPR) is a complex signal transduction pathway that remodels gene expression in response to proteotoxic stress in the endoplasmic reticulum (ER) and is linked to the development of a range of diseases, including Alzheimer's disease, diabetes, and several types of cancer. UPR induction is typically monitored by measuring the expression level of UPR marker genes. Most tools for quantifying gene expression, including DNA microarrays and quantitative PCR with reverse transcription (RT-PCR), produce snapshots of the cell transcriptome, but are not ideal for measurements requiring temporal resolution of gene expression dynamics. Reporter assays for indirect detection of the UPR typically rely on extrachromosomal expression of reporters under the control of minimal or synthetic regulatory sequences that do not recapitulate the native chromosomal context of the UPR target genes. To address the need for tools to monitor chromosomal gene expression that recapitulate gene expression dynamics from the native chromosomal context and generate a readily detectable signal output, we developed a gene signal amplifier platform that links transcriptional and post-translational regulation of a fluorescent output to the expression of a chromosomal gene marker of the UPR. The platform is based on a genetic circuit that amplifies the output signal with high sensitivity and dynamic resolution and is implemented through chromosomal integration of the gene encoding the main control element of the genetic circuit to link its expression to that of the target gene, thereby generating a platform that can be easily adapted to monitor any UPR target through integration of the main control element at the appropriate chromosomal locus. By recapitulating the transcriptional and translational control mechanisms underlying the expression of UPR targets with high sensitivity, this platform provides a novel technology for monitoring the UPR with superior sensitivity and dynamic resolution.
Collapse
|
4
|
Kim JK, Tyson JJ. Misuse of the Michaelis-Menten rate law for protein interaction networks and its remedy. PLoS Comput Biol 2020; 16:e1008258. [PMID: 33090989 PMCID: PMC7581366 DOI: 10.1371/journal.pcbi.1008258] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
For over a century, the Michaelis-Menten (MM) rate law has been used to describe the rates of enzyme-catalyzed reactions and gene expression. Despite the ubiquity of the MM rate law, it accurately captures the dynamics of underlying biochemical reactions only so long as it is applied under the right condition, namely, that the substrate is in large excess over the enzyme-substrate complex. Unfortunately, in circumstances where its validity condition is not satisfied, especially so in protein interaction networks, the MM rate law has frequently been misused. In this review, we illustrate how inappropriate use of the MM rate law distorts the dynamics of the system, provides mistaken estimates of parameter values, and makes false predictions of dynamical features such as ultrasensitivity, bistability, and oscillations. We describe how these problems can be resolved with a slightly modified form of the MM rate law, based on the total quasi-steady state approximation (tQSSA). Furthermore, we show that the tQSSA can be used for accurate stochastic simulations at a lower computational cost than using the full set of mass-action rate laws. This review describes how to use quasi-steady state approximations in the right context, to prevent drawing erroneous conclusions from in silico simulations.
Collapse
Affiliation(s)
- Jae Kyoung Kim
- Department of Mathematical Sciences, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - John J. Tyson
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
- Division of Systems Biology, Virginia Tech, Blacksburg, Virginia, United States of America
| |
Collapse
|
5
|
Clement EJ, Schulze TT, Soliman GA, Wysocki BJ, Davis PH, Wysocki TA. Stochastic Simulation of Cellular Metabolism. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2020; 8:79734-79744. [PMID: 33747671 PMCID: PMC7971159 DOI: 10.1109/access.2020.2986833] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Increased technological methods have enabled the investigation of biology at nanoscale levels. Such systems require the use of computational methods to comprehend the complex interactions that occur. The dynamics of metabolic systems have been traditionally described utilizing differential equations without fully capturing the heterogeneity of biological systems. Stochastic modeling approaches have recently emerged with the capacity to incorporate the statistical properties of such systems. However, the processing of stochastic algorithms is a computationally intensive task with intrinsic limitations. Alternatively, the queueing theory approach, historically used in the evaluation of telecommunication networks, can significantly reduce the computational power required to generate simulated results while simultaneously reducing the expansion of errors. We present here the application of queueing theory to simulate stochastic metabolic networks with high efficiency. With the use of glycolysis as a well understood biological model, we demonstrate the power of the proposed modeling methods discussed herein. Furthermore, we describe the simulation and pharmacological inhibition of glycolysis to provide an example of modeling capabilities.
Collapse
Affiliation(s)
- Emalie J. Clement
- Department of Biology, University of Nebraska at Omaha, Omaha, Nebraska, USA
| | - Thomas T. Schulze
- Department of Pathology and Microbiology, University of Nebraska Medical Center, Omaha, Nebraska, USA
| | - Ghada A. Soliman
- Graduate School of Public Health and Health Policy, City University of New York, New York, USA
| | - Beata J. Wysocki
- Department of Biology, University of Nebraska at Omaha, Omaha, Nebraska, USA
| | - Paul H. Davis
- Department of Biology, University of Nebraska at Omaha, Omaha, Nebraska, USA
| | - Tadeusz A. Wysocki
- Department of Electrical and Computer Engineering, University of Nebraska – Lincoln, Omaha, Nebraska, USA
- UTP University, Bydgoszcz, Poland
| |
Collapse
|
6
|
Serrano W. Genetic and deep learning clusters based on neural networks for management decision structures. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04231-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
7
|
Quasi-Steady-State Approximations Derived from the Stochastic Model of Enzyme Kinetics. Bull Math Biol 2019; 81:1303-1336. [DOI: 10.1007/s11538-019-00574-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2017] [Accepted: 01/29/2019] [Indexed: 10/27/2022]
|
8
|
Grenet I, Yin Y, Comet JP. G-Networks to Predict the Outcome of Sensing of Toxicity. SENSORS 2018; 18:s18103483. [PMID: 30332807 PMCID: PMC6210391 DOI: 10.3390/s18103483] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 10/05/2018] [Accepted: 10/12/2018] [Indexed: 01/09/2023]
Abstract
G-Networks and their simplified version known as the Random Neural Network have often been used to classify data. In this paper, we present a use of the Random Neural Network to the early detection of potential of toxicity chemical compounds through the prediction of their bioactivity from the compounds' physico-chemical structure, and propose that it be automated using machine learning (ML) techniques. Specifically the Random Neural Network is shown to be an effective analytical tool to this effect, and the approach is illustrated and compared with several ML techniques.
Collapse
Affiliation(s)
- Ingrid Grenet
- University Côte d'Azur, I3S laboratory, UMR CNRS 7271, CS 40121, 06903 Sophia Antipolis CEDEX, France.
| | - Yonghua Yin
- Intelligent Systems and Networks Group, Department of Electrical and Electronic Engineering, Imperial College, London SW7 2AZ, UK.
| | - Jean-Paul Comet
- University Côte d'Azur, I3S laboratory, UMR CNRS 7271, CS 40121, 06903 Sophia Antipolis CEDEX, France.
| |
Collapse
|
9
|
Lipan O, Ferwerda C. Hill functions for stochastic gene regulatory networks from master equations with split nodes and time-scale separation. Phys Rev E 2018; 97:022413. [PMID: 29548212 DOI: 10.1103/physreve.97.022413] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Indexed: 06/08/2023]
Abstract
The deterministic Hill function depends only on the average values of molecule numbers. To account for the fluctuations in the molecule numbers, the argument of the Hill function needs to contain the means, the standard deviations, and the correlations. Here we present a method that allows for stochastic Hill functions to be constructed from the dynamical evolution of stochastic biocircuits with specific topologies. These stochastic Hill functions are presented in a closed analytical form so that they can be easily incorporated in models for large genetic regulatory networks. Using a repressive biocircuit as an example, we show by Monte Carlo simulations that the traditional deterministic Hill function inaccurately predicts time of repression by an order of two magnitudes. However, the stochastic Hill function was able to capture the fluctuations and thus accurately predicted the time of repression.
Collapse
Affiliation(s)
- Ovidiu Lipan
- Department of Physics, University of Richmond, 28 Westhampton Way, Richmond, Virginia 23173, USA
| | - Cameron Ferwerda
- Department of Mathematics, King's College London, Strand, London WC2R 2LS, United Kingdom
| |
Collapse
|
10
|
Cao Y, Terebus A, Liang J. ACCURATE CHEMICAL MASTER EQUATION SOLUTION USING MULTI-FINITE BUFFERS. MULTISCALE MODELING & SIMULATION : A SIAM INTERDISCIPLINARY JOURNAL 2016; 14:923-963. [PMID: 27761104 PMCID: PMC5066912 DOI: 10.1137/15m1034180] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The discrete chemical master equation (dCME) provides a fundamental framework for studying stochasticity in mesoscopic networks. Because of the multi-scale nature of many networks where reaction rates have large disparity, directly solving dCMEs is intractable due to the exploding size of the state space. It is important to truncate the state space effectively with quantified errors, so accurate solutions can be computed. It is also important to know if all major probabilistic peaks have been computed. Here we introduce the Accurate CME (ACME) algorithm for obtaining direct solutions to dCMEs. With multi-finite buffers for reducing the state space by O(n!), exact steady-state and time-evolving network probability landscapes can be computed. We further describe a theoretical framework of aggregating microstates into a smaller number of macrostates by decomposing a network into independent aggregated birth and death processes, and give an a priori method for rapidly determining steady-state truncation errors. The maximal sizes of the finite buffers for a given error tolerance can also be pre-computed without costly trial solutions of dCMEs. We show exactly computed probability landscapes of three multi-scale networks, namely, a 6-node toggle switch, 11-node phage-lambda epigenetic circuit, and 16-node MAPK cascade network, the latter two with no known solutions. We also show how probabilities of rare events can be computed from first-passage times, another class of unsolved problems challenging for simulation-based techniques due to large separations in time scales. Overall, the ACME method enables accurate and efficient solutions of the dCME for a large class of networks.
Collapse
|
11
|
Ben-Hamo R, Gidoni M, Efroni S. PhenoNet: identification of key networks associated with disease phenotype. Bioinformatics 2014; 30:2399-405. [PMID: 24812342 DOI: 10.1093/bioinformatics/btu199] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION At the core of transcriptome analyses of cancer is a challenge to detect molecular differences affiliated with disease phenotypes. This approach has led to remarkable progress in identifying molecular signatures and in stratifying patients into clinical groups. Yet, despite this progress, many of the identified signatures are not robust enough to be clinically used and not consistent enough to provide a follow-up on molecular mechanisms. RESULTS To address these issues, we introduce PhenoNet, a novel algorithm for the identification of pathways and networks associated with different phenotypes. PhenoNet uses two types of input data: gene expression data (RMA, RPKM, FPKM, etc.) and phenotypic information, and integrates these data with curated pathways and protein-protein interaction information. Comprehensive iterations across all possible pathways and subnetworks result in the identification of key pathways or subnetworks that distinguish between the two phenotypes. AVAILABILITY AND IMPLEMENTATION Matlab code is available upon request. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rotem Ben-Hamo
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 5290002, Israel
| | - Moriah Gidoni
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 5290002, Israel
| | - Sol Efroni
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 5290002, Israel
| |
Collapse
|
12
|
Ooi HK, Ma L. Modeling heterogeneous responsiveness of intrinsic apoptosis pathway. BMC SYSTEMS BIOLOGY 2013; 7:65. [PMID: 23875784 PMCID: PMC3733900 DOI: 10.1186/1752-0509-7-65] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Accepted: 07/19/2013] [Indexed: 12/22/2022]
Abstract
BACKGROUND Apoptosis is a cell suicide mechanism that enables multicellular organisms to maintain homeostasis and to eliminate individual cells that threaten the organism's survival. Dependent on the type of stimulus, apoptosis can be propagated by extrinsic pathway or intrinsic pathway. The comprehensive understanding of the molecular mechanism of apoptotic signaling allows for development of mathematical models, aiming to elucidate dynamical and systems properties of apoptotic signaling networks. There have been extensive efforts in modeling deterministic apoptosis network accounting for average behavior of a population of cells. Cellular networks, however, are inherently stochastic and significant cell-to-cell variability in apoptosis response has been observed at single cell level. RESULTS To address the inevitable randomness in the intrinsic apoptosis mechanism, we develop a theoretical and computational modeling framework of intrinsic apoptosis pathway at single-cell level, accounting for both deterministic and stochastic behavior. Our deterministic model, adapted from the well-accepted Fussenegger model, shows that an additional positive feedback between the executioner caspase and the initiator caspase plays a fundamental role in yielding the desired property of bistability. We then examine the impact of intrinsic fluctuations of biochemical reactions, viewed as intrinsic noise, and natural variation of protein concentrations, viewed as extrinsic noise, on behavior of the intrinsic apoptosis network. Histograms of the steady-state output at varying input levels show that the intrinsic noise could elicit a wider region of bistability over that of the deterministic model. However, the system stochasticity due to intrinsic fluctuations, such as the noise of steady-state response and the randomness of response delay, shows that the intrinsic noise in general is insufficient to produce significant cell-to-cell variations at physiologically relevant level of molecular numbers. Furthermore, the extrinsic noise represented by random variations of two key apoptotic proteins, namely Cytochrome C and inhibitor of apoptosis proteins (IAP), is modeled separately or in combination with intrinsic noise. The resultant stochasticity in the timing of intrinsic apoptosis response shows that the fluctuating protein variations can induce cell-to-cell stochastic variability at a quantitative level agreeing with experiments. Finally, simulations illustrate that the mean abundance of fluctuating IAP protein is positively correlated with the degree of cellular stochasticity of the intrinsic apoptosis pathway. CONCLUSIONS Our theoretical and computational study shows that the pronounced non-genetic heterogeneity in intrinsic apoptosis responses among individual cells plausibly arises from extrinsic rather than intrinsic origin of fluctuations. In addition, it predicts that the IAP protein could serve as a potential therapeutic target for suppression of the cell-to-cell variation in the intrinsic apoptosis responsiveness.
Collapse
Affiliation(s)
- Hsu Kiang Ooi
- Department of Bioengineering, The University of Texas at Dallas, 800 W. Campbell Rd, Richardson, TX 75080, USA
| | - Lan Ma
- Department of Bioengineering, The University of Texas at Dallas, 800 W. Campbell Rd, Richardson, TX 75080, USA
| |
Collapse
|
13
|
DRESCH JACQUELINEM, THOMPSON MARCA, ARNOSTI DAVIDN, CHIU CHICHIA. TWO-LAYER MATHEMATICAL MODELING OF GENE EXPRESSION: INCORPORATING DNA-LEVEL INFORMATION AND SYSTEM DYNAMICS. SIAM JOURNAL ON APPLIED MATHEMATICS 2013; 73:804-826. [PMID: 25328249 PMCID: PMC4198071 DOI: 10.1137/120887588] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
High-throughput genome sequencing and transcriptome analysis have provided researchers with a quantitative basis for detailed modeling of gene expression using a wide variety of mathematical models. Two of the most commonly employed approaches used to model eukaryotic gene regulation are systems of differential equations, which describe time-dependent interactions of gene networks, and thermodynamic equilibrium approaches that can explore DNA-level transcriptional regulation. To combine the strengths of these approaches, we have constructed a new two-layer mathematical model that provides a dynamical description of gene regulatory systems, using detailed DNA-based information, as well as spatial and temporal transcription factor concentration data. We also developed a semi-implicit numerical algorithm for solving the model equations and demonstrate here the efficiency of this algorithm through stability and convergence analyses. To test the model, we used it together with the semi-implicit algorithm to simulate a Drosophila gene regulatory circuit that drives development in the dorsal-ventral axis of the blastoderm-stage embryo, involving three genes. For model validation, we have done both mathematical and statistical comparisons between the experimental data and the model's simulated data. Where protein and cis-regulatory information is available, our two-layer model provides a method for recapitulating and predicting dynamic aspects of eukaryotic transcriptional systems that will greatly improve our understanding of gene regulation at a global level.
Collapse
Affiliation(s)
- JACQUELINE M. DRESCH
- Department of Mathematics, Harvey Mudd College, Claremont, CA 91711. This author’s work was partly supported by a Teaching and Research Postdoctoral Fellowship at Harvey Mudd College
| | - MARC A. THOMPSON
- Department of Bioengineering, North Carolina Agricultural and Technical State University, Greensboro, NC27411
| | - DAVID N. ARNOSTI
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824. This author’s work was partly supported by NIH grant GM056976
| | - CHICHIA CHIU
- Department of Mathematics, Michigan State University, East Lansing, MI 48824
| |
Collapse
|
14
|
Kim H, Gelenbe E. Reconstruction of large-scale gene regulatory networks using Bayesian model averaging. IEEE Trans Nanobioscience 2013; 11:259-65. [PMID: 22987132 DOI: 10.1109/tnb.2012.2214233] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Gene regulatory networks provide the systematic view of molecular interactions in a complex living system. However, constructing large-scale gene regulatory networks is one of the most challenging problems in systems biology. Also large burst sets of biological data require a proper integration technique for reliable gene regulatory network construction. Here we present a new reverse engineering approach based on Bayesian model averaging which attempts to combine all the appropriate models describing interactions among genes. This Bayesian approach with a prior based on the Gibbs distribution provides an efficient means to integrate multiple sources of biological data. In a simulation study with maximum of 2000 genes, our method shows better sensitivity than previous elastic-net and Gaussian graphical models, with a fixed specificity of 0.99. The study also shows that the proposed method outperforms the other standard methods for a DREAM dataset generated by nonlinear stochastic models. In brain tumor data analysis, three large-scale networks consisting of 4422 genes were built using the gene expression of non-tumor, low and high grade tumor mRNA expression samples, along with DNA-protein binding affinity information. We found that genes having a large variation of degree distribution among the three tumor networks are the ones that see most involved in regulatory and developmental processes, which possibly gives a novel insight concerning conventional differentially expressed gene analysis.
Collapse
Affiliation(s)
- Haseong Kim
- Intelligent Systems and Networks Group, Department of Electrical and Electronic Engineering, Imperial College London, London SW72AZ, UK.
| | | |
Collapse
|