1
|
Fajiculay E, Hsu CP. Localization of Noise in Biochemical Networks. ACS OMEGA 2023; 8:3043-3056. [PMID: 36713703 PMCID: PMC9878546 DOI: 10.1021/acsomega.2c06113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 12/27/2022] [Indexed: 06/18/2023]
Abstract
Noise, or uncertainty in biochemical networks, has become an important aspect of many biological problems. Noise can arise and propagate from external factors and probabilistic chemical reactions occurring in small cellular compartments. For species survival, it is important to regulate such uncertainties in executing vital cell functions. Regulated noise can improve adaptability, whereas uncontrolled noise can cause diseases. Simulation can provide a detailed analysis of uncertainties, but parameters such as rate constants and initial conditions are usually unknown. A general understanding of noise dynamics from the perspective of network structure is highly desirable. In this study, we extended the previously developed law of localization for characterizing noise in terms of (co)variances and developed noise localization theory. With linear noise approximation, we can expand a biochemical network into an extended set of differential equations representing a fictitious network for pseudo-components consisting of variances and covariances, together with chemical species. Through localization analysis, perturbation responses at the steady state of pseudo-components can be summarized into a sensitivity matrix that only requires knowledge of network topology. Our work allows identification of buffering structures at the level of species, variances, and covariances and can provide insights into noise flow under non-steady-state conditions in the form of a pseudo-chemical reaction. We tested noise localization in various systems, and here we discuss its implications and potential applications. Results show that this theory is potentially applicable in discriminating models, scanning network topologies with interesting noise behavior, and designing and perturbing networks with the desired response.
Collapse
Affiliation(s)
- Erickson Fajiculay
- Institute
of Chemistry, Academia Sinica, Taipei115201, Taiwan
- Bioinformatics
Program, Institute of Information Science, Taiwan International Graduate
Program, Academia Sinica, Taipei115201, Taiwan
- Institute
of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu300044, Taiwan
| | - Chao-Ping Hsu
- Institute
of Chemistry, Academia Sinica, Taipei115201, Taiwan
- Bioinformatics
Program, Institute of Information Science, Taiwan International Graduate
Program, Academia Sinica, Taipei115201, Taiwan
- Physics
Division, National Center for Theoretical
Sciences, Taipei106319, Taiwan
- Genome
and Systems Biology Degree Program, National
Taiwan University, Taipei106319, Taiwan
| |
Collapse
|
2
|
Gupta A, Martin-Rufino JD, Jones TR, Subramanian V, Qiu X, Grody EI, Bloemendal A, Weng C, Niu SY, Min KH, Mehta A, Zhang K, Siraj L, Al' Khafaji A, Sankaran VG, Raychaudhuri S, Cleary B, Grossman S, Lander ES. Inferring gene regulation from stochastic transcriptional variation across single cells at steady state. Proc Natl Acad Sci U S A 2022; 119:e2207392119. [PMID: 35969771 PMCID: PMC9407670 DOI: 10.1073/pnas.2207392119] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 07/20/2022] [Indexed: 12/24/2022] Open
Abstract
Regulatory relationships between transcription factors (TFs) and their target genes lie at the heart of cellular identity and function; however, uncovering these relationships is often labor-intensive and requires perturbations. Here, we propose a principled framework to systematically infer gene regulation for all TFs simultaneously in cells at steady state by leveraging the intrinsic variation in the transcriptional abundance across single cells. Through modeling and simulations, we characterize how transcriptional bursts of a TF gene are propagated to its target genes, including the expected ranges of time delay and magnitude of maximum covariation. We distinguish these temporal trends from the time-invariant covariation arising from cell states, and we delineate the experimental and technical requirements for leveraging these small but meaningful cofluctuations in the presence of measurement noise. While current technology does not yet allow adequate power for definitively detecting regulatory relationships for all TFs simultaneously in cells at steady state, we investigate a small-scale dataset to inform future experimental design. This study supports the potential value of mapping regulatory connections through stochastic variation, and it motivates further technological development to achieve its full potential.
Collapse
Affiliation(s)
- Anika Gupta
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115
| | - Jorge D. Martin-Rufino
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Division of Hematology/Oncology, Boston Children’s Hospital, Boston, MA 02115
- Dana-Farber Cancer Institute, Boston, MA 02215
| | | | | | - Xiaojie Qiu
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142
- HHMI, Massachusetts Institute of Technology, Cambridge, MA 02139
| | | | | | - Chen Weng
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Division of Hematology/Oncology, Boston Children’s Hospital, Boston, MA 02115
- Dana-Farber Cancer Institute, Boston, MA 02215
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142
| | | | - Kyung Hoi Min
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Arnav Mehta
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Dana-Farber Cancer Institute, Boston, MA 02215
- Department of Medicine, Massachusetts General Hospital, Boston, MA 02114
| | - Kaite Zhang
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
| | - Layla Siraj
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
| | | | - Vijay G. Sankaran
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Division of Hematology/Oncology, Boston Children’s Hospital, Boston, MA 02115
- Dana-Farber Cancer Institute, Boston, MA 02215
| | - Soumya Raychaudhuri
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA 02115
| | - Brian Cleary
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
| | | | - Eric S. Lander
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02142
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| |
Collapse
|
3
|
Jiang Q, Zhang S, Wan L. Dynamic inference of cell developmental complex energy landscape from time series single-cell transcriptomic data. PLoS Comput Biol 2022; 18:e1009821. [PMID: 35073331 PMCID: PMC8812873 DOI: 10.1371/journal.pcbi.1009821] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 02/03/2022] [Accepted: 01/10/2022] [Indexed: 12/27/2022] Open
Abstract
Time series single-cell RNA sequencing (scRNA-seq) data are emerging. However, dynamic inference of an evolving cell population from time series scRNA-seq data is challenging owing to the stochasticity and nonlinearity of the underlying biological processes. This calls for the development of mathematical models and methods capable of reconstructing cellular dynamic transition processes and uncovering the nonlinear cell-cell interactions. In this study, we present GraphFP, a nonlinear Fokker-Planck equation on graph based model and dynamic inference framework, with the aim of reconstructing the cell state-transition complex potential energy landscape from time series single-cell transcriptomic data. The free energy of our model explicitly takes into account of the cell-cell interactions in a nonlinear quadratic term. We then recast the model inference problem in the form of a dynamic optimal transport framework and solve it efficiently with the adjoint method of optimal control. We evaluated GraphFP on the time series scRNA-seq data set of embryonic murine cerebral cortex development. We illustrated that it 1) reconstructs cell state potential energy, which is a measure of cellular differentiation potency, 2) faithfully charts the probability flows between paired cell states over the dynamic processes of cell differentiation, and 3) accurately quantifies the stochastic dynamics of cell type frequencies on probability simplex in continuous time. We also illustrated that GraphFP is robust in terms of cluster labelling with different resolutions, as well as parameter choices. Meanwhile, GraphFP provides a model-based approach to delineate the cell-cell interactions that drive cell differentiation. GraphFP software is available at https://github.com/QiJiang-QJ/GraphFP. Dynamic inference of cell development processes from time series scRNA-seq data is a major challenge. Here, we present GraphFP, a coherent computational framework that simultaneously reconstructs the cell state-transition complex potential energy landscape and infers cell-cell interactions from time series single-cell transcriptomic data. Based on the mathematical framework of nonlinear Fokker-Planck equation on graph, GraphFP models the stochastic dynamics of the cell state/type frequencies on probability simplex in continuous time, where the free energy with a nonlinear quadratic interaction term is employed to characterize cell-cell interactions. We formulate the model inference problem in the form of a dynamic optimal transport framework and solve it efficiently with the celebrated adjoint method. GraphFP allows for 1) reconstructing cell state potential energy, which is a measure of cellular differentiation potency, 2) charting the probability flows between paired cell states over dynamic processes, 3) quantifying the stochastic dynamics of cell type frequencies on probability simplex in continuous time, and 4) delineating cell-cell interactions that drive cell differentiation. We show how GraphFP can be used to faithfully reveal and accurately quantify the cell development processes using the embryonic murine cerebral cortex development time series scRNA-seq dataset.
Collapse
Affiliation(s)
- Qi Jiang
- NCMIS, LSC, LSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Shuo Zhang
- NCMIS, LSC, LSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Lin Wan
- NCMIS, LSC, LSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
- * E-mail:
| |
Collapse
|
4
|
Stein-O'Brien GL, Ainsile MC, Fertig EJ. Forecasting cellular states: from descriptive to predictive biology via single-cell multiomics. CURRENT OPINION IN SYSTEMS BIOLOGY 2021; 26:24-32. [PMID: 34660940 PMCID: PMC8516130 DOI: 10.1016/j.coisb.2021.03.008] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
As the single cell field races to characterize each cell type, state, and behavior, the complexity of the computational analysis approaches the complexity of the biological systems. Single cell and imaging technologies now enable unprecedented measurements of state transitions in biological systems, providing high-throughput data that capture tens-of-thousands of measurements on hundreds-of-thousands of samples. Thus, the definition of cell type and state is evolving to encompass the broad range of biological questions now attainable. To answer these questions requires the development of computational tools for integrated multi-omics analysis. Merged with mathematical models, these algorithms will be able to forecast future states of biological systems, going from statistical inferences of phenotypes to time course predictions of the biological systems with dynamic maps analogous to weather systems. Thus, systems biology for forecasting biological system dynamics from multi-omic data represents the future of cell biology empowering a new generation of technology-driven predictive medicine.
Collapse
Affiliation(s)
- Genevieve L Stein-O'Brien
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD
- Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD
- Convergence Institute, Johns Hopkins University, Baltimore, MD
| | - Michaela C Ainsile
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD
| | - Elana J Fertig
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD
- Convergence Institute, Johns Hopkins University, Baltimore, MD
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD
- Department of Applied Mathematics & Statistics, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD
| |
Collapse
|
5
|
Turki T, Taguchi YH. Discriminating the single-cell gene regulatory networks of human pancreatic islets: A novel deep learning application. Comput Biol Med 2021; 132:104257. [PMID: 33740535 DOI: 10.1016/j.compbiomed.2021.104257] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 02/01/2021] [Accepted: 02/03/2021] [Indexed: 12/24/2022]
Abstract
Analysis of single-cell pancreatic data can play an important role in understanding various metabolic diseases and health conditions. Due to the sparsity and noise present in such single-cell gene expression data, inference of single-cell gene regulatory networks remains a challenge. Since recent studies have reported the reliable inference of single-cell gene regulatory networks (SCGRNs), the current study focused on discriminating the SCGRNs of T2D patients from those of healthy controls. By accurately distinguishing SCGRNs of healthy pancreas from those of T2D pancreas, it would be possible to annotate, organize, visualize, and identify common patterns of SCGRNs in metabolic diseases. Such annotated SCGRNs could play an important role in accelerating the process of building large data repositories. This study aimed to contribute to the development of a novel deep learning (DL) application. First, we generated a dataset consisting of 224 SCGRNs belonging to both T2D and healthy pancreas and made it freely available. Next, we chose seven DL architectures, including VGG16, VGG19, Xception, ResNet50, ResNet101, DenseNet121, and DenseNet169, trained each of them on the dataset, and checked their prediction based on a test set. Of note, we evaluated the DL architectures on a single NVIDIA GeForce RTX 2080Ti GPU. Experimental results on the whole dataset, using several performance measures, demonstrated the superiority of VGG19 DL model in the automatic classification of SCGRNs, derived from the single-cell pancreatic data.
Collapse
Affiliation(s)
- Turki Turki
- Department of Computer Science, King Abdulaziz University, Jeddah, 21589, Saudi Arabia.
| | - Y-H Taguchi
- Department of Physics, Chuo University, Tokyo, 112-8551, Japan.
| |
Collapse
|
6
|
Osorio D, Zhong Y, Li G, Huang JZ, Cai JJ. scTenifoldNet: A Machine Learning Workflow for Constructing and Comparing Transcriptome-wide Gene Regulatory Networks from Single-Cell Data. PATTERNS (NEW YORK, N.Y.) 2020; 1:100139. [PMID: 33336197 PMCID: PMC7733883 DOI: 10.1016/j.patter.2020.100139] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 09/29/2020] [Accepted: 10/12/2020] [Indexed: 02/02/2023]
Abstract
We present scTenifoldNet-a machine learning workflow built upon principal-component regression, low-rank tensor approximation, and manifold alignment-for constructing and comparing single-cell gene regulatory networks (scGRNs) using data from single-cell RNA sequencing. scTenifoldNet reveals regulatory changes in gene expression between samples by comparing the constructed scGRNs. With real data, scTenifoldNet identifies specific gene expression programs associated with different biological processes, providing critical insights into the underlying mechanism of regulatory networks governing cellular transcriptional activities.
Collapse
Affiliation(s)
- Daniel Osorio
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Yan Zhong
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - Guanxun Li
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - Jianhua Z. Huang
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - James J. Cai
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
- Interdisciplinary Program of Genetics, Texas A&M University, College Station, TX 77843, USA
| |
Collapse
|