1
|
Choi Y, Li R, Quon G. siVAE: interpretable deep generative models for single-cell transcriptomes. Genome Biol 2023; 24:29. [PMID: 36803416 PMCID: PMC9940350 DOI: 10.1186/s13059-023-02850-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 01/06/2023] [Indexed: 02/22/2023] Open
Abstract
Neural networks such as variational autoencoders (VAE) perform dimensionality reduction for the visualization and analysis of genomic data, but are limited in their interpretability: it is unknown which data features are represented by each embedding dimension. We present siVAE, a VAE that is interpretable by design, thereby enhancing downstream analysis tasks. Through interpretation, siVAE also identifies gene modules and hubs without explicit gene network inference. We use siVAE to identify gene modules whose connectivity is associated with diverse phenotypes such as iPSC neuronal differentiation efficiency and dementia, showcasing the wide applicability of interpretable generative models for genomic data analysis.
Collapse
Affiliation(s)
- Yongin Choi
- Graduate Group in Biomedical Engineering, University of California, Davis, Davis, CA, USA
- Genome Center, University of California, Davis, Davis, CA, USA
| | - Ruoxin Li
- Genome Center, University of California, Davis, Davis, CA, USA
- Graduate Group in Biostatistics, University of California, Davis, Davis, CA, USA
| | - Gerald Quon
- Graduate Group in Biomedical Engineering, University of California, Davis, Davis, CA, USA.
- Genome Center, University of California, Davis, Davis, CA, USA.
- Department of Molecular and Cellular Biology, University of California, Davis, Davis, CA, USA.
| |
Collapse
|
2
|
Song Q, Ruffalo M, Bar-Joseph Z. Using single cell atlas data to reconstruct regulatory networks. Nucleic Acids Res 2023; 51:e38. [PMID: 36762475 PMCID: PMC10123116 DOI: 10.1093/nar/gkad053] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 12/16/2022] [Accepted: 01/19/2023] [Indexed: 02/11/2023] Open
Abstract
Inference of global gene regulatory networks from omics data is a long-term goal of systems biology. Most methods developed for inferring transcription factor (TF)-gene interactions either relied on a small dataset or used snapshot data which is not suitable for inferring a process that is inherently temporal. Here, we developed a new computational method that combines neural networks and multi-task learning to predict RNA velocity rather than gene expression values. This allows our method to overcome many of the problems faced by prior methods leading to more accurate and more comprehensive set of identified regulatory interactions. Application of our method to atlas scale single cell data from 6 HuBMAP tissues led to several validated and novel predictions and greatly improved on prior methods proposed for this task.
Collapse
Affiliation(s)
- Qi Song
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Matthew Ruffalo
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
3
|
Galindez G, Sadegh S, Baumbach J, Kacprowski T, List M. Network-based approaches for modeling disease regulation and progression. Comput Struct Biotechnol J 2022; 21:780-795. [PMID: 36698974 PMCID: PMC9841310 DOI: 10.1016/j.csbj.2022.12.022] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/14/2022] [Accepted: 12/14/2022] [Indexed: 12/23/2022] Open
Abstract
Molecular interaction networks lay the foundation for studying how biological functions are controlled by the complex interplay of genes and proteins. Investigating perturbed processes using biological networks has been instrumental in uncovering mechanisms that underlie complex disease phenotypes. Rapid advances in omics technologies have prompted the generation of high-throughput datasets, enabling large-scale, network-based analyses. Consequently, various modeling techniques, including network enrichment, differential network extraction, and network inference, have proven to be useful for gaining new mechanistic insights. We provide an overview of recent network-based methods and their core ideas to facilitate the discovery of disease modules or candidate mechanisms. Knowledge generated from these computational efforts will benefit biomedical research, especially drug development and precision medicine. We further discuss current challenges and provide perspectives in the field, highlighting the need for more integrative and dynamic network approaches to model disease development and progression.
Collapse
Affiliation(s)
- Gihanna Galindez
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Sepideh Sadegh
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| |
Collapse
|
4
|
Fang X, Yan P, Luo F, Han S, Lin T, Li S, Li S, Zhu T. Functional Identification of Arthrinium phaeospermum Effectors Related to Bambusa pervariabilis × Dendrocalamopsis grandis Shoot Blight. Biomolecules 2022; 12:biom12091264. [PMID: 36139102 PMCID: PMC9496123 DOI: 10.3390/biom12091264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 09/02/2022] [Accepted: 09/06/2022] [Indexed: 12/03/2022] Open
Abstract
The shoot blight of Bambusa pervariabilis × Dendrocalamopsis grandis caused by Arthrinium phaeospermum made bamboo die in a large area, resulting in serious ecological and economic losses. Dual RNA-seq was used to sequence and analyze the transcriptome data of A. phaeospermum and B. pervariabilis × D. grandis in the four periods after the pathogen infected the host and to screen the candidate effectors of the pathogen related to the infection. After the identification of the effectors by the tobacco transient expression system, the functions of these effectors were verified by gene knockout. Fifty-three differentially expressed candidate effectors were obtained by differential gene expression analysis and effector prediction. Among them, the effectors ApCE12 and ApCE22 can cause programmed cell death in tobacco. The disease index of B. pervariabilis × D. grandis inoculated with mutant ΔApCE12 and mutant ΔApCE22 strains were 52.5% and 47.5%, respectively, which was significantly lower than that of the wild-type strains (80%), the ApCE12 complementary strain (77.5%), and the ApCE22 complementary strain (75%). The tolerance of the mutant ΔApCE12 and mutant ΔApCE22 strains to H2O2 and NaCl stress was significantly lower than that of the wild-type strain and the ApCE12 complementary and ApCE22 complementary strains, but there was no difference in their tolerance to Congo red. Therefore, this study shows that the effectors ApCE12 and ApCE22 play an important role in A. phaeospermum virulence and response to H2O2 and NaCl stress.
Collapse
Affiliation(s)
- Xinmei Fang
- College of Forestry, Sichuan Agricultural University, Chengdu 611130, China
- Faculty of Mathematics and Natural Sciences, University of Cologne, 50674 Köln, Germany
| | - Peng Yan
- College of Forestry, Sichuan Agricultural University, Chengdu 611130, China
| | - Fengying Luo
- College of Forestry, Sichuan Agricultural University, Chengdu 611130, China
| | - Shan Han
- College of Forestry, Sichuan Agricultural University, Chengdu 611130, China
| | - Tiantian Lin
- College of Forestry, Sichuan Agricultural University, Chengdu 611130, China
| | - Shuying Li
- College of Forestry, Sichuan Agricultural University, Chengdu 611130, China
| | - Shujiang Li
- College of Forestry, Sichuan Agricultural University, Chengdu 611130, China
- National Forestry and Grassland Administration Key Laboratory of Forest Resources Conservation and Ecological Safety on the Upper Reaches of the Yangtze River, Chengdu 611130, China
- Correspondence: (S.L.); (T.Z.); Tel.: +86-17761264491 (T.Z.)
| | - Tianhui Zhu
- College of Forestry, Sichuan Agricultural University, Chengdu 611130, China
- Correspondence: (S.L.); (T.Z.); Tel.: +86-17761264491 (T.Z.)
| |
Collapse
|
5
|
Freyre-González JA, Escorcia-Rodríguez JM, Gutiérrez-Mondragón LF, Martí-Vértiz J, Torres-Franco CN, Zorro-Aranda A. System Principles Governing the Organization, Architecture, Dynamics, and Evolution of Gene Regulatory Networks. Front Bioeng Biotechnol 2022; 10:888732. [PMID: 35646858 PMCID: PMC9135355 DOI: 10.3389/fbioe.2022.888732] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 04/27/2022] [Indexed: 11/21/2022] Open
Abstract
Synthetic biology aims to apply engineering principles for the rational, systematical design and construction of biological systems displaying functions that do not exist in nature or even building a cell from scratch. Understanding how molecular entities interconnect, work, and evolve in an organism is pivotal to this aim. Here, we summarize and discuss some historical organizing principles identified in bacterial gene regulatory networks. We propose a new layer, the concilion, which is the group of structural genes and their local regulators responsible for a single function that, organized hierarchically, coordinate a response in a way reminiscent of the deliberation and negotiation that take place in a council. We then highlight the importance that the network structure has, and discuss that the natural decomposition approach has unveiled the system-level elements shaping a common functional architecture governing bacterial regulatory networks. We discuss the incompleteness of gene regulatory networks and the need for network inference and benchmarking standardization. We point out the importance that using the network structural properties showed to improve network inference. We discuss the advances and controversies regarding the consistency between reconstructions of regulatory networks and expression data. We then discuss some perspectives on the necessity of studying regulatory networks, considering the interactions’ strength distribution, the challenges to studying these interactions’ strength, and the corresponding effects on network structure and dynamics. Finally, we explore the ability of evolutionary systems biology studies to provide insights into how evolution shapes functional architecture despite the high evolutionary plasticity of regulatory networks.
Collapse
Affiliation(s)
- Julio A Freyre-González
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, México
| | - Juan M Escorcia-Rodríguez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, México
| | - Luis F Gutiérrez-Mondragón
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, México
- Undergraduate Program in Genomic Sciences, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, México
| | - Jerónimo Martí-Vértiz
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, México
| | - Camila N Torres-Franco
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, México
| | - Andrea Zorro-Aranda
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autónoma de México, Cuernavaca, México
- Department of Chemical Engineering, Universidad de Antioquia, Medellín, Colombia
| |
Collapse
|
6
|
Srinivasan K, Coble N, Hamlin J, Antonsen T, Ott E, Girvan M. Parallel Machine Learning for Forecasting the Dynamics of Complex Networks. PHYSICAL REVIEW LETTERS 2022; 128:164101. [PMID: 35522516 DOI: 10.1103/physrevlett.128.164101] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 03/28/2022] [Indexed: 06/14/2023]
Abstract
Forecasting the dynamics of large, complex, sparse networks from previous time series data is important in a wide range of contexts. Here we present a machine learning scheme for this task using a parallel architecture that mimics the topology of the network of interest. We demonstrate the utility and scalability of our method implemented using reservoir computing on a chaotic network of oscillators. Two levels of prior knowledge are considered: (i) the network links are known, and (ii) the network links are unknown and inferred via a data-driven approach to approximately optimize prediction.
Collapse
Affiliation(s)
| | - Nolan Coble
- University of Maryland, College Park, Maryland 20742, USA
- SUNY Brockport, Brockport, New York 14420, USA
| | - Joy Hamlin
- Stony Brook University, Long Island, New York 11794, USA
| | | | - Edward Ott
- University of Maryland, College Park, Maryland 20742, USA
| | | |
Collapse
|
7
|
Three topological features of regulatory networks control life-essential and specialized subsystems. Sci Rep 2021; 11:24209. [PMID: 34930908 PMCID: PMC8688434 DOI: 10.1038/s41598-021-03625-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 12/07/2021] [Indexed: 11/08/2022] Open
Abstract
Gene regulatory networks (GRNs) play key roles in development, phenotype plasticity, and evolution. Although graph theory has been used to explore GRNs, associations amongst topological features, transcription factors (TFs), and systems essentiality are poorly understood. Here we sought the relationship amongst the main GRN topological features that influence the control of essential and specific subsystems. We found that the Knn, page rank, and degree are the most relevant GRN features: the ones are conserved along the evolution and are also relevant in pluripotent cells. Interestingly, life-essential subsystems are governed mainly by TFs with intermediary Knn and high page rank or degree, whereas specialized subsystems are mainly regulated by TFs with low Knn. Hence, we suggest that the high probability of TFs be toured by a random signal, and the high probability of the signal propagation to target genes ensures the life-essential subsystems' robustness. Gene/genome duplication is the main evolutionary process to rise Knn as the most relevant feature. Herein, we shed light on unexplored topological GRN features to assess how they are related to subsystems and how the duplications shaped the regulatory systems along the evolution. The classification model generated can be found here: https://github.com/ivanrwolf/NoC/ .
Collapse
|
8
|
Naseri A, Sharghi M, Hasheminejad SMH. Enhancing gene regulatory networks inference through hub-based data integration. Comput Biol Chem 2021; 95:107589. [PMID: 34673384 DOI: 10.1016/j.compbiolchem.2021.107589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 08/11/2021] [Accepted: 10/04/2021] [Indexed: 12/09/2022]
Abstract
One of the main research topics in computational biology is Gene Regulatory Network (GRN) reconstruction that refers to inferring the relationships between genes involved in regulating cell conditions in response to internal or external stimuli. To this end, most computational methods use only transcriptional gene expression data to reconstruct gene regulatory networks, but recent studies suggest that gene expression data must be integrated with other types of data to obtain more accurate models predicting real relationships between genes. In this study, a diffusion-based method is enhanced to integrate biological data of network types besides structural prior knowledge. The Random Walk with Restart algorithm (RWR) with an emphasis on hub nodes is executed separately on each network, and then jointly optimizes low-dimensional feature vectors for network nodes by diffusion component analysis. Next, these feature vectors are used to infer gene regulatory networks. Fourteen centrality measures are studied for the detection of hub nodes to be used in the RWR algorithm, and the best centrality measure having the greatest effect on the improvement of gene network inference is selected. A case study for the Saccharomyces cerevisiae and E. coli networks shows that using the proposed features in comparison with gene expression data alone results in 0.02-0.08 units improvement in Area Under Receiver Characteristic Operator (AUROC) criteria across different gene regulatory network inference methods. Furthermore, the proposed method was applied to the esophageal cancer data to infer its gene regulatory network. The proposed framework substantially improves accuracy and scalability of GRN inference. The fused features and the best centrality measure detected can be used to provide functional insights about genes or proteins in various biological applications. Moreover, it can be served as a general framework for network data and structural data integration and analysis problems in various scientific disciplines including biology.
Collapse
Affiliation(s)
- Atefeh Naseri
- Department of Computer Engineering, Alzahra University, Tehran, Iran.
| | - Mehran Sharghi
- Department of Computer Engineering, Alzahra University, Tehran, Iran.
| | | |
Collapse
|
9
|
Abstract
Cancer is a genetic disease in which multiple genes are perturbed. Thus, information about the regulatory relationships between genes is necessary for the identification of biomarkers and therapeutic targets. In this review, methods for inference of gene regulatory networks (GRNs) from transcriptomics data that are used in cancer research are introduced. The methods are classified into three categories according to the analysis model. The first category includes methods that use pair-wise measures between genes, including correlation coefficient and mutual information. The second category includes methods that determine the genetic regulatory relationship using multivariate measures, which consider the expression profiles of all genes concurrently. The third category includes methods using supervised and integrative approaches. The supervised approach estimates the regulatory relationship using a supervised learning method that constructs a regression or classification model for predicting whether there is a regulatory relationship between genes with input data of gene expression profiles and class labels of prior biological knowledge. The integrative method is an expansion of the supervised method and uses more data and biological knowledge for predicting the regulatory relationship. Furthermore, simulation and experimental validation of the estimated GRNs are also discussed in this review. This review identified that most GRN inference methods are not specific for cancer transcriptome data, and such methods are required for better understanding of cancer pathophysiology. In addition, more systematic methods for validation of the estimated GRNs need to be developed in the context of cancer biology.
Collapse
|
10
|
Gupta C, Ramegowda V, Basu S, Pereira A. Using Network-Based Machine Learning to Predict Transcription Factors Involved in Drought Resistance. Front Genet 2021; 12:652189. [PMID: 34249082 PMCID: PMC8264776 DOI: 10.3389/fgene.2021.652189] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 05/13/2021] [Indexed: 12/13/2022] Open
Abstract
Gene regulatory networks underpin stress response pathways in plants. However, parsing these networks to prioritize key genes underlying a particular trait is challenging. Here, we have built the Gene Regulation and Association Network (GRAiN) of rice (Oryza sativa). GRAiN is an interactive query-based web-platform that allows users to study functional relationships between transcription factors (TFs) and genetic modules underlying abiotic-stress responses. We built GRAiN by applying a combination of different network inference algorithms to publicly available gene expression data. We propose a supervised machine learning framework that complements GRAiN in prioritizing genes that regulate stress signal transduction and modulate gene expression under drought conditions. Our framework converts intricate network connectivity patterns of 2160 TFs into a single drought score. We observed that TFs with the highest drought scores define the functional, structural, and evolutionary characteristics of drought resistance in rice. Our approach accurately predicted the function of OsbHLH148 TF, which we validated using in vitro protein-DNA binding assays and mRNA sequencing loss-of-function mutants grown under control and drought stress conditions. Our network and the complementary machine learning strategy lends itself to predicting key regulatory genes underlying other agricultural traits and will assist in the genetic engineering of desirable rice varieties.
Collapse
Affiliation(s)
- Chirag Gupta
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Venkategowda Ramegowda
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Supratim Basu
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Andy Pereira
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| |
Collapse
|
11
|
|
12
|
Gupta C, Ramegowda V, Basu S, Pereira A. Using Network-Based Machine Learning to Predict Transcription Factors Involved in Drought Resistance. Front Genet 2021. [PMID: 34249082 DOI: 10.1101/2020.04.29.068379] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023] Open
Abstract
Gene regulatory networks underpin stress response pathways in plants. However, parsing these networks to prioritize key genes underlying a particular trait is challenging. Here, we have built the Gene Regulation and Association Network (GRAiN) of rice (Oryza sativa). GRAiN is an interactive query-based web-platform that allows users to study functional relationships between transcription factors (TFs) and genetic modules underlying abiotic-stress responses. We built GRAiN by applying a combination of different network inference algorithms to publicly available gene expression data. We propose a supervised machine learning framework that complements GRAiN in prioritizing genes that regulate stress signal transduction and modulate gene expression under drought conditions. Our framework converts intricate network connectivity patterns of 2160 TFs into a single drought score. We observed that TFs with the highest drought scores define the functional, structural, and evolutionary characteristics of drought resistance in rice. Our approach accurately predicted the function of OsbHLH148 TF, which we validated using in vitro protein-DNA binding assays and mRNA sequencing loss-of-function mutants grown under control and drought stress conditions. Our network and the complementary machine learning strategy lends itself to predicting key regulatory genes underlying other agricultural traits and will assist in the genetic engineering of desirable rice varieties.
Collapse
Affiliation(s)
- Chirag Gupta
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Venkategowda Ramegowda
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Supratim Basu
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Andy Pereira
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| |
Collapse
|
13
|
Musungu B, Bhatnagar D, Quiniou S, Brown RL, Payne GA, O’Brian G, Fakhoury AM, Geisler M. Use of Dual RNA-seq for Systems Biology Analysis of Zea mays and Aspergillus flavus Interaction. Front Microbiol 2020; 11:853. [PMID: 32582038 PMCID: PMC7285840 DOI: 10.3389/fmicb.2020.00853] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 04/09/2020] [Indexed: 11/18/2022] Open
Abstract
The interaction between Aspergillus flavus and Zea mays is complex, and the identification of plant genes and pathways conferring resistance to the fungus has been challenging. Therefore, the authors undertook a systems biology approach involving dual RNA-seq to determine the simultaneous response from the host and the pathogen. What was dramatically highlighted in the analysis is the uniformity in the development patterns of gene expression of the host and the pathogen during infection. This led to the development of a "stage of infection index" that was subsequently used to categorize the samples before down-stream system biology analysis. Additionally, we were able to ascertain that key maize genes in pathways such as the jasmonate, ethylene and ROS pathways, were up-regulated in the study. The stage of infection index used for the transcriptomic analysis revealed that A. flavus produces a relatively limited number of transcripts during the early stages (0 to 12 h) of infection. At later stages, in A. flavus, transcripts and pathways involved in endosomal transport, aflatoxin production, and carbohydrate metabolism were up-regulated. Multiple WRKY genes targeting the activation of the resistance pathways (i.e., jasmonate, phenylpropanoid, and ethylene) were detected using causal inference analysis. This analysis also revealed, for the first time, the activation of Z. mays resistance genes influencing the expression of specific A. flavus genes. Our results show that A. flavus seems to be reacting to a hostile environment resulting from the activation of resistance pathways in Z. mays. This study revealed the dynamic nature of the interaction between the two organisms.
Collapse
Affiliation(s)
- Bryan Musungu
- Department of Plant Biology, Southern Illinois University, Carbondale, IL, United States
| | - Deepak Bhatnagar
- Southern Regional Research Center, USDA-ARS, New Orleans, LA, United States
| | - Sylvie Quiniou
- Warm Water Aquaculture Research Unit, USDA-ARS, Stoneville, MS, United States
| | - Robert L. Brown
- Southern Regional Research Center, USDA-ARS, New Orleans, LA, United States
| | - Gary A. Payne
- Department of Plant Pathology, North Carolina State University, Raleigh, NC, United States
| | - Greg O’Brian
- Department of Plant Pathology, North Carolina State University, Raleigh, NC, United States
| | - Ahmad M. Fakhoury
- Department of Plant Soil and Agriculture Systems, Southern Illinois University, Carbondale, IL, United States
| | - Matt Geisler
- Department of Plant Biology, Southern Illinois University, Carbondale, IL, United States
| |
Collapse
|
14
|
Staunton PM, Miranda-CasoLuengo AA, Loftus BJ, Gormley IC. BINDER: computationally inferring a gene regulatory network for Mycobacterium abscessus. BMC Bioinformatics 2019; 20:466. [PMID: 31500560 PMCID: PMC6734328 DOI: 10.1186/s12859-019-3042-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 08/21/2019] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND Although many of the genic features in Mycobacterium abscessus have been fully validated, a comprehensive understanding of the regulatory elements remains lacking. Moreover, there is little understanding of how the organism regulates its transcriptomic profile, enabling cells to survive in hostile environments. Here, to computationally infer the gene regulatory network for Mycobacterium abscessus we propose a novel statistical computational modelling approach: BayesIan gene regulatory Networks inferreD via gene coExpression and compaRative genomics (BINDER). In tandem with derived experimental coexpression data, the property of genomic conservation is exploited to probabilistically infer a gene regulatory network in Mycobacterium abscessus.Inference on regulatory interactions is conducted by combining 'primary' and 'auxiliary' data strata. The data forming the primary and auxiliary strata are derived from RNA-seq experiments and sequence information in the primary organism Mycobacterium abscessus as well as ChIP-seq data extracted from a related proxy organism Mycobacterium tuberculosis. The primary and auxiliary data are combined in a hierarchical Bayesian framework, informing the apposite bivariate likelihood function and prior distributions respectively. The inferred relationships provide insight to regulon groupings in Mycobacterium abscessus. RESULTS We implement BINDER on data relating to a collection of 167,280 regulator-target pairs resulting in the identification of 54 regulator-target pairs, across 5 transcription factors, for which there is strong probability of regulatory interaction. CONCLUSIONS The inferred regulatory interactions provide insight to, and a valuable resource for further studies of, transcriptional control in Mycobacterium abscessus, and in the family of Mycobacteriaceae more generally. Further, the developed BINDER framework has broad applicability, useable in settings where computational inference of a gene regulatory network requires integration of data sources derived from both the primary organism of interest and from related proxy organisms.
Collapse
Affiliation(s)
- Patrick M. Staunton
- School of Medicine, Conway Institute, University College Dublin, Dublin, Ireland
| | | | - Brendan J. Loftus
- School of Medicine, Conway Institute, University College Dublin, Dublin, Ireland
| | - Isobel Claire Gormley
- School of Mathematics and Statistics, Insight Centre for Data Analytics, University College Dublin, Dublin, Ireland
| |
Collapse
|
15
|
Liu W, Rajapakse JC. Fusing gene expressions and transitive protein-protein interactions for inference of gene regulatory networks. BMC SYSTEMS BIOLOGY 2019; 13:37. [PMID: 30953534 PMCID: PMC6449891 DOI: 10.1186/s12918-019-0695-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
BACKGROUND Systematic fusion of multiple data sources for Gene Regulatory Networks (GRN) inference remains a key challenge in systems biology. We incorporate information from protein-protein interaction networks (PPIN) into the process of GRN inference from gene expression (GE) data. However, existing PPIN remain sparse and transitive protein interactions can help predict missing protein interactions. We therefore propose a systematic probabilistic framework on fusing GE data and transitive protein interaction data to coherently build GRN. RESULTS We use a Gaussian Mixture Model (GMM) to soft-cluster GE data, allowing overlapping cluster memberships. Next, a heuristic method is proposed to extend sparse PPIN by incorporating transitive linkages. We then propose a novel way to score extended protein interactions by combining topological properties of PPIN and correlations of GE. Following this, GE data and extended PPIN are fused using a Gaussian Hidden Markov Model (GHMM) in order to identify gene regulatory pathways and refine interaction scores that are then used to constrain the GRN structure. We employ a Bayesian Gaussian Mixture (BGM) model to refine the GRN derived from GE data by using the structural priors derived from GHMM. Experiments on real yeast regulatory networks demonstrate both the feasibility of the extended PPIN in predicting transitive protein interactions and its effectiveness on improving the coverage and accuracy the proposed method of fusing PPIN and GE to build GRN. CONCLUSION The GE and PPIN fusion model outperforms both the state-of-the-art single data source models (CLR, GENIE3, TIGRESS) as well as existing fusion models under various constraints.
Collapse
Affiliation(s)
- Wenting Liu
- School of Public Health and Management, Hubei University of Medicine, Shiyan, Hubei China
- Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA USA
| | - Jagath C. Rajapakse
- School of Computer Engineering, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
16
|
Fernandez-Valverde SL, Aguilera F, Ramos-Díaz RA. Inference of Developmental Gene Regulatory Networks Beyond Classical Model Systems: New Approaches in the Post-genomic Era. Integr Comp Biol 2019; 58:640-653. [PMID: 29917089 DOI: 10.1093/icb/icy061] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The advent of high-throughput sequencing (HTS) technologies has revolutionized the way we understand the transformation of genetic information into morphological traits. Elucidating the network of interactions between genes that govern cell differentiation through development is one of the core challenges in genome research. These networks are known as developmental gene regulatory networks (dGRNs) and consist largely of the functional linkage between developmental control genes, cis-regulatory modules, and differentiation genes, which generate spatially and temporally refined patterns of gene expression. Over the last 20 years, great advances have been made in determining these gene interactions mainly in classical model systems, including human, mouse, sea urchin, fruit fly, and worm. This has brought about a radical transformation in the fields of developmental biology and evolutionary biology, allowing the generation of high-resolution gene regulatory maps to analyze cell differentiation during animal development. Such maps have enabled the identification of gene regulatory circuits and have led to the development of network inference methods that can recapitulate the differentiation of specific cell-types or developmental stages. In contrast, dGRN research in non-classical model systems has been limited to the identification of developmental control genes via the candidate gene approach and the characterization of their spatiotemporal expression patterns, as well as to the discovery of cis-regulatory modules via patterns of sequence conservation and/or predicted transcription-factor binding sites. However, thanks to the continuous advances in HTS technologies, this scenario is rapidly changing. Here, we give a historical overview on the architecture and elucidation of the dGRNs. Subsequently, we summarize the approaches available to unravel these regulatory networks, highlighting the vast range of possibilities of integrating multiple technical advances and theoretical approaches to expand our understanding on the global gene regulation during animal development in non-classical model systems. Such new knowledge will not only lead to greater insights into the evolution of molecular mechanisms underlying cell identity and animal body plans, but also into the evolution of morphological key innovations in animals.
Collapse
Affiliation(s)
- Selene L Fernandez-Valverde
- CONACYT, Unidad de Genómica Avanzada, Laboratorio Nacional de Genómica para la Biodiversidad (Langebio), Centro de Investigación y de Estudios Avanzados del IPN, Irapuato, Guanajuato, Mexico
| | - Felipe Aguilera
- Departamento de Bioquímica y Biología Molecular, Facultad de Ciencias Biológicas, Universidad de Concepción, Chile
| | - René Alexander Ramos-Díaz
- CONACYT, Unidad de Genómica Avanzada, Laboratorio Nacional de Genómica para la Biodiversidad (Langebio), Centro de Investigación y de Estudios Avanzados del IPN, Irapuato, Guanajuato, Mexico
| |
Collapse
|
17
|
Zhang W, Li W, Zhang J, Wang N. Data Integration of Hybrid Microarray and Single Cell Expression Data to Enhance Gene Network Inference. Curr Bioinform 2019. [DOI: 10.2174/1574893614666190104142228] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Background:
Gene Regulatory Network (GRN) inference algorithms aim to explore
casual interactions between genes and transcriptional factors. High-throughput transcriptomics
data including DNA microarray and single cell expression data contain complementary
information in network inference.
Objective:
To enhance GRN inference, data integration across various types of expression data
becomes an economic and efficient solution.
Method:
In this paper, a novel E-alpha integration rule-based ensemble inference algorithm is
proposed to merge complementary information from microarray and single cell expression data.
This paper implements a Gradient Boosting Tree (GBT) inference algorithm to compute
importance scores for candidate gene-gene pairs. The proposed E-alpha rule quantitatively
evaluates the credibility levels of each information source and determines the final ranked list.
Results:
Two groups of in silico gene networks are applied to illustrate the effectiveness of the
proposed E-alpha integration. Experimental outcomes with size50 and size100 in silico gene
networks suggest that the proposed E-alpha rule significantly improves performance metrics
compared with single information source.
Conclusion:
In GRN inference, the integration of hybrid expression data using E-alpha rule
provides a feasible and efficient way to enhance performance metrics than solely increasing
sample sizes.
Collapse
Affiliation(s)
- Wei Zhang
- Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, 310013, China
| | - Wenchao Li
- Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, 310013, China
| | - Jianming Zhang
- Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, 310013, China
| | - Ning Wang
- Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, 310013, China
| |
Collapse
|
18
|
Haque S, Ahmad JS, Clark NM, Williams CM, Sozzani R. Computational prediction of gene regulatory networks in plant growth and development. CURRENT OPINION IN PLANT BIOLOGY 2019; 47:96-105. [PMID: 30445315 DOI: 10.1016/j.pbi.2018.10.005] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Revised: 10/05/2018] [Accepted: 10/18/2018] [Indexed: 05/22/2023]
Abstract
Plants integrate a wide range of cellular, developmental, and environmental signals to regulate complex patterns of gene expression. Recent advances in genomic technologies enable differential gene expression analysis at a systems level, allowing for improved inference of the network of regulatory interactions between genes. These gene regulatory networks, or GRNs, are used to visualize the causal regulatory relationships between regulators and their downstream target genes. Accordingly, these GRNs can represent spatial, temporal, and/or environmental regulations and can identify functional genes. This review summarizes recent computational approaches applied to different types of gene expression data to infer GRNs in the context of plant growth and development. Three stages of GRN inference are described: first, data collection and analysis based on the dataset type; second, network inference application based on data availability and proposed hypotheses; and third, validation based on in silico, in vivo, and in planta methods. In addition, this review relates data collection strategies to biological questions, organizes inference algorithms based on statistical methods and data types, discusses experimental design considerations, and provides guidelines for GRN inference with an emphasis on the benefits of integrative approaches, especially when a priori information is limited. Finally, this review concludes that computational frameworks integrating large-scale heterogeneous datasets are needed for a more accurate (e.g. fewer false interactions), detailed (e.g. discrimination between direct versus indirect interactions), and comprehensive (e.g. genetic regulation under various conditions and spatial locations) inference of GRNs.
Collapse
Affiliation(s)
- Samiul Haque
- Electrical and Computer Engineering, North Carolina State University, Raleigh, USA
| | - Jabeen S Ahmad
- Plant and Microbial Biology, North Carolina State University, Raleigh, USA
| | - Natalie M Clark
- Plant and Microbial Biology, North Carolina State University, Raleigh, USA
| | - Cranos M Williams
- Electrical and Computer Engineering, North Carolina State University, Raleigh, USA.
| | - Rosangela Sozzani
- Plant and Microbial Biology, North Carolina State University, Raleigh, USA.
| |
Collapse
|
19
|
Jurman G, Filosi M, Visintainer R, Riccadonna S, Furlanello C. Stability in GRN Inference. Methods Mol Biol 2019; 1883:323-346. [PMID: 30547407 DOI: 10.1007/978-1-4939-8882-2_14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Reconstructing a gene regulatory network from one or more sets of omics measurements has been a major task of computational biology in the last 20 years. Despite an overwhelming number of algorithms proposed to solve the network inference problem either in the general scenario or in an ad-hoc tailored situation, assessing the stability of reconstruction is still an uncharted territory and exploratory studies mainly tackled theoretical aspects. We introduce here empirical stability, which is induced by variability of reconstruction as a function of data subsampling. By evaluating differences between networks that are inferred using different subsets of the same data we obtain quantitative indicators of the robustness of the algorithm, of the noise level affecting the data, and, overall, of the reliability of the reconstructed graph. We show that empirical stability can be used whenever no ground truth is available to compute a direct measure of the similarity between the inferred structure and the true network. The main ingredient here is a suite of indicators, called NetSI, providing statistics of distances between graphs generated by a given algorithm fed with different data subsets, where the chosen metric is the Hamming-Ipsen-Mikhailov (HIM) distance evaluating dissimilarity of graph topologies with shared nodes. Operatively, the NetSI family is demonstrated here on synthetic and high-throughput datasets, inferring graphs at different resolution levels (topology, direction, weight), showing how the stability indicators can be effectively used for the quantitative comparison of the stability of different reconstruction algorithms.
Collapse
Affiliation(s)
| | | | - Roberto Visintainer
- The Microsoft Research - University of Trento Centre for Computational and Systems Biology (COSBI), Rovereto, Italy
| | | | | |
Collapse
|
20
|
Gupta P, Singh SK. Gene Regulatory Networks: Current Updates and Applications in Plant Biology. ENERGY, ENVIRONMENT, AND SUSTAINABILITY 2019. [DOI: 10.1007/978-981-15-0690-1_18] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
21
|
Wang Y, Cho DY, Lee H, Fear J, Oliver B, Przytycka TM. Reprogramming of regulatory network using expression uncovers sex-specific gene regulation in Drosophila. Nat Commun 2018; 9:4061. [PMID: 30283019 PMCID: PMC6170494 DOI: 10.1038/s41467-018-06382-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 08/13/2018] [Indexed: 02/07/2023] Open
Abstract
Gene regulatory networks (GRNs) describe regulatory relationships between transcription factors (TFs) and their target genes. Computational methods to infer GRNs typically combine evidence across different conditions to infer context-agnostic networks. We develop a method, Network Reprogramming using EXpression (NetREX), that constructs a context-specific GRN given context-specific expression data and a context-agnostic prior network. NetREX remodels the prior network to obtain the topology that provides the best explanation for expression data. Because NetREX utilizes prior network topology, we also develop PriorBoost, a method that evaluates a prior network in terms of its consistency with the expression data. We validate NetREX and PriorBoost using the "gold standard" E. coli GRN from the DREAM5 network inference challenge and apply them to construct sex-specific Drosophila GRNs. NetREX constructed sex-specific Drosophila GRNs that, on all applied measures, outperform networks obtained from other methods indicating that NetREX is an important milestone toward building more accurate GRNs.
Collapse
Affiliation(s)
- Yijie Wang
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, 20894, USA
| | - Dong-Yeon Cho
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, 20894, USA
| | - Hangnoh Lee
- Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, 50 South Drive, Bethesda, MD, 20892, USA
| | - Justin Fear
- Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, 50 South Drive, Bethesda, MD, 20892, USA
| | - Brian Oliver
- Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, 50 South Drive, Bethesda, MD, 20892, USA.
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, 20894, USA.
| |
Collapse
|
22
|
Huang J, Zheng J, Yuan H, McGinnis K. Distinct tissue-specific transcriptional regulation revealed by gene regulatory networks in maize. BMC PLANT BIOLOGY 2018; 18:111. [PMID: 29879919 PMCID: PMC6040155 DOI: 10.1186/s12870-018-1329-y] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 05/24/2018] [Indexed: 05/22/2023]
Abstract
BACKGROUND Transcription factors (TFs) are proteins that can bind to DNA sequences and regulate gene expression. Many TFs are master regulators in cells that contribute to tissue-specific and cell-type-specific gene expression patterns in eukaryotes. Maize has been a model organism for over one hundred years, but little is known about its tissue-specific gene regulation through TFs. In this study, we used a network approach to elucidate gene regulatory networks (GRNs) in four tissues (leaf, root, SAM and seed) in maize. We utilized GENIE3, a machine-learning algorithm combined with large quantity of RNA-Seq expression data to construct four tissue-specific GRNs. Unlike some other techniques, this approach is not limited by high-quality Position Weighed Matrix (PWM), and can therefore predict GRNs for over 2000 TFs in maize. RESULTS Although many TFs were expressed across multiple tissues, a multi-tiered analysis predicted tissue-specific regulatory functions for many transcription factors. Some well-studied TFs emerged within the four tissue-specific GRNs, and the GRN predictions matched expectations based upon published results for many of these examples. Our GRNs were also validated by ChIP-Seq datasets (KN1, FEA4 and O2). Key TFs were identified for each tissue and matched expectations for key regulators in each tissue, including GO enrichment and identity with known regulatory factors for that tissue. We also found functional modules in each network by clustering analysis with the MCL algorithm. CONCLUSIONS By combining publicly available genome-wide expression data and network analysis, we can uncover GRNs at tissue-level resolution in maize. Since ChIP-Seq and PWMs are still limited in several model organisms, our study provides a uniform platform that can be adapted to any species with genome-wide expression data to construct GRNs. We also present a publicly available database, maize tissue-specific GRN (mGRN, https://www.bio.fsu.edu/mcginnislab/mgrn/ ), for easy querying. All source code and data are available at Github ( https://github.com/timedreamer/maize_tissue-specific_GRN ).
Collapse
Affiliation(s)
- Ji Huang
- Department of Biological Science, Florida State University, Tallahassee, Florida, 32306, USA
| | - Juefei Zheng
- School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Hui Yuan
- Department of Statistics, Florida State University, Tallahassee, Florida, 32306, USA
| | - Karen McGinnis
- Department of Biological Science, Florida State University, Tallahassee, Florida, 32306, USA.
| |
Collapse
|
23
|
Pirayre A, Couprie C, Duval L, Pesquet JC. BRANE Clust: Cluster-Assisted Gene Regulatory Network Inference Refinement. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:850-860. [PMID: 28368827 DOI: 10.1109/tcbb.2017.2688355] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Discovering meaningful gene interactions is crucial for the identification of novel regulatory processes in cells. Building accurately the related graphs remains challenging due to the large number of possible solutions from available data. Nonetheless, enforcing a priori on the graph structure, such as modularity, may reduce network indeterminacy issues. BRANE Clust (Biologically-Related A priori Network Enhancement with Clustering) refines gene regulatory network (GRN) inference thanks to cluster information. It works as a post-processing tool for inference methods (i.e., CLR, GENIE3). In BRANE Clust, the clustering is based on the inversion of a system of linear equations involving a graph-Laplacian matrix promoting a modular structure. Our approach is validated on DREAM4 and DREAM5 datasets with objective measures, showing significant comparative improvements. We provide additional insights on the discovery of novel regulatory or co-expressed links in the inferred Escherichia coli network evaluated using the STRING database. The comparative pertinence of clustering is discussed computationally (SIMoNe, WGCNA, X-means) and biologically (RegulonDB). BRANE Clust software is available at: http://www-syscom.univ-mlv.fr/~pirayre/Codes-GRN-BRANE-clust.html.
Collapse
|
24
|
Mochida K, Koda S, Inoue K, Nishii R. Statistical and Machine Learning Approaches to Predict Gene Regulatory Networks From Transcriptome Datasets. FRONTIERS IN PLANT SCIENCE 2018; 9:1770. [PMID: 30555503 PMCID: PMC6281826 DOI: 10.3389/fpls.2018.01770] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2018] [Accepted: 11/14/2018] [Indexed: 05/20/2023]
Abstract
Statistical and machine learning (ML)-based methods have recently advanced in construction of gene regulatory network (GRNs) based on high-throughput biological datasets. GRNs underlie almost all cellular phenomena; hence, comprehensive GRN maps are essential tools to elucidate gene function, thereby facilitating the identification and prioritization of candidate genes for functional analysis. High-throughput gene expression datasets have yielded various statistical and ML-based algorithms to infer causal relationship between genes and decipher GRNs. This review summarizes the recent advancements in the computational inference of GRNs, based on large-scale transcriptome sequencing datasets of model plants and crops. We highlight strategies to select contextual genes for GRN inference, and statistical and ML-based methods for inferring GRNs based on transcriptome datasets from plants. Furthermore, we discuss the challenges and opportunities for the elucidation of GRNs based on large-scale datasets obtained from emerging transcriptomic applications, such as from population-scale, single-cell level, and life-course transcriptome analyses.
Collapse
Affiliation(s)
- Keiichi Mochida
- Bioproductivity Informatics Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
- Microalgae Production Control Technology Laboratory, RIKEN Baton Zone Program, RIKEN Cluster for Science, Technology and Innovation Hub, Yokohama, Japan
- Institute of Plant Science and Resources, Okayama University, Kurashiki, Japan
- Kihara Institute for Biological Research, Yokohama City University, Yokohama, Japan
- *Correspondence: Keiichi Mochida, Ryuei Nishii,
| | - Satoru Koda
- Graduate School of Mathematics, Kyushu University, Fukuoka, Japan
| | - Komaki Inoue
- Bioproductivity Informatics Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Ryuei Nishii
- Institute of Mathematics for Industry, Kyushu University, Fukuoka, Japan
- *Correspondence: Keiichi Mochida, Ryuei Nishii,
| |
Collapse
|
25
|
Ma T, Zhang A. Reconstructing context-specific gene regulatory network and identifying modules and network rewiring through data integration. Methods 2017; 124:36-45. [PMID: 28529066 DOI: 10.1016/j.ymeth.2017.05.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2017] [Accepted: 05/05/2017] [Indexed: 12/01/2022] Open
Abstract
Reconstructing context-specific transcriptional regulatory network is crucial for deciphering principles of regulatory mechanisms underlying various conditions. Recently studies that reconstructed transcriptional networks have focused on individual organisms or cell types and relied on data repositories of context-free regulatory relationships. Here we present a comprehensive framework to systematically derive putative regulator-target pairs in any given context by integrating context-specific transcriptional profiling and public data repositories of gene regulatory networks. Moreover, our framework can identify core regulatory modules and signature genes underlying global regulatory circuitry, and detect network rewiring and core rewired modules in different contexts by considering gene modules and edge (gene interaction) modules collaboratively. We applied our methods to analyzing Autism RNA-seq experiment data and produced biologically meaningful results. In particular, all 11 hub genes in a predicted rewired autistic regulatory subnetwork have been linked to autism based on literature review. The predicted rewired autistic regulatory network may shed some new insight into disease mechanism.
Collapse
Affiliation(s)
- Tianle Ma
- Department of Computer Science and Engineering, University at Buffalo (SUNY), Buffalo, NY 14260-2500, United States.
| | - Aidong Zhang
- Department of Computer Science and Engineering, University at Buffalo (SUNY), Buffalo, NY 14260-2500, United States.
| |
Collapse
|