1
|
Gill JK, Chetty M, Lim S, Hallinan J. BioBERT based text mining for incorporating prior knowledge in the inference of genetic network models. Comput Biol Med 2025; 186:109623. [PMID: 39753024 DOI: 10.1016/j.compbiomed.2024.109623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2024] [Revised: 12/03/2024] [Accepted: 12/23/2024] [Indexed: 02/20/2025]
Abstract
Reconstruction of Gene Regulatory Networks (GRNs) is essential for understanding gene interactions, their impact on cellular processes, and manifestation of diseases, including drug discovery. Among various mathematical and dynamic models used for GRN reconstruction, S-system model, comprising non-linear differential equations, is widely utilised to capture the behaviour of complex biological systems with non-linear and time-dependent interactions. However, as the network size increases, computational demand for network inference grows due to a greater number of estimation parameters, significantly impacting the performance of optimisation algorithms. Incorporating biologically relevant prior knowledge using advanced Natural Language Processing methods can effectively address this limitation by reducing the need for computing large parameters, thereby enhancing speed and accuracy. In this study, we introduce PRESS, an integrated Prior Knowledge Enhanced S-system model for accurate GRN reconstructions, which seamlessly automates the incorporation of prior knowledge obtained through systematic extraction from published literature. PRESS exploits our recently reported BioBERT-based Gene Interaction Extraction Framework with enhanced targeted genetic relation extraction and the prediction of regulatory genes. Effectiveness of the optimisation algorithm in learning model parameters is further enhanced through a novel fitness evaluation, which limits the maximum number of regulatory genes to mimic real GRNs. This integrated method, combining a robust relation extraction framework for automated prior knowledge with a GRN reconstruction model, is novel and has not been reported previously. Experimental results obtained using Escherichia coli subnetworks and the benchmark SOS dataset demonstrate substantial reductions in computational cost while simultaneously improving prediction accuracy.
Collapse
Affiliation(s)
- Jaskaran Kaur Gill
- Health Innovation and Transformation Centre, Federation University, Victoria, 3842, Australia.
| | - Madhu Chetty
- Health Innovation and Transformation Centre, Federation University, Victoria, 3842, Australia
| | - Suryani Lim
- Health Innovation and Transformation Centre, Federation University, Victoria, 3842, Australia
| | - Jennifer Hallinan
- Health Innovation and Transformation Centre, Federation University, Victoria, 3842, Australia; BioThink, Queensland, 4020, Australia
| |
Collapse
|
2
|
Azhar HMF, Saeed MT, Jabeen I. Dynamics simulations of hypoxia inducible factor-1 regulatory network in cancer using formal verification techniques. Front Mol Biosci 2024; 11:1386930. [PMID: 39606028 PMCID: PMC11599740 DOI: 10.3389/fmolb.2024.1386930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 10/28/2024] [Indexed: 11/29/2024] Open
Abstract
Hypoxia-inducible factor-1 (HIF-1) regulates cell growth, protein translation, metabolic pathways and therefore, has been advocated as a promising biological target for the therapeutic interventions against cancer. In general, hyperactivation of HIF-1 in cancer has been associated with increases in the expression of glucose transporter type-1 (GLUT-1) thus, enhancing glucose consumption and hyperactivating metabolic pathways. The collective behavior of GLUT-1 along with previously known key players AKT, OGT, and VEGF is not fully characterized and lacks clarity of how glucose uptake through this pathway (HIF-1) probes the cancer progression. This study uses a Rene Thomas qualitative modeling framework to comprehend the signaling dynamics of HIF-1 and its interlinked proteins, including VEGF, ERK, AKT, GLUT-1, β-catenin, C-MYC, OGT, and p53 to elucidate the regulatory mechanistic of HIF-1 in cancer. Our dynamic model reveals that continuous activation of p53, β-catenin, and AKT in cyclic conditions, leads to oscillations representing homeostasis or a stable recovery state. Any deviation from this cycle results in a cancerous or pathogenic state. The model shows that overexpression of VEGF activates ERK and GLUT-1, leads to more aggressive tumor growth in a cancerous state. Moreover, it is observed that collective modulation of VEGF, ERK, and β-catenin is required for therapeutic intervention because these genes enhance the expression of GLUT-1 and play a significant role in cancer progression and angiogenesis. Additionally, SimBiology simulation unveils dynamic molecular interactions, emphasizing the need for targeted therapeutics to effectively regulate VEGF and ERK concentrations to modulate cancer cell proliferation.
Collapse
Affiliation(s)
| | | | - Ishrat Jabeen
- School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan
| |
Collapse
|
3
|
Panahi B, Khalilpour Shadbad R. Navigating the microalgal maze: a comprehensive review of recent advances and future perspectives in biological networks. PLANTA 2024; 260:114. [PMID: 39367989 DOI: 10.1007/s00425-024-04543-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 09/28/2024] [Indexed: 10/07/2024]
Abstract
MAIN CONCLUSION PPI analysis deepens our knowledge in critical processes like carbon fixation and nutrient sensing. Moreover, signaling networks, including pathways like MAPK/ERK and TOR, provide valuable information in how microalgae respond to environmental changes and stress. Additionally, species-species interaction networks for microalgae provide a comprehensive understanding of how different species interact within their environments. This review examines recent advancements in the study of biological networks within microalgae, with a focus on the intricate interactions that define these organisms. It emphasizes how network biology, an interdisciplinary field, offers valuable insights into microalgae functions through various methodologies. Crucial approaches, such as protein-protein interaction (PPI) mapping utilizing yeast two-hybrid screening and mass spectrometry, are essential for comprehending cellular processes and optimizing functions, such as photosynthesis and fatty acid biosynthesis. The application of advanced computational methods and information mining has significantly improved PPI analysis, revealing networks involved in critical processes like carbon fixation and nutrient sensing. The review also encompasses transcriptional networks, which play a role in gene regulation and stress responses, as well as metabolic networks represented by genome-scale metabolic models (GEMs), which aid in strain optimization and the prediction of metabolic outcomes. Furthermore, signaling networks, including pathways like MAPK/ERK and TOR, are crucial for understanding how microalgae respond to environmental changes and stress. Additionally, species-species interaction networks for microalgae provide a comprehensive understanding of how different species interact within their environments. The integration of these network biology approaches has deepened our understanding of microalgal interactions, paving the way for more efficient cultivation and new industrial applications.
Collapse
Affiliation(s)
- Bahman Panahi
- Department of Genomics, Branch for Northwest & West Region, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research, Education and Extension Organization (AREEO), Tabriz, 5156915-598, Iran.
| | - Robab Khalilpour Shadbad
- Department of Cellular and Molecular Biology, Faculty of Science, Azarbaijan Shahid Madani University, Tabriz, Iran
| |
Collapse
|
4
|
Wu Y, Zhou D, Hu J. Reconstruction of gene regulatory networks for Caenorhabditis elegans using tree-shaped gene expression data. Brief Bioinform 2024; 25:bbae396. [PMID: 39133097 PMCID: PMC11318059 DOI: 10.1093/bib/bbae396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 06/11/2024] [Accepted: 08/07/2024] [Indexed: 08/13/2024] Open
Abstract
Constructing gene regulatory networks is a widely adopted approach for investigating gene regulation, offering diverse applications in biology and medicine. A great deal of research focuses on using time series data or single-cell RNA-sequencing data to infer gene regulatory networks. However, such gene expression data lack either cellular or temporal information. Fortunately, the advent of time-lapse confocal laser microscopy enables biologists to obtain tree-shaped gene expression data of Caenorhabditis elegans, achieving both cellular and temporal resolution. Although such tree-shaped data provide abundant knowledge, they pose challenges like non-pairwise time series, laying the inaccuracy of downstream analysis. To address this issue, a comprehensive framework for data integration and a novel Bayesian approach based on Boolean network with time delay are proposed. The pre-screening process and Markov Chain Monte Carlo algorithm are applied to obtain the parameter estimates. Simulation studies show that our method outperforms existing Boolean network inference algorithms. Leveraging the proposed approach, gene regulatory networks for five subtrees are reconstructed based on the real tree-shaped datatsets of Caenorhabditis elegans, where some gene regulatory relationships confirmed in previous genetic studies are recovered. Also, heterogeneity of regulatory relationships in different cell lineage subtrees is detected. Furthermore, the exploration of potential gene regulatory relationships that bear importance in human diseases is undertaken. All source code is available at the GitHub repository https://github.com/edawu11/BBTD.git.
Collapse
Affiliation(s)
- Yida Wu
- School of Mathematical Sciences, Xiamen University, Zengcuo'an West Road, Siming District, Xiamen 361000, China
| | - Da Zhou
- School of Mathematical Sciences, Xiamen University, Zengcuo'an West Road, Siming District, Xiamen 361000, China
| | - Jie Hu
- School of Mathematical Sciences, Xiamen University, Zengcuo'an West Road, Siming District, Xiamen 361000, China
| |
Collapse
|
5
|
Park JH, Hothi P, de Lomana ALG, Pan M, Calder R, Turkarslan S, Wu WJ, Lee H, Patel AP, Cobbs C, Huang S, Baliga NS. Gene regulatory network topology governs resistance and treatment escape in glioma stem-like cells. SCIENCE ADVANCES 2024; 10:eadj7706. [PMID: 38848360 PMCID: PMC11160475 DOI: 10.1126/sciadv.adj7706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 05/03/2024] [Indexed: 06/09/2024]
Abstract
Poor prognosis and drug resistance in glioblastoma (GBM) can result from cellular heterogeneity and treatment-induced shifts in phenotypic states of tumor cells, including dedifferentiation into glioma stem-like cells (GSCs). This rare tumorigenic cell subpopulation resists temozolomide, undergoes proneural-to-mesenchymal transition (PMT) to evade therapy, and drives recurrence. Through inference of transcriptional regulatory networks (TRNs) of patient-derived GSCs (PD-GSCs) at single-cell resolution, we demonstrate how the topology of transcription factor interaction networks drives distinct trajectories of cell-state transitions in PD-GSCs resistant or susceptible to cytotoxic drug treatment. By experimentally testing predictions based on TRN simulations, we show that drug treatment drives surviving PD-GSCs along a trajectory of intermediate states, exposing vulnerability to potentiated killing by siRNA or a second drug targeting treatment-induced transcriptional programs governing nongenetic cell plasticity. Our findings demonstrate an approach to uncover TRN topology and use it to rationally predict combinatorial treatments that disrupt acquired resistance in GBM.
Collapse
Affiliation(s)
| | - Parvinder Hothi
- Ivy Center for Advanced Brain Tumor Treatment, Swedish Neuroscience Institute, Seattle, WA, USA
| | | | - Min Pan
- Institute for Systems Biology, Seattle, WA, USA
| | | | | | - Wei-Ju Wu
- Institute for Systems Biology, Seattle, WA, USA
| | - Hwahyung Lee
- Ivy Center for Advanced Brain Tumor Treatment, Swedish Neuroscience Institute, Seattle, WA, USA
| | - Anoop P. Patel
- Department of Neurosurgery, Preston Robert Tisch Brain Tumor Center, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
| | - Charles Cobbs
- Ivy Center for Advanced Brain Tumor Treatment, Swedish Neuroscience Institute, Seattle, WA, USA
| | - Sui Huang
- Institute for Systems Biology, Seattle, WA, USA
| | - Nitin S. Baliga
- Institute for Systems Biology, Seattle, WA, USA
- Departments of Microbiology, Biology, and Molecular Engineering Sciences, University of Washington, Seattle, WA, USA
| |
Collapse
|
6
|
Stock M, Popp N, Fiorentino J, Scialdone A. Topological benchmarking of algorithms to infer gene regulatory networks from single-cell RNA-seq data. Bioinformatics 2024; 40:btae267. [PMID: 38627250 PMCID: PMC11096270 DOI: 10.1093/bioinformatics/btae267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 02/28/2024] [Accepted: 04/16/2024] [Indexed: 05/18/2024] Open
Abstract
MOTIVATION In recent years, many algorithms for inferring gene regulatory networks from single-cell transcriptomic data have been published. Several studies have evaluated their accuracy in estimating the presence of an interaction between pairs of genes. However, these benchmarking analyses do not quantify the algorithms' ability to capture structural properties of networks, which are fundamental, e.g., for studying the robustness of a gene network to external perturbations. Here, we devise a three-step benchmarking pipeline called STREAMLINE that quantifies the ability of algorithms to capture topological properties of networks and identify hubs. RESULTS To this aim, we use data simulated from different types of networks as well as experimental data from three different organisms. We apply our benchmarking pipeline to four inference algorithms and provide guidance on which algorithm should be used depending on the global network property of interest. AVAILABILITY AND IMPLEMENTATION STREAMLINE is available at https://github.com/ScialdoneLab/STREAMLINE. The data generated in this study are available at https://doi.org/10.5281/zenodo.10710444.
Collapse
Affiliation(s)
- Marco Stock
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich 85354, Germany
| | - Niclas Popp
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
| | - Jonathan Fiorentino
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
| | - Antonio Scialdone
- Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 81377, Germany
- Institute of Functional Epigenetics, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich 85764, Germany
| |
Collapse
|
7
|
Murmu S, Sinha D, Chaurasia H, Sharma S, Das R, Jha GK, Archak S. A review of artificial intelligence-assisted omics techniques in plant defense: current trends and future directions. FRONTIERS IN PLANT SCIENCE 2024; 15:1292054. [PMID: 38504888 PMCID: PMC10948452 DOI: 10.3389/fpls.2024.1292054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Accepted: 01/24/2024] [Indexed: 03/21/2024]
Abstract
Plants intricately deploy defense systems to counter diverse biotic and abiotic stresses. Omics technologies, spanning genomics, transcriptomics, proteomics, and metabolomics, have revolutionized the exploration of plant defense mechanisms, unraveling molecular intricacies in response to various stressors. However, the complexity and scale of omics data necessitate sophisticated analytical tools for meaningful insights. This review delves into the application of artificial intelligence algorithms, particularly machine learning and deep learning, as promising approaches for deciphering complex omics data in plant defense research. The overview encompasses key omics techniques and addresses the challenges and limitations inherent in current AI-assisted omics approaches. Moreover, it contemplates potential future directions in this dynamic field. In summary, AI-assisted omics techniques present a robust toolkit, enabling a profound understanding of the molecular foundations of plant defense and paving the way for more effective crop protection strategies amidst climate change and emerging diseases.
Collapse
Affiliation(s)
- Sneha Murmu
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Dipro Sinha
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Himanshushekhar Chaurasia
- Central Institute for Research on Cotton Technology, Indian Council of Agricultural Research (ICAR), Mumbai, India
| | - Soumya Sharma
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Ritwika Das
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Girish Kumar Jha
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Sunil Archak
- National Bureau of Plant Genetic Resources, Indian Council of Agricultural Research (ICAR), New Delhi, India
| |
Collapse
|
8
|
Park JH, Hothi P, Lopez Garcia de Lomana A, Pan M, Calder R, Turkarslan S, Wu WJ, Lee H, Patel AP, Cobbs C, Huang S, Baliga NS. Gene regulatory network topology governs resistance and treatment escape in glioma stem-like cells. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.02.578510. [PMID: 38370784 PMCID: PMC10871280 DOI: 10.1101/2024.02.02.578510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Poor prognosis and drug resistance in glioblastoma (GBM) can result from cellular heterogeneity and treatment-induced shifts in phenotypic states of tumor cells, including dedifferentiation into glioma stem-like cells (GSCs). This rare tumorigenic cell subpopulation resists temozolomide, undergoes proneural-to-mesenchymal transition (PMT) to evade therapy, and drives recurrence. Through inference of transcriptional regulatory networks (TRNs) of patient-derived GSCs (PD-GSCs) at single-cell resolution, we demonstrate how the topology of transcription factor interaction networks drives distinct trajectories of cell state transitions in PD-GSCs resistant or susceptible to cytotoxic drug treatment. By experimentally testing predictions based on TRN simulations, we show that drug treatment drives surviving PD-GSCs along a trajectory of intermediate states, exposing vulnerability to potentiated killing by siRNA or a second drug targeting treatment-induced transcriptional programs governing non-genetic cell plasticity. Our findings demonstrate an approach to uncover TRN topology and use it to rationally predict combinatorial treatments that disrupts acquired resistance in GBM.
Collapse
Affiliation(s)
| | - Parvinder Hothi
- Ivy Center for Advanced Brain Tumor Treatment, Swedish Neuroscience Institute, Seattle, WA
| | | | - Min Pan
- Institute for Systems Biology, Seattle, WA
| | | | | | - Wei-Ju Wu
- Institute for Systems Biology, Seattle, WA
| | - Hwahyung Lee
- Ivy Center for Advanced Brain Tumor Treatment, Swedish Neuroscience Institute, Seattle, WA
| | - Anoop P Patel
- Department of Neurosurgery, Preston Robert Tisch Brain Tumor Center, Duke University, Durham, NC
- Center for Advanced Genomic Technologies, Duke University, Durham, NC
| | - Charles Cobbs
- Ivy Center for Advanced Brain Tumor Treatment, Swedish Neuroscience Institute, Seattle, WA
| | - Sui Huang
- Institute for Systems Biology, Seattle, WA
| | - Nitin S Baliga
- Institute for Systems Biology, Seattle, WA
- Departments of Microbiology, Biology, and Molecular Engineering Sciences, University of Washington, Seattle, WA
| |
Collapse
|
9
|
Manosalva Pérez N, Ferrari C, Engelhorn J, Depuydt T, Nelissen H, Hartwig T, Vandepoele K. MINI-AC: inference of plant gene regulatory networks using bulk or single-cell accessible chromatin profiles. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2024; 117:280-301. [PMID: 37788349 DOI: 10.1111/tpj.16483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 09/13/2023] [Accepted: 09/16/2023] [Indexed: 10/05/2023]
Abstract
Gene regulatory networks (GRNs) represent the interactions between transcription factors (TF) and their target genes. Plant GRNs control transcriptional programs involved in growth, development, and stress responses, ultimately affecting diverse agricultural traits. While recent developments in accessible chromatin (AC) profiling technologies make it possible to identify context-specific regulatory DNA, learning the underlying GRNs remains a major challenge. We developed MINI-AC (Motif-Informed Network Inference based on Accessible Chromatin), a method that combines AC data from bulk or single-cell experiments with TF binding site (TFBS) information to learn GRNs in plants. We benchmarked MINI-AC using bulk AC datasets from different Arabidopsis thaliana tissues and showed that it outperforms other methods to identify correct TFBS. In maize, a crop with a complex genome and abundant distal AC regions, MINI-AC successfully inferred leaf GRNs with experimentally confirmed, both proximal and distal, TF-target gene interactions. Furthermore, we showed that both AC regions and footprints are valid alternatives to infer AC-based GRNs with MINI-AC. Finally, we combined MINI-AC predictions from bulk and single-cell AC datasets to identify general and cell-type specific maize leaf regulators. Focusing on C4 metabolism, we identified diverse regulatory interactions in specialized cell types for this photosynthetic pathway. MINI-AC represents a powerful tool for inferring accurate AC-derived GRNs in plants and identifying known and novel candidate regulators, improving our understanding of gene regulation in plants.
Collapse
Affiliation(s)
- Nicolás Manosalva Pérez
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052, Ghent, Belgium
- Center for Plant Systems Biology, VIB, 9052, Ghent, Belgium
| | - Camilla Ferrari
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052, Ghent, Belgium
- Center for Plant Systems Biology, VIB, 9052, Ghent, Belgium
| | - Julia Engelhorn
- Molecular Physiology Department, Heinrich-Heine University, 40225, Düsseldorf, Germany
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
| | - Thomas Depuydt
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052, Ghent, Belgium
- Center for Plant Systems Biology, VIB, 9052, Ghent, Belgium
| | - Hilde Nelissen
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052, Ghent, Belgium
- Center for Plant Systems Biology, VIB, 9052, Ghent, Belgium
| | - Thomas Hartwig
- Molecular Physiology Department, Heinrich-Heine University, 40225, Düsseldorf, Germany
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
- Cluster of Excellence on Plant Sciences, Düsseldorf, Germany
| | - Klaas Vandepoele
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052, Ghent, Belgium
- Center for Plant Systems Biology, VIB, 9052, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, 9052, Ghent, Belgium
| |
Collapse
|
10
|
Feng K, Jiang H, Yin C, Sun H. Gene regulatory network inference based on causal discovery integrating with graph neural network. QUANTITATIVE BIOLOGY 2023; 11:434-450. [DOI: 10.1002/qub2.26] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 04/04/2023] [Indexed: 01/06/2025]
Abstract
AbstractGene regulatory network (GRN) inference from gene expression data is a significant approach to understanding aspects of the biological system. Compared with generalized correlation‐based methods, causality‐inspired ones seem more rational to infer regulatory relationships. We propose GRINCD, a novel GRN inference framework empowered by graph representation learning and causal asymmetric learning, considering both linear and non‐linear regulatory relationships. First, high‐quality representation of each gene is generated using graph neural network. Then, we apply the additive noise model to predict the causal regulation of each regulator‐target pair. Additionally, we design two channels and finally assemble them for robust prediction. Through comprehensive comparisons of our framework with state‐of‐the‐art methods based on different principles on numerous datasets of diverse types and scales, the experimental results show that our framework achieves superior or comparable performance under various evaluation metrics. Our work provides a new clue for constructing GRNs, and our proposed framework GRINCD also shows potential in identifying key factors affecting cancer development.
Collapse
Affiliation(s)
- Ke Feng
- School of Artificial Intelligence Jilin University Changchun China
| | - Hongyang Jiang
- School of Artificial Intelligence Jilin University Changchun China
| | - Chaoyi Yin
- School of Artificial Intelligence Jilin University Changchun China
| | - Huiyan Sun
- School of Artificial Intelligence Jilin University Changchun China
- International Center of Future Science Jilin University Changchun China
- Engineering Research Center of Knowledge‐Driven Human‐Machine Intelligence Ministry of Education Changchun China
| |
Collapse
|
11
|
Li S, Yan B, Wu B, Su J, Lu J, Lam TW, Boheler KR, Poon ENY, Luo R. Integrated modeling framework reveals co-regulation of transcription factors, miRNAs and lncRNAs on cardiac developmental dynamics. Stem Cell Res Ther 2023; 14:247. [PMID: 37705079 PMCID: PMC10500942 DOI: 10.1186/s13287-023-03442-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 08/07/2023] [Indexed: 09/15/2023] Open
Abstract
AIMS Dissecting complex interactions among transcription factors (TFs), microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) are central for understanding heart development and function. Although computational approaches and platforms have been described to infer relationships among regulatory factors and genes, current approaches do not adequately account for how highly diverse, interacting regulators that include noncoding RNAs (ncRNAs) control cardiac gene expression dynamics over time. METHODS To overcome this limitation, we devised an integrated framework, cardiac gene regulatory modeling (CGRM) that integrates LogicTRN and regulatory component analysis bioinformatics modeling platforms to infer complex regulatory mechanisms. We then used CGRM to identify and compare the TF-ncRNA gene regulatory networks that govern early- and late-stage cardiomyocytes (CMs) generated by in vitro differentiation of human pluripotent stem cells (hPSC) and ventricular and atrial CMs isolated during in vivo human cardiac development. RESULTS Comparisons of in vitro versus in vivo derived CMs revealed conserved regulatory networks among TFs and ncRNAs in early cells that significantly diverged in late staged cells. We report that cardiac genes ("heart targets") expressed in early-stage hPSC-CMs are primarily regulated by MESP1, miR-1, miR-23, lncRNAs NEAT1 and MALAT1, while GATA6, HAND2, miR-200c, NEAT1 and MALAT1 are critical for late hPSC-CMs. The inferred TF-miRNA-lncRNA networks regulating heart development and contraction were similar among early-stage CMs, among individual hPSC-CM datasets and between in vitro and in vivo samples. However, genes related to apoptosis, cell cycle and proliferation, and transmembrane transport showed a high degree of divergence between in vitro and in vivo derived late-stage CMs. Overall, late-, but not early-stage CMs diverged greatly in the expression of "heart target" transcripts and their regulatory mechanisms. CONCLUSIONS In conclusion, we find that hPSC-CMs are regulated in a cell autonomous manner during early development that diverges significantly as a function of time when compared to in vivo derived CMs. These findings demonstrate the feasibility of using CGRM to reveal dynamic and complex transcriptional and posttranscriptional regulatory interactions that underlie cell directed versus environment-dependent CM development. These results with in vitro versus in vivo derived CMs thus establish this approach for detailed analyses of heart disease and for the analysis of cell regulatory systems in other biomedical fields.
Collapse
Affiliation(s)
- Shumin Li
- Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Bin Yan
- Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong, China
- State Key Laboratory of Pharmaceutical Biotechnology, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Binbin Wu
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, China
- Centre for Cardiovascular Genomics and Medicine, Lui Che Woo Institute of Innovative Medicine, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Junhao Su
- Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Jianliang Lu
- Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Tak-Wah Lam
- Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Kenneth R Boheler
- The Division of Cardiology, Department of Medicine and The Whiting School of Engineering, Department of Biomedical Engineering, The Johns Hopkins University, Baltimore, MD, 21205, USA.
| | - Ellen Ngar-Yun Poon
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, China.
- Centre for Cardiovascular Genomics and Medicine, Lui Che Woo Institute of Innovative Medicine, The Chinese University of Hong Kong, Shatin, Hong Kong, China.
- Hong Kong Hub of Paediatric Excellence (HK HOPE), The Chinese University of Hong Kong, Kowloon Bay, Hong Kong, China.
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong, China.
| |
Collapse
|
12
|
Li R, Rozum JC, Quail MM, Qasim MN, Sindi SS, Nobile CJ, Albert R, Hernday AD. Inferring gene regulatory networks using transcriptional profiles as dynamical attractors. PLoS Comput Biol 2023; 19:e1010991. [PMID: 37607190 PMCID: PMC10473541 DOI: 10.1371/journal.pcbi.1010991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 09/01/2023] [Accepted: 07/19/2023] [Indexed: 08/24/2023] Open
Abstract
Genetic regulatory networks (GRNs) regulate the flow of genetic information from the genome to expressed messenger RNAs (mRNAs) and thus are critical to controlling the phenotypic characteristics of cells. Numerous methods exist for profiling mRNA transcript levels and identifying protein-DNA binding interactions at the genome-wide scale. These enable researchers to determine the structure and output of transcriptional regulatory networks, but uncovering the complete structure and regulatory logic of GRNs remains a challenge. The field of GRN inference aims to meet this challenge using computational modeling to derive the structure and logic of GRNs from experimental data and to encode this knowledge in Boolean networks, Bayesian networks, ordinary differential equation (ODE) models, or other modeling frameworks. However, most existing models do not incorporate dynamic transcriptional data since it has historically been less widely available in comparison to "static" transcriptional data. We report the development of an evolutionary algorithm-based ODE modeling approach (named EA) that integrates kinetic transcription data and the theory of attractor matching to infer GRN architecture and regulatory logic. Our method outperformed six leading GRN inference methods, none of which incorporate kinetic transcriptional data, in predicting regulatory connections among TFs when applied to a small-scale engineered synthetic GRN in Saccharomyces cerevisiae. Moreover, we demonstrate the potential of our method to predict unknown transcriptional profiles that would be produced upon genetic perturbation of the GRN governing a two-state cellular phenotypic switch in Candida albicans. We established an iterative refinement strategy to facilitate candidate selection for experimentation; the experimental results in turn provide validation or improvement for the model. In this way, our GRN inference approach can expedite the development of a sophisticated mathematical model that can accurately describe the structure and dynamics of the in vivo GRN.
Collapse
Affiliation(s)
- Ruihao Li
- Quantitative and Systems Biology Graduate Program, University of California, Merced, Merced, California, United States of America
| | - Jordan C. Rozum
- Department of Systems Science and Industrial Engineering, Binghamton University (State University of New York), Binghamton, New York, United States of America
| | - Morgan M. Quail
- Quantitative and Systems Biology Graduate Program, University of California, Merced, Merced, California, United States of America
| | - Mohammad N. Qasim
- Quantitative and Systems Biology Graduate Program, University of California, Merced, Merced, California, United States of America
| | - Suzanne S. Sindi
- Department of Applied Mathematics, University of California, Merced, Merced, California, United States of America
| | - Clarissa J. Nobile
- Department of Molecular Cell Biology, University of California, Merced, Merced, California, United States of America
- Health Sciences Research Institute, University of California, Merced, Merced, California, United States of America
| | - Réka Albert
- Department of Physics, Pennsylvania State University, University Park, University Park, Pennsylvania, United States of America
- Department of Biology, Pennsylvania State University, University Park, University Park, Pennsylvania, United States of America
| | - Aaron D. Hernday
- Department of Molecular Cell Biology, University of California, Merced, Merced, California, United States of America
- Health Sciences Research Institute, University of California, Merced, Merced, California, United States of America
| |
Collapse
|
13
|
Fang Z, Ford AJ, Hu T, Zhang N, Mantalaris A, Coskun AF. Subcellular spatially resolved gene neighborhood networks in single cells. CELL REPORTS METHODS 2023; 3:100476. [PMID: 37323566 PMCID: PMC10261906 DOI: 10.1016/j.crmeth.2023.100476] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 02/18/2023] [Accepted: 04/18/2023] [Indexed: 06/17/2023]
Abstract
Image-based spatial omics methods such as fluorescence in situ hybridization (FISH) generate molecular profiles of single cells at single-molecule resolution. Current spatial transcriptomics methods focus on the distribution of single genes. However, the spatial proximity of RNA transcripts can play an important role in cellular function. We demonstrate a spatially resolved gene neighborhood network (spaGNN) pipeline for the analysis of subcellular gene proximity relationships. In spaGNN, machine-learning-based clustering of subcellular spatial transcriptomics data yields subcellular density classes of multiplexed transcript features. The nearest-neighbor analysis produces heterogeneous gene proximity maps in distinct subcellular regions. We illustrate the cell-type-distinguishing capability of spaGNN using multiplexed error-robust FISH data of fibroblast and U2-OS cells and sequential FISH data of mesenchymal stem cells (MSCs), revealing tissue-source-specific MSC transcriptomics and spatial distribution characteristics. Overall, the spaGNN approach expands the spatial features that can be used for cell-type classification tasks.
Collapse
Affiliation(s)
- Zhou Fang
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
- Machine Learning Graduate Program, Georgia Institute of Technology, Atlanta, GA, USA
| | - Adam J. Ford
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Thomas Hu
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | - Nicholas Zhang
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
- Interdisciplinary Bioengineering Graduate Program, Georgia Institute of Technology, Atlanta, GA, USA
| | - Athanasios Mantalaris
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Ahmet F. Coskun
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
- Interdisciplinary Bioengineering Graduate Program, Georgia Institute of Technology, Atlanta, GA, USA
- Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
14
|
Alali M, Imani M. Reinforcement Learning Data-Acquiring for Causal Inference of Regulatory Networks. PROCEEDINGS OF THE ... AMERICAN CONTROL CONFERENCE. AMERICAN CONTROL CONFERENCE 2023; 2023:3957-3964. [PMID: 37521901 PMCID: PMC10382224 DOI: 10.23919/acc55779.2023.10155867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
Gene regulatory networks (GRNs) consist of multiple interacting genes whose activities govern various cellular processes. The limitations in genomics data and the complexity of the interactions between components often pose huge uncertainties in the models of these biological systems. Meanwhile, inferring/estimating the interactions between components of the GRNs using data acquired from the normal condition of these biological systems is a challenging or, in some cases, an impossible task. Perturbation is a well-known genomics approach that aims to excite targeted components to gather useful data from these systems. This paper models GRNs using the Boolean network with perturbation, where the network uncertainty appears in terms of unknown interactions between genes. Unlike the existing heuristics and greedy data-acquiring methods, this paper provides an optimal Bayesian formulation of the data-acquiring process in the reinforcement learning context, where the actions are perturbations, and the reward measures step-wise improvement in the inference accuracy. We develop a semi-gradient reinforcement learning method with function approximation for learning near-optimal data-acquiring policy. The obtained policy yields near-exact Bayesian optimality with respect to the entire uncertainty in the regulatory network model, and allows learning the policy offline through planning. We demonstrate the performance of the proposed framework using the well-known p53-Mdm2 negative feedback loop gene regulatory network.
Collapse
Affiliation(s)
- Mohammad Alali
- Department of Electrical and Computer Engineering at Northeastern University
| | - Mahdi Imani
- Department of Electrical and Computer Engineering at Northeastern University
| |
Collapse
|
15
|
Yan J, Wang X. Machine learning bridges omics sciences and plant breeding. TRENDS IN PLANT SCIENCE 2023; 28:199-210. [PMID: 36153276 DOI: 10.1016/j.tplants.2022.08.018] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 08/15/2022] [Accepted: 08/23/2022] [Indexed: 06/16/2023]
Abstract
Some of the biological knowledge obtained from fundamental research will be implemented in applied plant breeding. To bridge basic research and breeding practice, machine learning (ML) holds great promise to translate biological knowledge and omics data into precision-designed plant breeding. Here, we review ML for multi-omics analysis in plants, including data dimensionality reduction, inference of gene-regulation networks, and gene discovery and prioritization. These applications will facilitate understanding trait regulation mechanisms and identifying target genes potentially applicable to knowledge-driven molecular design breeding. We also highlight applications of deep learning in plant phenomics and ML in genomic selection-assisted breeding, such as various ML algorithms that model the correlations among genotypes (genes), phenotypes (traits), and environments, to ultimately achieve data-driven genomic design breeding.
Collapse
Affiliation(s)
- Jun Yan
- National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China; Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing 100094, China
| | - Xiangfeng Wang
- National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China; Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing 100094, China.
| |
Collapse
|
16
|
Inference of gene regulatory networks based on the Light Gradient Boosting Machine. Comput Biol Chem 2022; 101:107769. [DOI: 10.1016/j.compbiolchem.2022.107769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 08/12/2022] [Accepted: 09/06/2022] [Indexed: 11/23/2022]
|
17
|
Guan J, Wang Y, Wang Y, Zhuang Y, Ji G. SRGS: sparse partial least squares-based recursive gene selection for gene regulatory network inference. BMC Genomics 2022; 23:782. [PMID: 36451086 PMCID: PMC9710113 DOI: 10.1186/s12864-022-09020-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 11/16/2022] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND The identification of gene regulatory networks (GRNs) facilitates the understanding of the underlying molecular mechanism of various biological processes and complex diseases. With the availability of single-cell RNA sequencing data, it is essential to infer GRNs from single-cell expression. Although some GRN methods originally developed for bulk expression data can be applicable to single-cell data and several single-cell specific GRN algorithms were developed, recent benchmarking studies have emphasized the need of developing more accurate and robust GRN modeling methods that are compatible for single-cell expression data. RESULTS We present SRGS, SPLS (sparse partial least squares)-based recursive gene selection, to infer GRNs from bulk or single-cell expression data. SRGS recursively selects and scores the genes which may have regulations on the considered target gene based on SPLS. When dealing with gene expression data with dropouts, we randomly scramble samples, set some values in the expression matrix to zeroes, and generate multiple copies of data through multiple iterations to make SRGS more robust. We test SRGS on different kinds of expression data, including simulated bulk data, simulated single-cell data without and with dropouts, and experimental single-cell data, and also compared with the existing GRN methods, including the ones originally developed for bulk data, the ones developed specifically for single-cell data, and even the ones recommended by recent benchmarking studies. CONCLUSIONS It has been shown that SRGS is competitive with the existing GRN methods and effective in the gene regulatory network inference from bulk or single-cell gene expression data. SRGS is available at: https://github.com/JGuan-lab/SRGS .
Collapse
Affiliation(s)
- Jinting Guan
- grid.12955.3a0000 0001 2264 7233Department of Automation, Xiamen University, Xiamen, Fujian China ,grid.12955.3a0000 0001 2264 7233National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian China
| | - Yang Wang
- grid.12955.3a0000 0001 2264 7233Department of Automation, Xiamen University, Xiamen, Fujian China
| | - Yongjie Wang
- grid.12955.3a0000 0001 2264 7233Department of Automation, Xiamen University, Xiamen, Fujian China
| | - Yan Zhuang
- grid.12955.3a0000 0001 2264 7233Department of Automation, Xiamen University, Xiamen, Fujian China
| | - Guoli Ji
- grid.12955.3a0000 0001 2264 7233Department of Automation, Xiamen University, Xiamen, Fujian China ,grid.12955.3a0000 0001 2264 7233National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian China
| |
Collapse
|
18
|
An integrated transcriptome mapping the regulatory network of coding and long non-coding RNAs provides a genomics resource in chickpea. Commun Biol 2022; 5:1106. [PMID: 36261617 PMCID: PMC9581958 DOI: 10.1038/s42003-022-04083-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Accepted: 10/07/2022] [Indexed: 11/11/2022] Open
Abstract
Large-scale transcriptome analysis can provide a systems-level understanding of biological processes. To accelerate functional genomic studies in chickpea, we perform a comprehensive transcriptome analysis to generate full-length transcriptome and expression atlas of protein-coding genes (PCGs) and long non-coding RNAs (lncRNAs) from 32 different tissues/organs via deep sequencing. The high-depth RNA-seq dataset reveal expression dynamics and tissue-specificity along with associated biological functions of PCGs and lncRNAs during development. The coexpression network analysis reveal modules associated with a particular tissue or a set of related tissues. The components of transcriptional regulatory networks (TRNs), including transcription factors, their cognate cis-regulatory motifs, and target PCGs/lncRNAs that determine developmental programs of different tissues/organs, are identified. Several candidate tissue-specific and abiotic stress-responsive transcripts associated with quantitative trait loci that determine important agronomic traits are also identified. These results provide an important resource to advance functional/translational genomic and genetic studies during chickpea development and environmental conditions. A full-length transcriptome and expression atlas of protein-coding genes and long non-coding RNAs is generated in chickpea. Components of transcriptional regulatory networks and candidate tissue-specific transcripts associated with quantitative trait loci are identified.
Collapse
|
19
|
Wang Y, Zhang C, Wang Y, Liu X, Zhang Z. Enhancer RNA (eRNA) in Human Diseases. Int J Mol Sci 2022; 23:11582. [PMID: 36232885 PMCID: PMC9569849 DOI: 10.3390/ijms231911582] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Revised: 09/22/2022] [Accepted: 09/24/2022] [Indexed: 11/16/2022] Open
Abstract
Enhancer RNAs (eRNAs), a class of non-coding RNAs (ncRNAs) transcribed from enhancer regions, serve as a type of critical regulatory element in gene expression. There is increasing evidence demonstrating that the aberrant expression of eRNAs can be broadly detected in various human diseases. Some studies also revealed the potential clinical utility of eRNAs in these diseases. In this review, we summarized the recent studies regarding the pathological mechanisms of eRNAs as well as their potential utility across human diseases, including cancers, neurodegenerative disorders, cardiovascular diseases and metabolic diseases. It could help us to understand how eRNAs are engaged in the processes of diseases and to obtain better insight of eRNAs in diagnosis, prognosis or therapy. The studies we reviewed here indicate the enormous therapeutic potency of eRNAs across human diseases.
Collapse
Affiliation(s)
- Yunzhe Wang
- MOE Key Laboratory of Metabolism and Molecular Medicine, Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
| | - Chenyang Zhang
- Department of Pathology, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
| | - Yuxiang Wang
- Department of Pathology, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
| | - Xiuping Liu
- Department of Pathology, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
| | - Zhao Zhang
- MOE Key Laboratory of Metabolism and Molecular Medicine, Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
| |
Collapse
|
20
|
Pomiès L, Brouard C, Duruflé H, Maigné É, Carré C, Gody L, Trösser F, Katsirelos G, Mangin B, Langlade NB, de Givry S. Gene regulatory network inference methodology for genomic and transcriptomic data acquired in genetically related heterozygote individuals. Bioinformatics 2022; 38:4127-4134. [PMID: 35792837 DOI: 10.1093/bioinformatics/btac445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 06/17/2022] [Accepted: 07/05/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Inferring gene regulatory networks in non-independent genetically related panels is a methodological challenge. This hampers evolutionary and biological studies using heterozygote individuals such as in wild sunflower populations or cultivated hybrids. RESULTS First, we simulated 100 datasets of gene expressions and polymorphisms, displaying the same gene expression distributions, heterozygosities and heritabilities as in our dataset including 173 genes and 353 genotypes measured in sunflower hybrids. Secondly, we performed a meta-analysis based on six inference methods [least absolute shrinkage and selection operator (Lasso), Random Forests, Bayesian Networks, Markov Random Fields, Ordinary Least Square and fast inference of networks from directed regulation (Findr)] and selected the minimal density networks for better accuracy with 64 edges connecting 79 genes and 0.35 area under precision and recall (AUPR) score on average. We identified that triangles and mutual edges are prone to errors in the inferred networks. Applied on classical datasets without heterozygotes, our strategy produced a 0.65 AUPR score for one dataset of the DREAM5 Systems Genetics Challenge. Finally, we applied our method to an experimental dataset from sunflower hybrids. We successfully inferred a network composed of 105 genes connected by 106 putative regulations with a major connected component. AVAILABILITY AND IMPLEMENTATION Our inference methodology dedicated to genomic and transcriptomic data is available at https://forgemia.inra.fr/sunrise/inference_methods. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lise Pomiès
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Céline Brouard
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Harold Duruflé
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Élise Maigné
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Clément Carré
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Louise Gody
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Fulya Trösser
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - George Katsirelos
- MIA-Paris, AgroParisTech, Université Paris-Saclay, INRAE, Paris 75231, France
| | - Brigitte Mangin
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Nicolas B Langlade
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Simon de Givry
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| |
Collapse
|
21
|
Pinoli P, Ceddia G, Ceri S, Masseroli M. Predicting Drug Synergism by Means of Non-Negative Matrix Tri-Factorization. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1956-1967. [PMID: 34166199 DOI: 10.1109/tcbb.2021.3091814] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Traditional drug experiments to find synergistic drug pairs are time-consuming and expensive due to the numerous possible combinations of drugs that have to be examined. Thus, computational methods that can give suggestions for synergistic drug investigations are of great interest. Here, we propose a Non-negative Matrix Tri-Factorization (NMTF) based approach that leverages the integration of different data types for predicting synergistic drug pairs in multiple specific cell lines. Our computational framework relies on a network-based representation of available data about drug synergism, which also allows integrating genomic information about cell lines. We computationally evaluate the performances of our method in finding missing relationships between synergistic drug pairs and cell lines, and in computing synergy scores between drug pairs in a specific cell line, as well as we estimate the benefit of adding cell line genomic data to the network. Our approach obtains very good performance (Average Precision Score equal to 0.937, Pearson's correlation coefficient equal to 0.760) when cell line genomic data and rich data about synergistic drugs in a cell line are considered. Finally, we systematically searched our top-scored predictions in the available literature and in the NCI ALMANAC, a well-known database of drug combination experiments, proving the goodness of our findings.
Collapse
|
22
|
Gonçalves LO, Pulido AFV, Mathias FAS, Enes AES, Carvalho MGR, de Melo Resende D, Polak ME, Ruiz JC. Expression Profile of Genes Related to the Th17 Pathway in Macrophages Infected by Leishmania major and Leishmania amazonensis: The Use of Gene Regulatory Networks in Modeling This Pathway. Front Cell Infect Microbiol 2022; 12:826523. [PMID: 35774406 PMCID: PMC9239034 DOI: 10.3389/fcimb.2022.826523] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 03/09/2022] [Indexed: 11/13/2022] Open
Abstract
Leishmania amazonensis and Leishmania major are the causative agents of cutaneous and mucocutaneous diseases. The infections‘ outcome depends on host–parasite interactions and Th1/Th2 response, and in cutaneous form, regulation of Th17 cytokines has been reported to maintain inflammation in lesions. Despite that, the Th17 regulatory scenario remains unclear. With the aim to gain a better understanding of the transcription factors (TFs) and genes involved in Th17 induction, in this study, the role of inducing factors of the Th17 pathway in Leishmania–macrophage infection was addressed through computational modeling of gene regulatory networks (GRNs). The Th17 GRN modeling integrated experimentally validated data available in the literature and gene expression data from a time-series RNA-seq experiment (4, 24, 48, and 72 h post-infection). The generated model comprises a total of 10 TFs, 22 coding genes, and 16 cytokines related to the Th17 immune modulation. Addressing the Th17 induction in infected and uninfected macrophages, an increase of 2- to 3-fold in 4–24 h was observed in the former. However, there was a decrease in basal levels at 48–72 h for both groups. In order to evaluate the possible outcomes triggered by GRN component modulation in the Th17 pathway. The generated GRN models promoted an integrative and dynamic view of Leishmania–macrophage interaction over time that extends beyond the analysis of single-gene expression.
Collapse
Affiliation(s)
- Leilane Oliveira Gonçalves
- Programa de Pós-graduação em Biologia Computacional e Sistemas, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil
- Grupo Informática de Biossistemas, Instituto René Rachou, Fiocruz Minas, Belo Horizonte, Brazil
| | - Andrés F. Vallejo Pulido
- Systems Immunology Group, Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | | | - Alexandre Estevão Silvério Enes
- Programa de Pós-graduação em Biologia Computacional e Sistemas, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil
- Grupo Informática de Biossistemas, Instituto René Rachou, Fiocruz Minas, Belo Horizonte, Brazil
| | | | - Daniela de Melo Resende
- Grupo Genômica Funcional de Parasitos, Instituto René Rachou, Fiocruz Minas, Belo Horizonte, Brazil
| | - Marta E. Polak
- Systems Immunology Group, Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- *Correspondence: Jeronimo C. Ruiz, ; Marta E. Polak,
| | - Jeronimo C. Ruiz
- Grupo Informática de Biossistemas, Instituto René Rachou, Fiocruz Minas, Belo Horizonte, Brazil
- *Correspondence: Jeronimo C. Ruiz, ; Marta E. Polak,
| |
Collapse
|
23
|
Amit G, Vaknin Ben Porath D, Levy O, Hamdi O, Bashan A. Global coordination level in single-cell transcriptomic data. Sci Rep 2022; 12:7547. [PMID: 35534606 PMCID: PMC9085802 DOI: 10.1038/s41598-022-11507-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Accepted: 03/31/2022] [Indexed: 11/26/2022] Open
Abstract
Genes are linked by underlying regulatory mechanisms and by jointly implementing biological functions, working in coordination to apply different tasks in the cells. Assessing the coordination level between genes from single-cell transcriptomic data, without a priori knowledge of the map of gene regulatory interactions, is a challenge. A ‘top-down’ approach has recently been developed to analyze single-cell transcriptomic data by evaluating the global coordination level between genes (called GCL). Here, we systematically analyze the performance of the GCL in typical scenarios of single-cell RNA sequencing (scRNA-seq) data. We show that an individual anomalous cell can have a disproportionate effect on the GCL calculated over a cohort of cells. In addition, we demonstrate how the GCL is affected by the presence of clusters, which are very common in scRNA-seq data. Finally, we analyze the effect of the sampling size of the Jackknife procedure on the GCL statistics. The manuscript is accompanied by a description of a custom-built Python package for calculating the GCL. These results provide practical guidelines for properly pre-processing and applying the GCL measure in transcriptional data.
Collapse
|
24
|
Dlamini Z, Skepu A, Kim N, Mkhabele M, Khanyile R, Molefi T, Mbatha S, Setlai B, Mulaudzi T, Mabongo M, Bida M, Kgoebane-Maseko M, Mathabe K, Lockhat Z, Kgokolo M, Chauke-Malinga N, Ramagaga S, Hull R. AI and precision oncology in clinical cancer genomics: From prevention to targeted cancer therapies-an outcomes based patient care. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100965] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
|
25
|
Panditrao G, Bhowmick R, Meena C, Sarkar RR. Emerging landscape of molecular interaction networks: Opportunities, challenges and prospects. J Biosci 2022. [PMID: 36210749 PMCID: PMC9018971 DOI: 10.1007/s12038-022-00253-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Network biology finds application in interpreting molecular interaction networks and providing insightful inferences using graph theoretical analysis of biological systems. The integration of computational bio-modelling approaches with different hybrid network-based techniques provides additional information about the behaviour of complex systems. With increasing advances in high-throughput technologies in biological research, attempts have been made to incorporate this information into network structures, which has led to a continuous update of network biology approaches over time. The newly minted centrality measures accommodate the details of omics data and regulatory network structure information. The unification of graph network properties with classical mathematical and computational modelling approaches and technologically advanced approaches like machine-learning- and artificial intelligence-based algorithms leverages the potential application of these techniques. These computational advances prove beneficial and serve various applications such as essential gene prediction, identification of drug–disease interaction and gene prioritization. Hence, in this review, we have provided a comprehensive overview of the emerging landscape of molecular interaction networks using graph theoretical approaches. With the aim to provide information on the wide range of applications of network biology approaches in understanding the interaction and regulation of genes, proteins, enzymes and metabolites at different molecular levels, we have reviewed the methods that utilize network topological properties, emerging hybrid network-based approaches and applications that integrate machine learning techniques to analyse molecular interaction networks. Further, we have discussed the applications of these approaches in biomedical research with a note on future prospects.
Collapse
Affiliation(s)
- Gauri Panditrao
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Rupa Bhowmick
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| | - Chandrakala Meena
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| |
Collapse
|
26
|
Hillerton T, Seçilmiş D, Nelander S, Sonnhammer ELL. Fast and accurate gene regulatory network inference by normalized least squares regression. Bioinformatics 2022; 38:2263-2268. [PMID: 35176145 PMCID: PMC9004640 DOI: 10.1093/bioinformatics/btac103] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 01/10/2022] [Accepted: 02/15/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Inferring an accurate gene regulatory network (GRN) has long been a key goal in the field of systems biology. To do this, it is important to find a suitable balance between the maximum number of true positive and the minimum number of false-positive interactions. Another key feature is that the inference method can handle the large size of modern experimental data, meaning the method needs to be both fast and accurate. The Least Squares Cut-Off (LSCO) method can fulfill both these criteria, however as it is based on least squares it is vulnerable to known issues of amplifying extreme values, small or large. In GRN this manifests itself with genes that are erroneously hyper-connected to a large fraction of all genes due to extremely low value fold changes. RESULTS We developed a GRN inference method called Least Squares Cut-Off with Normalization (LSCON) that tackles this problem. LSCON extends the LSCO algorithm by regularization to avoid hyper-connected genes and thereby reduce false positives. The regularization used is based on normalization, which removes effects of extreme values on the fit. We benchmarked LSCON and compared it to Genie3, LASSO, LSCO and Ridge regression, in terms of accuracy, speed and tendency to predict hyper-connected genes. The results show that LSCON achieves better or equal accuracy compared to LASSO, the best existing method, especially for data with extreme values. Thanks to the speed of least squares regression, LSCON does this an order of magnitude faster than LASSO. AVAILABILITY AND IMPLEMENTATION Data: https://bitbucket.org/sonnhammergrni/lscon; Code: https://bitbucket.org/sonnhammergrni/genespider. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Thomas Hillerton
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, 17121 Solna, Sweden
| | - Deniz Seçilmiş
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, 17121 Solna, Sweden
| | - Sven Nelander
- Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, 75185 Uppsala, Sweden
| | | |
Collapse
|
27
|
Disentangling direct from indirect relationships in association networks. Proc Natl Acad Sci U S A 2022; 119:2109995119. [PMID: 34992138 PMCID: PMC8764688 DOI: 10.1073/pnas.2109995119] [Citation(s) in RCA: 66] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/30/2021] [Indexed: 11/18/2022] Open
Abstract
Networks are vital tools for understanding and modeling interactions in complex systems in science and engineering, and direct and indirect interactions are pervasive in all types of networks. However, quantitatively disentangling direct and indirect relationships in networks remains a formidable task. Here, we present a framework, called iDIRECT (Inference of Direct and Indirect Relationships with Effective Copula-based Transitivity), for quantitatively inferring direct dependencies in association networks. Using copula-based transitivity, iDIRECT eliminates/ameliorates several challenging mathematical problems, including ill-conditioning, self-looping, and interaction strength overflow. With simulation data as benchmark examples, iDIRECT showed high prediction accuracies. Application of iDIRECT to reconstruct gene regulatory networks in Escherichia coli also revealed considerably higher prediction power than the best-performing approaches in the DREAM5 (Dialogue on Reverse Engineering Assessment and Methods project, #5) Network Inference Challenge. In addition, applying iDIRECT to highly diverse grassland soil microbial communities in response to climate warming showed that the iDIRECT-processed networks were significantly different from the original networks, with considerably fewer nodes, links, and connectivity, but higher relative modularity. Further analysis revealed that the iDIRECT-processed network was more complex under warming than the control and more robust to both random and target species removal (P < 0.001). As a general approach, iDIRECT has great advantages for network inference, and it should be widely applicable to infer direct relationships in association networks across diverse disciplines in science and engineering.
Collapse
|
28
|
Wani N, Barh D, Raza K. Modular network inference between miRNA-mRNA expression profiles using weighted co-expression network analysis. J Integr Bioinform 2021; 18:20210029. [PMID: 34800012 PMCID: PMC8709739 DOI: 10.1515/jib-2021-0029] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 10/20/2021] [Accepted: 10/28/2021] [Indexed: 12/14/2022] Open
Abstract
Connecting transcriptional and post-transcriptional regulatory networks solves an important puzzle in the elucidation of gene regulatory mechanisms. To decipher the complexity of these connections, we build co-expression network modules for mRNA as well as miRNA expression profiles of breast cancer data. We construct gene and miRNA co-expression modules using the weighted gene co-expression network analysis (WGCNA) method and establish the significance of these modules (Genes/miRNAs) for cancer phenotype. This work also infers an interaction network between the genes of the turquoise module from mRNA expression data and hubs of the turquoise module from miRNA expression data. A pathway enrichment analysis using a miRsystem web tool for miRNA hubs and some of their targets, reveal their enrichment in several important pathways associated with the progression of cancer.
Collapse
Affiliation(s)
- Nisar Wani
- Computer Science and Engineering Department, Govt. College of Engineering and Technology Safapora, Ganderbal Kashmir, J&K, India
| | - Debmalya Barh
- Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, WB, India
- Department of Genetics, Ecology and Evolution, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Khalid Raza
- Department of Computer Science, Jamia Millia Islamia, New Delhi, India
| |
Collapse
|
29
|
DeMers LC, Raboy V, Li S, Saghai Maroof MA. Network Inference of Transcriptional Regulation in Germinating Low Phytic Acid Soybean Seeds. FRONTIERS IN PLANT SCIENCE 2021; 12:708286. [PMID: 34531883 PMCID: PMC8438133 DOI: 10.3389/fpls.2021.708286] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 06/23/2021] [Indexed: 05/14/2023]
Abstract
The low phytic acid (lpa) trait in soybeans can be conferred by loss-of-function mutations in genes encoding myo-inositol phosphate synthase and two epistatically interacting genes encoding multidrug-resistance protein ATP-binding cassette (ABC) transporters. However, perturbations in phytic acid biosynthesis are associated with poor seed vigor. Since the benefits of the lpa trait, in terms of end-use quality and sustainability, far outweigh the negatives associated with poor seed performance, a fuller understanding of the molecular basis behind the negatives will assist crop breeders and engineers in producing variates with lpa and better germination rate. The gene regulatory network (GRN) for developing low and normal phytic acid soybean seeds was previously constructed, with genes modulating a variety of processes pertinent to phytic acid metabolism and seed viability being identified. In this study, a comparative time series analysis of low and normal phytic acid soybeans was carried out to investigate the transcriptional regulatory elements governing the transitional dynamics from dry seed to germinated seed. GRNs were reverse engineered from time series transcriptomic data of three distinct genotypic subsets composed of lpa soybean lines and their normal phytic acid sibling lines. Using a robust unsupervised network inference scheme, putative regulatory interactions were inferred for each subset of genotypes. These interactions were further validated by published regulatory interactions found in Arabidopsis thaliana and motif sequence analysis. Results indicate that lpa seeds have increased sensitivity to stress, which could be due to changes in phytic acid levels, disrupted inositol phosphate signaling, disrupted phosphate ion (Pi) homeostasis, and altered myo-inositol metabolism. Putative regulatory interactions were identified for the latter two processes. Changes in abscisic acid (ABA) signaling candidate transcription factors (TFs) putatively regulating genes in this process were identified as well. Analysis of the GRNs reveal altered regulation in processes that may be affecting the germination of lpa soybean seeds. Therefore, this work contributes to the ongoing effort to elucidate molecular mechanisms underlying altered seed viability, germination and field emergence of lpa crops, understanding of which is necessary in order to mitigate these problems.
Collapse
Affiliation(s)
- Lindsay C. DeMers
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - Victor Raboy
- National Small Grains Germplasm Research Center, Agricultural Research Service (USDA), Aberdeen, ID, United States
| | - Song Li
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - M. A. Saghai Maroof
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| |
Collapse
|
30
|
Ellery A. Are There Biomimetic Lessons from Genetic Regulatory Networks for Developing a Lunar Industrial Ecology? Biomimetics (Basel) 2021; 6:biomimetics6030050. [PMID: 34449537 PMCID: PMC8395472 DOI: 10.3390/biomimetics6030050] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 11/21/2022] Open
Abstract
We examine the prospect for employing a bio-inspired architecture for a lunar industrial ecology based on genetic regulatory networks. The lunar industrial ecology resembles a metabolic system in that it comprises multiple chemical processes interlinked through waste recycling. Initially, we examine lessons from factory organisation which have evolved into a bio-inspired concept, the reconfigurable holonic architecture. We then examine genetic regulatory networks and their application in the biological cell cycle. There are numerous subtleties that would be challenging to implement in a lunar industrial ecology but much of the essence of biological circuitry (as implemented in synthetic biology, for example) is captured by traditional electrical engineering design with emphasis on feedforward and feedback loops to implement robustness.
Collapse
Affiliation(s)
- Alex Ellery
- Department of Mechanical & Aerospace Engineering, Carleton University, 1125 Colonel By Drive, Ottawa, ON K1S 5B6, Canada
| |
Collapse
|
31
|
Trinh HC, Kwon YK. A novel constrained genetic algorithm-based Boolean network inference method from steady-state gene expression data. Bioinformatics 2021; 37:i383-i391. [PMID: 34252959 PMCID: PMC8275338 DOI: 10.1093/bioinformatics/btab295] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/24/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION It is a challenging problem in systems biology to infer both the network structure and dynamics of a gene regulatory network from steady-state gene expression data. Some methods based on Boolean or differential equation models have been proposed but they were not efficient in inference of large-scale networks. Therefore, it is necessary to develop a method to infer the network structure and dynamics accurately on large-scale networks using steady-state expression. RESULTS In this study, we propose a novel constrained genetic algorithm-based Boolean network inference (CGA-BNI) method where a Boolean canalyzing update rule scheme was employed to capture coarse-grained dynamics. Given steady-state gene expression data as an input, CGA-BNI identifies a set of path consistency-based constraints by comparing the gene expression level between the wild-type and the mutant experiments. It then searches Boolean networks which satisfy the constraints and induce attractors most similar to steady-state expressions. We devised a heuristic mutation operation for faster convergence and implemented a parallel evaluation routine for execution time reduction. Through extensive simulations on the artificial and the real gene expression datasets, CGA-BNI showed better performance than four other existing methods in terms of both structural and dynamics prediction accuracies. Taken together, CGA-BNI is a promising tool to predict both the structure and the dynamics of a gene regulatory network when a highest accuracy is needed at the cost of sacrificing the execution time. AVAILABILITY AND IMPLEMENTATION Source code and data are freely available at https://github.com/csclab/CGA-BNI. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hung-Cuong Trinh
- Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh 758307, Vietnam
| | - Yung-Keun Kwon
- Department of IT Convergence, University of Ulsan, Ulsan 680-749, Korea
| |
Collapse
|
32
|
Alvarez JM, Brooks MD, Swift J, Coruzzi GM. Time-Based Systems Biology Approaches to Capture and Model Dynamic Gene Regulatory Networks. ANNUAL REVIEW OF PLANT BIOLOGY 2021; 72:105-131. [PMID: 33667112 PMCID: PMC9312366 DOI: 10.1146/annurev-arplant-081320-090914] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
All aspects of transcription and its regulation involve dynamic events. However, capturing these dynamic events in gene regulatory networks (GRNs) offers both a promise and a challenge. The promise is that capturing and modeling the dynamic changes in GRNs will allow us to understand how organisms adapt to a changing environment. The ability to mount a rapid transcriptional response to environmental changes is especially important in nonmotile organisms such as plants. The challenge is to capture these dynamic, genome-wide events and model them in GRNs. In this review, we cover recent progress in capturing dynamic interactions of transcription factors with their targets-at both the local and genome-wide levels-and how they are used to learn how GRNs operate as a function of time. We also discuss recent advances that employ time-based machine learning approaches to forecast gene expression at future time points, a key goal of systems biology.
Collapse
Affiliation(s)
- Jose M Alvarez
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, Chile
- ANID-Millennium Science Initiative Program-Millennium Institute for Integrative Biology (iBio), Santiago, Chile
| | - Matthew D Brooks
- Global Change and Photosynthesis Research Unit, US Department of Agriculture Agricultural Research Service, Urbana, Illinois 61801, USA
| | - Joseph Swift
- Salk Institute for Biological Studies, La Jolla, California 92037, USA
| | - Gloria M Coruzzi
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA;
| |
Collapse
|
33
|
Westerman EL, Bowman SEJ, Davidson B, Davis MC, Larson ER, Sanford CPJ. Deploying Big Data to Crack the Genotype to Phenotype Code. Integr Comp Biol 2021; 60:385-396. [PMID: 32492136 DOI: 10.1093/icb/icaa055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Mechanistically connecting genotypes to phenotypes is a longstanding and central mission of biology. Deciphering these connections will unite questions and datasets across all scales from molecules to ecosystems. Although high-throughput sequencing has provided a rich platform on which to launch this effort, tools for deciphering mechanisms further along the genome to phenome pipeline remain limited. Machine learning approaches and other emerging computational tools hold the promise of augmenting human efforts to overcome these obstacles. This vision paper is the result of a Reintegrating Biology Workshop, bringing together the perspectives of integrative and comparative biologists to survey challenges and opportunities in cracking the genotype to phenotype code and thereby generating predictive frameworks across biological scales. Key recommendations include promoting the development of minimum "best practices" for the experimental design and collection of data; fostering sustained and long-term data repositories; promoting programs that recruit, train, and retain a diversity of talent; and providing funding to effectively support these highly cross-disciplinary efforts. We follow this discussion by highlighting a few specific transformative research opportunities that will be advanced by these efforts.
Collapse
Affiliation(s)
- Erica L Westerman
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR 72701, USA
| | - Sarah E J Bowman
- High-Throughput Crystallization Screening Center, Hauptman-Woodward Medical Research Institute, Buffalo, NY 14203, USA.,Department of Biochemistry, Jacobs School of Medicine & Biomedical Sciences at the University at Buffalo, Buffalo, NY 14203, USA
| | - Bradley Davidson
- Department of Biology, Swarthmore College, Swarthmore, PA 19081, USA
| | - Marcus C Davis
- Department of Biology, James Madison University, Harrisonburg, VA 22807, USA
| | - Eric R Larson
- Department of Natural Resources and Environmental Sciences, University of Illinois, Urbana, IL 61801, USA
| | - Christopher P J Sanford
- Department of Ecology, Evolution and Organismal Biology, Kennesaw State University, Kennesaw, GA 30144, USA
| |
Collapse
|
34
|
Vatsa D, Agarwal S. PEPN-GRN: A Petri net-based approach for the inference of gene regulatory networks from noisy gene expression data. PLoS One 2021; 16:e0251666. [PMID: 33989333 PMCID: PMC8121333 DOI: 10.1371/journal.pone.0251666] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 04/30/2021] [Indexed: 11/22/2022] Open
Abstract
The inference of gene regulatory networks (GRNs) from expression data is a challenging problem in systems biology. The stochasticity or fluctuations in the biochemical processes that regulate the transcription process poses as one of the major challenges. In this paper, we propose a novel GRN inference approach, named the Probabilistic Extended Petri Net for Gene Regulatory Network (PEPN-GRN), for the inference of gene regulatory networks from noisy expression data. The proposed inference approach makes use of transition of discrete gene expression levels across adjacent time points as different evidence types that relate to the production or decay of genes. The paper examines three variants of the PEPN-GRN method, which mainly differ by the way the scores of network edges are computed using evidence types. The proposed method is evaluated on the benchmark DREAM4 in silico data sets and a real time series data set of E. coli from the DREAM5 challenge. The PEPN-GRN_v3 variant (the third variant of the PEPN-GRN approach) sought to learn the weights of evidence types in accordance with their contribution to the activation and inhibition gene regulation process. The learned weights help understand the time-shifted and inverted time-shifted relationship between regulator and target gene. Thus, PEPN-GRN_v3, along with the inference of network edges, also provides a functional understanding of the gene regulation process.
Collapse
Affiliation(s)
- Deepika Vatsa
- Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, India
| | - Sumeet Agarwal
- Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, India
- * E-mail: ,
| |
Collapse
|
35
|
Computational analysis of fused co-expression networks for the identification of candidate cancer gene biomarkers. NPJ Syst Biol Appl 2021; 7:17. [PMID: 33712625 PMCID: PMC7955132 DOI: 10.1038/s41540-021-00175-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 02/08/2021] [Indexed: 11/08/2022] Open
Abstract
The complexity of cancer has always been a huge issue in understanding the source of this disease. However, by appreciating its complexity, we can shed some light on crucial gene associations across and in specific cancer types. In this study, we develop a general framework to infer relevant gene biomarkers and their gene-to-gene associations using multiple gene co-expression networks for each cancer type. Specifically, we infer computationally and biologically interesting communities of genes from kidney renal clear cell carcinoma, liver hepatocellular carcinoma, and prostate adenocarcinoma data sets of The Cancer Genome Atlas (TCGA) database. The gene communities are extracted through a data-driven pipeline and then evaluated through both functional analyses and literature findings. Furthermore, we provide a computational validation of their relevance for each cancer type by comparing the performance of normal/cancer classification for our identified gene sets and other gene signatures, including the typically-used differentially expressed genes. The hallmark of this study is its approach based on gene co-expression networks from different similarity measures: using a combination of multiple gene networks and then fusing normal and cancer networks for each cancer type, we can have better insights on the overall structure of the cancer-type-specific network.
Collapse
|
36
|
Brooks MD, Juang CL, Katari MS, Alvarez JM, Pasquino A, Shih HJ, Huang J, Shanks C, Cirrone J, Coruzzi GM. ConnecTF: A platform to integrate transcription factor-gene interactions and validate regulatory networks. PLANT PHYSIOLOGY 2021; 185:49-66. [PMID: 33631799 PMCID: PMC8133578 DOI: 10.1093/plphys/kiaa012] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Accepted: 10/27/2020] [Indexed: 05/08/2023]
Abstract
Deciphering gene regulatory networks (GRNs) is both a promise and challenge of systems biology. The promise lies in identifying key transcription factors (TFs) that enable an organism to react to changes in its environment. The challenge lies in validating GRNs that involve hundreds of TFs with hundreds of thousands of interactions with their genome-wide targets experimentally determined by high-throughput sequencing. To address this challenge, we developed ConnecTF, a species-independent, web-based platform that integrates genome-wide studies of TF-target binding, TF-target regulation, and other TF-centric omic datasets and uses these to build and refine validated or inferred GRNs. We demonstrate the functionality of ConnecTF by showing how integration within and across TF-target datasets uncovers biological insights. Case study 1 uses integration of TF-target gene regulation and binding datasets to uncover TF mode-of-action and identify potential TF partners for 14 TFs in abscisic acid signaling. Case study 2 demonstrates how genome-wide TF-target data and automated functions in ConnecTF are used in precision/recall analysis and pruning of an inferred GRN for nitrogen signaling. Case study 3 uses ConnecTF to chart a network path from NLP7, a master TF in nitrogen signaling, to direct secondary TF2s and to its indirect targets in a Network Walking approach. The public version of ConnecTF (https://ConnecTF.org) contains 3,738,278 TF-target interactions for 423 TFs in Arabidopsis, 839,210 TF-target interactions for 139 TFs in maize (Zea mays), and 293,094 TF-target interactions for 26 TFs in rice (Oryza sativa). The database and tools in ConnecTF will advance the exploration of GRNs in plant systems biology applications for model and crop species.
Collapse
Affiliation(s)
- Matthew D Brooks
- Center for Genomics and Systems Biology, Department of Biology, New York University, NY, USA
- USDA ARS Global Change and Photosynthesis Research Unit, Urbana, IL, USA
| | - Che-Lun Juang
- Center for Genomics and Systems Biology, Department of Biology, New York University, NY, USA
| | - Manpreet Singh Katari
- Center for Genomics and Systems Biology, Department of Biology, New York University, NY, USA
| | - José M Alvarez
- Center for Genomics and Systems Biology, Department of Biology, New York University, NY, USA
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, Chile
- Millennium Institute for Integrative Biology (iBio), Santiago, Chile
| | - Angelo Pasquino
- Center for Genomics and Systems Biology, Department of Biology, New York University, NY, USA
| | - Hung-Jui Shih
- Center for Genomics and Systems Biology, Department of Biology, New York University, NY, USA
| | - Ji Huang
- Center for Genomics and Systems Biology, Department of Biology, New York University, NY, USA
| | - Carly Shanks
- Center for Genomics and Systems Biology, Department of Biology, New York University, NY, USA
| | - Jacopo Cirrone
- Courant Institute for Mathematical Sciences, Department of Computer Science, New York University NY, USA
| | - Gloria M Coruzzi
- Center for Genomics and Systems Biology, Department of Biology, New York University, NY, USA
- Author for communication: (G.C.)
| |
Collapse
|
37
|
|
38
|
Long P, Zhang L, Huang B, Chen Q, Liu H. Integrating genome sequence and structural data for statistical learning to predict transcription factor binding sites. Nucleic Acids Res 2020; 48:12604-12617. [PMID: 33264415 PMCID: PMC7736823 DOI: 10.1093/nar/gkaa1134] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 09/18/2020] [Accepted: 11/10/2020] [Indexed: 01/11/2023] Open
Abstract
We report an approach to predict DNA specificity of the tetracycline repressor (TetR) family transcription regulators (TFRs). First, a genome sequence-based method was streamlined with quantitative P-values defined to filter out reliable predictions. Then, a framework was introduced to incorporate structural data and to train a statistical energy function to score the pairing between TFR and TFR binding site (TFBS) based on sequences. The predictions benchmarked against experiments, TFBSs for 29 out of 30 TFRs were correctly predicted by either the genome sequence-based or the statistical energy-based method. Using P-values or Z-scores as indicators, we estimate that 59.6% of TFRs are covered with relatively reliable predictions by at least one of the two methods, while only 28.7% are covered by the genome sequence-based method alone. Our approach predicts a large number of new TFBs which cannot be correctly retrieved from public databases such as FootprintDB. High-throughput experimental assays suggest that the statistical energy can model the TFBSs of a significant number of TFRs reliably. Thus the energy function may be applied to explore for new TFBSs in respective genomes. It is possible to extend our approach to other transcriptional factor families with sufficient structural information.
Collapse
Affiliation(s)
- Pengpeng Long
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Lu Zhang
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Bin Huang
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Quan Chen
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
- Hefei National Laboratory for Physical Sciences at the Microscale, Hefei, Anhui 230026, China
| | - Haiyan Liu
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
- Hefei National Laboratory for Physical Sciences at the Microscale, Hefei, Anhui 230026, China
- School of Data Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
39
|
Parkinson's Disease Master Regulators on Substantia Nigra and Frontal Cortex and Their Use for Drug Repositioning. Mol Neurobiol 2020; 58:1517-1534. [PMID: 33211252 DOI: 10.1007/s12035-020-02203-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 11/03/2020] [Indexed: 12/14/2022]
Abstract
Parkinson's disease (PD) is among the most prevalent neurodegenerative diseases. Available evidences support the view of PD as a complex disease, being the outcome of interactions between genetic and environmental factors. In face of diagnosis and therapy challenges, and the elusive PD etiology, the use of alternative methodological approaches for the elucidation of the disease pathophysiological mechanisms and proposal of novel potential therapeutic interventions has become increasingly necessary. In the present study, we first reconstructed the transcriptional regulatory networks (TN), centered on transcription factors (TF), of two brain regions affected in PD, the substantia nigra pars compacta (SNc) and the frontal cortex (FCtx). Then, we used case-control studies data from these regions to identify TFs working as master regulators (MR) of the disease, based on region-specific TNs. Twenty-nine regulatory units enriched with differentially expressed genes were identified for the SNc, and twenty for the FCtx, all of which were considered MR candidates for PD. Three consensus MR candidates were found for SNc and FCtx, namely ATF2, SLC30A9, and ZFP69B. In order to search for novel potential therapeutic interventions, we used these consensus MR candidate signatures as input to the Connectivity Map (CMap), a computational drug repositioning webtool. This analysis resulted in the identification of four drugs that reverse the expression pattern of all three MR consensus simultaneously, benperidol, harmaline, tubocurarine chloride, and vorinostat, thus suggested as novel potential PD therapeutic interventions.
Collapse
|
40
|
Ko DK, Brandizzi F. Network-based approaches for understanding gene regulation and function in plants. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 104:302-317. [PMID: 32717108 PMCID: PMC8922287 DOI: 10.1111/tpj.14940] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2020] [Accepted: 07/14/2020] [Indexed: 05/03/2023]
Abstract
Expression reprogramming directed by transcription factors is a primary gene regulation underlying most aspects of the biology of any organism. Our views of how gene regulation is coordinated are dramatically changing thanks to the advent and constant improvement of high-throughput profiling and transcriptional network inference methods: from activities of individual genes to functional interactions across genes. These technical and analytical advances can reveal the topology of transcriptional networks in which hundreds of genes are hierarchically regulated by multiple transcription factors at systems level. Here we review the state of the art of experimental and computational methods used in plant biology research to obtain large-scale datasets and model transcriptional networks. Examples of direct use of these network models and perspectives on their limitations and future directions are also discussed.
Collapse
Affiliation(s)
- Dae Kwan Ko
- MSU-DOE Plant Research Lab, Michigan State University, East Lansing, MI 48824, USA
- Great Lakes Bioenergy Research Center, Michigan State University, East Lansing, MI 48824, USA
| | - Federica Brandizzi
- MSU-DOE Plant Research Lab, Michigan State University, East Lansing, MI 48824, USA
- Great Lakes Bioenergy Research Center, Michigan State University, East Lansing, MI 48824, USA
- Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA
- For correspondence ()
| |
Collapse
|
41
|
Sleight VA, Antczak P, Falciani F, Clark MS. Computationally predicted gene regulatory networks in molluscan biomineralization identify extracellular matrix production and ion transportation pathways. Bioinformatics 2020; 36:1326-1332. [PMID: 31617561 PMCID: PMC7703775 DOI: 10.1093/bioinformatics/btz754] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 09/07/2019] [Accepted: 10/07/2019] [Indexed: 01/09/2023] Open
Abstract
MOTIVATION The molecular processes regulating molluscan shell production remain relatively uncharacterized, despite the clear evolutionary and societal importance of biomineralization. RESULTS Here we built the first computationally predicted gene regulatory network (GRN) for molluscan biomineralization using Antarctic clam (Laternula elliptica) mantle gene expression data produced over an age-categorized shell damage-repair time-course. We used previously published in vivo in situ hybridization expression data to ground truth gene interactions predicted by the GRN and show that candidate biomineralization genes from different shell layers, and hence microstructures, were connected in unique modules. We characterized two biomineralization modules of the GRN and hypothesize that one module is responsible for translating the extracellular proteins required for growing, repairing or remodelling the nacreous shell layer, whereas the second module orchestrates the transport of both ions and proteins to the shell secretion site, which are required during normal shell growth, and repair. Our findings demonstrate that unbiased computational methods are particularly valuable for studying fundamental biological processes and gene interactions in non-model species where rich sources of gene expression data exist, but annotation rates are poor and the ability to carry out true functional tests are still lacking. AVAILABILITY AND IMPLEMENTATION The raw RNA-Seq data is freely available for download from NCBI SRA (Accession: PRJNA398984), the assembled and annotated transcriptome can be viewed and downloaded from molluscDB (ensembl.molluscdb.org) and in addition, the assembled transcripts, reconstructed GRN, modules and detailed annotations are all available as Supplementary Files. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Victoria A Sleight
- Department of Zoology, University of Cambridge, Cambridge, UK.,Biodiversity, Evolution and Adaptation Team, British Antarctic Survey, Cambridge, UK
| | - Philipp Antczak
- Department of Functional and Comparative Genomics, Institute of Integrative Biology, University of Liverpool, Liverpool, UK
| | - Francesco Falciani
- Department of Functional and Comparative Genomics, Institute of Integrative Biology, University of Liverpool, Liverpool, UK
| | - Melody S Clark
- Biodiversity, Evolution and Adaptation Team, British Antarctic Survey, Cambridge, UK
| |
Collapse
|
42
|
Gao Z, Ding R, Zhai X, Wang Y, Chen Y, Yang CX, Du ZQ. Common Gene Modules Identified for Chicken Adiposity by Network Construction and Comparison. Front Genet 2020; 11:537. [PMID: 32547600 PMCID: PMC7272656 DOI: 10.3389/fgene.2020.00537] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Accepted: 05/04/2020] [Indexed: 12/12/2022] Open
Abstract
Excessive fat deposition can cause chicken health problem, and affect production efficiency by causing great economic losses to the industry. However, the molecular underpinnings of the complex adiposity trait remain elusive. In the current study, we constructed and compared the gene co-expression networks on four transcriptome profiling datasets, from two chicken lines under divergent selection for abdominal fat contents, in an attempt to dissect network compositions underlying adipose tissue growth and development. After functional enrichment analysis, nine network modules important to adipogenesis were discovered to be involved in lipid metabolism, PPAR and insulin signaling pathways, and contained hub genes related to adipogenesis, cell cycle, inflammation, and protein synthesis. Moreover, after additional functional annotation and network module comparisons, common sub-modules of similar functionality for chicken fat deposition were identified for different chicken lines, apart from modules specific to each chicken line. We further validated the lysosome pathway, and found TFEB and its downstream target genes showed similar expression patterns along with chicken preadipocyte differentiation. Our findings could provide novel insights into the genetic basis of complex adiposity traits, as well as human obesity and related metabolic diseases.
Collapse
Affiliation(s)
- Zhuoran Gao
- College of Animal Science, Yangtze University, Jingzhou, China.,College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Ran Ding
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Xiangyun Zhai
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Yuhao Wang
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Yaofeng Chen
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Cai-Xia Yang
- College of Animal Science, Yangtze University, Jingzhou, China.,College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Zhi-Qiang Du
- College of Animal Science, Yangtze University, Jingzhou, China.,College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| |
Collapse
|
43
|
DeMers LC, Redekar NR, Kachroo A, Tolin SA, Li S, Saghai Maroof MA. A transcriptional regulatory network of Rsv3-mediated extreme resistance against Soybean mosaic virus. PLoS One 2020; 15:e0231658. [PMID: 32315334 PMCID: PMC7173922 DOI: 10.1371/journal.pone.0231658] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 03/29/2020] [Indexed: 01/02/2023] Open
Abstract
Resistance genes are an effective means for disease control in plants. They predominantly function by inducing a hypersensitive reaction, which results in localized cell death restricting pathogen spread. Some resistance genes elicit an atypical response, termed extreme resistance, where resistance is not associated with a hypersensitive reaction and its standard defense responses. Unlike hypersensitive reaction, the molecular regulatory mechanism(s) underlying extreme resistance is largely unexplored. One of the few known, naturally occurring, instances of extreme resistance is resistance derived from the soybean Rsv3 gene, which confers resistance against the most virulent Soybean mosaic virus strains. To discern the regulatory mechanism underlying Rsv3-mediated extreme resistance, we generated a gene regulatory network using transcriptomic data from time course comparisons of Soybean mosaic virus-G7-inoculated resistant (L29, Rsv3-genotype) and susceptible (Williams82, rsv3-genotype) soybean cultivars. Our results show Rsv3 begins mounting a defense by 6 hpi via a complex phytohormone network, where abscisic acid, cytokinin, jasmonic acid, and salicylic acid pathways are suppressed. We identified putative regulatory interactions between transcription factors and genes in phytohormone regulatory pathways, which is consistent with the demonstrated involvement of these pathways in Rsv3-mediated resistance. One such transcription factor identified as a putative transcriptional regulator was MYC2 encoded by Glyma.07G051500. Known as a master regulator of abscisic acid and jasmonic acid signaling, MYC2 specifically recognizes the G-box motif ("CACGTG"), which was significantly enriched in our data among differentially expressed genes implicated in abscisic acid- and jasmonic acid-related activities. This suggests an important role for Glyma.07G051500 in abscisic acid- and jasmonic acid-derived defense signaling in Rsv3. Resultantly, the findings from our network offer insights into genes and biological pathways underlying the molecular defense mechanism of Rsv3-mediated extreme resistance against Soybean mosaic virus. The computational pipeline used to reconstruct the gene regulatory network in this study is freely available at https://github.com/LiLabAtVT/rsv3-network.
Collapse
Affiliation(s)
- Lindsay C. DeMers
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Neelam R. Redekar
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Aardra Kachroo
- Department of Plant Pathology, University of Kentucky, Lexington, Virginia, United States of America
| | - Sue A. Tolin
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Song Li
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - M. A. Saghai Maroof
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| |
Collapse
|
44
|
Buldum G, Tsipa A, Mantalaris A. Linking Engineered Gene Circuit Kinetic Modeling to Cellulose Biosynthesis Prediction in Escherichia coli: Toward Bioprocessing of Microbial Cell Factories. Ind Eng Chem Res 2020. [DOI: 10.1021/acs.iecr.9b05847] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Gizem Buldum
- Biological Systems Engineering Laboratory (BSEL), Department of Chemical Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| | - Argyro Tsipa
- Biological Systems Engineering Laboratory (BSEL), Department of Chemical Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| | - Athanasios Mantalaris
- Biological Systems Engineering Laboratory (BSEL), Department of Chemical Engineering, Imperial College London, London SW7 2AZ, United Kingdom
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, Georgia 30322, United States
| |
Collapse
|
45
|
Law SR, Kellgren TG, Björk R, Ryden P, Keech O. Centralization Within Sub-Experiments Enhances the Biological Relevance of Gene Co-expression Networks: A Plant Mitochondrial Case Study. FRONTIERS IN PLANT SCIENCE 2020; 11:524. [PMID: 32582224 PMCID: PMC7287149 DOI: 10.3389/fpls.2020.00524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Accepted: 04/07/2020] [Indexed: 05/07/2023]
Abstract
UNLABELLED Gene co-expression networks (GCNs) can be prepared using a variety of mathematical approaches based on data sampled across diverse developmental processes, tissue types, pathologies, mutant backgrounds, and stress conditions. These networks are used to identify genes with similar expression dynamics but are prone to introducing false-positive and false-negative relationships, especially in the instance of large and heterogenous datasets. With the aim of optimizing the relevance of edges in GCNs and enhancing global biological insight, we propose a novel approach that involves a data-centering step performed simultaneously per gene and per sub-experiment, called centralization within sub-experiments (CSE). Using a gene set encoding the plant mitochondrial proteome as a case study, our results show that all CSE-based GCNs assessed had significantly more edges within the majority of the considered functional sub-networks, such as the mitochondrial electron transport chain and its complexes, than GCNs not using CSE; thus demonstrating that CSE-based GCNs are efficient at predicting canonical functions and associated pathways, here referred to as the core gene network. Furthermore, we show that correlation analyses using CSE-processed data can be used to fine-tune prediction of the function of uncharacterized genes; while its use in combination with analyses based on non-CSE data can augment conventional stress analyses with the innate connections underpinning the dynamic system being examined. Therefore, CSE is an effective alternative method to conventional batch correction approaches, particularly when dealing with large and heterogenous datasets. The method is easy to implement into a pre-existing GCN analysis pipeline and can provide enhanced biological relevance to conventional GCNs by allowing users to delineate a core gene network. AUTHOR SUMMARY Gene co-expression networks (GCNs) are the product of a variety of mathematical approaches that identify causal relationships in gene expression dynamics but are prone to the misdiagnoses of false-positives and false-negatives, especially in the instance of large and heterogenous datasets. In light of the burgeoning output of next-generation sequencing projects performed on a variety of species, and developmental or clinical conditions; the statistical power and complexity of these networks will undoubtedly increase, while their biological relevance will be fiercely challenged. Here, we propose a novel approach to generate a "core" GCN with enhanced biological relevance. Our method involves a data-centering step that effectively removes all primary treatment/tissue effects, which is simple to employ and can be easily implemented into pre-existing GCN analysis pipelines. The gain in biological relevance resulting from the adoption of this approach was assessed using a plant mitochondrial case study.
Collapse
Affiliation(s)
- Simon R. Law
- Department of Plant Physiology, Umeå Plant Science Centre, Umeå Universitet, Umeå, Sweden
| | - Therese G. Kellgren
- Department of Mathematics and Mathematical Statistics, Umeå Universitet, Umeå, Sweden
| | - Rafael Björk
- Department of Mathematics and Mathematical Statistics, Umeå Universitet, Umeå, Sweden
| | - Patrik Ryden
- Department of Mathematics and Mathematical Statistics, Umeå Universitet, Umeå, Sweden
- *Correspondence: Patrik Ryden,
| | - Olivier Keech
- Department of Plant Physiology, Umeå Plant Science Centre, Umeå Universitet, Umeå, Sweden
- Olivier Keech,
| |
Collapse
|
46
|
Mercatelli D, Scalambra L, Triboli L, Ray F, Giorgi FM. Gene regulatory network inference resources: A practical overview. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194430. [PMID: 31678629 DOI: 10.1016/j.bbagrm.2019.194430] [Citation(s) in RCA: 75] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 09/06/2019] [Accepted: 09/09/2019] [Indexed: 02/08/2023]
Abstract
Transcriptional regulation is a fundamental molecular mechanism involved in almost every aspect of life, from homeostasis to development, from metabolism to behavior, from reaction to stimuli to disease progression. In recent years, the concept of Gene Regulatory Networks (GRNs) has grown popular as an effective applied biology approach for describing the complex and highly dynamic set of transcriptional interactions, due to its easy-to-interpret features. Since cataloguing, predicting and understanding every GRN connection in all species and cellular contexts remains a great challenge for biology, researchers have developed numerous tools and methods to infer regulatory processes. In this review, we catalogue these methods in six major areas, based on the dominant underlying information leveraged to infer GRNs: Coexpression, Sequence Motifs, Chromatin Immunoprecipitation (ChIP), Orthology, Literature and Protein-Protein Interaction (PPI) specifically focused on transcriptional complexes. The methods described here cover a wide range of user-friendliness: from web tools that require no prior computational expertise to command line programs and algorithms for large scale GRN inferences. Each method for GRN inference described herein effectively illustrates a type of transcriptional relationship, with many methods being complementary to others. While a truly holistic approach for inferring and displaying GRNs remains one of the greatest challenges in the field of systems biology, we believe that the integration of multiple methods described herein provides an effective means with which experimental and computational biologists alike may obtain the most complete pictures of transcriptional relationships. This article is part of a Special Issue entitled: Transcriptional Profiles and Regulatory Gene Networks edited by Dr. Federico Manuel Giorgi and Dr. Shaun Mahony.
Collapse
Affiliation(s)
- Daniele Mercatelli
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Laura Scalambra
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Luca Triboli
- Centre for Integrative Biology (CIBIO), University of Trento, Italy
| | - Forest Ray
- Department of Systems Biology, Columbia University Medical Center, New York, NY, United States
| | - Federico M Giorgi
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
| |
Collapse
|
47
|
Inference of plant gene regulatory networks using data-driven methods: A practical overview. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194447. [PMID: 31678628 DOI: 10.1016/j.bbagrm.2019.194447] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 10/08/2019] [Accepted: 10/31/2019] [Indexed: 11/20/2022]
Abstract
Transcriptional regulation is a complex and dynamic process that plays a vital role in plant growth and development. A key component in the regulation of genes is transcription factors (TFs), which coordinate the transcriptional control of gene activity. A gene regulatory network (GRN) is a collection of regulatory interactions between TFs and their target genes. The accurate delineation of GRNs offers a significant contribution to our understanding about how plant cells are organized and function, and how individual genes are regulated in various conditions, organs or cell types. During the past decade, important progress has been made in the identification of GRNs using experimental and computational approaches. However, a detailed overview of available platforms supporting the analysis of GRNs in plants is missing. Here, we review current databases, platforms and tools that perform data-driven analyses of gene regulation in Arabidopsis. The platforms are categorized into two sections, 1) promoter motif analysis tools that use motif mapping approaches to find TF motifs in the regulatory sequences of genes of interest and 2) network analysis tools that identify potential regulators for a set of input genes using a range of data types in order to generate GRNs. We discuss the diverse datasets integrated and highlight the strengths and caveats of different platforms. Finally, we shed light on the limitations of the above approaches and discuss future perspectives, including the need for integrative approaches to unravel complex GRNs in plants.
Collapse
|
48
|
Brown K, Takawira LT, O'Neill MM, Mizrachi E, Myburg AA, Hussey SG. Identification and functional evaluation of accessible chromatin associated with wood formation in Eucalyptus grandis. THE NEW PHYTOLOGIST 2019; 223:1937-1951. [PMID: 31063599 DOI: 10.1111/nph.15897] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 04/29/2019] [Indexed: 05/03/2023]
Abstract
Accessible chromatin changes dynamically during development and harbours functional regulatory regions which are poorly understood in the context of wood development. We explored the importance of accessible chromatin in Eucalyptus grandis in immature xylem generally, and MYB transcription factor-mediated transcriptional programmes specifically. We identified biologically reproducible DNase I Hypersensitive Sites (DHSs) and assessed their functional significance in immature xylem through their associations with gene expression, epigenomic data and DNA sequence conservation. We identified in vitro DNA binding sites for six secondary cell wall-associated Eucalyptus MYB (EgrMYB) transcription factors using DAP-seq, reconstructed protein-DNA networks of predicted targets based on binding sites within or outside DHSs and assessed biological enrichment of these networks with published datasets. 25 319 identified immature xylem DHSs were associated with increased transcription and significantly enriched for various epigenetic signatures (H3K4me3, H3K27me3, RNA pol II), conserved noncoding sequences and depleted single nucleotide variants. Predicted networks built from EgrMYB binding sites located in accessible chromatin were significantly enriched for systems biology datasets relevant to wood formation, whereas those occurring in inaccessible chromatin were not. Our study demonstrates that DHSs in E. grandis immature xylem, most of which are intergenic, are of functional significance to gene regulation in this tissue.
Collapse
Affiliation(s)
- Katrien Brown
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private Bag X28, Pretoria, 0002, South Africa
| | - Lazarus T Takawira
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private Bag X28, Pretoria, 0002, South Africa
| | - Marja M O'Neill
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private Bag X28, Pretoria, 0002, South Africa
| | - Eshchar Mizrachi
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private Bag X28, Pretoria, 0002, South Africa
| | - Alexander A Myburg
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private Bag X28, Pretoria, 0002, South Africa
| | - Steven G Hussey
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private Bag X28, Pretoria, 0002, South Africa
| |
Collapse
|
49
|
A novel analysis method for biomarker identification based on horizontal relationship: identifying potential biomarkers from large-scale hepatocellular carcinoma metabolomics data. Anal Bioanal Chem 2019; 411:6377-6386. [DOI: 10.1007/s00216-019-02011-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 06/03/2019] [Accepted: 07/01/2019] [Indexed: 02/07/2023]
|
50
|
Glymour C, Zhang K, Spirtes P. Review of Causal Discovery Methods Based on Graphical Models. Front Genet 2019; 10:524. [PMID: 31214249 PMCID: PMC6558187 DOI: 10.3389/fgene.2019.00524] [Citation(s) in RCA: 165] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Accepted: 05/13/2019] [Indexed: 12/11/2022] Open
Abstract
A fundamental task in various disciplines of science, including biology, is to find underlying causal relations and make use of them. Causal relations can be seen if interventions are properly applied; however, in many cases they are difficult or even impossible to conduct. It is then necessary to discover causal relations by analyzing statistical properties of purely observational data, which is known as causal discovery or causal structure search. This paper aims to give a introduction to and a brief review of the computational methods for causal discovery that were developed in the past three decades, including constraint-based and score-based methods and those based on functional causal models, supplemented by some illustrations and applications.
Collapse
Affiliation(s)
- Clark Glymour
- Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Kun Zhang
- Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Peter Spirtes
- Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA, United States
| |
Collapse
|