1
|
Chen Y, Wu T, Zhu Z, Huang H, Zhang L, Goel A, Yang M, Wang X. An integrated workflow for biomarker development using microRNAs in extracellular vesicles for cancer precision medicine. Semin Cancer Biol 2021; 74:134-155. [PMID: 33766650 DOI: 10.1016/j.semcancer.2021.03.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 03/13/2021] [Accepted: 03/16/2021] [Indexed: 02/06/2023]
Abstract
EV-miRNAs are microRNA (miRNA) molecules encapsulated in extracellular vesicles (EVs), which play crucial roles in tumor pathogenesis, progression, and metastasis. Recent studies about EV-miRNAs have gained novel insights into cancer biology and have demonstrated a great potential to develop novel liquid biopsy assays for various applications. Notably, compared to conventional liquid biomarkers, EV-miRNAs are more advantageous in representing host-cell molecular architecture and exhibiting higher stability and specificity. Despite various available techniques for EV-miRNA separation, concentration, profiling, and data analysis, a standardized approach for EV-miRNA biomarker development is yet lacking. In this review, we performed a substantial literature review and distilled an integrated workflow encompassing important steps for EV-miRNA biomarker development, including sample collection and EV isolation, EV-miRNA extraction and quantification, high-throughput data preprocessing, biomarker prioritization and model construction, functional analysis, as well as validation. With the rapid growth of "big data", we highlight the importance of efficient mining of high-throughput data for the discovery of EV-miRNA biomarkers and integrating multiple independent datasets for in silico and experimental validations to increase the robustness and reproducibility. Furthermore, as an efficient strategy in systems biology, network inference provides insights into the regulatory mechanisms and can be used to select functionally important EV-miRNAs to refine the biomarker candidates. Despite the encouraging development in the field, a number of challenges still hinder the clinical translation. We finally summarize several common challenges in various biomarker studies and discuss potential opportunities emerging in the related fields.
Collapse
Affiliation(s)
- Yu Chen
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Tan Wu
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Zhongxu Zhu
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Hao Huang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Liang Zhang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong; Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong; Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, Guangdong Province, China
| | - Ajay Goel
- Department of Molecular Diagnostics and Experimental Therapeutics, Beckman Research Institute of City of Hope Comprehensive Cancer Center, Duarte, CA, USA
| | - Mengsu Yang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong; Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong; Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, Guangdong Province, China
| | - Xin Wang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong; Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong; Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, Guangdong Province, China.
| |
Collapse
|
2
|
Wang K, Dai J, Liu T, Wang Q, Pang Y. Retracted Article: LncRNA ZEB2-AS1 regulates the drug resistance of acute myeloid leukemia via the miR-142-3p/INPP4B axis. RSC Adv 2019; 9:39495-39504. [PMID: 35540690 PMCID: PMC9076093 DOI: 10.1039/c9ra07854a] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 11/22/2019] [Indexed: 12/14/2022] Open
Abstract
Dysregulation of long noncoding RNAs (lncRNAs) has been reported to participate in the process of chemoresistance in multiple cancers, including acute myeloid leukemia (AML). LncRNA zinc finger E-box binding homeobox 2 antisense RNA 1 (ZEB2-AS1) has been reported to be up-regulated in AML. However, the biological role of ZEB2-AS1 remains to be determined. Quantitative real time polymerase chain reaction (qRT-PCR) was used to detect the levels of ZEB2-AS1, miR-142-3p and inositol polyphosphate-4-phosphatase type II B (INPP4B). The cell viability and apoptosis were examined by 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) assay and flow cytometry, respectively. Western blotting was applied to analyze levels of BCL2 apoptosis regulator (Bcl-2), BCL2 associated X, apoptosis regulator (Bax), cleaved-caspase-3 and INPP4B. The interaction among ZEB2-AS1, miR-142-3p and INPP4B was verified by dual-luciferase reporter assay and RNA pull-down assay. The levels of ZEB2-AS1 and INPP4B were significantly elevated in AML and chemo-resistance tissues, as well as in THP-1 and THP-1/ADR cells. ZEB2-AS1 elevated the IC50 of ADR, and suppressed cell apoptosis of AML cells, while ZEB2-AS1 increased Bcl-2 expression and decreased the levels of Bax and cleaved-caspase-3. ZEB2-AS1 could enhance the resistance in THP-1 and THP-1/ADR cells. ZEB2-AS1 could sponge miR-142-3p, and ZEB2-AS1 reduced the promotion effect of miR-124-3p on the sensitivity of AML cells. Furthermore, IPNN4B was revealed as a target gene of miR-142-3p. More interestingly, suppression of IPNN4B by shRNA reversed the inhibitory effect of ZEB2-AS1 on the sensitivity of AML cells. LncRNA ZEB2-AS1 promoted ADR resistance of AML via regulating INP4B expression by sponging miR-142-3p, providing a novel therapeutic target for drug resistance of AML.
Collapse
Affiliation(s)
- Kai Wang
- Department of Hematology, Zhoukou Central Hospital No. 26, East Renmin Road Zhoukou 466000 Henan China +86-394-8521603
| | - Jing Dai
- Department of Hematology, Zhoukou Central Hospital No. 26, East Renmin Road Zhoukou 466000 Henan China +86-394-8521603
| | - Tao Liu
- Department of Hematology, Zhoukou Central Hospital No. 26, East Renmin Road Zhoukou 466000 Henan China +86-394-8521603
| | - Qiong Wang
- Department of Hematology, Zhoukou Central Hospital No. 26, East Renmin Road Zhoukou 466000 Henan China +86-394-8521603
| | - Yingxu Pang
- Department of Hematology, Zhoukou Central Hospital No. 26, East Renmin Road Zhoukou 466000 Henan China +86-394-8521603
| |
Collapse
|
3
|
Agrahari R, Foroushani A, Docking TR, Chang L, Duns G, Hudoba M, Karsan A, Zare H. Applications of Bayesian network models in predicting types of hematological malignancies. Sci Rep 2018; 8:6951. [PMID: 29725024 PMCID: PMC5934387 DOI: 10.1038/s41598-018-24758-5] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 04/05/2018] [Indexed: 12/17/2022] Open
Abstract
Network analysis is the preferred approach for the detection of subtle but coordinated changes in expression of an interacting and related set of genes. We introduce a novel method based on the analyses of coexpression networks and Bayesian networks, and we use this new method to classify two types of hematological malignancies; namely, acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS). Our classifier has an accuracy of 93%, a precision of 98%, and a recall of 90% on the training dataset (n = 366); which outperforms the results reported by other scholars on the same dataset. Although our training dataset consists of microarray data, our model has a remarkable performance on the RNA-Seq test dataset (n = 74, accuracy = 89%, precision = 88%, recall = 98%), which confirms that eigengenes are robust with respect to expression profiling technology. These signatures are useful in classification and correctly predicting the diagnosis. They might also provide valuable information about the underlying biology of diseases. Our network analysis approach is generalizable and can be useful for classifying other diseases based on gene expression profiles. Our previously published Pigengene package is publicly available through Bioconductor, which can be used to conveniently fit a Bayesian network to gene expression data.
Collapse
Affiliation(s)
- Rupesh Agrahari
- Department of Computer Science, Texas State University, San Marcos, Texas, 78666, USA
| | - Amir Foroushani
- Department of Computer Science, Texas State University, San Marcos, Texas, 78666, USA
| | - T Roderick Docking
- Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Linda Chang
- Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Gerben Duns
- Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Monika Hudoba
- Department of Pathology and Laboratory Medicine, Vancouver General Hospital, Vancouver, British Columbia, V5Z 1M9, Canada
| | - Aly Karsan
- Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Habil Zare
- Department of Computer Science, Texas State University, San Marcos, Texas, 78666, USA. .,Department of Cell Systems & Anatomy, The University of Texas Health Science Center, San Antonio, Texas, 78229, USA.
| |
Collapse
|
4
|
Engelhardt B, Kschischo M, Fröhlich H. A Bayesian approach to estimating hidden variables as well as missing and wrong molecular interactions in ordinary differential equation-based mathematical models. J R Soc Interface 2018; 14:rsif.2017.0332. [PMID: 28615495 PMCID: PMC5493809 DOI: 10.1098/rsif.2017.0332] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Accepted: 05/23/2017] [Indexed: 11/12/2022] Open
Abstract
Ordinary differential equations (ODEs) are a popular approach to quantitatively model molecular networks based on biological knowledge. However, such knowledge is typically restricted. Wrongly modelled biological mechanisms as well as relevant external influence factors that are not included into the model are likely to manifest in major discrepancies between model predictions and experimental data. Finding the exact reasons for such observed discrepancies can be quite challenging in practice. In order to address this issue, we suggest a Bayesian approach to estimate hidden influences in ODE-based models. The method can distinguish between exogenous and endogenous hidden influences. Thus, we can detect wrongly specified as well as missed molecular interactions in the model. We demonstrate the performance of our Bayesian dynamic elastic-net with several ordinary differential equation models from the literature, such as human JAK-STAT signalling, information processing at the erythropoietin receptor, isomerization of liquid α-Pinene, G protein cycling in yeast and UV-B triggered signalling in plants. Moreover, we investigate a set of commonly known network motifs and a gene-regulatory network. Altogether our method supports the modeller in an algorithmic manner to identify possible sources of errors in ODE-based models on the basis of experimental data.
Collapse
Affiliation(s)
- Benjamin Engelhardt
- Rheinische Friedrich-Wilhelms-Universität Bonn, Algorithmic Bioinformatics, Bonn, Germany .,DFG Research Training Group 1873, Rheinische Friedrich-Wilhelms-Universität Bonn, Germany
| | - Maik Kschischo
- Department of Mathematics and Technology, University of Applied Sciences Koblenz, RheinAhrCampus, Remagen, Germany
| | - Holger Fröhlich
- Rheinische Friedrich-Wilhelms-Universität Bonn, Algorithmic Bioinformatics, Bonn, Germany.,UCB Biosciences GmbH, Monheim, Germany
| |
Collapse
|
5
|
Trescher S, Münchmeyer J, Leser U. Estimating genome-wide regulatory activity from multi-omics data sets using mathematical optimization. BMC SYSTEMS BIOLOGY 2017; 11:41. [PMID: 28347313 PMCID: PMC5369021 DOI: 10.1186/s12918-017-0419-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2016] [Accepted: 03/08/2017] [Indexed: 12/28/2022]
Abstract
Background Gene regulation is one of the most important cellular processes, indispensable for the adaptability of organisms and closely interlinked with several classes of pathogenesis and their progression. Elucidation of regulatory mechanisms can be approached by a multitude of experimental methods, yet integration of the resulting heterogeneous, large, and noisy data sets into comprehensive and tissue or disease-specific cellular models requires rigorous computational methods. Recently, several algorithms have been proposed which model genome-wide gene regulation as sets of (linear) equations over the activity and relationships of transcription factors, genes and other factors. Subsequent optimization finds those parameters that minimize the divergence of predicted and measured expression intensities. In various settings, these methods produced promising results in terms of estimating transcription factor activity and identifying key biomarkers for specific phenotypes. However, despite their common root in mathematical optimization, they vastly differ in the types of experimental data being integrated, the background knowledge necessary for their application, the granularity of their regulatory model, the concrete paradigm used for solving the optimization problem and the data sets used for evaluation. Results Here, we review five recent methods of this class in detail and compare them with respect to several key properties. Furthermore, we quantitatively compare the results of four of the presented methods based on publicly available data sets. Conclusions The results show that all methods seem to find biologically relevant information. However, we also observe that the mutual result overlaps are very low, which contradicts biological intuition. Our aim is to raise further awareness of the power of these methods, yet also to identify common shortcomings and necessary extensions enabling focused research on the critical points. Electronic supplementary material The online version of this article (doi:10.1186/s12918-017-0419-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Saskia Trescher
- Knowledge Management in Bioinformatics, Computer Science Department, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099, Berlin, Germany.
| | - Jannes Münchmeyer
- Knowledge Management in Bioinformatics, Computer Science Department, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099, Berlin, Germany
| | - Ulf Leser
- Knowledge Management in Bioinformatics, Computer Science Department, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099, Berlin, Germany
| |
Collapse
|
6
|
Expectation propagation for large scale Bayesian inference of non-linear molecular networks from perturbation data. PLoS One 2017; 12:e0171240. [PMID: 28166542 PMCID: PMC5293552 DOI: 10.1371/journal.pone.0171240] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2016] [Accepted: 01/17/2017] [Indexed: 11/19/2022] Open
Abstract
Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.
Collapse
|
7
|
Walsh CJ, Hu P, Batt J, Dos Santos CC. Discovering MicroRNA-Regulatory Modules in Multi-Dimensional Cancer Genomic Data: A Survey of Computational Methods. Cancer Inform 2016; 15:25-42. [PMID: 27721651 PMCID: PMC5051584 DOI: 10.4137/cin.s39369] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Revised: 08/14/2016] [Accepted: 08/16/2016] [Indexed: 12/20/2022] Open
Abstract
MicroRNAs (miRs) are small single-stranded noncoding RNA that function in RNA silencing and post-transcriptional regulation of gene expression. An increasing number of studies have shown that miRs play an important role in tumorigenesis, and understanding the regulatory mechanism of miRs in this gene regulatory network will help elucidate the complex biological processes at play during malignancy. Despite advances, determination of miR–target interactions (MTIs) and identification of functional modules composed of miRs and their specific targets remain a challenge. A large amount of data generated by high-throughput methods from various sources are available to investigate MTIs. The development of data-driven tools to harness these multi-dimensional data has resulted in significant progress over the past decade. In parallel, large-scale cancer genomic projects are allowing new insights into the commonalities and disparities of miR–target regulation across cancers. In the first half of this review, we explore methods for identification of pairwise MTIs, and in the second half, we explore computational tools for discovery of miR-regulatory modules in a cancer-specific and pan-cancer context. We highlight strengths and limitations of each of these tools as a practical guide for the computational biologists.
Collapse
Affiliation(s)
- Christopher J Walsh
- Keenan and Li Ka Shing Knowledge Institute of Saint Michael's Hospital, Toronto, ON, Canada.; Institute of Medical Sciences and Department of Medicine, University of Toronto, Toronto, ON, Canada
| | - Pingzhao Hu
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, Canada
| | - Jane Batt
- Keenan and Li Ka Shing Knowledge Institute of Saint Michael's Hospital, Toronto, ON, Canada.; Institute of Medical Sciences and Department of Medicine, University of Toronto, Toronto, ON, Canada
| | - Claudia C Dos Santos
- Keenan and Li Ka Shing Knowledge Institute of Saint Michael's Hospital, Toronto, ON, Canada.; Institute of Medical Sciences and Department of Medicine, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
8
|
Zhang J, Le TD, Liu L, He J, Li J. A novel framework for inferring condition-specific TF and miRNA co-regulation of protein-protein interactions. Gene 2015; 577:55-64. [PMID: 26611531 DOI: 10.1016/j.gene.2015.11.023] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2015] [Revised: 10/16/2015] [Accepted: 11/17/2015] [Indexed: 12/11/2022]
Abstract
Recent studies have shown that transcription factors (TFs) and microRNAs (miRNAs), while independently regulate their downstream targets, collaborate with each other to regulate gene expression. However, their synergistic roles in protein-protein interactions (PPIs) remain mostly unknown. In this paper, we present a novel framework (called CoRePPI) for inferring TF and miRNA co-regulation of PPIs. Particularly, CoRePPI is aimed at discovering the co-regulation specific to a condition of interest, by using heterogeneous data, including miRNA and messenger RNA (mRNA) expression profiles, putative miRNA targets, TF targets and PPIs. CoRePPI firstly finds the network motifs indicating the co-regulation of PPIs by TFs and miRNAs in tumor and normal conditions separately. Then by identifying the differential motifs found in one condition but not in the other, it builds the networks consisting of TFs, miRNAs and their co-regulated PPIs specific to different conditions respectively. To validate CoRePPI, we apply it to the Pan-Cancer dataset which includes the expression profiles of 12 cancer types from TCGA. Through network topology analysis, we found that the tumor and normal CoRePPI networks are scale-free. Furthermore, the results of differential and intersected network analysis between the tumor and normal CoRePPI networks suggest that only a small fraction of the regulatory relationships between TFs and miRNAs are conserved in both conditions but they co-regulate different downstream PPIs in tumor and normal conditions; and in different conditions the majority of the regulatory relationships between TFs and miRNAs are different although they may regulate the same PPIs in their respective conditions. The CoRePPI sub-networks constructed for the three types of cancers (breast cancer, lung cancer and ovarian cancer) are all scale-free, and the intersection of these CoRePPI sub-networks can be utilized as the biomarker CoRePPI sub-network of the three types of cancers. The PPI enrichment analyses of the tumor and normal CoRePPI networks suggest that the co-regulating TFs and miRNAs are significantly associated with the specific biological processes, diseases and pathways. In addition, comparing with the two non-condition-specific approaches, the tumor CoRePPI network is found to have the most enriched cancer-related PPIs. Altogether, the results uncover the combined regulatory patterns of TFs and miRNAs on the PPIs, and may provide new insights for research in cancer-associated TFs and miRNAs.
Collapse
Affiliation(s)
- Junpeng Zhang
- School of Engineering, Dali University, Dali, Yunnan 671003, China.
| | - Thuc Duy Le
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, SA 5095, Australia
| | - Lin Liu
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, SA 5095, Australia
| | - Jianfeng He
- School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan, 650500, China
| | - Jiuyong Li
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, SA 5095, Australia.
| |
Collapse
|
9
|
Fröhlich H. biRte: Bayesian inference of context-specific regulator activities and transcriptional networks. Bioinformatics 2015; 31:3290-8. [PMID: 26112290 DOI: 10.1093/bioinformatics/btv379] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 06/15/2015] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED In the last years there has been an increasing effort to computationally model and predict the influence of regulators (transcription factors, miRNAs) on gene expression. Here we introduce biRte as a computationally attractive approach combining Bayesian inference of regulator activities with network reverse engineering. biRte integrates target gene predictions with different omics data entities (e.g. miRNA and mRNA data) into a joint probabilistic framework. The utility of our method is tested in extensive simulation studies and demonstrated with applications from prostate cancer and Escherichia coli growth control. The resulting regulatory networks generally show a good agreement with the biological literature. AVAILABILITY AND IMPLEMENTATION biRte is available on Bioconductor (http://bioconductor.org). CONTACT frohlich@bit.uni-bonn.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Holger Fröhlich
- University of Bonn, Institute for Computer Science, Römerstr. 164, 53117 Bonn, Germany
| |
Collapse
|
10
|
Hasegawa T, Mori T, Yamaguchi R, Shimamura T, Miyano S, Imoto S, Akutsu T. Genomic data assimilation using a higher moment filtering technique for restoration of gene regulatory networks. BMC SYSTEMS BIOLOGY 2015; 9:14. [PMID: 25890175 PMCID: PMC4371723 DOI: 10.1186/s12918-015-0154-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 02/20/2015] [Indexed: 11/20/2022]
Abstract
Background As a result of recent advances in biotechnology, many findings related to intracellular systems have been published, e.g., transcription factor (TF) information. Although we can reproduce biological systems by incorporating such findings and describing their dynamics as mathematical equations, simulation results can be inconsistent with data from biological observations if there are inaccurate or unknown parts in the constructed system. For the completion of such systems, relationships among genes have been inferred through several computational approaches, which typically apply several abstractions, e.g., linearization, to handle the heavy computational cost in evaluating biological systems. However, since these approximations can generate false regulations, computational methods that can infer regulatory relationships based on less abstract models incorporating existing knowledge have been strongly required. Results We propose a new data assimilation algorithm that utilizes a simple nonlinear regulatory model and a state space representation to infer gene regulatory networks (GRNs) using time-course observation data. For the estimation of the hidden state variables and the parameter values, we developed a novel method termed a higher moment ensemble particle filter (HMEnPF) that can retain first four moments of the conditional distributions through filtering steps. Starting from the original model, e.g., derived from the literature, the proposed algorithm can sequentially evaluate candidate models, which are generated by partially changing the current best model, to find the model that can best predict the data. For the performance evaluation, we generated six synthetic data based on two real biological networks and evaluated effectiveness of the proposed algorithm by improving the networks inferred by previous methods. We then applied time-course observation data of rat skeletal muscle stimulated with corticosteroid. Since a corticosteroid pharmacogenomic pathway, its kinetic/dynamics and TF candidate genes have been partially elucidated, we incorporated these findings and inferred an extended pathway of rat pharmacogenomics. Conclusions Through the simulation study, the proposed algorithm outperformed previous methods and successfully improved the regulatory structure inferred by the previous methods. Furthermore, the proposed algorithm could extend a corticosteroid related pathway, which has been partially elucidated, with incorporating several information sources. Electronic supplementary material The online version of this article (doi:10.1186/s12918-015-0154-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Takanori Hasegawa
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Kyoto, 611-0011 Uji, Japan.
| | - Tomoya Mori
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Kyoto, 611-0011 Uji, Japan.
| | - Rui Yamaguchi
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Tokyo, 108-8639 Minato-ku, Japan.
| | - Teppei Shimamura
- Division of Systems Biology, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Nagoya, 466-8550 Showa-ku, Japan.
| | - Satoru Miyano
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Tokyo, 108-8639 Minato-ku, Japan.
| | - Seiya Imoto
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Tokyo, 108-8639 Minato-ku, Japan.
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Kyoto, 611-0011 Uji, Japan.
| |
Collapse
|
11
|
Role of microRNAs in cancers of the female reproductive tract: insights from recent clinical and experimental discovery studies. Clin Sci (Lond) 2014; 128:153-80. [PMID: 25294164 DOI: 10.1042/cs20140087] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
microRNAs (miRNAs) are small RNA molecules that represent the top of the pyramid of many tumorigenesis cascade pathways as they have the ability to affect multiple, intricate, and still undiscovered downstream targets. Understanding how miRNA molecules serve as master regulators in these important networks involved in cancer initiation and progression open up significant innovative areas for therapy and diagnosis that have been sadly lacking for deadly female reproductive tract cancers. This review will highlight the recent advances in the field of miRNAs in epithelial ovarian cancer, endometrioid endometrial cancer and squamous-cell cervical carcinoma focusing on studies associated with actual clinical information in humans. Importantly, recent miRNA profiling studies have included well-characterized clinical specimens of female reproductive tract cancers, allowing for studies correlating miRNA expression with clinical outcomes. This review will summarize the current thoughts on the role of miRNA processing in unique miRNA species present in these cancers. In addition, this review will focus on current data regarding miRNA molecules as unique biomarkers associated with clinically significant outcomes such as overall survival and chemotherapy resistance. We will also discuss why specific miRNA molecules are not recapitulated across multiple studies of the same cancer type. Although the mechanistic contributions of miRNA molecules to these clinical phenomena have been confirmed using in vitro and pre-clinical mouse model systems, these studies are truly only the beginning of our understanding of the roles miRNAs play in cancers of the female reproductive tract. This review will also highlight useful areas for future research regarding miRNAs as therapeutic targets in cancers of the female reproductive tract.
Collapse
|
12
|
Diez D, Agustí A, Wheelock CE. Network Analysis in the Investigation of Chronic Respiratory Diseases. From Basics to Application. Am J Respir Crit Care Med 2014; 190:981-8. [DOI: 10.1164/rccm.201403-0421pp] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
|
13
|
Zhang J, Le TD, Liu L, Liu B, He J, Goodall GJ, Li J. Identifying direct miRNA-mRNA causal regulatory relationships in heterogeneous data. J Biomed Inform 2014; 52:438-47. [PMID: 25181465 DOI: 10.1016/j.jbi.2014.08.005] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2013] [Revised: 08/11/2014] [Accepted: 08/16/2014] [Indexed: 10/24/2022]
Abstract
Discovering the regulatory relationships between microRNAs (miRNAs) and mRNAs is an important problem that interests many biologists and medical researchers. A number of computational methods have been proposed to infer miRNA-mRNA regulatory relationships, and are mostly based on the statistical associations between miRNAs and mRNAs discovered in observational data. The miRNA-mRNA regulatory relationships identified by these methods can be both direct and indirect regulations. However, differentiating direct regulatory relationships from indirect ones is important for biologists in experimental designs. In this paper, we present a causal discovery based framework (called DirectTarget) to infer direct miRNA-mRNA causal regulatory relationships in heterogeneous data, including expression profiles of miRNAs and mRNAs, and miRNA target information. DirectTarget is applied to the Epithelial to Mesenchymal Transition (EMT) datasets. The validation by experimentally confirmed target databases suggests that the proposed method can effectively identify direct miRNA-mRNA regulatory relationships. To explore the upstream regulators of miRNA regulation, we further identify the causal feedforward patterns (CFFPs) of TF-miRNA-mRNA to provide insights into the miRNA regulation in EMT. DirectTarget has the potential to be applied to other datasets to elucidate the direct miRNA-mRNA causal regulatory relationships and to explore the regulatory patterns.
Collapse
Affiliation(s)
- Junpeng Zhang
- Faculty of Engineering, Dali University, Dali, China.
| | - Thuc Duy Le
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, SA 5095, Australia.
| | - Lin Liu
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, SA 5095, Australia.
| | - Bing Liu
- Children's Cancer Institute Australia, Randwick, NSW 2301, Australia.
| | - Jianfeng He
- Kunming University of Science and Technology, Kunming, China.
| | | | - Jiuyong Li
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, SA 5095, Australia.
| |
Collapse
|
14
|
Inference of gene regulatory networks incorporating multi-source biological knowledge via a state space model with L1 regularization. PLoS One 2014; 9:e105942. [PMID: 25162401 PMCID: PMC4146587 DOI: 10.1371/journal.pone.0105942] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Accepted: 07/25/2014] [Indexed: 12/17/2022] Open
Abstract
Comprehensive understanding of gene regulatory networks (GRNs) is a major challenge in the field of systems biology. Currently, there are two main approaches in GRN analysis using time-course observation data, namely an ordinary differential equation (ODE)-based approach and a statistical model-based approach. The ODE-based approach can generate complex dynamics of GRNs according to biologically validated nonlinear models. However, it cannot be applied to ten or more genes to simultaneously estimate system dynamics and regulatory relationships due to the computational difficulties. The statistical model-based approach uses highly abstract models to simply describe biological systems and to infer relationships among several hundreds of genes from the data. However, the high abstraction generates false regulations that are not permitted biologically. Thus, when dealing with several tens of genes of which the relationships are partially known, a method that can infer regulatory relationships based on a model with low abstraction and that can emulate the dynamics of ODE-based models while incorporating prior knowledge is urgently required. To accomplish this, we propose a method for inference of GRNs using a state space representation of a vector auto-regressive (VAR) model with L1 regularization. This method can estimate the dynamic behavior of genes based on linear time-series modeling constructed from an ODE-based model and can infer the regulatory structure among several tens of genes maximizing prediction ability for the observational data. Furthermore, the method is capable of incorporating various types of existing biological knowledge, e.g., drug kinetics and literature-recorded pathways. The effectiveness of the proposed method is shown through a comparison of simulation studies with several previous methods. For an application example, we evaluated mRNA expression profiles over time upon corticosteroid stimulation in rats, thus incorporating corticosteroid kinetics/dynamics, literature-recorded pathways and transcription factor (TF) information.
Collapse
|
15
|
Zhang J, Le TD, Liu L, Liu B, He J, Goodall GJ, Li J. Inferring condition-specific miRNA activity from matched miRNA and mRNA expression data. ACTA ACUST UNITED AC 2014; 30:3070-7. [PMID: 25061069 DOI: 10.1093/bioinformatics/btu489] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
MOTIVATION MicroRNAs (miRNAs) play crucial roles in complex cellular networks by binding to the messenger RNAs (mRNAs) of protein coding genes. It has been found that miRNA regulation is often condition-specific. A number of computational approaches have been developed to identify miRNA activity specific to a condition of interest using gene expression data. However, most of the methods only use the data in a single condition, and thus, the activity discovered may not be unique to the condition of interest. Additionally, these methods are based on statistical associations between the gene expression levels of miRNAs and mRNAs, so they may not be able to reveal real gene regulatory relationships, which are causal relationships. RESULTS We propose a novel method to infer condition-specific miRNA activity by considering (i) the difference between the regulatory behavior that an miRNA has in the condition of interest and its behavior in the other conditions; (ii) the causal semantics of miRNA-mRNA relationships. The method is applied to the epithelial-mesenchymal transition (EMT) and multi-class cancer (MCC) datasets. The validation by the results of transfection experiments shows that our approach is effective in discovering significant miRNA-mRNA interactions. Functional and pathway analysis and literature validation indicate that the identified active miRNAs are closely associated with the specific biological processes, diseases and pathways. More detailed analysis of the activity of the active miRNAs implies that some active miRNAs show different regulation types in different conditions, but some have the same regulation types and their activity only differs in different conditions in the strengths of regulation. AVAILABILITY AND IMPLEMENTATION The R and Matlab scripts are in the Supplementary materials.
Collapse
Affiliation(s)
- Junpeng Zhang
- Faculty of Engineering, Dali University, Dali, Yunnan 671003, China, School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, SA 5095, Australia, Children's Cancer Institute Australia, Randwick, NSW 2301, Australia, Kunming University of Science and Technology, Kunming, Yunnan 650500, China and Centre for Cancer Biology, SA Pathology, Adelaide, SA 5000, Australia
| | - Thuc Duy Le
- Faculty of Engineering, Dali University, Dali, Yunnan 671003, China, School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, SA 5095, Australia, Children's Cancer Institute Australia, Randwick, NSW 2301, Australia, Kunming University of Science and Technology, Kunming, Yunnan 650500, China and Centre for Cancer Biology, SA Pathology, Adelaide, SA 5000, Australia
| | - Lin Liu
- Faculty of Engineering, Dali University, Dali, Yunnan 671003, China, School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, SA 5095, Australia, Children's Cancer Institute Australia, Randwick, NSW 2301, Australia, Kunming University of Science and Technology, Kunming, Yunnan 650500, China and Centre for Cancer Biology, SA Pathology, Adelaide, SA 5000, Australia
| | - Bing Liu
- Faculty of Engineering, Dali University, Dali, Yunnan 671003, China, School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, SA 5095, Australia, Children's Cancer Institute Australia, Randwick, NSW 2301, Australia, Kunming University of Science and Technology, Kunming, Yunnan 650500, China and Centre for Cancer Biology, SA Pathology, Adelaide, SA 5000, Australia
| | - Jianfeng He
- Faculty of Engineering, Dali University, Dali, Yunnan 671003, China, School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, SA 5095, Australia, Children's Cancer Institute Australia, Randwick, NSW 2301, Australia, Kunming University of Science and Technology, Kunming, Yunnan 650500, China and Centre for Cancer Biology, SA Pathology, Adelaide, SA 5000, Australia
| | - Gregory J Goodall
- Faculty of Engineering, Dali University, Dali, Yunnan 671003, China, School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, SA 5095, Australia, Children's Cancer Institute Australia, Randwick, NSW 2301, Australia, Kunming University of Science and Technology, Kunming, Yunnan 650500, China and Centre for Cancer Biology, SA Pathology, Adelaide, SA 5000, Australia
| | - Jiuyong Li
- Faculty of Engineering, Dali University, Dali, Yunnan 671003, China, School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, SA 5095, Australia, Children's Cancer Institute Australia, Randwick, NSW 2301, Australia, Kunming University of Science and Technology, Kunming, Yunnan 650500, China and Centre for Cancer Biology, SA Pathology, Adelaide, SA 5000, Australia
| |
Collapse
|
16
|
Le TD, Liu L, Zhang J, Liu B, Li J. From miRNA regulation to miRNA-TF co-regulation: computational approaches and challenges. Brief Bioinform 2014; 16:475-96. [DOI: 10.1093/bib/bbu023] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2014] [Accepted: 06/10/2014] [Indexed: 12/14/2022] Open
|
17
|
Afshar AS, Xu J, Goutsias J. Integrative identification of deregulated miRNA/TF-mediated gene regulatory loops and networks in prostate cancer. PLoS One 2014; 9:e100806. [PMID: 24968068 PMCID: PMC4072696 DOI: 10.1371/journal.pone.0100806] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Accepted: 05/28/2014] [Indexed: 01/07/2023] Open
Abstract
MicroRNAs (miRNAs) have attracted a great deal of attention in biology and medicine. It has been hypothesized that miRNAs interact with transcription factors (TFs) in a coordinated fashion to play key roles in regulating signaling and transcriptional pathways and in achieving robust gene regulation. Here, we propose a novel integrative computational method to infer certain types of deregulated miRNA-mediated regulatory circuits at the transcriptional, post-transcriptional and signaling levels. To reliably predict miRNA-target interactions from mRNA/miRNA expression data, our method collectively utilizes sequence-based miRNA-target predictions obtained from several algorithms, known information about mRNA and miRNA targets of TFs available in existing databases, certain molecular structures identified to be statistically over-represented in gene regulatory networks, available molecular subtyping information, and state-of-the-art statistical techniques to appropriately constrain the underlying analysis. In this way, the method exploits almost every aspect of extractable information in the expression data. We apply our procedure on mRNA/miRNA expression data from prostate tumor and normal samples and detect numerous known and novel miRNA-mediated deregulated loops and networks in prostate cancer. We also demonstrate instances of the results in a number of distinct biological settings, which are known to play crucial roles in prostate and other types of cancer. Our findings show that the proposed computational method can be used to effectively achieve notable insights into the poorly understood molecular mechanisms of miRNA-mediated interactions and dissect their functional roles in cancer in an effort to pave the way for miRNA-based therapeutics in clinical settings.
Collapse
Affiliation(s)
- Ali Sobhi Afshar
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Joseph Xu
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - John Goutsias
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
18
|
MicroRNAs: master regulators of drug resistance, stemness, and metastasis. J Mol Med (Berl) 2014; 92:321-36. [PMID: 24509937 DOI: 10.1007/s00109-014-1129-2] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Revised: 01/21/2014] [Accepted: 01/23/2014] [Indexed: 12/13/2022]
Abstract
MicroRNAs (miRNAs) are 20-22 nucleotides long small non-coding RNAs that regulate gene expression post-transcriptionally. Last decade has witnessed emerging evidences of active roles of miRNAs in tumor development, progression, metastasis, and drug resistance. Many factors contribute to their dysregulation in cancer, such as chromosomal aberrations, differential methylation of their own or host genes' promoters and alterations in miRNA biogenesis pathways. miRNAs have been shown to act as tumor suppressors or oncogenes depending on the targets they regulate and the tissue where they are expressed. Because miRNAs can regulate dozens of genes simultaneously and they can function as tumor suppressors or oncogenes, they have been proposed as promising targets for cancer therapy. In this review, we focus on the role of miRNAs in driving drug resistance and metastasis which are associated with stem cell properties of cancer cells. Furthermore, we discuss systems biology approaches to combine experimental and computational methods to study effects of miRNAs on gene or protein networks regulating these processes. Finally, we describe methods to target oncogenic or replace tumor suppressor miRNAs and current delivery strategies to sensitize refractory cells and to prevent metastasis. A holistic understanding of miRNAs' functions in drug resistance and metastasis, which are major causes of cancer-related deaths, and the development of novel strategies to target them efficiently will pave the way towards better translation of miRNAs into clinics and management of cancer therapy.
Collapse
|
19
|
Kramer F, Bayerlová M, Beißbarth T. R-based software for the integration of pathway data into bioinformatic algorithms. BIOLOGY 2014; 3:85-100. [PMID: 24833336 PMCID: PMC4009765 DOI: 10.3390/biology3010085] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Revised: 11/29/2013] [Accepted: 01/31/2014] [Indexed: 11/16/2022]
Abstract
Putting new findings into the context of available literature knowledge is one approach to deal with the surge of high-throughput data results. Furthermore, prior knowledge can increase the performance and stability of bioinformatic algorithms, for example, methods for network reconstruction. In this review, we examine software packages for the statistical computing framework R, which enable the integration of pathway data for further bioinformatic analyses. Different approaches to integrate and visualize pathway data are identified and packages are stratified concerning their features according to a number of different aspects: data import strategies, the extent of available data, dependencies on external tools, integration with further analysis steps and visualization options are considered. A total of 12 packages integrating pathway data are reviewed in this manuscript. These are supplemented by five R-specific packages for visualization and six connector packages, which provide access to external tools.
Collapse
Affiliation(s)
- Frank Kramer
- University Medical Center Göttingen, Department of Medical Statistics, Humboldtallee 32, D-37073 Göttingen, Germany.
| | - Michaela Bayerlová
- University Medical Center Göttingen, Department of Medical Statistics, Humboldtallee 32, D-37073 Göttingen, Germany.
| | - Tim Beißbarth
- University Medical Center Göttingen, Department of Medical Statistics, Humboldtallee 32, D-37073 Göttingen, Germany.
| |
Collapse
|
20
|
Sass S, Buettner F, Mueller NS, Theis FJ. A modular framework for gene set analysis integrating multilevel omics data. Nucleic Acids Res 2013; 41:9622-33. [PMID: 23975194 PMCID: PMC3834824 DOI: 10.1093/nar/gkt752] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Modern high-throughput methods allow the investigation of biological functions across multiple ‘omics’ levels. Levels include mRNA and protein expression profiling as well as additional knowledge on, for example, DNA methylation and microRNA regulation. The reason for this interest in multi-omics is that actual cellular responses to different conditions are best explained mechanistically when taking all omics levels into account. To map gene products to their biological functions, public ontologies like Gene Ontology are commonly used. Many methods have been developed to identify terms in an ontology, overrepresented within a set of genes. However, these methods are not able to appropriately deal with any combination of several data types. Here, we propose a new method to analyse integrated data across multiple omics-levels to simultaneously assess their biological meaning. We developed a model-based Bayesian method for inferring interpretable term probabilities in a modular framework. Our Multi-level ONtology Analysis (MONA) algorithm performed significantly better than conventional analyses of individual levels and yields best results even for sophisticated models including mRNA fine-tuning by microRNAs. The MONA framework is flexible enough to allow for different underlying regulatory motifs or ontologies. It is ready-to-use for applied researchers and is available as a standalone application from http://icb.helmholtz-muenchen.de/mona.
Collapse
Affiliation(s)
- Steffen Sass
- Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstraße 1, 85764 Neuherberg, Germany and Department of Mathematics, Technische Universität München, Boltzmannstraße 3, 85747 Garching, Germany
| | | | | | | |
Collapse
|
21
|
Le TD, Liu L, Liu B, Tsykin A, Goodall GJ, Satou K, Li J. Inferring microRNA and transcription factor regulatory networks in heterogeneous data. BMC Bioinformatics 2013; 14:92. [PMID: 23497388 PMCID: PMC3636059 DOI: 10.1186/1471-2105-14-92] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2012] [Accepted: 02/26/2013] [Indexed: 01/11/2023] Open
Abstract
Background Transcription factors (TFs) and microRNAs (miRNAs) are primary metazoan gene regulators. Regulatory mechanisms of the two main regulators are of great interest to biologists and may provide insights into the causes of diseases. However, the interplay between miRNAs and TFs in a regulatory network still remains unearthed. Currently, it is very difficult to study the regulatory mechanisms that involve both miRNAs and TFs in a biological lab. Even at data level, a network involving miRNAs, TFs and genes will be too complicated to achieve. Previous research has been mostly directed at inferring either miRNA or TF regulatory networks from data. However, networks involving a single type of regulator may not fully reveal the complex gene regulatory mechanisms, for instance, the way in which a TF indirectly regulates a gene via a miRNA. Results We propose a framework to learn from heterogeneous data the three-component regulatory networks, with the presence of miRNAs, TFs, and mRNAs. This method firstly utilises Bayesian network structure learning to construct a regulatory network from multiple sources of data: gene expression profiles of miRNAs, TFs and mRNAs, target information based on sequence data, and sample categories. Then, in order to produce more meaningful results for further biological experimentation and research, the method searches the learnt network to identify the interplay between miRNAs and TFs and applies a network motif finding algorithm to further infer the network. We apply the proposed framework to the data sets of epithelial-to-mesenchymal transition (EMT). The results elucidate the complex gene regulatory mechanism for EMT which involves both TFs and miRNAs. Several discovered interactions and molecular functions have been confirmed by literature. In addition, many other discovered interactions and bio-markers are of high statistical significance and thus can be good candidates for validation by experiments. Moreover, the results generated by our method are compact, involving a small number of interactions which have been proved highly relevant to EMT. Conclusions We have designed a framework to infer gene regulatory networks involving both TFs and miRNAs from multiple sources of data, including gene expression data, target information, and sample categories. Results on the EMT data sets have shown that the proposed approach is able to produce compact and meaningful gene regulatory networks that are highly relevant to the biological conditions of the data sets. This framework has the potential for application to other heterogeneous datasets to reveal the complex gene regulatory relationships.
Collapse
Affiliation(s)
- Thuc D Le
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, SA 5095, Australia.
| | | | | | | | | | | | | |
Collapse
|