1
|
Chen CK. Inference of genetic regulatory networks with regulatory hubs using vector autoregressions and automatic relevance determination with model selections. Stat Appl Genet Mol Biol 2021; 20:121-143. [PMID: 34963205 DOI: 10.1515/sagmb-2020-0054] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 11/15/2021] [Indexed: 12/11/2022]
Abstract
The inference of genetic regulatory networks (GRNs) reveals how genes interact with each other. A few genes can regulate many genes as targets to control cell functions. We present new methods based on the order-1 vector autoregression (VAR1) for inferring GRNs from gene expression time series. The methods use the automatic relevance determination (ARD) to incorporate the regulatory hub structure into the estimation of VAR1 in a Bayesian framework. Several sparse approximation schemes are applied to the estimated regression weights or VAR1 model to generate the sparse weighted adjacency matrices representing the inferred GRNs. We apply the proposed and several widespread reference methods to infer GRNs with up to 100 genes using simulated, DREAM4 in silico and experimental E. coli gene expression time series. We show that the proposed methods are efficient on simulated hub GRNs and scale-free GRNs using short time series simulated by VAR1s and outperform reference methods on small-scale DREAM4 in silico GRNs and E. coli GRNs. They can utilize the known major regulatory hubs to improve the performance on larger DREAM4 in silico GRNs and E. coli GRNs. The impact of nonlinear time series data on the performance of proposed methods is discussed.
Collapse
Affiliation(s)
- Chi-Kan Chen
- Department of Applied Mathematics, National Chung Hsing University, 145 Xingda Rd., South District, Taichung City, Taiwan, ROC
| |
Collapse
|
2
|
Nepomuceno-Chamorro IA, Nepomuceno JA, Galván-Rojas JL, Vega-Márquez B, Rubio-Escudero C. Using prior knowledge in the inference of gene association networks. APPL INTELL 2020. [DOI: 10.1007/s10489-020-01705-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
3
|
Delgado-Chaves FM, Gómez-Vela F, García-Torres M, Divina F, Vázquez Noguera JL. Computational Inference of Gene Co-Expression Networks for the identification of Lung Carcinoma Biomarkers: An Ensemble Approach. Genes (Basel) 2019; 10:E962. [PMID: 31766738 PMCID: PMC6947459 DOI: 10.3390/genes10120962] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 10/28/2019] [Accepted: 10/31/2019] [Indexed: 12/22/2022] Open
Abstract
Gene Networks (GN), have emerged as an useful tool in recent years for the analysis of different diseases in the field of biomedicine. In particular, GNs have been widely applied for the study and analysis of different types of cancer. In this context, Lung carcinoma is among the most common cancer types and its short life expectancy is partly due to late diagnosis. For this reason, lung cancer biomarkers that can be easily measured are highly demanded in biomedical research. In this work, we present an application of gene co-expression networks in the modelling of lung cancer gene regulatory networks, which ultimately served to the discovery of new biomarkers. For this, a robust GN inference was performed from microarray data concomitantly using three different co-expression measures. Results identified a major cluster of genes involved in SRP-dependent co-translational protein target to membrane, as well as a set of 28 genes that were exclusively found in networks generated from cancer samples. Amongst potential biomarkers, genes N C K A P 1 L and D M D are highlighted due to their implications in a considerable portion of lung and bronchus primary carcinomas. These findings demonstrate the potential of GN reconstruction in the rational prediction of biomarkers.
Collapse
Affiliation(s)
- Fernando M. Delgado-Chaves
- Division of Computer Science, Pablo de Olavide University, 41013 Seville, Spain; (F.M.D.-C.); (M.G.-T.); (F.D.)
| | - Francisco Gómez-Vela
- Division of Computer Science, Pablo de Olavide University, 41013 Seville, Spain; (F.M.D.-C.); (M.G.-T.); (F.D.)
| | - Miguel García-Torres
- Division of Computer Science, Pablo de Olavide University, 41013 Seville, Spain; (F.M.D.-C.); (M.G.-T.); (F.D.)
| | - Federico Divina
- Division of Computer Science, Pablo de Olavide University, 41013 Seville, Spain; (F.M.D.-C.); (M.G.-T.); (F.D.)
| | | |
Collapse
|
4
|
Larmuseau M, Verbeke LPC, Marchal K. Associating expression and genomic data using co-occurrence measures. Biol Direct 2019; 14:10. [PMID: 31072345 PMCID: PMC6507230 DOI: 10.1186/s13062-019-0240-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 04/10/2019] [Indexed: 12/11/2022] Open
Abstract
Abstract Recent technological evolutions have led to an exponential increase in data in all the omics fields. It is expected that integration of these different data sources, will drastically enhance our knowledge of the biological mechanisms behind genomic diseases such as cancer. However, the integration of different omics data still remains a challenge. In this work we propose an intuitive workflow for the integrative analysis of expression, mutation and copy number data taken from the METABRIC study on breast cancer. First, we present evidence that the expression profile of many important breast cancer genes consists of two modes or ‘regimes’, which contain important clinical information. Then, we show how the co-occurrence of these expression regimes can be used as an association measure between genes and validate our findings on the TCGA-BRCA study. Finally, we demonstrate how these co-occurrence measures can also be applied to link expression regimes to genomic aberrations, providing a more complete, integrative view on breast cancer. As a case study, an integrative analysis of the identified MLPH-FOXA1 association is performed, illustrating that the obtained expression associations are intimately linked to the underlying genomic changes. Reviewers This article was reviewed by Dirk Walther, Francisco Garcia and Isabel Nepomuceno. Electronic supplementary material The online version of this article (10.1186/s13062-019-0240-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Maarten Larmuseau
- Department of Information Technology, Ghent University - Imec, Technologiepark-Zwijnaarde 126, 9052, Ghent, Belgium
| | - Lieven P C Verbeke
- Department of Plant Biotechnology and Bioinformatics, Ghent University - Imec, Technologiepark-Zwijnaarde 126, 9052, Ghent, Belgium
| | - Kathleen Marchal
- Department of Plant Biotechnology and Bioinformatics, Ghent University - Imec, Technologiepark-Zwijnaarde 126, 9052, Ghent, Belgium.
| |
Collapse
|
5
|
Huynh-Thu VA, Geurts P. Unsupervised Gene Network Inference with Decision Trees and Random Forests. Methods Mol Biol 2019; 1883:195-215. [PMID: 30547401 DOI: 10.1007/978-1-4939-8882-2_8] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
In this chapter, we introduce the reader to a popular family of machine learning algorithms, called decision trees. We then review several approaches based on decision trees that have been developed for the inference of gene regulatory networks (GRNs). Decision trees have indeed several nice properties that make them well-suited for tackling this problem: they are able to detect multivariate interacting effects between variables, are non-parametric, have good scalability, and have very few parameters. In particular, we describe in detail the GENIE3 algorithm, a state-of-the-art method for GRN inference.
Collapse
Affiliation(s)
- Vân Anh Huynh-Thu
- Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium.
| | - Pierre Geurts
- Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium
| |
Collapse
|
6
|
Lazzarini N, Widera P, Williamson S, Heer R, Krasnogor N, Bacardit J. Functional networks inference from rule-based machine learning models. BioData Min 2016; 9:28. [PMID: 27597880 PMCID: PMC5011349 DOI: 10.1186/s13040-016-0106-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Accepted: 08/11/2016] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods. RESULTS We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process. AVAILABILITY The implementation of our network inference protocol is available at: http://ico2s.org/software/funel.html.
Collapse
Affiliation(s)
- Nicola Lazzarini
- Interdisciplinary Computing and Complex BioSystems (ICOS) research group, School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
| | - Paweł, Widera
- Interdisciplinary Computing and Complex BioSystems (ICOS) research group, School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
| | - Stuart Williamson
- Clinical and Experimental Pharmacology Group, Cancer Research UK Manchester Institute, University of Manchester, Manchester, UK
| | - Rakesh Heer
- Northern Institute for Cancer Research, Medical School, Newcastle University, Newcastle upon Tyne, UK
| | - Natalio Krasnogor
- Interdisciplinary Computing and Complex BioSystems (ICOS) research group, School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
| | - Jaume Bacardit
- Interdisciplinary Computing and Complex BioSystems (ICOS) research group, School of Computing Science, Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|
7
|
Nepomuceno-Chamorro IA. Model tree to improve the inference of gene association networks. AI COMMUN 2016. [DOI: 10.3233/aic-160700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
8
|
Gómez-Vela F, Barranco CD, Díaz-Díaz N. Incorporating biological knowledge for construction of fuzzy networks of gene associations. Appl Soft Comput 2016. [DOI: 10.1016/j.asoc.2016.01.014] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
9
|
Han H. Diagnostic biases in translational bioinformatics. BMC Med Genomics 2015; 8:46. [PMID: 26232237 PMCID: PMC4522082 DOI: 10.1186/s12920-015-0116-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2014] [Accepted: 07/07/2015] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND With the surge of translational medicine and computational omics research, complex disease diagnosis is more and more relying on massive omics data-driven molecular signature detection. However, how to detect and prevent possible diagnostic biases in translational bioinformatics remains an unsolved problem despite its importance in the coming era of personalized medicine. METHODS In this study, we comprehensively investigate the diagnostic bias problem by analyzing benchmark gene array, protein array, RNA-Seq and miRNA-Seq data under the framework of support vector machines for different model selection methods. We further categorize the diagnostic biases into different types by conducting rigorous kernel matrix analysis and provide effective machine learning methods to conquer the diagnostic biases. RESULTS In this study, we comprehensively investigate the diagnostic bias problem by analyzing benchmark gene array, protein array, RNA-Seq and miRNA-Seq data under the framework of support vector machines. We have found that the diagnostic biases happen for data with different distributions and SVM with different kernels. Moreover, we identify total three types of diagnostic biases: overfitting bias, label skewness bias, and underfitting bias in SVM diagnostics, and present corresponding reasons through rigorous analysis. Compared with the overfitting and underfitting biases, the label skewness bias is more challenging to detect and conquer because it can be easily confused as a normal diagnostic case from its deceptive accuracy. To tackle this problem, we propose a derivative component analysis based support vector machines to conquer the label skewness bias by achieving the rivaling clinical diagnostic results. CONCLUSIONS Our studies demonstrate that the diagnostic biases are mainly caused by the three major factors, i.e. kernel selection, signal amplification mechanism in high-throughput profiling, and training data label distribution. Moreover, the proposed DCA-SVM diagnosis provides a generic solution for the label skewness bias overcome due to the powerful feature extraction capability from derivative component analysis. Our work identifies and solves an important but less addressed problem in translational research. It also has a positive impact on machine learning for adding new results to kernel-based learning for omics data.
Collapse
Affiliation(s)
- Henry Han
- Department of Computer and Information Science, Fordham University, New York, 10023, NY, USA. .,Quantitative Proteomics Center, Columbia University, New York, NY, USA.
| |
Collapse
|
10
|
Nepomuceno-Chamorro IA, Marquez-Chamorro A, Aguilar-Ruiz JS. Building Transcriptional Association Networks in Cytoscape with RegNetC. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:823-824. [PMID: 26357322 DOI: 10.1109/tcbb.2014.2385702] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The Regression Network plugin for Cytoscape (RegNetC) implements the RegNet algorithm for the inference of transcriptional association network from gene expression profiles. This algorithm is a model tree-based method to detect the relationship between each gene and the remaining genes simultaneously instead of analyzing individually each pair of genes as correlation-based methods do. Model trees are a very useful technique to estimate the gene expression value by regression models and favours localized similarities over more global similarity, which is one of the major drawbacks of correlation-based methods. Here, we present an integrated software suite, named RegNetC, as a Cytoscape plugin that can operate on its own as well. RegNetC facilitates, according to user-defined parameters, the resulted transcriptional gene association network in .sif format for visualization, analysis and interoperates with other Cytoscape plugins, which can be exported for publication figures. In addition to the network, the RegNetC plugin also provides the quantitative relationships between genes expression values of those genes involved in the inferred network, i.e., those defined by the regression models.
Collapse
|
11
|
Gene network coherence based on prior knowledge using direct and indirect relationships. Comput Biol Chem 2015; 56:142-51. [DOI: 10.1016/j.compbiolchem.2015.03.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Revised: 03/06/2015] [Accepted: 03/20/2015] [Indexed: 12/21/2022]
|
12
|
Transcriptional response to cardiac injury in the zebrafish: systematic identification of genes with highly concordant activity across in vivo models. BMC Genomics 2014; 15:852. [PMID: 25280539 PMCID: PMC4197235 DOI: 10.1186/1471-2164-15-852] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 09/25/2014] [Indexed: 12/26/2022] Open
Abstract
Background Zebrafish is a clinically-relevant model of heart regeneration. Unlike mammals, it has a remarkable heart repair capacity after injury, and promises novel translational applications. Amputation and cryoinjury models are key research tools for understanding injury response and regeneration in vivo. An understanding of the transcriptional responses following injury is needed to identify key players of heart tissue repair, as well as potential targets for boosting this property in humans. Results We investigated amputation and cryoinjury in vivo models of heart damage in the zebrafish through unbiased, integrative analyses of independent molecular datasets. To detect genes with potential biological roles, we derived computational prediction models with microarray data from heart amputation experiments. We focused on a top-ranked set of genes highly activated in the early post-injury stage, whose activity was further verified in independent microarray datasets. Next, we performed independent validations of expression responses with qPCR in a cryoinjury model. Across in vivo models, the top candidates showed highly concordant responses at 1 and 3 days post-injury, which highlights the predictive power of our analysis strategies and the possible biological relevance of these genes. Top candidates are significantly involved in cell fate specification and differentiation, and include heart failure markers such as periostin, as well as potential new targets for heart regeneration. For example, ptgis and ca2 were overexpressed, while usp2a, a regulator of the p53 pathway, was down-regulated in our in vivo models. Interestingly, a high activity of ptgis and ca2 has been previously observed in failing hearts from rats and humans. Conclusions We identified genes with potential critical roles in the response to cardiac damage in the zebrafish. Their transcriptional activities are reproducible in different in vivo models of cardiac injury. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-852) contains supplementary material, which is available to authorized users.
Collapse
|
13
|
Diaz-Montana JJ, Diaz-Diaz N. Development and use of the Cytoscape app GFD-Net for measuring semantic dissimilarity of gene networks. F1000Res 2014; 3:142. [PMID: 25400907 PMCID: PMC4224201 DOI: 10.12688/f1000research.4573.1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/25/2014] [Indexed: 01/12/2023] Open
Abstract
Gene networks are one of the main computational models used to study the interaction between different elements during biological processes being widely used to represent gene–gene, or protein–protein interaction complexes. We present GFD-Net, a Cytoscape app for visualizing and analyzing the functional dissimilarity of gene networks.
Collapse
Affiliation(s)
| | - Norberto Diaz-Diaz
- School of Engineering, Pablo de Olavide University, Seville, 41013, Spain
| |
Collapse
|
14
|
Lo K, Raftery AE, Dombek KM, Zhu J, Schadt EE, Bumgarner RE, Yeung KY. Integrating external biological knowledge in the construction of regulatory networks from time-series expression data. BMC SYSTEMS BIOLOGY 2012; 6:101. [PMID: 22898396 PMCID: PMC3465231 DOI: 10.1186/1752-0509-6-101] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2012] [Accepted: 07/24/2012] [Indexed: 01/27/2023]
Abstract
BACKGROUND Inference about regulatory networks from high-throughput genomics data is of great interest in systems biology. We present a Bayesian approach to infer gene regulatory networks from time series expression data by integrating various types of biological knowledge. RESULTS We formulate network construction as a series of variable selection problems and use linear regression to model the data. Our method summarizes additional data sources with an informative prior probability distribution over candidate regression models. We extend the Bayesian model averaging (BMA) variable selection method to select regulators in the regression framework. We summarize the external biological knowledge by an informative prior probability distribution over the candidate regression models. CONCLUSIONS We demonstrate our method on simulated data and a set of time-series microarray experiments measuring the effect of a drug perturbation on gene expression levels, and show that it outperforms leading regression-based methods in the literature.
Collapse
Affiliation(s)
- Kenneth Lo
- Department of Microbiology, University of Washington, Box 358070, Seattle, WA, 98195, USA
| | - Adrian E Raftery
- Department of Statistics, University of Washington, Box 354320, Seattle, WA, 98195, USA
| | - Kenneth M Dombek
- Department of Biochemistry, University of Washington, Box 357350, Seattle, WA, 98195, USA
| | - Jun Zhu
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, NY, 10029, USA
| | - Eric E Schadt
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, NY, 10029, USA
| | - Roger E Bumgarner
- Department of Microbiology, University of Washington, Box 358070, Seattle, WA, 98195, USA
| | - Ka Yee Yeung
- Department of Microbiology, University of Washington, Box 358070, Seattle, WA, 98195, USA
| |
Collapse
|
15
|
Bassel GW, Glaab E, Marquez J, Holdsworth MJ, Bacardit J. Functional network construction in Arabidopsis using rule-based machine learning on large-scale data sets. THE PLANT CELL 2011; 23:3101-16. [PMID: 21896882 PMCID: PMC3203449 DOI: 10.1105/tpc.111.088153] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2011] [Revised: 08/01/2011] [Accepted: 08/25/2011] [Indexed: 05/17/2023]
Abstract
The meta-analysis of large-scale postgenomics data sets within public databases promises to provide important novel biological knowledge. Statistical approaches including correlation analyses in coexpression studies of gene expression have emerged as tools to elucidate gene function using these data sets. Here, we present a powerful and novel alternative methodology to computationally identify functional relationships between genes from microarray data sets using rule-based machine learning. This approach, termed "coprediction," is based on the collective ability of groups of genes co-occurring within rules to accurately predict the developmental outcome of a biological system. We demonstrate the utility of coprediction as a powerful analytical tool using publicly available microarray data generated exclusively from Arabidopsis thaliana seeds to compute a functional gene interaction network, termed Seed Co-Prediction Network (SCoPNet). SCoPNet predicts functional associations between genes acting in the same developmental and signal transduction pathways irrespective of the similarity in their respective gene expression patterns. Using SCoPNet, we identified four novel regulators of seed germination (ALTERED SEED GERMINATION5, 6, 7, and 8), and predicted interactions at the level of transcript abundance between these novel and previously described factors influencing Arabidopsis seed germination. An online Web tool to query SCoPNet has been developed as a community resource to dissect seed biology and is available at http://www.vseed.nottingham.ac.uk/.
Collapse
Affiliation(s)
- George W Bassel
- Division of Plant and Crop Sciences, University of Nottingham, Loughborough, Leicestershire, UK.
| | | | | | | | | |
Collapse
|
16
|
Nepomuceno-Chamorro I, Azuaje F, Devaux Y, Nazarov PV, Muller A, Aguilar-Ruiz JS, Wagner DR. Prognostic transcriptional association networks: a new supervised approach based on regression trees. ACTA ACUST UNITED AC 2010; 27:252-8. [PMID: 21098433 PMCID: PMC3018815 DOI: 10.1093/bioinformatics/btq645] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Motivation: The application of information encoded in molecular networks for prognostic purposes is a crucial objective of systems biomedicine. This approach has not been widely investigated in the cardiovascular research area. Within this area, the prediction of clinical outcomes after suffering a heart attack would represent a significant step forward. We developed a new quantitative prediction-based method for this prognostic problem based on the discovery of clinically relevant transcriptional association networks. This method integrates regression trees and clinical class-specific networks, and can be applied to other clinical domains. Results: Before analyzing our cardiovascular disease dataset, we tested the usefulness of our approach on a benchmark dataset with control and disease patients. We also compared it to several algorithms to infer transcriptional association networks and classification models. Comparative results provided evidence of the prediction power of our approach. Next, we discovered new models for predicting good and bad outcomes after myocardial infarction. Using blood-derived gene expression data, our models reported areas under the receiver operating characteristic curve above 0.70. Our model could also outperform different techniques based on co-expressed gene modules. We also predicted processes that may represent novel therapeutic targets for heart disease, such as the synthesis of leucine and isoleucine. Availability: The SATuRNo software is freely available at http://www.lsi.us.es/isanepo/toolsSaturno/. Contact:inepomuceno@us.es Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
|