1
|
Ayoubi M, Teimourpour B, Hassanzadeh A. ExGenet, Integrating Design of Experiments and Response Surface Methodology for Cancer Gene Detection in Gene Regulatory Networks. Cancer Inform 2024; 23:11769351241255645. [PMID: 38854618 PMCID: PMC11159540 DOI: 10.1177/11769351241255645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 04/26/2024] [Indexed: 06/11/2024] Open
Abstract
Objective Network analysis techniques often require tuning hyperparameters for optimal performance. For instance, the independent cascade model necessitates determining the probability of diffusion. Despite its importance, a consensus on effective parameter adjustment remains elusive. Methods In this study, we propose a novel approach utilizing experimental design methodologies, specifically 2-Factorial Analysis for Screening, and Response Surface Methodology (RSM) for parameter adjustment. We apply this methodology to the task of detecting cancer driver genes in colorectal cancer. Result Through experimental validation of colorectal cancer data, we demonstrate the effectiveness of our proposed methodology. Compared with existing methods, our approach offers several advantages, including reduced computational overhead, systematic parameter selection grounded in statistical theory, and improved performance in detecting cancer driver genes. Conclusion This study presents a significant advancement in the field of network analysis by providing a practical and systematic approach to hyperparameter tuning. By optimizing parameter settings, our methodology offers promising implications for critical biomedical applications such as cancer driver gene detection.
Collapse
Affiliation(s)
- Mahboube Ayoubi
- Department of Data Science, Tarbiat Modares University (TMU), Tehran, Iran
| | - Babak Teimourpour
- Department of Information Technology Engineering, School of Systems and Industrial Engineering, Tarbiat Modares University (TMU), Tehran, Iran
| | - Alireza Hassanzadeh
- Professor and Head of Department of Information Technology Management, Tarbiat Modares University (TMU), Tehran, Iran
| |
Collapse
|
2
|
Zito F, Cutello V, Pavone M. A Machine Learning Approach to Simulate Gene Expression and Infer Gene Regulatory Networks. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1214. [PMID: 37628244 PMCID: PMC10453511 DOI: 10.3390/e25081214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 07/20/2023] [Accepted: 08/10/2023] [Indexed: 08/27/2023]
Abstract
The ability to simulate gene expression and infer gene regulatory networks has vast potential applications in various fields, including medicine, agriculture, and environmental science. In recent years, machine learning approaches to simulate gene expression and infer gene regulatory networks have gained significant attention as a promising area of research. By simulating gene expression, we can gain insights into the complex mechanisms that control gene expression and how they are affected by various environmental factors. This knowledge can be used to develop new treatments for genetic diseases, improve crop yields, and better understand the evolution of species. In this article, we address this issue by focusing on a novel method capable of simulating the gene expression regulation of a group of genes and their mutual interactions. Our framework enables us to simulate the regulation of gene expression in response to alterations or perturbations that can affect the expression of a gene. We use both artificial and real benchmarks to empirically evaluate the effectiveness of our methodology. Furthermore, we compare our method with existing ones to understand its advantages and disadvantages. We also present future ideas for improvement to enhance the effectiveness of our method. Overall, our approach has the potential to greatly improve the field of gene expression simulation and gene regulatory network inference, possibly leading to significant advancements in genetics.
Collapse
Affiliation(s)
| | | | - Mario Pavone
- Department of Mathematics and Computer Science, University of Catania, 95125 Catania, Italy
| |
Collapse
|
3
|
Sheikh K, Sayeed S, Asif A, Siddiqui MF, Rafeeq MM, Sahu A, Ahmad S. Consequential Innovations in Nature-Inspired Intelligent Computing Techniques for Biomarkers and Potential Therapeutics Identification. NATURE-INSPIRED INTELLIGENT COMPUTING TECHNIQUES IN BIOINFORMATICS 2023:247-274. [DOI: 10.1007/978-981-19-6379-7_13] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/24/2024]
|
4
|
Siddiqui MF, Alam A, Kalmatov R, Mouna A, Villela R, Mitalipova A, Mrad YN, Rahat SAA, Magarde BK, Muhammad W, Sherbaevna SR, Tashmatova N, Islamovna UG, Abuassi MA, Parween Z. Leveraging Healthcare System with Nature-Inspired Computing Techniques: An Overview and Future Perspective. NATURE-INSPIRED INTELLIGENT COMPUTING TECHNIQUES IN BIOINFORMATICS 2023:19-42. [DOI: 10.1007/978-981-19-6379-7_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/24/2024]
|
5
|
Inference of gene regulatory networks based on the Light Gradient Boosting Machine. Comput Biol Chem 2022; 101:107769. [DOI: 10.1016/j.compbiolchem.2022.107769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 08/12/2022] [Accepted: 09/06/2022] [Indexed: 11/23/2022]
|
6
|
Hettich J, Gebhardt JCM. Periodic synchronization of isolated network elements facilitates simulating and inferring gene regulatory networks including stochastic molecular kinetics. BMC Bioinformatics 2022; 23:13. [PMID: 34986805 PMCID: PMC8729106 DOI: 10.1186/s12859-021-04541-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 12/16/2021] [Indexed: 11/10/2022] Open
Abstract
Background The temporal progression of many fundamental processes in cells and organisms, including homeostasis, differentiation and development, are governed by gene regulatory networks (GRNs). GRNs balance fluctuations in the output of their genes, which trace back to the stochasticity of molecular interactions. Although highly desirable to understand life processes, predicting the temporal progression of gene products within a GRN is challenging when considering stochastic events such as transcription factor–DNA interactions or protein production and degradation.
Results We report a method to simulate and infer GRNs including genes and biochemical reactions at molecular detail. In our approach, we consider each network element to be isolated from other elements during small time intervals, after which we synchronize molecule numbers across all network elements. Thereby, the temporal behaviour of network elements is decoupled and can be treated by local stochastic or deterministic solutions. We demonstrate the working principle of this modular approach with a repressive gene cascade comprising four genes. By considering a deterministic time evolution within each time interval for all elements, our method approaches the solution of the system of deterministic differential equations associated with the GRN. By allowing genes to stochastically switch between on and off states or by considering stochastic production of gene outputs, we are able to include increasing levels of stochastic detail and approximate the solution of a Gillespie simulation. Thereby, CaiNet is able to reproduce noise-induced bi-stability and oscillations in dynamically complex GRNs. Notably, our modular approach further allows for a simple consideration of deterministic delays. We further infer relevant regulatory connections and steady-state parameters of a GRN of up to ten genes from steady-state measurements by identifying each gene of the network with a single perceptron in an artificial neuronal network and using a gradient decent method originally designed to train recurrent neural networks. To facilitate setting up GRNs and using our simulation and inference method, we provide a fast computer-aided interactive network simulation environment, CaiNet. Conclusion We developed a method to simulate GRNs at molecular detail and to infer the topology and steady-state parameters of GRNs. Our method and associated user-friendly framework CaiNet should prove helpful to analyze or predict the temporal progression of reaction networks or GRNs in cellular and organismic biology. CaiNet is freely available at https://gitlab.com/GebhardtLab/CaiNet. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04541-6.
Collapse
Affiliation(s)
- Johannes Hettich
- Institute of Biophysics, Ulm University, Albert-Einstein-Allee 11, 89081, Ulm, Germany
| | - J Christof M Gebhardt
- Institute of Biophysics, Ulm University, Albert-Einstein-Allee 11, 89081, Ulm, Germany.
| |
Collapse
|
7
|
Wani N, Barh D, Raza K. Modular network inference between miRNA-mRNA expression profiles using weighted co-expression network analysis. J Integr Bioinform 2021; 18:jib-2021-0029. [PMID: 34800012 PMCID: PMC8709739 DOI: 10.1515/jib-2021-0029] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 10/28/2021] [Indexed: 12/14/2022] Open
Abstract
Connecting transcriptional and post-transcriptional regulatory networks solves an important puzzle in the elucidation of gene regulatory mechanisms. To decipher the complexity of these connections, we build co-expression network modules for mRNA as well as miRNA expression profiles of breast cancer data. We construct gene and miRNA co-expression modules using the weighted gene co-expression network analysis (WGCNA) method and establish the significance of these modules (Genes/miRNAs) for cancer phenotype. This work also infers an interaction network between the genes of the turquoise module from mRNA expression data and hubs of the turquoise module from miRNA expression data. A pathway enrichment analysis using a miRsystem web tool for miRNA hubs and some of their targets, reveal their enrichment in several important pathways associated with the progression of cancer.
Collapse
Affiliation(s)
- Nisar Wani
- Computer Science and Engineering Department, Govt. College of Engineering and Technology Safapora, Ganderbal Kashmir, J&K, India
| | - Debmalya Barh
- Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, WB, India.,Department of Genetics, Ecology and Evolution, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Khalid Raza
- Department of Computer Science, Jamia Millia Islamia, New Delhi, India
| |
Collapse
|
8
|
Biswas S, Acharyya S. Multi-objective Simulated Annealing Variants to Infer Gene Regulatory Network: A Comparative Study. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2612-2623. [PMID: 32386161 DOI: 10.1109/tcbb.2020.2992304] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Gene Regulatory Network (GRN) is formed due to mutual transcriptional regulation within a set of protein coding genes in cellular context of an organism. Computational inference of GRN is important to understand the behavior of each gene in terms of change in its protein production rate (expression level). As Recurrent Neural Network (RNN) is efficient in GRN modeling, a bi-objective RNN formulation has been applied here. Based on Archived Multi Objective Simulated Annealing (AMOSA), four algorithms, namely, AMOSA Revised (AMOSAR), Modified Freezing based AMOSA (AMOFSA), Tabu based AMOSA (AMOTSA) and Modified Freezing and Tabu based AMOSA (AMOFTSA) have been proposed and applied to RNN (treated as GRN) for parameter learning taking four gene expression time series datasets. Comparative studies on the performance of the algorithms (based on each dataset) have been made in terms of the number of GRNs obtained in the final non-dominated front and the performance metrics, namely, recall, precision and f1 score. Two proposed variants, namely, AMOFSA and AMOTSA have been found competitive in performance. Experimental observations and statistical analysis show that, modified algorithms are better than AMOSAR and the state-of-the-art algorithms in respect of the above-mentioned metrics.
Collapse
|
9
|
Naseri A, Sharghi M, Hasheminejad SMH. Enhancing gene regulatory networks inference through hub-based data integration. Comput Biol Chem 2021; 95:107589. [PMID: 34673384 DOI: 10.1016/j.compbiolchem.2021.107589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 08/11/2021] [Accepted: 10/04/2021] [Indexed: 12/09/2022]
Abstract
One of the main research topics in computational biology is Gene Regulatory Network (GRN) reconstruction that refers to inferring the relationships between genes involved in regulating cell conditions in response to internal or external stimuli. To this end, most computational methods use only transcriptional gene expression data to reconstruct gene regulatory networks, but recent studies suggest that gene expression data must be integrated with other types of data to obtain more accurate models predicting real relationships between genes. In this study, a diffusion-based method is enhanced to integrate biological data of network types besides structural prior knowledge. The Random Walk with Restart algorithm (RWR) with an emphasis on hub nodes is executed separately on each network, and then jointly optimizes low-dimensional feature vectors for network nodes by diffusion component analysis. Next, these feature vectors are used to infer gene regulatory networks. Fourteen centrality measures are studied for the detection of hub nodes to be used in the RWR algorithm, and the best centrality measure having the greatest effect on the improvement of gene network inference is selected. A case study for the Saccharomyces cerevisiae and E. coli networks shows that using the proposed features in comparison with gene expression data alone results in 0.02-0.08 units improvement in Area Under Receiver Characteristic Operator (AUROC) criteria across different gene regulatory network inference methods. Furthermore, the proposed method was applied to the esophageal cancer data to infer its gene regulatory network. The proposed framework substantially improves accuracy and scalability of GRN inference. The fused features and the best centrality measure detected can be used to provide functional insights about genes or proteins in various biological applications. Moreover, it can be served as a general framework for network data and structural data integration and analysis problems in various scientific disciplines including biology.
Collapse
Affiliation(s)
- Atefeh Naseri
- Department of Computer Engineering, Alzahra University, Tehran, Iran.
| | - Mehran Sharghi
- Department of Computer Engineering, Alzahra University, Tehran, Iran.
| | | |
Collapse
|
10
|
Yang B, Bao W, Zhang W, Wang H, Song C, Chen Y, Jiang X. Reverse engineering gene regulatory network based on complex-valued ordinary differential equation model. BMC Bioinformatics 2021; 22:448. [PMID: 34544363 PMCID: PMC8451084 DOI: 10.1186/s12859-021-04367-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 09/09/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The growing researches of molecular biology reveal that complex life phenomena have the ability to demonstrating various types of interactions in the level of genomics. To establish the interactions between genes or proteins and understand the intrinsic mechanisms of biological systems have become an urgent need and study hotspot. RESULTS In order to forecast gene expression data and identify more accurate gene regulatory network, complex-valued version of ordinary differential equation (CVODE) is proposed in this paper. In order to optimize CVODE model, a complex-valued hybrid evolutionary method based on Grammar-guided genetic programming and complex-valued firefly algorithm is presented. CONCLUSIONS When tested on three real gene expression datasets from E. coli and Human Cell, the experiment results suggest that CVODE model could improve 20-50% prediction accuracy of gene expression data, which could also infer more true-positive regulatory relationships and less false-positive regulations than ordinary differential equation.
Collapse
Affiliation(s)
- Bin Yang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
| | - Wenzheng Bao
- School of Information and Electrical Engineering, Xuzhou University of Technology, Xuzhou, 221018, China.
| | - Wei Zhang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
| | - Haifeng Wang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
| | - Chuandong Song
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
| | - Yuehui Chen
- School of Information Science and Engineering, University of Jinan, Jinan, 250022, China
| | - Xiuying Jiang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
| |
Collapse
|
11
|
|
12
|
Muzio G, O’Bray L, Borgwardt K. Biological network analysis with deep learning. Brief Bioinform 2021; 22:1515-1530. [PMID: 33169146 PMCID: PMC7986589 DOI: 10.1093/bib/bbaa257] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 08/26/2020] [Accepted: 09/11/2020] [Indexed: 12/17/2022] Open
Abstract
Recent advancements in experimental high-throughput technologies have expanded the availability and quantity of molecular data in biology. Given the importance of interactions in biological processes, such as the interactions between proteins or the bonds within a chemical compound, this data is often represented in the form of a biological network. The rise of this data has created a need for new computational tools to analyze networks. One major trend in the field is to use deep learning for this goal and, more specifically, to use methods that work with networks, the so-called graph neural networks (GNNs). In this article, we describe biological networks and review the principles and underlying algorithms of GNNs. We then discuss domains in bioinformatics in which graph neural networks are frequently being applied at the moment, such as protein function prediction, protein-protein interaction prediction and in silico drug discovery and development. Finally, we highlight application areas such as gene regulatory networks and disease diagnosis where deep learning is emerging as a new tool to answer classic questions like gene interaction prediction and automatic disease prediction from data.
Collapse
Affiliation(s)
- Giulia Muzio
- Machine Learning and Computational Biology Lab at ETH Zürich
| | - Leslie O’Bray
- Machine Learning and Computational Biology Lab at ETH Zürich
| | | |
Collapse
|
13
|
Johnson ZJ, Krutkin DD, Bohutskyi P, Kalyuzhnaya MG. Metals and methylotrophy: Via global gene expression studies. Methods Enzymol 2021; 650:185-213. [PMID: 33867021 DOI: 10.1016/bs.mie.2021.01.046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
A number of minerals, such as copper, cobalt, and rare earth elements (REE), are essential modulators of microbial one-carbon metabolism. This chapter provides an overview of the gene expression study design and analysis protocols for uncovering REE-induced changes in methylotrophic bacteria. By interrogating relationships and differences in total gene expression induced by mineral micronutrients, a deeper understanding of gene regulation at a systems scale can be gained. With careful design and execution of RNA-sequencing experiments, thorough processing and assessment of read quality can be utilized to assess and adjust for possible biases. By ensuring only quality data are utilized in downstream processes, differential gene expression, overrepresented analyses, and gene-set enrichment analyses provide reliable and reproducible representation of pathways and functions which are being affected by changes in environmental conditions.
Collapse
Affiliation(s)
- Zachary J Johnson
- Department of Biology, San Diego State University, San Diego, CA, United States
| | - Dennis D Krutkin
- Department of Biology, San Diego State University, San Diego, CA, United States
| | - Pavlo Bohutskyi
- Pacific Northwest National Laboratory, Richland, WA, United States
| | - Marina G Kalyuzhnaya
- Department of Biology, San Diego State University, San Diego, CA, United States.
| |
Collapse
|
14
|
Wani N, Raza K. MKL-GRNI: A parallel multiple kernel learning approach for supervised inference of large-scale gene regulatory networks. PeerJ Comput Sci 2021; 7:e363. [PMID: 33817013 PMCID: PMC7924726 DOI: 10.7717/peerj-cs.363] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 12/29/2020] [Indexed: 06/12/2023]
Abstract
High throughput multi-omics data generation coupled with heterogeneous genomic data fusion are defining new ways to build computational inference models. These models are scalable and can support very large genome sizes with the added advantage of exploiting additional biological knowledge from the integration framework. However, the limitation with such an arrangement is the huge computational cost involved when learning from very large datasets in a sequential execution environment. To overcome this issue, we present a multiple kernel learning (MKL) based gene regulatory network (GRN) inference approach wherein multiple heterogeneous datasets are fused using MKL paradigm. We formulate the GRN learning problem as a supervised classification problem, whereby genes regulated by a specific transcription factor are separated from other non-regulated genes. A parallel execution architecture is devised to learn a large scale GRN by decomposing the initial classification problem into a number of subproblems that run as multiple processes on a multi-processor machine. We evaluate the approach in terms of increased speedup and inference potential using genomic data from Escherichia coli, Saccharomyces cerevisiae and Homo sapiens. The results thus obtained demonstrate that the proposed method exhibits better classification accuracy and enhanced speedup compared to other state-of-the-art methods while learning large scale GRNs from multiple and heterogeneous datasets.
Collapse
Affiliation(s)
- Nisar Wani
- Govt. Degree College Baramulla, Jammu & Kashmir, India
| | - Khalid Raza
- Department of Computer Science, Jamia Millia Islamia, New Delhi, India
| |
Collapse
|
15
|
Genomic signal processing of microarrays for cancer gene expression and identification using cluster-fuzzy adaptive networking. Soft comput 2020. [DOI: 10.1007/s00500-020-05068-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
16
|
Liu W, Sun X, Peng L, Zhou L, Lin H, Jiang Y. RWRNET: A Gene Regulatory Network Inference Algorithm Using Random Walk With Restart. Front Genet 2020; 11:591461. [PMID: 33101398 PMCID: PMC7545090 DOI: 10.3389/fgene.2020.591461] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 09/02/2020] [Indexed: 11/30/2022] Open
Abstract
Inferring gene regulatory networks from expression data is essential in identifying complex regulatory relationships among genes and revealing the mechanism of certain diseases. Various computation methods have been developed for inferring gene regulatory networks. However, these methods focus on the local topology of the network rather than on the global topology. From network optimisation standpoint, emphasising the global topology of the network also reduces redundant regulatory relationships. In this study, we propose a novel network inference algorithm using Random Walk with Restart (RWRNET) that combines local and global topology relationships. The method first captures the local topology through three elements of random walk and then combines the local topology with the global topology by Random Walk with Restart. The Markov Blanket discovery algorithm is then used to deal with isolated genes. The proposed method is compared with several state-of-the-art methods on the basis of six benchmark datasets. Experimental results demonstrated the effectiveness of the proposed method.
Collapse
Affiliation(s)
- Wei Liu
- School of Computer Science, Xiangtan University, Xiangtan, China.,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, China
| | - Xingen Sun
- School of Computer Science, Xiangtan University, Xiangtan, China
| | - Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China
| | - Lili Zhou
- School of Computer Science, Xiangtan University, Xiangtan, China
| | - Hui Lin
- School of Computer Science, Xiangtan University, Xiangtan, China
| | - Yi Jiang
- School of Computer Science, Xiangtan University, Xiangtan, China
| |
Collapse
|
17
|
Zhang Z, Zhao Y, Liao X, Shi W, Li K, Zou Q, Peng S. Deep learning in omics: a survey and guideline. Brief Funct Genomics 2020; 18:41-57. [PMID: 30265280 DOI: 10.1093/bfgp/ely030] [Citation(s) in RCA: 80] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Revised: 07/31/2018] [Accepted: 08/30/2018] [Indexed: 01/17/2023] Open
Abstract
Omics, such as genomics, transcriptome and proteomics, has been affected by the era of big data. A huge amount of high dimensional and complex structured data has made it no longer applicable for conventional machine learning algorithms. Fortunately, deep learning technology can contribute toward resolving these challenges. There is evidence that deep learning can handle omics data well and resolve omics problems. This survey aims to provide an entry-level guideline for researchers, to understand and use deep learning in order to solve omics problems. We first introduce several deep learning models and then discuss several research areas which have combined omics and deep learning in recent years. In addition, we summarize the general steps involved in using deep learning which have not yet been systematically discussed in the existent literature on this topic. Finally, we compare the features and performance of current mainstream open source deep learning frameworks and present the opportunities and challenges involved in deep learning. This survey will be a good starting point and guideline for omics researchers to understand deep learning.
Collapse
Affiliation(s)
- Zhiqiang Zhang
- School of Computer Science, National University of Defense Technology, Changsha, China
| | - Yi Zhao
- Institute of Computing Technology,Chinese Academy of Sciences, Beijing, China
| | - Xiangke Liao
- School of Computer Science, National University of Defense Technology, Changsha, China
| | - Wenqiang Shi
- School of Computer Science, National University of Defense Technology, Changsha, China
| | - Kenli Li
- College of Computer Science and Electronic Engineering & National Supercomputer Centre in Changsha, Hunan University, Changsha, China
| | - Quan Zou
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| | - Shaoliang Peng
- School of Computer Science, National University of Defense Technology, Changsha, China.,College of Computer Science and Electronic Engineering & National Supercomputer Centre in Changsha, Hunan University, Changsha, China
| |
Collapse
|
18
|
Wani N, Raza K. Integrative approaches to reconstruct regulatory networks from multi-omics data: A review of state-of-the-art methods. Comput Biol Chem 2019; 83:107120. [PMID: 31499298 DOI: 10.1016/j.compbiolchem.2019.107120] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 02/22/2019] [Accepted: 08/27/2019] [Indexed: 02/06/2023]
Abstract
Data generation using high throughput technologies has led to the accumulation of diverse types of molecular data. These data have different types (discrete, real, string, etc.) and occur in various formats and sizes. Datasets including gene expression, miRNA expression, protein-DNA binding data (ChIP-Seq/ChIP-ChIP), mutation data (copy number variation, single nucleotide polymorphisms), annotations, interactions, and association data are some of the commonly used biological datasets to study various cellular mechanisms of living organisms. Each of them provides a unique, complementary and partly independent view of the genome and hence embed essential information about the regulatory mechanisms of genes and their products. Therefore, integrating these data and inferring regulatory interactions from them offer a system level of biological insight in predicting gene functions and their phenotypic outcomes. To study genome functionality through regulatory networks, different methods have been proposed for collective mining of information from an integrated dataset. We survey here integration methods that reconstruct regulatory networks using state-of-the-art techniques to handle multi-omics (i.e., genomic, transcriptomic, proteomic) and other biological datasets.
Collapse
Affiliation(s)
- Nisar Wani
- Govt. Degree College Baramulla, J & K, India; Department of Computer Science, jamia Milia Islamia, New Delhi, India
| | - Khalid Raza
- Department of Computer Science, jamia Milia Islamia, New Delhi, India.
| |
Collapse
|
19
|
García-Nieto J, Nebro AJ, Aldana-Montes JF. Inference of gene regulatory networks with multi-objective cellular genetic algorithm. Comput Biol Chem 2019; 80:409-418. [PMID: 31128452 DOI: 10.1016/j.compbiolchem.2019.05.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Revised: 03/26/2019] [Accepted: 05/08/2019] [Indexed: 10/26/2022]
Abstract
Reverse engineering of biochemical networks remains an important open challenge in computational systems biology. The goal of model inference is to, based on time-series gene expression data, obtain the sparse topological structure and parameters that quantitatively understand and reproduce the dynamics of biological systems. In this paper, we propose a multi-objective approach for the inference of S-System structures for Gene Regulatory Networks (GRNs) based on Pareto dominance and Pareto optimality theoretical concepts instead of the conventional single-objective evaluation of Mean Squared Error (MSE). Our motivation is that, using a multi-objective formulation for the GRN, it is possible to optimize the sparse topology of a given GRN as well as the kinetic order and rate constant parameters in a decoupled S-System, yet avoiding the use of additional penalty weights. A flexible and robust Multi-Objective Cellular Evolutionary Algorithm is adapted to perform the tasks of parameter learning and network topology inference for the proposed approach. The resulting software, called MONET, is evaluated on real-based academic and synthetic time-series of gene expression taken from the DREAM3 challenge and the IRMA in vivo datasets. The ability to reproduce biological behavior and robustness to noise is assessed and compared. The results obtained are competitive and indicate that the proposed approach offers advantages over previously used methods. In addition, MONET is able to provide experts with a set of trade-off solutions involving GRNs with different typologies and MSEs.
Collapse
Affiliation(s)
- José García-Nieto
- Dept. de Lenguajes y Ciencias de la Computación and Instituto de Investigación Biomédica de Málaga (IBIMA), University of Malaga, ETSI Informática, Campus de Teatinos, Malaga 29071, Spain.
| | - Antonio J Nebro
- Dept. de Lenguajes y Ciencias de la Computación and Instituto de Investigación Biomédica de Málaga (IBIMA), University of Malaga, ETSI Informática, Campus de Teatinos, Malaga 29071, Spain.
| | - José F Aldana-Montes
- Dept. de Lenguajes y Ciencias de la Computación and Instituto de Investigación Biomédica de Málaga (IBIMA), University of Malaga, ETSI Informática, Campus de Teatinos, Malaga 29071, Spain.
| |
Collapse
|
20
|
Jana B, Mitra S, Acharyya S. Repository and Mutation based Particle Swarm Optimization (RMPSO): A new PSO variant applied to reconstruction of Gene Regulatory Network. Appl Soft Comput 2019. [DOI: 10.1016/j.asoc.2018.09.027] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
21
|
Raza K. Fuzzy logic based approaches for gene regulatory network inference. Artif Intell Med 2018; 97:189-203. [PMID: 30573378 DOI: 10.1016/j.artmed.2018.12.004] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 12/10/2018] [Accepted: 12/12/2018] [Indexed: 12/26/2022]
Abstract
The rapid advancements in high-throughput techniques have fueled large-scale production of biological data at very affordable costs. Some of these techniques are microarrays and next-generation sequencing that provide genome level insight of living cells. As a result, the size of most of the biological databases, such as NCBI-GEO, NCBI-SRA, etc., is growing exponentially. These biological data are analyzed using various computational techniques for knowledge discovery - which is also one of the objectives of bioinformatics research. Gene regulatory network (GRN) is a gene-gene interaction network which plays a pivotal role in understanding gene regulation processes and disease mechanism at the molecular level. From last couple of decades, researchers are interested in developing computational algorithms for GRN inference (GRNI) from high-throughput experimental data. Several computational approaches have been proposed for inferring GRN from gene expression data including statistical techniques (correlation coefficient), information theory (mutual information), regression-based approaches, probabilistic approaches (Bayesian networks, naïve byes), artificial neural networks and fuzzy logic. The fuzzy logic, along with its hybridization with other intelligent approaches, is a well-studied technique in GRNI due to its several advantages. In this paper, we present a consolidated review on fuzzy logic and its hybrid approaches developed during last two decades for GRNI.
Collapse
Affiliation(s)
- Khalid Raza
- Department of Computer Science, Jamia Millia Islamia, New Delhi, India.
| |
Collapse
|
22
|
Biswas S, Acharyya S. A Bi-Objective RNN Model to Reconstruct Gene Regulatory Network: A Modified Multi-Objective Simulated Annealing Approach. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:2053-2059. [PMID: 29990170 DOI: 10.1109/tcbb.2017.2771360] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Gene Regulatory Network (GRN) is a virtual network in a cellular context of an organism, comprising a set of genes and their internal relationships to regulate protein production rate (gene expression level) of each other through coded proteins. Computational Reconstruction of GRN from gene expression data is a widely-applied research area. Recurrent Neural Network (RNN) is a useful modeling scheme for GRN reconstruction. In this research, the RNN formulation of GRN reconstruction having single objective function has been modified to incorporate a new objective function. An existing multi-objective meta-heuristic algorithm, called Archived Multi Objective Simulated Annealing (AMOSA), has been modified and applied to this bi-objective RNN formulation. Executing the resulting algorithm (called AMOSA-GRN) on a gene expression dataset, a collection (termed as Archive) of non-dominated GRNs has been obtained. Ensemble averaging has been applied on the archives, and obtained through a sequence of executions of AMOSA-GRN. Accuracy of GRNs in the averaged archive, with respect to gold standard GRN, varies in the range 0.875 - 1.0 (87.5 - 100 percent).
Collapse
|
23
|
Yang B, Chen Y, Zhang W, Lv J, Bao W, Huang DS. HSCVFNT: Inference of Time-Delayed Gene Regulatory Network Based on Complex-Valued Flexible Neural Tree Model. Int J Mol Sci 2018; 19:E3178. [PMID: 30326663 PMCID: PMC6214043 DOI: 10.3390/ijms19103178] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Revised: 10/08/2018] [Accepted: 10/10/2018] [Indexed: 11/17/2022] Open
Abstract
Gene regulatory network (GRN) inference can understand the growth and development of animals and plants, and reveal the mystery of biology. Many computational approaches have been proposed to infer GRN. However, these inference approaches have hardly met the need of modeling, and the reducing redundancy methods based on individual information theory method have bad universality and stability. To overcome the limitations and shortcomings, this thesis proposes a novel algorithm, named HSCVFNT, to infer gene regulatory network with time-delayed regulations by utilizing a hybrid scoring method and complex-valued flexible neural network (CVFNT). The regulations of each target gene can be obtained by iteratively performing HSCVFNT. For each target gene, the HSCVFNT algorithm utilizes a novel scoring method based on time-delayed mutual information (TDMI), time-delayed maximum information coefficient (TDMIC) and time-delayed correlation coefficient (TDCC), to reduce the redundancy of regulatory relationships and obtain the candidate regulatory factor set. Then, the TDCC method is utilized to create time-delayed gene expression time-series matrix. Finally, a complex-valued flexible neural tree model is proposed to infer the time-delayed regulations of each target gene with the time-delayed time-series matrix. Three real time-series expression datasets from (Save Our Soul) SOS DNA repair system in E. coli and Saccharomyces cerevisiae are utilized to evaluate the performance of the HSCVFNT algorithm. As a result, HSCVFNT obtains outstanding F-scores of 0.923, 0.8 and 0.625 for SOS network and (In vivo Reverse-Engineering and Modeling Assessment) IRMA network inference, respectively, which are 5.5%, 14.3% and 72.2% higher than the best performance of other state-of-the-art GRN inference methods and time-delayed methods.
Collapse
Affiliation(s)
- Bin Yang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.
| | - Yuehui Chen
- School of Information Science and Engineering, University of Jinan, Jinan 250002, China.
| | - Wei Zhang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.
| | - Jiaguo Lv
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.
| | - Wenzheng Bao
- School of Computer Science, China University of Mining and Technology, Xuzhou 221000, China.
| | - De-Shuang Huang
- Institute of Machine Learning and Systems Biology, Tongji University, Shanghai 200092, China.
| |
Collapse
|
24
|
Yu M, Tang X, Lin Y, Wang X. Diesel engine modeling based on recurrent neural networks for a hardware-in-the-loop simulation system of diesel generator sets. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.12.054] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
25
|
Kordmahalleh MM, Sefidmazgi MG, Harrison SH, Homaifar A. Identifying time-delayed gene regulatory networks via an evolvable hierarchical recurrent neural network. BioData Min 2017; 10:29. [PMID: 28785315 PMCID: PMC5543747 DOI: 10.1186/s13040-017-0146-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 07/14/2017] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND The modeling of genetic interactions within a cell is crucial for a basic understanding of physiology and for applied areas such as drug design. Interactions in gene regulatory networks (GRNs) include effects of transcription factors, repressors, small metabolites, and microRNA species. In addition, the effects of regulatory interactions are not always simultaneous, but can occur after a finite time delay, or as a combined outcome of simultaneous and time delayed interactions. Powerful biotechnologies have been rapidly and successfully measuring levels of genetic expression to illuminate different states of biological systems. This has led to an ensuing challenge to improve the identification of specific regulatory mechanisms through regulatory network reconstructions. Solutions to this challenge will ultimately help to spur forward efforts based on the usage of regulatory network reconstructions in systems biology applications. METHODS We have developed a hierarchical recurrent neural network (HRNN) that identifies time-delayed gene interactions using time-course data. A customized genetic algorithm (GA) was used to optimize hierarchical connectivity of regulatory genes and a target gene. The proposed design provides a non-fully connected network with the flexibility of using recurrent connections inside the network. These features and the non-linearity of the HRNN facilitate the process of identifying temporal patterns of a GRN. RESULTS Our HRNN method was implemented with the Python language. It was first evaluated on simulated data representing linear and nonlinear time-delayed gene-gene interaction models across a range of network sizes and variances of noise. We then further demonstrated the capability of our method in reconstructing GRNs of the Saccharomyces cerevisiae synthetic network for in vivo benchmarking of reverse-engineering and modeling approaches (IRMA). We compared the performance of our method to TD-ARACNE, HCC-CLINDE, TSNI and ebdbNet across different network sizes and levels of stochastic noise. We found our HRNN method to be superior in terms of accuracy for nonlinear data sets with higher amounts of noise. CONCLUSIONS The proposed method identifies time-delayed gene-gene interactions of GRNs. The topology-based advancement of our HRNN worked as expected by more effectively modeling nonlinear data sets. As a non-fully connected network, an added benefit to HRNN was how it helped to find the few genes which regulated the target gene over different time delays.
Collapse
Affiliation(s)
- Mina Moradi Kordmahalleh
- Department of Electrical and Computer Engineering, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| | - Mohammad Gorji Sefidmazgi
- Department of Electrical and Computer Engineering, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| | - Scott H Harrison
- Department of Biology, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| | - Abdollah Homaifar
- Department of Electrical and Computer Engineering, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| |
Collapse
|
26
|
Liu L, Zhao T, Ma M, Wang Y. A new gene regulatory network model based on BP algorithm for interrogating differentially expressed genes of Sea Urchin. SPRINGERPLUS 2016; 5:1911. [PMID: 27867818 PMCID: PMC5095099 DOI: 10.1186/s40064-016-3526-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 10/12/2016] [Indexed: 12/23/2022]
Abstract
Background Computer science and mathematical theories are combined to analyze the complex interactions among genes, which are simplified to a network to establish a theoretical model for the analysis of the structure, module and dynamic properties. In contrast, traditional model of gene regulatory networks often lack an effective method for solving gene expression data because of high durational and spatial complexity. In this paper, we propose a new model for constructing gene regulatory networks using back propagation (BP) neural network based on predictive function and network topology. Results Combined with complex nonlinear mapping and self-learning, the BP neural network was mapped into a complex network. Network characteristics were obtained from the parameters of the average path length, average clustering coefficient, average degree, modularity, and map’s density to simulate the real gene network by an artificial network. Through the statistical analysis and comparison of network parameters of Sea Urchin mRNA microarray data under different temperatures, the value of network parameters was observed. Differentially expressed Sea Urchin genes associated with temperature were determined by calculating the difference in the degree of each gene from different networks. Conclusion The new model we developed is suitable to simulate gene regulatory network and has capability of determining differentially expressed genes.
Collapse
Affiliation(s)
- Longlong Liu
- School of Mathematical Sciences, Ocean University of China, Qingdao, 266100 People's Republic of China
| | - Tingting Zhao
- School of Mathematical Sciences, Ocean University of China, Qingdao, 266100 People's Republic of China
| | - Meng Ma
- School of Mathematical Sciences, Ocean University of China, Qingdao, 266100 People's Republic of China
| | - Yan Wang
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101 People's Republic of China
| |
Collapse
|