1
|
Xuan P, Wang W, Cui H, Wang S, Nakaguchi T, Zhang T. Mask-Guided Target Node Feature Learning and Dynamic Detailed Feature Enhancement for lncRNA-Disease Association Prediction. J Chem Inf Model 2024; 64:6662-6675. [PMID: 39112431 DOI: 10.1021/acs.jcim.4c00652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
Identifying new relevant long noncoding RNAs (lncRNAs) for various human diseases can facilitate the exploration of the causes and progression of these diseases. Recently, several graph inference methods have been proposed to predict disease-related lncRNAs by exploiting the topological structure and node attributes within graphs. However, these methods did not prioritize the target lncRNA and disease nodes over auxiliary nodes like miRNA nodes, potentially limiting their ability to fully utilize the features of the target nodes. We propose a new method, mask-guided target node feature learning and dynamic detailed feature enhancement for lncRNA-disease association prediction (MDLD), to enhance node feature learning for improved lncRNA-disease association prediction. First, we designed a heterogeneous graph masked transformer autoencoder to guide feature learning, focusing more on the features of target lncRNA (disease) nodes. The target nodes were increasingly masked as training progressed, which helps develop a more robust prediction model. Second, we developed a graph convolutional network with dynamic residuals (GCNDR) to learn and integrate the heterogeneous topology and features of all lncRNA, disease, and miRNA nodes. GCNDR employs an interlayer residual strategy and a residual evolution strategy to mitigate oversmoothing caused by multilayer graph convolution. The interlayer residual strategy estimates the importance of node features learned in the previous GCN encoding layer for nodes in the current encoding layer. Additionally, since there are dependencies in the importance of features of individual lncRNA (disease, miRNA) nodes across multiple encoding layers, a gated recurrent unit-based strategy is proposed to encode these dependencies. Finally, we designed a perspective-level attention mechanism to obtain more informative features of lncRNA and disease node pairs from the perspectives of mask-enhanced and dynamic-enhanced node features. Cross-validation experimental results demonstrated that MDLD outperformed 10 other state-of-the-art prediction methods. Ablation experiments and case studies on candidate lncRNAs for three diseases further proved the technical contributions of MDLD and its capability to discover disease-related lncRNAs.
Collapse
Affiliation(s)
- Ping Xuan
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Wei Wang
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Shuai Wang
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
2
|
Xuan P, Lu S, Cui H, Wang S, Nakaguchi T, Zhang T. Learning Association Characteristics by Dynamic Hypergraph and Gated Convolution Enhanced Pairwise Attributes for Prediction of Disease-Related lncRNAs. J Chem Inf Model 2024; 64:3569-3578. [PMID: 38523267 DOI: 10.1021/acs.jcim.4c00245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
As the long non-coding RNAs (lncRNAs) play important roles during the incurrence and development of various human diseases, identifying disease-related lncRNAs can contribute to clarifying the pathogenesis of diseases. Most of the recent lncRNA-disease association prediction methods utilized the multi-source data about the lncRNAs and diseases. A single lncRNA may participate in multiple disease processes, and multiple lncRNAs usually are involved in the same disease process synergistically. However, the previous methods did not completely exploit the biological characteristics to construct the informative prediction models. We construct a prediction model based on adaptive hypergraph and gated convolution for lncRNA-disease association prediction (AGLDA), to embed and encode the biological characteristics about lncRNA-disease associations, the topological features from the entire heterogeneous graph perspective, and the gated enhanced pairwise features. First, the strategy for constructing hyperedges is designed to reflect the biological characteristic that multiple lncRNAs are involved in multiple disease processes. Furthermore, each hyperedge has its own biological perspective, and multiple hyperedges are beneficial for revealing the diverse relationships among multiple lncRNAs and diseases. Second, we encode the biological features of each lncRNA (disease) node using a strategy based on dynamic hypergraph convolutional networks. The strategy may adaptively learn the features of the hyperedges and formulate the dynamically evolved hypergraph topological structure. Third, a group convolutional network is established to integrate the entire heterogeneous topological structure and multiple types of node attributes within an lncRNA-disease-miRNA graph. Finally, a gated convolutional strategy is proposed to enhance the informative features of the lncRNA-disease node pairs. The comparison experiments indicate that AGLDA outperforms seven advanced prediction methods. The ablation studies confirm the effectiveness of major innovations, and the case studies validate AGLDA's ability in application for discovering potential disease-related lncRNA candidates.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- Department of Computer Science, Shantou University, Shantou 515063, China
| | - Siyuan Lu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Shuai Wang
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
3
|
Wang S, Hui C, Zhang T, Wu P, Nakaguchi T, Xuan P. Graph Reasoning Method Based on Affinity Identification and Representation Decoupling for Predicting lncRNA-Disease Associations. J Chem Inf Model 2023; 63:6947-6958. [PMID: 37906529 DOI: 10.1021/acs.jcim.3c01214] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
An increasing number of studies have shown that dysregulation of lncRNAs is related to the occurrence of various diseases. Most of the previous methods, however, are designed based on homogeneity assumption that the representation of a target lncRNA (or disease) node should be updated by aggregating the attributes of its neighbor nodes. However, the assumption ignores the affinity nodes that are far from the target node. We present a novel prediction method, GAIRD, to fully leverage the heterogeneous information in the network and the decoupled node features. The first major innovation is a random walk strategy based on width-first searching and depth-first searching. Different from previous methods that only focus on homogeneous information, our new strategy learns both the homogeneous information within local neighborhoods and the heterogeneous information within higher-order neighborhoods. The second innovation is a representation decoupling module to extract the purer attributes and the purer topologies. Third, a module based on group convolution and deep separable convolution is developed to promote the pairwise intrachannel and interchannel feature learning. The experimental results show that GAIRD outperforms comparing state-of-the-art methods, and the ablation studies prove the contributions of major innovations. We also performed case studies on 3 diseases to further demonstrate the effectiveness of the GAIRD model in applications.
Collapse
Affiliation(s)
- Shuai Wang
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Cui Hui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Peiliang Wu
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
- Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Qinhuangdao 066004, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Ping Xuan
- Department of Computer Science, School of Engineering, Shantou University, Shantou 515063, China
| |
Collapse
|
4
|
Sheng QJ, Tan Y, Zhang L, Wu ZP, Wang B, He XY. Heterogeneous graph framework for predicting the association between lncRNA and disease and case on uterine fibroid. Comput Biol Med 2023; 165:107331. [PMID: 37619322 DOI: 10.1016/j.compbiomed.2023.107331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 07/24/2023] [Accepted: 08/07/2023] [Indexed: 08/26/2023]
Abstract
Long non-coding RNAs (lncRNAs) play crucial regulatory roles in various cellular processes, including gene expression, chromatin remodeling, and protein localization. Dysregulation of lncRNAs has been linked to several diseases, making it essential to understand their functions in disease mechanisms and therapeutic strategies. However, traditional experimental methods for studying lncRNA function are time-consuming, expensive, and offer limited insights. In recent years, computational methods have emerged as valuable tools for predicting lncRNA functions and their associations with diseases. However, many existing methods focus on constructing separate networks for lncRNA and disease similarity, resulting in information loss and insufficient processing capacity for isolated nodes. To address this, we developed 'RGLD' by combining Random Walk with restarting (RWR), Graph Neural Network (GNN), and Graph Attention Networks (GAT) to predict lncRNA-disease associations in a heterogeneous network. RGLD achieved an impressive AUC of 0.88, outperforming other methods. It can also predict novel associations between lncRNAs and diseases. RGLD identified HOTAIR, MEG3, and PVT1 as lncRNAs associated with uterine fibroids. Biological experiments directly or indirectly verified the involvement of these three lncRNAs in uterine fibroids, validating the accuracy of RGLD's predictions. Furthermore, we extensively discussed the functions of the target genes regulated by these lncRNAs in uterine fibroids, providing evidence for their role in the development and progression of the disease.
Collapse
Affiliation(s)
- Qing-Jing Sheng
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Yuan Tan
- Department of Integrated Traditional Chinese Medicine (TCM) & Western Medicine, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Liyuan Zhang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Zhi-Ping Wu
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Beiying Wang
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Xiao-Ying He
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China.
| |
Collapse
|
5
|
Xuan P, Bai H, Cui H, Zhang X, Nakaguchi T, Zhang T. Specific topology and topological connection sensitivity enhanced graph learning for lncRNA-disease association prediction. Comput Biol Med 2023; 164:107265. [PMID: 37531860 DOI: 10.1016/j.compbiomed.2023.107265] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 06/26/2023] [Accepted: 07/16/2023] [Indexed: 08/04/2023]
Abstract
Predicting disease-related candidate long noncoding RNAs (lncRNAs) is beneficial for exploring disease pathogenesis due to the close relations between lncRNAs and the occurrence and development of human diseases. It is a long-term and challenging task to adequately extract specific and local topologies in individual lncRNA network and individual disease network, and integrate the information of the connection relationships. We propose a new graph learning-based prediction method to encode specific and local topologies from each individual network, neighbor topologies with different connection relationships, and pairwise attributes. We first construct a lncRNA network composed of all the lncRNA nodes and their similarities, and a single disease network that contains all the disease nodes and disease similarities. Then, a network-aware graph convolutional autoencoder is constructed to encode the specific and local topologies of each network. Secondly, a heterogeneous network is established to embed all lncRNA, disease, and miRNA nodes and their various connections. Afterwards, a connection-sensitive graph neural network is designed to deeply integrate the neighbor node attributes and connection characteristics in the heterogeneous network and learn neighbor topological representations. We also construct both connection-level and topology representation-level attention mechanisms to extract informative connections and topological representations. Finally, we build a multi-layer convolutional neural networks with weighted residuals to adaptively complement the detailed features to pairwise attribute encoding. Comprehensive experiments and comparison results demonstrated that NCPred outperforms seven state-of-the-art prediction methods. The ablation studies demonstrated the importance of local topology learning, neighbor topology learning, and pairwise attribute encoding. Case studies on prostate, lung, and breast cancers further revealed NCPred's capacity to screen potential candidate disease-related lncRNAs.
Collapse
Affiliation(s)
- Ping Xuan
- Department of Computer Science, School of Engineering, Shantou University, Shantou, China
| | - Honglei Bai
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, Australia
| | - Xiaowen Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba, Japan
| | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin, China; School of Mathematical Science, Heilongjiang University, Harbin, China.
| |
Collapse
|
6
|
Xuan P, Zhao Y, Cui H, Zhan L, Jin Q, Zhang T, Nakaguchi T. Semantic Meta-Path Enhanced Global and Local Topology Learning for lncRNA-Disease Association Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1480-1491. [PMID: 36173783 DOI: 10.1109/tcbb.2022.3209571] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Since abnormal expression of long non-coding RNAs (lncRNAs) is associated with various human diseases, identifying disease-related lncRNAs helps reveal the pathogenesis of diseases. Existing methods for lncRNA-disease association prediction mainly focus on multi-sourced data related to lncRNAs and diseases. The rich semantic information of meta-paths, composed of multiple kinds of connections between lncRNA and disease nodes, is neglected. We propose a new prediction method, MGLDA, to encode and integrate the semantics of multiple meta-paths, the global topology of heterogeneous graph, and pairwise attributes of lncRNA and disease nodes. First, a tri-layer heterogeneous graph is constructed to associate multi-sourced data across the lncRNA, disease, and miRNA nodes. Afterwards, we establish multiple meta-paths connecting the lncRNA and disease nodes to derive and denote various semantics. Each meta-path contains its specific semantics formulated by an embedding strategy, and each embedding covers local topology formed by the diverse semantic connections among the lncRNA, disease, and miRNA nodes. We construct multiple graph convolutional autoencoders (GCA) with topology-level attention to learn global and multiple local topologies from the tri-layer graph and each meta-path, respectively. The topology-level attention mechanism can learn the importance of various global and local topologies for adaptive pairwise topology fusion. Finally, a convolutional autoencoder learns the attribute representations of lncRNA-disease pairs, which integrates the learnt detailed and representative pairwise features. Experimental results show that MGLDA outperforms other state-of-the-art prediction methods in comparison and retrieves more real lncRNA-disease associations in the top-ranked candidates. The ablation study also demonstrates the important contributions of the local and global topology learning, and pairwise attribute learning. Case studies on three diseases further demonstrate MGLDA's ability to identify potential disease-related lncRNAs.
Collapse
|
7
|
Recent advances in predicting lncRNA-disease associations based on computational methods. Drug Discov Today 2023; 28:103432. [PMID: 36370992 DOI: 10.1016/j.drudis.2022.103432] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/19/2022] [Accepted: 11/03/2022] [Indexed: 11/11/2022]
Abstract
Mutations in and dysregulation of long non-coding RNAs (lncRNAs) are closely associated with the development of various human complex diseases, but only a few lncRNAs have been experimentally confirmed to be associated with human diseases. Predicting new potential lncRNA-disease associations (LDAs) will help us to understand the pathogenesis of human diseases and to detect disease markers, as well as in disease diagnosis, prevention and treatment. Computational methods can effectively narrow down the screening scope of biological experiments, thereby reducing the duration and cost of such experiments. In this review, we outline recent advances in computational methods for predicting LDAs, focusing on LDA databases, lncRNA/disease similarity calculations, and advanced computational models. In addition, we analyze the limitations of various computational models and discuss future challenges and directions for development.
Collapse
|
8
|
Li J, Wang D, Yang Z, Liu M. HEGANLDA: A Computational Model for Predicting Potential Lncrna-Disease Associations Based On Multiple Heterogeneous Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:388-398. [PMID: 34932483 DOI: 10.1109/tcbb.2021.3136886] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Long non-coding RNAs (lncRNAs) play vital regulatory roles in many human complex diseases, however, the number of validated lncRNA-disease associations is notable rare so far. How to predict potential lncRNA-disease associations precisely through computational methods remains challenging. In this study, we proposed a novel method, LDVCHN (LncRNA-Disease Vector Calculation Heterogeneous Networks), and also developed the corresponding model, HEGANLDA (Heterogeneous Embedding Generative Adversarial Networks LncRNA-Disease Association), for predicting potential lncRNA-disease associations. In HEGANLDA, the graph embedding algorithm (HeGAN) was introduced for mapping all nodes in the lncRNA-miRNA-disease heterogeneous network into the low-dimensional vectors which severed as the inputs of LDVCHN. HEGANLDA effectively adopted the XGBoost (eXtreme Gradient Boosting) classifier, which was trained by the low-dimensional vectors, to predict potential lncRNA-disease associations. The 10-fold cross-validation method was utilized to evaluate the performance of our model, our model finally achieved an area under the ROC curve of 0.983. According to the experiment results, HEGANLDA outperformed any one of five current state-of-the-art methods. To further evaluate the effectiveness of HEGANLDA in predicting potential lncRNA-disease associations, both case studies and robustness tests were performed and the results confirmed its effectiveness and robustness. The source code and data of HEGANLDA are available at https://github.com/HEGANLDA/HEGANLDA.
Collapse
|
9
|
Xuan P, Wang S, Cui H, Zhao Y, Zhang T, Wu P. Learning global dependencies and multi-semantics within heterogeneous graph for predicting disease-related lncRNAs. Brief Bioinform 2022; 23:6695267. [DOI: 10.1093/bib/bbac361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 07/18/2022] [Accepted: 08/05/2022] [Indexed: 11/14/2022] Open
Abstract
Abstract
Motivation
Long noncoding RNAs (lncRNAs) play an important role in the occurrence and development of diseases. Predicting disease-related lncRNAs can help to understand the pathogenesis of diseases deeply. The existing methods mainly rely on multi-source data related to lncRNAs and diseases when predicting the associations between lncRNAs and diseases. There are interdependencies among node attributes in a heterogeneous graph composed of all lncRNAs, diseases and micro RNAs. The meta-paths composed of various connections between them also contain rich semantic information. However, the existing methods neglect to integrate attribute information of intermediate nodes in meta-paths.
Results
We propose a novel association prediction model, GSMV, to learn and deeply integrate the global dependencies, semantic information of meta-paths and node-pair multi-view features related to lncRNAs and diseases. We firstly formulate the global representations of the lncRNA and disease nodes by establishing a self-attention mechanism to capture and learn the global dependencies among node attributes. Second, starting from the lncRNA and disease nodes, respectively, multiple meta-pathways are established to reveal different semantic information. Considering that each meta-path contains specific semantics and has multiple meta-path instances which have different contributions to revealing meta-path semantics, we design a graph neural network based module which consists of a meta-path instance encoding strategy and two novel attention mechanisms. The proposed meta-path instance encoding strategy is used to learn the contextual connections between nodes within a meta-path instance. One of the two new attention mechanisms is at the meta-path instance level, which learns rich and informative meta-path instances. The other attention mechanism integrates various semantic information from multiple meta-paths to learn the semantic representation of lncRNA and disease nodes. Finally, a dilated convolution-based learning module with adjustable receptive fields is proposed to learn multi-view features of lncRNA-disease node pairs. The experimental results prove that our method outperforms seven state-of-the-art comparing methods for lncRNA-disease association prediction. Ablation experiments demonstrate the contributions of the proposed global representation learning, semantic information learning, pairwise multi-view feature learning and the meta-path instance encoding strategy. Case studies on three cancers further demonstrate our method’s ability to discover potential disease-related lncRNA candidates.
Contact
zhang@hlju.edu.cn or peiliangwu@ysu.edu.cn
Supplementary information
Supplementary data are available at Briefings in Bioinformatics online.
Collapse
Affiliation(s)
- Ping Xuan
- School of Information Science and Engineering (School of Software), Yanshan University , Qinhuangdao 066004, China
- School of Computer Science and Technology, Heilongjiang University , Harbin 150080, China
| | - Shuai Wang
- School of Information Science and Engineering (School of Software), Yanshan University , Qinhuangdao 066004, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University , Melbourne 3083, Australia
| | - Yue Zhao
- School of Computer Science and Technology, Heilongjiang University , Harbin 150080, China
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University , Harbin 150080, China
| | - Peiliang Wu
- School of Information Science and Engineering (School of Software), Yanshan University , Qinhuangdao 066004, China
| |
Collapse
|
10
|
Chen L, Li S, Shi W, Wu Y. An Integrative Transcriptomic Analysis Reveals EGFR Exon-19 E746-A750 Fragment Deletion Regulated miRNA, circRNA, mRNA and lncRNA Networks in Lung Carcinoma. Int J Gen Med 2022; 15:6031-6042. [PMID: 35818580 PMCID: PMC9270948 DOI: 10.2147/ijgm.s370247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 06/27/2022] [Indexed: 11/23/2022] Open
Abstract
Introduction Competing endogenous RNA (ceRNA) appears to be an important post-transcriptional manner that regulates gene expression through a miRNA-mediated mechanism. Mutations in exon-19 of EGFR were frequently observed in lung cancer genes, which were associated with EGFR activity and EGFR-targeted therapies. Methods We explored the transcriptome regulated by mutation in EGFR exon-19 E746-A750 fragment via using a network modeling strategy. We applied transcriptome sequencing to detect the deletion process of EGFR exon-19 E746-A750 fragment. Bio-informatics analyses were used to predict the gene target pairs and explain their potential roles in tumorigenesis and progression of lung cancer. Results We conducted an explorative lncRNA/miRNA/circRNA and mRNA expression study with two groups of lung adenocarcinoma tissues, including EGFR exon-19 E746-A750 deletion group and EGFR exon-19 wild-type group. Meanwhile, we screen out the hub genes related to the EGFR-19-D patient. Significant pathways and biological functions potentially regulated by the deregulated 128 non-coding genes were enriched. Conclusion Our work provides an important theoretical, experimental and clinical foundation for further research on more effective targets for the diagnosis, therapy and prognosis of lung cancer.
Collapse
Affiliation(s)
- Ling Chen
- The Affiliated Hospital of Jiangnan University, Wuxi, Jiangsu Province, People’s Republic of China
| | - Shenyi Li
- The Affiliated Hospital of Jiangnan University, Wuxi, Jiangsu Province, People’s Republic of China
| | - Weifeng Shi
- The Affiliated Hospital of Jiangnan University, Wuxi, Jiangsu Province, People’s Republic of China
| | - Yibo Wu
- The Affiliated Hospital of Jiangnan University, Wuxi, Jiangsu Province, People’s Republic of China
- Correspondence: Yibo Wu; Weifeng Shi, The Affiliated Hospital of Jiangnan University, Wuxi, Jiangsu Province, People’s Republic of China, Tel +86-510-68089762; +86-510-68089762, Fax +86-510-68089762, Email ;
| |
Collapse
|
11
|
Xu H, Hu X, Yan X, Zhong W, Yin D, Gai Y. Exploring noncoding RNAs in thyroid cancer using a graph convolutional network approach. Comput Biol Med 2022; 145:105447. [DOI: 10.1016/j.compbiomed.2022.105447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 03/20/2022] [Accepted: 03/21/2022] [Indexed: 12/01/2022]
|
12
|
Xuan P, Zhan L, Cui H, Zhang T, Nakaguchi T, Zhang W. Graph Triple-Attention Network for Disease-related LncRNA Prediction. IEEE J Biomed Health Inform 2021; 26:2839-2849. [PMID: 34813484 DOI: 10.1109/jbhi.2021.3130110] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Abnormal expressions of long non-coding RNAs (lncRNAs) are associated with various human diseases. Identifying disease-related lncRNAs can help clarify complex disease pathogeneses. The latest methods for lncRNA-disease association prediction rely on diverse data about lncRNAs and diseases. These methods, however, cannot adequately integrate the neighbour topological information of lncRNA and disease nodes. Moreover, more intrinsic features of lncRNA-disease node pairs can be explored to better predict the latent associations between lncRNAs and diseases. We developed a novel method, named GTAN, to predict the association propensities between lncRNAs and diseases. GTAN integrates various information about lncRNAs and diseases, including similarities, associations and interactions among lncRNAs, diseases and miRNAs, and exploits neighbour topology and attribute representations of a pair of lncRNA-disease nodes. We adopted in GTAN a graph neural network architecture with three attention mechanisms and multi-layer convolutional neural networks. First, a neighbour-level self-attention mechanism is constructed to learn the importance of each neighbour for an interested lncRNA or disease node. Second, topology-level attention is proposed to enhance contextual dependencies among multiple local topology representations of the lncRNA or disease node. An attention-enhanced graph neural network framework is then established to learn a topology representation of top-ranked neighbours for a pair of lncRNA-disease nodes. GTAN also has attribute-level attention to distinguish various contributions of attributes of the lncRNA-disease pair. Finally, attribute representation is learned by multi-layer CNN to integrate detailed features and representative features of the pair. Extensive experimental results demonstrated that GTAN outperformed state-of-the-art methods. The improved recall rates also showed GTANs capacity for retrieving more actual lncRNA-disease associations in the top-ranked candidates. The ablation studies confirmed the important contributions of three attention mechanisms. Case studies on lung cancer, prostate cancer and colon cancer further showed GTANs ability in discovering potential lncRNA candidates related to diseases.
Collapse
|
13
|
Bamunu Mudiyanselage T, Lei X, Senanayake N, Zhang Y, Pan Y. Predicting CircRNA disease associations using novel node classification and link prediction models on Graph Convolutional Networks. Methods 2021; 198:32-44. [PMID: 34748953 DOI: 10.1016/j.ymeth.2021.10.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 09/21/2021] [Accepted: 10/22/2021] [Indexed: 12/17/2022] Open
Abstract
Accumulated studies have discovered that circular RNAs (CircRNAs) are closely related to many complex human diseases. Due to this close relationship, CircRNAs can be used as good biomarkers for disease diagnosis and therapeutic targets for treatments. However, the number of experimentally verified circRNA-disease associations are still fewer and also conducting wet-lab experiments are constrained by the small scale and cost of time and labour. Therefore, effective computational methods are required to predict associations between circRNAs and diseases which will be promising candidates for small scale biological and clinical experiments. In this paper, we propose novel computational models based on Graph Convolution Networks (GCN) for the potential circRNA-disease association prediction. Currently most of the existing prediction methods use shallow learning algorithms. Instead, the proposed models combine the strengths of deep learning and graphs for the computation. First, they integrate multi-source similarity information into the association network. Next, models predict potential associations using graph convolution which explore this important relational knowledge of that network structure. Two circRNA-disease association prediction models, GCN based Node Classification (GCN-NC) and GCN based Link Prediction (GCN-LP) are introduced in this work and they demonstrate promising results in various experiments and outperforms other existing methods. Further, a case study proves that some of the predicted results of the novel computational models were confirmed by published literature and all top results could be verified using gene-gene interaction networks.
Collapse
Affiliation(s)
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China.
| | - Nipuna Senanayake
- Department of Computer Science, Georgia State University, Atlanta, USA.
| | - Yanqing Zhang
- Department of Computer Science, Georgia State University, Atlanta, USA.
| | - Yi Pan
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| |
Collapse
|
14
|
Yao Y, Ji B, Lv Y, Li L, Xiang J, Liao B, Gao W. Predicting LncRNA-Disease Association by a Random Walk With Restart on Multiplex and Heterogeneous Networks. Front Genet 2021; 12:712170. [PMID: 34490041 PMCID: PMC8417042 DOI: 10.3389/fgene.2021.712170] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 07/23/2021] [Indexed: 02/05/2023] Open
Abstract
Studies have found that long non-coding RNAs (lncRNAs) play important roles in many human biological processes, and it is critical to explore potential lncRNA-disease associations, especially cancer-associated lncRNAs. However, traditional biological experiments are costly and time-consuming, so it is of great significance to develop effective computational models. We developed a random walk algorithm with restart on multiplex and heterogeneous networks of lncRNAs and diseases to predict lncRNA-disease associations (MHRWRLDA). First, multiple disease similarity networks are constructed by using different approaches to calculate similarity scores between diseases, and multiple lncRNA similarity networks are also constructed by using different approaches to calculate similarity scores between lncRNAs. Then, a multiplex and heterogeneous network was constructed by integrating multiple disease similarity networks and multiple lncRNA similarity networks with the lncRNA-disease associations, and a random walk with restart on the multiplex and heterogeneous network was performed to predict lncRNA-disease associations. The results of Leave-One-Out cross-validation (LOOCV) showed that the value of Area under the curve (AUC) was 0.68736, which was improved compared with the classical algorithm in recent years. Finally, we confirmed a few novel predicted lncRNAs associated with specific diseases like colon cancer by literature mining. In summary, MHRWRLDA contributes to predict lncRNA-disease associations.
Collapse
Affiliation(s)
- Yuhua Yao
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
- Key Laboratory of Data Science and Intelligence Education, Ministry of Education, Hainan Normal University, Haikou, China
- Key Laboratory of Computational Science and Application of Hainan Province, Hainan Normal University, Haikou, China
| | - Binbin Ji
- Geneis Beijing Co., Ltd., Beijing, China
| | - Yaping Lv
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Ling Li
- Basic Courses Department, Zhejiang Shuren University, Hangzhou, China
| | - Ju Xiang
- School of Computer Science and Engineering, Central South University, Changsha, China
- Department of Basic Medical Sciences, Changsha Medical University, Changsha, China
- Department of Computer Science, Changsha Medical University, Changsha, China
| | - Bo Liao
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Wei Gao
- Departments of Internal Medicine-Oncology, Fujian Cancer Hospital & Fujian Medical University Cancer Hospital, Fuzhou, China
| |
Collapse
|
15
|
Yao Y, Ji B, Lv Y, Li L, Xiang J, Liao B, Gao W. Predicting LncRNA–Disease Association by a Random Walk With Restart on Multiplex and Heterogeneous Networks. Front Genet 2021. [DOI: https:/doi.org/10.3389/fgene.2021.712170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Studies have found that long non-coding RNAs (lncRNAs) play important roles in many human biological processes, and it is critical to explore potential lncRNA–disease associations, especially cancer-associated lncRNAs. However, traditional biological experiments are costly and time-consuming, so it is of great significance to develop effective computational models. We developed a random walk algorithm with restart on multiplex and heterogeneous networks of lncRNAs and diseases to predict lncRNA–disease associations (MHRWRLDA). First, multiple disease similarity networks are constructed by using different approaches to calculate similarity scores between diseases, and multiple lncRNA similarity networks are also constructed by using different approaches to calculate similarity scores between lncRNAs. Then, a multiplex and heterogeneous network was constructed by integrating multiple disease similarity networks and multiple lncRNA similarity networks with the lncRNA–disease associations, and a random walk with restart on the multiplex and heterogeneous network was performed to predict lncRNA–disease associations. The results of Leave-One-Out cross-validation (LOOCV) showed that the value of Area under the curve (AUC) was 0.68736, which was improved compared with the classical algorithm in recent years. Finally, we confirmed a few novel predicted lncRNAs associated with specific diseases like colon cancer by literature mining. In summary, MHRWRLDA contributes to predict lncRNA–disease associations.
Collapse
|
16
|
DBNLDA: Deep Belief Network based representation learning for lncRNA-disease association prediction. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02675-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
17
|
smORFunction: a tool for predicting functions of small open reading frames and microproteins. BMC Bioinformatics 2020; 21:455. [PMID: 33054771 PMCID: PMC7559452 DOI: 10.1186/s12859-020-03805-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 10/08/2020] [Indexed: 12/14/2022] Open
Abstract
Background Small open reading frame (smORF) is open reading frame with a length of less than 100 codons. Microproteins, translated from smORFs, have been found to participate in a variety of biological processes such as muscle formation and contraction, cell proliferation, and immune activation. Although previous studies have collected and annotated a large abundance of smORFs, functions of the vast majority of smORFs are still unknown. It is thus increasingly important to develop computational methods to annotate the functions of these smORFs. Results In this study, we collected 617,462 unique smORFs from three studies. The expression of smORF RNAs was estimated by reannotated microarray probes. Using a speed-optimized correlation algorism, the functions of smORFs were predicted by their correlated genes with known functional annotations. After applying our method to 5 known microproteins from literatures, our method successfully predicted their functions. Further validation from the UniProt database showed that at least one function of 202 out of 270 microproteins was predicted. Conclusions We developed a method, smORFunction, to provide function predictions of smORFs/microproteins in at most 265 models generated from 173 datasets, including 48 tissues/cells, 82 diseases (and normal). The tool can be available at https://www.cuilab.cn/smorfunction.
Collapse
|
18
|
Tong Z, Zhou Y, Wang J. Identification and Functional Analysis of Long Non-coding RNAs in Autism Spectrum Disorders. Front Genet 2020; 11:849. [PMID: 33193567 PMCID: PMC7525012 DOI: 10.3389/fgene.2020.00849] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 07/13/2020] [Indexed: 01/08/2023] Open
Abstract
Genetic and environmental factors, alone or in combination, contribute to the pathogenesis of autism spectrum disorder (ASD). Although many protein-coding genes have now been identified as disease risk genes for ASD, a detailed illustration of long non-coding RNAs (lncRNAs) associated with ASD remains elusive. In this study, we first identified ASD-related lncRNAs based on genomic variant data of individuals with ASD from a twin study. In total, 532 ASD-related lncRNAs were identified, and 86.7% of these ASD-related lncRNAs were further validated by an independent copy number variant (CNV) dataset. Then, the functions and associated biological pathways of ASD-related lncRNAs were explored by enrichment analysis of their three different types of functional neighbor genes (i.e., genomic neighbors, competing endogenous RNA (ceRNA) neighbors, and gene co-expression neighbors in the cortex). The results have shown that most of the functional neighbor genes of ASD-related lncRNAs were enriched in nervous system development, inflammatory responses, and transcriptional regulation. Moreover, we explored the differential functions of ASD-related lncRNAs in distinct brain regions by using gene co-expression network analysis based on tissue-specific gene expression profiles. As a set, ASD-related lncRNAs were mainly associated with nervous system development and dopaminergic synapse in the cortex, but associated with transcriptional regulation in the cerebellum. In addition, a functional network analysis was conducted for the highly reliable functional neighbor genes of ASD-related lncRNAs. We found that all the highly reliable functional neighbor genes were connected in a single functional network, which provided additional clues for the action mechanisms of ASD-related lncRNAs. Finally, we predicted several potential drugs based on the enrichment of drug-induced pathway sets in the ASD-altered biological pathway list. Among these drugs, several (e.g., amoxapine, piperine, and diflunisal) were partly supported by the previous reports. In conclusion, ASD-related lncRNAs participated in the pathogenesis of ASD through various known biological pathways, which may be differential in distinct brain regions. Detailed investigation into ASD-related lncRNAs can provide clues for developing potential ASD diagnosis biomarkers and therapy.
Collapse
Affiliation(s)
- Zhan Tong
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Yuan Zhou
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Juan Wang
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China.,Autism Research Center of Peking University Health Science Center, Peking University, Beijing, China
| |
Collapse
|
19
|
A random forest based computational model for predicting novel lncRNA-disease associations. BMC Bioinformatics 2020; 21:126. [PMID: 32216744 PMCID: PMC7099795 DOI: 10.1186/s12859-020-3458-1] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 03/18/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Accumulated evidence shows that the abnormal regulation of long non-coding RNA (lncRNA) is associated with various human diseases. Accurately identifying disease-associated lncRNAs is helpful to study the mechanism of lncRNAs in diseases and explore new therapies of diseases. Many lncRNA-disease association (LDA) prediction models have been implemented by integrating multiple kinds of data resources. However, most of the existing models ignore the interference of noisy and redundancy information among these data resources. RESULTS To improve the ability of LDA prediction models, we implemented a random forest and feature selection based LDA prediction model (RFLDA in short). First, the RFLDA integrates the experiment-supported miRNA-disease associations (MDAs) and LDAs, the disease semantic similarity (DSS), the lncRNA functional similarity (LFS) and the lncRNA-miRNA interactions (LMI) as input features. Then, the RFLDA chooses the most useful features to train prediction model by feature selection based on the random forest variable importance score that takes into account not only the effect of individual feature on prediction results but also the joint effects of multiple features on prediction results. Finally, a random forest regression model is trained to score potential lncRNA-disease associations. In terms of the area under the receiver operating characteristic curve (AUC) of 0.976 and the area under the precision-recall curve (AUPR) of 0.779 under 5-fold cross-validation, the performance of the RFLDA is better than several state-of-the-art LDA prediction models. Moreover, case studies on three cancers demonstrate that 43 of the 45 lncRNAs predicted by the RFLDA are validated by experimental data, and the other two predicted lncRNAs are supported by other LDA prediction models. CONCLUSIONS Cross-validation and case studies indicate that the RFLDA has excellent ability to identify potential disease-associated lncRNAs.
Collapse
|
20
|
Zhang Y, Chen M, Li A, Cheng X, Jin H, Liu Y. LDAI-ISPS: LncRNA-Disease Associations Inference Based on Integrated Space Projection Scores. Int J Mol Sci 2020; 21:E1508. [PMID: 32098405 PMCID: PMC7073162 DOI: 10.3390/ijms21041508] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Revised: 02/18/2020] [Accepted: 02/19/2020] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNAs (long ncRNAs, lncRNAs) of all kinds have been implicated in a range of cell developmental processes and diseases, while they are not translated into proteins. Inferring diseases associated lncRNAs by computational methods can be helpful to understand the pathogenesis of diseases, but those current computational methods still have not achieved remarkable predictive performance: such as the inaccurate construction of similarity networks and inadequate numbers of known lncRNA-disease associations. In this research, we proposed a lncRNA-disease associations inference based on integrated space projection scores (LDAI-ISPS) composed of the following key steps: changing the Boolean network of known lncRNA-disease associations into the weighted networks via combining all the global information (e.g., disease semantic similarities, lncRNA functional similarities, and known lncRNA-disease associations); obtaining the space projection scores via vector projections of the weighted networks to form the final prediction scores without biases. The leave-one-out cross validation (LOOCV) results showed that, compared with other methods, LDAI-ISPS had a higher accuracy with area-under-the-curve (AUC) value of 0.9154 for inferring diseases, with AUC value of 0.8865 for inferring new lncRNAs (whose associations related to diseases are unknown), with AUC value of 0.7518 for inferring isolated diseases (whose associations related to lncRNAs are unknown). A case study also confirmed the predictive performance of LDAI-ISPS as a helper for traditional biological experiments in inferring the potential LncRNA-disease associations and isolated diseases.
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| | - Min Chen
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang 421002, China
| | - Ang Li
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang 421002, China
| | - Xiaohui Cheng
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| | - Hong Jin
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| | - Yarong Liu
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| |
Collapse
|
21
|
Wang Q, Yan G. IDLDA: An Improved Diffusion Model for Predicting LncRNA-Disease Associations. Front Genet 2019; 10:1259. [PMID: 31867043 PMCID: PMC6909379 DOI: 10.3389/fgene.2019.01259] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2019] [Accepted: 11/14/2019] [Indexed: 11/13/2022] Open
Abstract
It has been demonstrated that long non-coding RNAs (lncRNAs) play important roles in a variety of biological processes associated with human diseases. However, the identification of lncRNA–disease associations by experimental methods is time-consuming and labor-intensive. Computational methods provide an effective strategy to predict more potential lncRNA–disease associations to some degree. Based on the hypothesis that phenotypically similar diseases are often associated with functionally similar lncRNAs and vice versa, we developed an improved diffusion model to predict potential lncRNA–disease associations (IDLDA). As a result, our model performed well in the global and local cross-validations, which indicated that IDLDA had a great performance in predicting novel associations. Case studies of colon cancer, breast cancer, and gastric cancer were also implemented, all lncRNAs which ranked top 10 in both databases were verified by databases and related literature. The results showed that IDLDA might play a key role in biomedical research.
Collapse
Affiliation(s)
- Qi Wang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Guiying Yan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
22
|
Xuan P, Jia L, Zhang T, Sheng N, Li X, Li J. LDAPred: A Method Based on Information Flow Propagation and a Convolutional Neural Network for the Prediction of Disease-Associated lncRNAs. Int J Mol Sci 2019; 20:E4458. [PMID: 31510011 PMCID: PMC6771133 DOI: 10.3390/ijms20184458] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2019] [Revised: 09/05/2019] [Accepted: 09/06/2019] [Indexed: 12/26/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) play a crucial role in the pathogenesis and development of complex diseases. Predicting potential lncRNA-disease associations can improve our understanding of the molecular mechanisms of human diseases and help identify biomarkers for disease diagnosis, treatment, and prevention. Previous research methods have mostly integrated the similarity and association information of lncRNAs and diseases, without considering the topological structure information among these nodes, which is important for predicting lncRNA-disease associations. We propose a method based on information flow propagation and convolutional neural networks, called LDAPred, to predict disease-related lncRNAs. LDAPred not only integrates the similarities, associations, and interactions among lncRNAs, diseases, and miRNAs, but also exploits the topological structures formed by them. In this study, we construct a dual convolutional neural network-based framework that comprises the left and right sides. The embedding layer on the left side is established by utilizing lncRNA, miRNA, and disease-related biological premises. On the right side of the frame, multiple types of similarity, association, and interaction relationships among lncRNAs, diseases, and miRNAs are calculated based on information flow propagation on the bi-layer networks, such as the lncRNA-disease network. They contain the network topological structure and they are learned by the right side of the framework. The experimental results based on five-fold cross-validation indicate that LDAPred performs better than several state-of-the-art methods. Case studies on breast cancer, colon cancer, and osteosarcoma further demonstrate LDAPred's ability to discover potential lncRNA-disease associations.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China.
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Harbin 150090, China.
| | - Lan Jia
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China.
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China.
| | - Nan Sheng
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China.
| | - Xiaokun Li
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Harbin 150090, China.
| | - Jinbao Li
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China.
| |
Collapse
|
23
|
Xuan P, Pan S, Zhang T, Liu Y, Sun H. Graph Convolutional Network and Convolutional Neural Network Based Method for Predicting lncRNA-Disease Associations. Cells 2019; 8:E1012. [PMID: 31480350 PMCID: PMC6769579 DOI: 10.3390/cells8091012] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 08/19/2019] [Accepted: 08/26/2019] [Indexed: 12/11/2022] Open
Abstract
Aberrant expressions of long non-coding RNAs (lncRNAs) are often associated with diseases and identification of disease-related lncRNAs is helpful for elucidating complex pathogenesis. Recent methods for predicting associations between lncRNAs and diseases integrate their pertinent heterogeneous data. However, they failed to deeply integrate topological information of heterogeneous network comprising lncRNAs, diseases, and miRNAs. We proposed a novel method based on the graph convolutional network and convolutional neural network, referred to as GCNLDA, to infer disease-related lncRNA candidates. The heterogeneous network containing the lncRNA, disease, and miRNA nodes, is constructed firstly. The embedding matrix of a lncRNA-disease node pair was constructed according to various biological premises about lncRNAs, diseases, and miRNAs. A new framework based on a graph convolutional network and a convolutional neural network was developed to learn network and local representations of the lncRNA-disease pair. On the left side of the framework, the autoencoder based on graph convolution deeply integrated topological information within the heterogeneous lncRNA-disease-miRNA network. Moreover, as different node features have discriminative contributions to the association prediction, an attention mechanism at node feature level is constructed. The left side learnt the network representation of the lncRNA-disease pair. The convolutional neural networks on the right side of the framework learnt the local representation of the lncRNA-disease pair by focusing on the similarities, associations, and interactions that are only related to the pair. Compared to several state-of-the-art prediction methods, GCNLDA had superior performance. Case studies on stomach cancer, osteosarcoma, and lung cancer confirmed that GCNLDA effectively discovers the potential lncRNA-disease associations.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Shuxiang Pan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China.
| | - Yong Liu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Hao Sun
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
24
|
Xuan P, Cao Y, Zhang T, Kong R, Zhang Z. Dual Convolutional Neural Networks With Attention Mechanisms Based Method for Predicting Disease-Related lncRNA Genes. Front Genet 2019; 10:416. [PMID: 31130990 PMCID: PMC6509943 DOI: 10.3389/fgene.2019.00416] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Accepted: 04/16/2019] [Indexed: 12/30/2022] Open
Abstract
A lot of studies indicated that aberrant expression of long non-coding RNA genes (lncRNAs) is closely related to human diseases. Identifying disease-related lncRNAs (disease lncRNAs) is critical for understanding the pathogenesis and etiology of diseases. Most of the previous methods focus on prioritizing the potential disease lncRNAs based on shallow learning methods. The methods fail to extract the deep and complex feature representations of lncRNA-disease associations. Furthermore, nearly all the methods ignore the discriminative contributions of the similarity, association, and interaction relationships among lncRNAs, disease, and miRNAs for the association prediction. A dual convolutional neural networks with attention mechanisms based method is presented for predicting the candidate disease lncRNAs, and it is referred to as CNNLDA. CNNLDA deeply integrates the multiple source data like the lncRNA similarities, the disease similarities, the lncRNA-disease associations, the lncRNA-miRNA interactions, and the miRNA-disease associations. The diverse biological premises about lncRNAs, miRNAs, and diseases are combined to construct the feature matrix from the biological perspectives. A novel framework based on the dual convolutional neural networks is developed to learn the global and attention representations of the lncRNA-disease associations. The left part of the framework exploits the various information contained by the feature matrix to learn the global representation of lncRNA-disease associations. The different connection relationships among the lncRNA, miRNA, and disease nodes and the different features of these nodes have the discriminative contributions for the association prediction. Hence we present the attention mechanisms from the relationship level and the feature level respectively, and the right part of the framework learns the attention representation of associations. The experimental results based on the cross validation indicate that CNNLDA yields superior performance than several state-of-the-art methods. Case studies on stomach cancer, lung cancer, and colon cancer further demonstrate CNNLDA's ability to discover the potential disease lncRNAs.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Yangkun Cao
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin, China
| | - Rui Kong
- Department of Pancreatic and Biliary Surgery, The First Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Zhaogong Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| |
Collapse
|
25
|
Wang L, Xuan Z, Zhou S, Kuang L, Pei T. A Novel Model for Predicting LncRNA-disease Associations based on the LncRNA-MiRNA-Disease Interactive Network. Curr Bioinform 2019. [DOI: 10.2174/1574893613666180703105258] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Background:
Accumulating experimental studies have manifested that long-non-coding
RNAs (lncRNAs) play an important part in various biological process. It has been shown that their
alterations and dysregulations are closely related to many critical complex diseases.
Objective:
It is of great importance to develop effective computational models for predicting
potential lncRNA-disease associations.
Method:
Based on the hypothesis that there would be potential associations between a lncRNA
and a disease if both of them have associations with the same group of microRNAs, and similar
diseases tend to be in close association with functionally similar lncRNAs. A novel method for
calculating similarities of both lncRNAs and diseases is proposed, and then a novel prediction
model LDLMD for inferring potential lncRNA-disease associations is proposed.
Results:
LDLMD can achieve an AUC of 0.8925 in the Leave-One-Out Cross Validation
(LOOCV), which demonstrated that the newly proposed model LDLMD significantly outperforms
previous state-of-the-art methods and could be a great addition to the biomedical research field.
Conclusion:
Here, we present a new method for predicting lncRNA-disease associations,
moreover, the method of our present decrease the time and cost of biological experiments.
Collapse
Affiliation(s)
- Lei Wang
- College of Information Engineering, Xiangtan University, Xiangtan 411105, China
| | - Zhanwei Xuan
- College of Information Engineering, Xiangtan University, Xiangtan 411105, China
| | - Shunxian Zhou
- College of Information Engineering, Xiangtan University, Xiangtan 411105, China
| | - Linai Kuang
- College of Information Engineering, Xiangtan University, Xiangtan 411105, China
| | - Tingrui Pei
- College of Information Engineering, Xiangtan University, Xiangtan 411105, China
| |
Collapse
|
26
|
Manzanarez-Ozuna E, Flores DL, Gutiérrez-López E, Cervantes D, Juárez P. Model based on GA and DNN for prediction of mRNA-Smad7 expression regulated by miRNAs in breast cancer. Theor Biol Med Model 2018; 15:24. [PMID: 30594253 PMCID: PMC6310970 DOI: 10.1186/s12976-018-0095-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 11/30/2018] [Indexed: 01/06/2023] Open
Abstract
Background The Smad7 protein is negative regulator of the TGF-β signaling pathway, which is upregulated in patients with breast cancer. miRNAs regulate proteins expressions by arresting or degrading the mRNAs. The purpose of this work is to identify a miRNAs profile that regulates the expression of the mRNA coding for Smad7 in breast cancer using the data from patients with breast cancer obtained from the Cancer Genome Atlas Project. Methods We develop an automatic search method based on genetic algorithms to find a predictive model based on deep neural networks (DNN) which fit the set of biological data and apply the Olden algorithm to identify the relative importance of each miRNAs. Results A computational model of non-linear regression is shown, based on deep neural networks that predict the regulation given by the miRNA target transcripts mRNA coding for Smad7 protein in patients with breast cancer, with R2 of 0.99 is shown and MSE of 0.00001. In addition, the model is validated with the results in vivo and in vitro experiments reported in the literature. The set of miRNAs hsa-mir-146a, hsa-mir-93, hsa-mir-375, hsa-mir-205, hsa-mir-15a, hsa-mir-21, hsa-mir-20a, hsa-mir-503, hsa-mir-29c, hsa-mir-497, hsa-mir-107, hsa-mir-125a, hsa-mir-200c, hsa-mir-212, hsa-mir-429, hsa-mir-34a, hsa-let-7c, hsa-mir-92b, hsa-mir-33a, hsa-mir-15b, hsa-mir-224, hsa-mir-185 and hsa-mir-10b integrate a profile that critically regulates the expression of the mRNA coding for Smad7 in breast cancer. Conclusions We developed a genetic algorithm to select best features as DNN inputs (miRNAs). The genetic algorithm also builds the best DNN architecture by optimizing the parameters. Although the confirmation of the results by laboratory experiments has not occurred, the results allow suggesting that miRNAs profile could be used as biomarkers or targets in targeted therapies. Electronic supplementary material The online version of this article (10.1186/s12976-018-0095-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Edgar Manzanarez-Ozuna
- Universidad Autónoma de Baja California, Carretera Transpeninsular Ensenada-Tijuana 3917 Colonia Playitas, C.P. 22860, Ensenada, B.C., Mexico
| | - Dora-Luz Flores
- Universidad Autónoma de Baja California, Carretera Transpeninsular Ensenada-Tijuana 3917 Colonia Playitas, C.P. 22860, Ensenada, B.C., Mexico.
| | - Everardo Gutiérrez-López
- Universidad Autónoma de Baja California, Carretera Transpeninsular Ensenada-Tijuana 3917 Colonia Playitas, C.P. 22860, Ensenada, B.C., Mexico
| | - David Cervantes
- Universidad Autónoma de Baja California, Carretera Transpeninsular Ensenada-Tijuana 3917 Colonia Playitas, C.P. 22860, Ensenada, B.C., Mexico
| | - Patricia Juárez
- Centro de Investigación Científica y de Educación Superior de Ensenada, Carretera Ensenada-Tijuana No. 3918, Zona Playitas, C.P. 22860, Ensenada, B.C., Mexico
| |
Collapse
|
27
|
A Novel Probability Model for LncRNA⁻Disease Association Prediction Based on the Naïve Bayesian Classifier. Genes (Basel) 2018; 9:genes9070345. [PMID: 29986541 PMCID: PMC6071012 DOI: 10.3390/genes9070345] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2018] [Revised: 06/24/2018] [Accepted: 07/03/2018] [Indexed: 12/17/2022] Open
Abstract
An increasing number of studies have indicated that long-non-coding RNAs (lncRNAs) play crucial roles in biological processes, complex disease diagnoses, prognoses, and treatments. However, experimentally validated associations between lncRNAs and diseases are still very limited. Recently, computational models have been developed to discover potential associations between lncRNAs and diseases by integrating multiple heterogeneous biological data; this has become a hot topic in biological research. In this article, we constructed a global tripartite network by integrating a variety of biological information including miRNA–disease, miRNA–lncRNA, and lncRNA–disease associations and interactions. Then, we constructed a global quadruple network by appending gene–lncRNA interaction, gene–disease association, and gene–miRNA interaction networks to the global tripartite network. Subsequently, based on these two global networks, a novel approach was proposed based on the naïve Bayesian classifier to predict potential lncRNA–disease associations (NBCLDA). Comparing with the state-of-the-art methods, our new method does not entirely rely on known lncRNA–disease associations, and can achieve a reliable performance with effective area under ROC curve (AUCs)in leave-one-out cross validation. Moreover, in order to further estimate the performance of NBCLDA, case studies of colorectal cancer, prostate cancer, and glioma were implemented in this paper, and the simulation results demonstrated that NBCLDA can be an excellent tool for biomedical research in the future.
Collapse
|
28
|
A Novel Approach for Predicting Disease-lncRNA Associations Based on the Distance Correlation Set and Information of the miRNAs. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2018; 2018:6747453. [PMID: 30046354 PMCID: PMC6038663 DOI: 10.1155/2018/6747453] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 04/04/2018] [Accepted: 04/17/2018] [Indexed: 12/29/2022]
Abstract
Recently, accumulating laboratorial studies have indicated that plenty of long noncoding RNAs (lncRNAs) play important roles in various biological processes and are associated with many complex human diseases. Therefore, developing powerful computational models to predict correlation between lncRNAs and diseases based on heterogeneous biological datasets will be important. However, there are few approaches to calculating and analyzing lncRNA-disease associations on the basis of information about miRNAs. In this article, a new computational method based on distance correlation set is developed to predict lncRNA-disease associations (DCSLDA). Comparing with existing state-of-the-art methods, we found that the major novelty of DCSLDA lies in the introduction of lncRNA-miRNA-disease network and distance correlation set; thus DCSLDA can be applied to predict potential lncRNA-disease associations without requiring any known disease-lncRNA associations. Simulation results show that DCSLDA can significantly improve previous existing models with reliable AUC of 0.8517 in the leave-one-out cross-validation. Furthermore, while implementing DCSLDA to prioritize candidate lncRNAs for three important cancers, in the first 0.5% of forecast results, 17 predicted associations are verified by other independent studies and biological experimental studies. Hence, it is anticipated that DCSLDA could be a great addition to the biomedical research field.
Collapse
|
29
|
Yang YX, Wei L, Zhang YJ, Hayano T, Piñeiro Pereda MDP, Nakaoka H, Li Q, Barragán Mallofret I, Lu YZ, Tamagnone L, Inoue I, Li X, Luo JY, Zheng K, You H. Long non-coding RNA p10247, high expressed in breast cancer (lncRNA-BCHE), is correlated with metastasis. Clin Exp Metastasis 2018; 35:109-121. [DOI: 10.1007/s10585-018-9901-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2017] [Accepted: 05/11/2018] [Indexed: 10/14/2022]
|
30
|
ncRNA-disease association prediction based on sequence information and tripartite network. BMC SYSTEMS BIOLOGY 2018; 12:37. [PMID: 29671405 PMCID: PMC5907179 DOI: 10.1186/s12918-018-0527-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Background Current technology has demonstrated that mutation and deregulation of non-coding RNAs (ncRNAs) are associated with diverse human diseases and important biological processes. Therefore, developing a novel computational method for predicting potential ncRNA-disease associations could benefit pathologists in understanding the correlation between ncRNAs and disease diagnosis, treatment, and prevention. However, only a few studies have investigated these associations in pathogenesis. Results This study utilizes a disease-target-ncRNA tripartite network, and computes prediction scores between each disease-ncRNA pair by integrating biological information derived from pairwise similarity based upon sequence expressions with weights obtained from a multi-layer resource allocation technique. Our proposed algorithm was evaluated based on a 5-fold-cross-validation with optimal kernel parameter tuning. In addition, we achieved an average AUC that varies from 0.75 without link cut to 0.57 with link cut methods, which outperforms a previous method using the same evaluation methodology. Furthermore, the algorithm predicted 23 ncRNA-disease associations supported by other independent biological experimental studies. Conclusions Taken together, these results demonstrate the capability and accuracy of predicting further biological significant associations between ncRNAs and diseases and highlight the importance of adding biological sequence information to enhance predictions.
Collapse
|
31
|
Chen X, You ZH, Yan GY, Gong DW. IRWRLDA: improved random walk with restart for lncRNA-disease association prediction. Oncotarget 2018; 7:57919-57931. [PMID: 27517318 PMCID: PMC5295400 DOI: 10.18632/oncotarget.11141] [Citation(s) in RCA: 146] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2016] [Accepted: 07/06/2016] [Indexed: 12/11/2022] Open
Abstract
In recent years, accumulating evidences have shown that the dysregulations of lncRNAs are associated with a wide range of human diseases. It is necessary and feasible to analyze known lncRNA-disease associations, predict potential lncRNA-disease associations, and provide the most possible lncRNA-disease pairs for experimental validation. Considering the limitations of traditional Random Walk with Restart (RWR), the model of Improved Random Walk with Restart for LncRNA-Disease Association prediction (IRWRLDA) was developed to predict novel lncRNA-disease associations by integrating known lncRNA-disease associations, disease semantic similarity, and various lncRNA similarity measures. The novelty of IRWRLDA lies in the incorporation of lncRNA expression similarity and disease semantic similarity to set the initial probability vector of the RWR. Therefore, IRWRLDA could be applied to diseases without any known related lncRNAs. IRWRLDA significantly improved previous classical models with reliable AUCs of 0.7242 and 0.7872 in two known lncRNA-disease association datasets downloaded from the lncRNADisease database, respectively. Further case studies of colon cancer and leukemia were implemented for IRWRLDA and 60% of lncRNAs in the top 10 prediction lists have been confirmed by recent experimental reports.
Collapse
Affiliation(s)
- Xing Chen
- School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Zhu-Hong You
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| | - Gui-Ying Yan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China.,National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing, 100190, China
| | - Dun-Wei Gong
- School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
32
|
Chen X, Yan CC, Zhang X, You ZH. Long non-coding RNAs and complex diseases: from experimental results to computational models. Brief Bioinform 2017; 18:558-576. [PMID: 27345524 PMCID: PMC5862301 DOI: 10.1093/bib/bbw060] [Citation(s) in RCA: 312] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2016] [Indexed: 02/07/2023] Open
Abstract
LncRNAs have attracted lots of attentions from researchers worldwide in recent decades. With the rapid advances in both experimental technology and computational prediction algorithm, thousands of lncRNA have been identified in eukaryotic organisms ranging from nematodes to humans in the past few years. More and more research evidences have indicated that lncRNAs are involved in almost the whole life cycle of cells through different mechanisms and play important roles in many critical biological processes. Therefore, it is not surprising that the mutations and dysregulations of lncRNAs would contribute to the development of various human complex diseases. In this review, we first made a brief introduction about the functions of lncRNAs, five important lncRNA-related diseases, five critical disease-related lncRNAs and some important publicly available lncRNA-related databases about sequence, expression, function, etc. Nowadays, only a limited number of lncRNAs have been experimentally reported to be related to human diseases. Therefore, analyzing available lncRNA–disease associations and predicting potential human lncRNA–disease associations have become important tasks of bioinformatics, which would benefit human complex diseases mechanism understanding at lncRNA level, disease biomarker detection and disease diagnosis, treatment, prognosis and prevention. Furthermore, we introduced some state-of-the-art computational models, which could be effectively used to identify disease-related lncRNAs on a large scale and select the most promising disease-related lncRNAs for experimental validation. We also analyzed the limitations of these models and discussed the future directions of developing computational models for lncRNA research.
Collapse
Affiliation(s)
- Xing Chen
- School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou, China
- Corresponding authors. Xing Chen, School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou 221116, China. E-mail: ; Zhu-Hong You, School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China. E-mail:
| | | | - Xu Zhang
- School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai, China
- Corresponding authors. Xing Chen, School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou 221116, China. E-mail: ; Zhu-Hong You, School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China. E-mail:
| | - Zhu-Hong You
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
| |
Collapse
|
33
|
Peng H, Lan C, Liu Y, Liu T, Blumenstein M, Li J. Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes. Oncotarget 2017; 8:78901-78916. [PMID: 29108274 PMCID: PMC5668007 DOI: 10.18632/oncotarget.20481] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2017] [Accepted: 07/19/2017] [Indexed: 12/15/2022] Open
Abstract
Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.
Collapse
Affiliation(s)
- Hui Peng
- Advanced Analytics Institute & Centre for Health Technologies, University of Technology Sydney, Broadway, NSW, Australia
| | - Chaowang Lan
- Advanced Analytics Institute & Centre for Health Technologies, University of Technology Sydney, Broadway, NSW, Australia
| | - Yuansheng Liu
- Advanced Analytics Institute & Centre for Health Technologies, University of Technology Sydney, Broadway, NSW, Australia
| | - Tao Liu
- Centre for Childhood Cancer Research, University of New South Wales, Sydney, Kensington, NSW, Australia
| | - Michael Blumenstein
- School of Software, University of Technology Sydney, Broadway, NSW, Australia
| | - Jinyan Li
- Advanced Analytics Institute & Centre for Health Technologies, University of Technology Sydney, Broadway, NSW, Australia
| |
Collapse
|
34
|
Cao C, Fan R, Zhao J, Zhao X, Yang J, Zhang Z, Xu S. Impact of exudative diathesis induced by selenium deficiency on LncRNAs and their roles in the oxidative reduction process in broiler chick veins. Oncotarget 2017; 8:20695-20705. [PMID: 28157700 PMCID: PMC5400537 DOI: 10.18632/oncotarget.14971] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 01/24/2017] [Indexed: 02/07/2023] Open
Abstract
Selenium deficiency may induce exudative diathesis (ED) in broiler chick, and this damage is closely related to oxidative damage. Long noncoding RNA (LncRNA) can regulate the redox state in vivo. The aim of the present study was to clarify the LncRNA expression profile in broiler veins and filter and verify the LncRNAs related to oxidative damage of ED. This study established an ED model induced by selenium deficiency and presented the expression and characterization of LncRNAs in normal and ED samples. A total of 15412 LncRNAs (including 8052 novel LncRNAs) were generated in six cDNA libraries using the Illumina Hi-Seq 4000 platform. 635 distinct changes in LncRNAs (up-regulated fold change > 1.5, down-regulated fold change < 0.67 and differentially expressed LncRNAs) were filtered. Gene ontology enrichment on LncRNAs target genes showed that the oxidative reduction process was important. This study also defined and verified 19 target mRNAs of 23 LncRNAs related to the oxidative reduction process. The in vivo and vitro experiments also demonstrated these 23 LncRNAs can participate in the oxidative reduction process. This study presents LncRNAs expression profile in broiler chick veins for the first time and confirmed 23 LncRNAs involving in the vein oxidative damage in ED.
Collapse
Affiliation(s)
- Changyu Cao
- Department of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| | - Ruifeng Fan
- Department of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| | - Jinxin Zhao
- Department of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| | - Xia Zhao
- Department of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| | - Jie Yang
- Department of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| | - Ziwei Zhang
- Department of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| | - Shiwen Xu
- Department of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China.,Key Laboratory of Animal Cellular and Genetic Engineering of Heilongjiang Province, Northeast Agricultural University, Harbin 150030, P. R. China
| |
Collapse
|
35
|
Xu C, Qi R, Ping Y, Li J, Zhao H, Wang L, Du MY, Xiao Y, Li X. Systemically identifying and prioritizing risk lncRNAs through integration of pan-cancer phenotype associations. Oncotarget 2017; 8:12041-12051. [PMID: 28076842 PMCID: PMC5355324 DOI: 10.18632/oncotarget.14510] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Accepted: 12/12/2016] [Indexed: 02/01/2023] Open
Abstract
LncRNAs have emerged as a major class of regulatory molecules involved in normal cellular physiology and disease, our knowledge of lncRNAs is very limited and it has become a major research challenge in discovering novel disease-related lncRNAs in cancers. Based on the assumption that diverse diseases with similar phenotype associations show similar molecular mechanisms, we presented a pan-cancer network-based prioritization approach to systematically identify disease-specific risk lncRNAs by integrating disease phenotype associations. We applied this strategy to approximately 2800 tumor samples from 14 cancer types for prioritizing disease risk lncRNAs. Our approach yielded an average area under the ROC curve (AUC) of 80.66%, with the highest AUC (98.14%) for medulloblastoma. When evaluated using leave-one-out cross-validation (LOOCV) for prioritization of disease candidate genes, the average AUC score of 97.16% was achieved. Moreover, we demonstrated the robustness as well as the integrative importance of this approach, including disease phenotype associations, known disease genes and the numbers of cancer types. Taking glioblastoma multiforme as a case study, we identified a candidate lncRNA gene SNHG1 as a novel disease risk factor for disease diagnosis and prognosis. In summary, we provided a novel lncRNA prioritization approach by integrating pan-cancer phenotype associations that could help researchers better understand the important roles of lncRNAs in human cancers.
Collapse
Affiliation(s)
- Chaohan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Rui Qi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Yanyan Ping
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Jie Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Hongying Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Li Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | | | - Yun Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China.,Key Laboratory of Cardiovascular Medicine Research, Harbin Medical University, Ministry of Education, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| |
Collapse
|
36
|
Yu G, Fu G, Lu C, Ren Y, Wang J. BRWLDA: bi-random walks for predicting lncRNA-disease associations. Oncotarget 2017; 8:60429-60446. [PMID: 28947982 PMCID: PMC5601150 DOI: 10.18632/oncotarget.19588] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 06/19/2017] [Indexed: 12/20/2022] Open
Abstract
Increasing efforts have been done to figure out the association between lncRNAs and complex diseases. Many computational models construct various lncRNA similarity networks, disease similarity networks, along with known lncRNA-disease associations to infer novel associations. However, most of them neglect the structural difference between lncRNAs network and diseases network, hierarchical relationships between diseases and pattern of newly discovered associations. In this study, we developed a model that performs Bi-Random Walks to predict novel LncRNA-Disease Associations (BRWLDA in short). This model utilizes multiple heterogeneous data to construct the lncRNA functional similarity network, and Disease Ontology to construct a disease network. It then constructs a directed bi-relational network based on these two networks and available lncRNAs-disease associations. Next, it applies bi-random walks on the network to predict potential associations. BRWLDA achieves reliable and better performance than other comparing methods not only on experiment verified associations, but also on the simulated experiments with masked associations. Case studies further demonstrate the feasibility of BRWLDA in identifying new lncRNA-disease associations.
Collapse
Affiliation(s)
- Guoxian Yu
- College of Computer and Information Sciences, Southwest University, Chongqing, China
| | - Guangyuan Fu
- College of Computer and Information Sciences, Southwest University, Chongqing, China
| | - Chang Lu
- College of Computer and Information Sciences, Southwest University, Chongqing, China
| | - Yazhou Ren
- Big Data Research Center, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Jun Wang
- College of Computer and Information Sciences, Southwest University, Chongqing, China
| |
Collapse
|
37
|
Yao Q, Wu L, Li J, Yang LG, Sun Y, Li Z, He S, Feng F, Li H, Li Y. Global Prioritizing Disease Candidate lncRNAs via a Multi-level Composite Network. Sci Rep 2017; 7:39516. [PMID: 28051121 PMCID: PMC5209722 DOI: 10.1038/srep39516] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 10/21/2016] [Indexed: 01/14/2023] Open
Abstract
LncRNAs play pivotal roles in many important biological processes, but research on the functions of lncRNAs in human disease is still in its infancy. Therefore, it is urgent to prioritize lncRNAs that are potentially associated with diseases. In this work, we developed a novel algorithm, LncPriCNet, that uses a multi-level composite network to prioritize candidate lncRNAs associated with diseases. By integrating genes, lncRNAs, phenotypes and their associations, LncPriCNet achieves an overall performance superior to that of previous methods, with high AUC values of up to 0.93. Notably, LncPriCNet still performs well when information on known disease lncRNAs is lacking. When applied to breast cancer, LncPriCNet identified known breast cancer-related lncRNAs, revealed novel lncRNA candidates and inferred their functions via pathway analysis. We further constructed the human disease-lncRNA landscape, revealed the modularity of the disease-lncRNA network and identified several lncRNA hotspots. In summary, LncPriCNet is a useful tool for prioritizing disease-related lncRNAs and may facilitate understanding of the molecular mechanisms of human disease at the lncRNA level.
Collapse
Affiliation(s)
- Qianlan Yao
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200031, China
| | - Leilei Wu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200031, China
| | - Jia Li
- CAS Key Laboratory for Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Li guang Yang
- CAS Key Laboratory for Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yidi Sun
- CAS Key Laboratory for Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhen Li
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200031, China
| | - Sheng He
- CAS Key Laboratory for Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Fangyoumin Feng
- CAS Key Laboratory for Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Hong Li
- CAS Key Laboratory for Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Yixue Li
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200031, China
- CAS Key Laboratory for Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
- Collaborative Innovation Center of Genetics and Development, Fudan University, Shanghai 200433, China
| |
Collapse
|
38
|
Wang J, Ma R, Ma W, Chen J, Yang J, Xi Y, Cui Q. LncDisease: a sequence based bioinformatics tool for predicting lncRNA-disease associations. Nucleic Acids Res 2016; 44:e90. [PMID: 26887819 PMCID: PMC4872090 DOI: 10.1093/nar/gkw093] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Revised: 02/04/2016] [Accepted: 02/06/2016] [Indexed: 02/07/2023] Open
Abstract
LncRNAs represent a large class of noncoding RNA molecules that have important functions and play key roles in a variety of human diseases. There is an urgent need to develop bioinformatics tools as to gain insight into lncRNAs. This study developed a sequence-based bioinformatics method, LncDisease, to predict the lncRNA-disease associations based on the crosstalk between lncRNAs and miRNAs. Using LncDisease, we predicted the lncRNAs associated with breast cancer and hypertension. The breast-cancer-associated lncRNAs were studied in two breast tumor cell lines, MCF-7 and MDA-MB-231. The qRT-PCR results showed that 11 (91.7%) of the 12 predicted lncRNAs could be validated in both breast cancer cell lines. The hypertension-associated lncRNAs were further evaluated in human vascular smooth muscle cells (VSMCs) stimulated with angiotensin II (Ang II). The qRT-PCR results showed that 3 (75.0%) of the 4 predicted lncRNAs could be validated in Ang II-treated human VSMCs. In addition, we predicted 6 diseases associated with the lncRNA GAS5 and validated 4 (66.7%) of them by literature mining. These results greatly support the specificity and efficacy of LncDisease in the study of lncRNAs in human diseases. The LncDisease software is freely available on the Software Page: http://www.cuilab.cn/.
Collapse
Affiliation(s)
- Junyi Wang
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China MOE Key Lab of Cardiovascular Sciences, Peking University, 38 Xueyuan Road, Beijing 100191, China
| | - Ruixia Ma
- Mitchell Cancer Institute, University of South Alabama, 1160 Springhill Ave, Mobile, AL 36604, USA
| | - Wei Ma
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China MOE Key Lab of Cardiovascular Sciences, Peking University, 38 Xueyuan Road, Beijing 100191, China Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Ji Chen
- MOE Key Lab of Cardiovascular Sciences, Peking University, 38 Xueyuan Road, Beijing 100191, China Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Jichun Yang
- MOE Key Lab of Cardiovascular Sciences, Peking University, 38 Xueyuan Road, Beijing 100191, China Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Yaguang Xi
- Mitchell Cancer Institute, University of South Alabama, 1160 Springhill Ave, Mobile, AL 36604, USA
| | - Qinghua Cui
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China MOE Key Lab of Cardiovascular Sciences, Peking University, 38 Xueyuan Road, Beijing 100191, China Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China Beijing Key Laboratory of Tumor Systems Biology, Peking University, 38 Xueyuan Road, Beijing 100191, China
| |
Collapse
|
39
|
Chen X. KATZLDA: KATZ measure for the lncRNA-disease association prediction. Sci Rep 2015; 5:16840. [PMID: 26577439 PMCID: PMC4649494 DOI: 10.1038/srep16840] [Citation(s) in RCA: 151] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 10/21/2015] [Indexed: 12/28/2022] Open
Abstract
Accumulating experimental studies have demonstrated important associations between alterations and dysregulations of lncRNAs and the development and progression of various complex human diseases. Developing effective computational models to integrate vast amount of heterogeneous biological data for the identification of potential disease-lncRNA associations has become a hot topic in the fields of human complex diseases and lncRNAs, which could benefit lncRNA biomarker detection for disease diagnosis, treatment, and prevention. Considering the limitations in previous computational methods, the model of KATZ measure for LncRNA-Disease Association prediction (KATZLDA) was developed to uncover potential lncRNA-disease associations by integrating known lncRNA-disease associations, lncRNA expression profiles, lncRNA functional similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity. KATZLDA could work for diseases without known related lncRNAs and lncRNAs without known associated diseases. KATZLDA obtained reliable AUCs of 7175, 0.7886, 0.7719 in the local and global leave-one-out cross validation and 5-fold cross validation, respectively, significantly improving previous classical methods. Furthermore, case studies of colon, gastric, and renal cancer were implemented and 60% of top 10 predictions have been confirmed by recent biological experiments. It is anticipated that KATZLDA could be an important resource with potential values for biomedical researches.
Collapse
Affiliation(s)
- Xing Chen
- National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing, 100190, China.,Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
| |
Collapse
|
40
|
Gong Z, Yang Q, Zeng Z, Zhang W, Li X, Zu X, Deng H, Chen P, Liao Q, Xiang B, Zhou M, Li X, Li Y, Xiong W, Li G. An integrative transcriptomic analysis reveals p53 regulated miRNA, mRNA, and lncRNA networks in nasopharyngeal carcinoma. Tumour Biol 2015; 37:3683-95. [PMID: 26462838 DOI: 10.1007/s13277-015-4156-x] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Accepted: 09/23/2015] [Indexed: 12/12/2022] Open
Abstract
It has been reported that p53 dysfunction is closely related to the carcinogenesis of nasopharyngeal carcinoma (NPC). Recently, an increasing body of evidence has indicated that microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) participate in p53-associated signaling pathways and, in addition to mRNAs, form a complex regulation network to promote tumor occurrence and progression. The aim of this study was to elucidate the p53-regulated miRNAs, mRNAs, and lncRNAs and their regulating networks in NPC. Firstly, we overexpressed p53 in the NPC cell line HNE2 and performed transcriptomic gene expression profiling (GEP) analysis, which included miRNAs, mRNAs, and lncRNAs, using microarray technology at 0, 12, 24, and 48 h after transfection. There were 38 miRNAs (33 upregulated and 5 downregulated), 2107 mRNAs (296 upregulated and 1811 downregulated), and 1190 lncRNAs (133 upregulated and 1057 downregulated) that were significantly dysregulated by p53. Some of the dysregulated molecules were confirmed by quantitative real-time polymerase chain reaction (qRT-PCR). Then, we integrated previously published miRNAs, mRNAs, and lncRNAs GEP datasets from NPC biopsies to investigate the expression of these p53 regulated molecules and found that 7 miRNAs, 218 mRNAs, and 101 lncRNAs regulated by p53 were also differentially expressed in NPC tissues. Finally, p53-regulated miRNA, mRNA, and lncRNA networks were constructed using bioinformatics methods. These miRNAs, mRNAs, and lncRNAs may participate in p53 downstream signaling pathways and play important roles in the carcinogenesis of NPC. Thorough investigations of their biological functions and regulating relationships will provide a novel view of the p53 signaling pathway, and the restoration of p53 functioning or its downstream gene regulating network is potentially of great value in treating NPC patients.
Collapse
Affiliation(s)
- Zhaojian Gong
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis of Ministry of Health and Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Qian Yang
- Key Laboratory of Carcinogenesis of Ministry of Health and Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China.,School of Nursing, Hunan Polytechnic of Environment and Biology, Hengyang, Hunan, China
| | - Zhaoyang Zeng
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China. .,Key Laboratory of Carcinogenesis of Ministry of Health and Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China. .,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China.
| | - Wenling Zhang
- Key Laboratory of Carcinogenesis of Ministry of Health and Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Xiayu Li
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Xuyu Zu
- Clinical Research Institution, the First Affiliated Hospital, University of South China, Hengyang, Hunan, China
| | - Hao Deng
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Pan Chen
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China
| | - Qianjin Liao
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China
| | - Bo Xiang
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis of Ministry of Health and Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Ming Zhou
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis of Ministry of Health and Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Xiaoling Li
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis of Ministry of Health and Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Yong Li
- Key Laboratory of Carcinogenesis of Ministry of Health and Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Department of Cancer Biology, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Wei Xiong
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis of Ministry of Health and Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Guiyuan Li
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.,Key Laboratory of Carcinogenesis of Ministry of Health and Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.,Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| |
Collapse
|
41
|
Chen X. Predicting lncRNA-disease associations and constructing lncRNA functional similarity network based on the information of miRNA. Sci Rep 2015; 5:13186. [PMID: 26278472 PMCID: PMC4538606 DOI: 10.1038/srep13186] [Citation(s) in RCA: 152] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2015] [Accepted: 07/22/2015] [Indexed: 12/16/2022] Open
Abstract
Accumulating experimental studies have indicated that lncRNAs play important roles in various critical biological process and their alterations and dysregulations have been associated with many important complex diseases. Developing effective computational models to predict potential disease-lncRNA association could benefit not only the understanding of disease mechanism at lncRNA level, but also the detection of disease biomarkers for disease diagnosis, treatment, prognosis and prevention. However, known experimentally confirmed disease-lncRNA associations are still very limited. In this study, a novel model of HyperGeometric distribution for LncRNA-Disease Association inference (HGLDA) was developed to predict lncRNA-disease associations by integrating miRNA-disease associations and lncRNA-miRNA interactions. Although HGLDA didn't rely on any known disease-lncRNA associations, it still obtained an AUC of 0.7621 in the leave-one-out cross validation. Furthermore, 19 predicted associations for breast cancer, lung cancer, and colorectal cancer were verified by biological experimental studies. Furthermore, the model of LncRNA Functional Similarity Calculation based on the information of MiRNA (LFSCM) was developed to calculate lncRNA functional similarity on a large scale by integrating disease semantic similarity, miRNA-disease associations, and miRNA-lncRNA interactions. It is anticipated that HGLDA and LFSCM could be effective biological tools for biomedical research.
Collapse
Affiliation(s)
- Xing Chen
- National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing, 100190, China
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
| |
Collapse
|
42
|
Chen X, Yan CC, Luo C, Ji W, Zhang Y, Dai Q. Constructing lncRNA functional similarity network based on lncRNA-disease associations and disease semantic similarity. Sci Rep 2015; 5:11338. [PMID: 26061969 PMCID: PMC4462156 DOI: 10.1038/srep11338] [Citation(s) in RCA: 156] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 05/21/2015] [Indexed: 12/28/2022] Open
Abstract
Increasing evidence has indicated that plenty of lncRNAs play important roles in many critical biological processes. Developing powerful computational models to construct lncRNA functional similarity network based on heterogeneous biological datasets is one of the most important and popular topics in the fields of both lncRNAs and complex diseases. Functional similarity network consturction could benefit the model development for both lncRNA function inference and lncRNA-disease association identification. However, little effort has been attempted to analysis and calculate lncRNA functional similarity on a large scale. In this study, based on the assumption that functionally similar lncRNAs tend to be associated with similar diseases, we developed two novel lncRNA functional similarity calculation models (LNCSIM). LNCSIM was evaluated by introducing similarity scores into the model of Laplacian Regularized Least Squares for LncRNA–Disease Association (LRLSLDA) for lncRNA-disease association prediction. As a result, new predictive models improved the performance of LRLSLDA in the leave-one-out cross validation of various known lncRNA-disease associations datasets. Furthermore, some of the predictive results for colorectal cancer and lung cancer were verified by independent biological experimental studies. It is anticipated that LNCSIM could be a useful and important biological tool for human disease diagnosis, treatment, and prevention.
Collapse
Affiliation(s)
- Xing Chen
- 1] National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing, 100190, China [2] Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
| | | | - Cai Luo
- Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Wen Ji
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
| | - Yongdong Zhang
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
| | - Qionghai Dai
- Department of Automation, Tsinghua University, Beijing, 100084, China
| |
Collapse
|