1
|
Xie GB, Chen RB, Lin ZY, Gu GS, Yu JR, Liu ZG, Cui J, Lin LQ, Chen LC. Predicting lncRNA-disease associations based on combining selective similarity matrix fusion and bidirectional linear neighborhood label propagation. Brief Bioinform 2023; 24:6966536. [PMID: 36592062 DOI: 10.1093/bib/bbac595] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 11/30/2022] [Accepted: 12/04/2022] [Indexed: 01/03/2023] Open
Abstract
Recent studies have revealed that long noncoding RNAs (lncRNAs) are closely linked to several human diseases, providing new opportunities for their use in detection and therapy. Many graph propagation and similarity fusion approaches can be used for predicting potential lncRNA-disease associations. However, existing similarity fusion approaches suffer from noise and self-similarity loss in the fusion process. To address these problems, a new prediction approach, termed SSMF-BLNP, based on organically combining selective similarity matrix fusion (SSMF) and bidirectional linear neighborhood label propagation (BLNP), is proposed in this paper to predict lncRNA-disease associations. In SSMF, self-similarity networks of lncRNAs and diseases are obtained by selective preprocessing and nonlinear iterative fusion. The fusion process assigns weights to each initial similarity network and introduces a unit matrix that can reduce noise and compensate for the loss of self-similarity. In BLNP, the initial lncRNA-disease associations are employed in both lncRNA and disease directions as label information for linear neighborhood label propagation. The propagation was then performed on the self-similarity network obtained from SSMF to derive the scoring matrix for predicting the relationships between lncRNAs and diseases. Experimental results showed that SSMF-BLNP performed better than seven other state of-the-art approaches. Furthermore, a case study demonstrated up to 100% and 80% accuracy in 10 lncRNAs associated with hepatocellular carcinoma and 10 lncRNAs associated with renal cell carcinoma, respectively. The source code and datasets used in this paper are available at: https://github.com/RuiBingo/SSMF-BLNP.
Collapse
Affiliation(s)
- Guo-Bo Xie
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Rui-Bin Chen
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Zhi-Yi Lin
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Guo-Sheng Gu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Jun-Rui Yu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Zhen-Guo Liu
- Department of Thoracic Surgery, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, 510080, China
| | - Ji Cui
- Department of Gastrointestinal Surgery, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, 510080, China
| | - Lie-Qing Lin
- Center of Campus Network & Modern Educational Technology, Guangdong University of Technology, Guangzhou, 510000, China
| | - Lang-Cheng Chen
- Center of Campus Network & Modern Educational Technology, Guangdong University of Technology, Guangzhou, 510000, China
| |
Collapse
|
2
|
Silva ABOV, Spinosa EJ. Graph Convolutional Auto-Encoders for Predicting Novel lncRNA-Disease Associations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2264-2271. [PMID: 33819159 DOI: 10.1109/tcbb.2021.3070910] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
LncRNAs are intermediate molecules that participate in the most diverse biological processes in humans, such as gene expression control and X-chromosome inactivation. Numerous researches have associated lncRNAs with a wide range of diseases, such as breast cancer, leukemia, and many other conditions. In this work, we propose a graph-based method named PANDA. This method treats the prediction of new associations between lncRNAs and diseases as a link prediction problem in a graph. We start by building a heterogeneous graph that contains the known associations between lncRNAs and diseases and additional information such as gene expression levels and symptoms of diseases. We then use a Graph Auto-encoder to learn the representation of the nodes' features and edges, finally applying a Neural Network to predict potentially interesting novel edges. The experimental results indicate that PANDA achieved a 0.976 AUC-ROC, surpassing state-of-the-art methods for the same problem, showing that PANDA could be a promising approach to generate embeddings to predict potentially novel lncRNA-disease associations.
Collapse
|
3
|
Chen M, Deng Y, Li A, Tan Y. Inferring Latent Disease-lncRNA Associations by Label-Propagation Algorithm and Random Projection on a Heterogeneous Network. Front Genet 2022; 13:798632. [PMID: 35186029 PMCID: PMC8854791 DOI: 10.3389/fgene.2022.798632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 01/18/2022] [Indexed: 11/13/2022] Open
Abstract
Long noncoding RNA (lncRNA), a type of more than 200 nucleotides non-coding RNA, is related to various complex diseases. To precisely identify the potential lncRNA–disease association is important to understand the disease pathogenesis, to develop new drugs, and to design individualized diagnosis and treatment methods for different human diseases. Compared with the complexity and high cost of biological experiments, computational methods can quickly and effectively predict potential lncRNA–disease associations. Thus, it is a promising avenue to develop computational methods for lncRNA-disease prediction. However, owing to the low prediction accuracy ofstate of the art methods, it is vastly challenging to accurately and effectively identify lncRNA-disease at present. This article proposed an integrated method called LPARP, which is based on label-propagation algorithm and random projection to address the issue. Specifically, the label-propagation algorithm is initially used to obtain the estimated scores of lncRNA–disease associations, and then random projections are used to accurately predict disease-related lncRNAs.The empirical experiments showed that LAPRP achieved good prediction on three golddatasets, which is superior to existing state-of-the-art prediction methods. It can also be used to predict isolated diseases and new lncRNAs. Case studies of bladder cancer, esophageal squamous-cell carcinoma, and colorectal cancer further prove the reliability of the method. The proposed LPARP algorithm can predict the potential lncRNA–disease interactions stably and effectively with fewer data. LPARP can be used as an effective and reliable tool for biomedical research.
Collapse
|
4
|
Zhang Y, Chen M, Huang L, Xie X, Li X, Jin H, Wang X, Wei H. Fusion of KATZ measure and space projection to fast probe potential lncRNA-disease associations in bipartite graphs. PLoS One 2021; 16:e0260329. [PMID: 34807960 PMCID: PMC8608294 DOI: 10.1371/journal.pone.0260329] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 11/06/2021] [Indexed: 11/19/2022] Open
Abstract
It is well known that numerous long noncoding RNAs (lncRNAs) closely relate to the physiological and pathological processes of human diseases and can serves as potential biomarkers. Therefore, lncRNA-disease associations that are identified by computational methods as the targeted candidates reduce the cost of biological experiments focusing on deep study furtherly. However, inaccurate construction of similarity networks and inadequate numbers of observed known lncRNA–disease associations, such inherent problems make many mature computational methods that have been developed for many years still exit some limitations. It motivates us to explore a new computational method that was fused with KATZ measure and space projection to fast probing potential lncRNA-disease associations (namely KATZSP). KATZSP is comprised of following key steps: combining all the global information with which to change Boolean network of known lncRNA–disease associations into the weighted networks; changing the similarities calculation into counting the number of walks that connect lncRNA nodes and disease nodes in bipartite graphs; obtaining the space projection scores to refine the primary prediction scores. The process to fuse KATZ measure and space projection was simplified and uncomplicated with needing only one attenuation factor. The leave-one-out cross validation (LOOCV) experimental results showed that, compared with other state-of-the-art methods (NCPLDA, LDAI-ISPS and IIRWR), KATZSP had a higher predictive accuracy shown with area-under-the-curve (AUC) value on the three datasets built, while KATZSP well worked on inferring potential associations related to new lncRNAs (or isolated diseases). The results from real cases study (such as pancreas cancer, lung cancer and colorectal cancer) further confirmed that KATZSP is capable of superior predictive ability to be applied as a guide for traditional biological experiments.
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
- Guangxi Key Laboratory of Embedded Technology and Intelligent System, Guilin University of Technology, Guilin, China
| | - Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, China
- The Future Laboratory, Tsinghua University, Beijing, China
| | - Xiaolan Xie
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Xin Li
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Hong Jin
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Xiaohua Wang
- Pharmacy School, Guilin Medical University, Guilin, China
| | - Hanyan Wei
- Pharmacy School, Guilin Medical University, Guilin, China
| |
Collapse
|
5
|
ICLRBBN: a tool for accurate prediction of potential lncRNA disease associations. MOLECULAR THERAPY-NUCLEIC ACIDS 2020; 23:501-511. [PMID: 33510939 PMCID: PMC7806946 DOI: 10.1016/j.omtn.2020.12.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 12/06/2020] [Indexed: 12/12/2022]
Abstract
Growing evidence has elucidated that long non-coding RNAs (lncRNAs) are involved in a variety of complex diseases in human bodies. In recent years, it has become a hot topic to develop effective computational models to identify potential lncRNA-disease associations. In this article, a novel method called ICLRBBN (Internal Confidence-Based Local Radial Basis Biological Network) is proposed to detect potential lncRNA-disease associations by adopting an internal confidence-based radial basis biological network. In ICLRBBN, a novel internal confidence-based collaborative filtering recommendation algorithm was designed first to mine hidden features between lncRNAs and diseases, which guarantees that ICLRBBN can be more effectively applied to predict new diseases. Then, a unique three-layer local radial basis function network consisting of diseases and lncRNAs was constructed, based on which the association probability between diseases and lncRNAs was calculated by combining different characteristics of lncRNAs with local information of diseases. Finally, we compared ICLRBBN with 6 state-of-the-art methods based on two different validation frameworks. Simulation results showed that area under the receiver operating characteristic curve (AUC) values achieved by ICLRBBN outperformed all competing methods. Furthermore, case studies illustrated that ICLRBBN has a promising future as a powerful tool in the practical application of lncRNA-disease association prediction. A web service for prediction of potential lncRNA-disease associations is available at http://leelab2997.cn/.
Collapse
|
6
|
Lei X, Zhang C, Wang Y. Predicting Metabolite-Disease Associations Based on Spy Strategy and ABC Algorithm. Front Mol Biosci 2020; 7:603121. [PMID: 33344506 PMCID: PMC7747351 DOI: 10.3389/fmolb.2020.603121] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2020] [Accepted: 10/08/2020] [Indexed: 12/12/2022] Open
Abstract
In recent years, latent metabolite-disease associations have been a significant focus in the biomedical domain. And more and more experimental evidence has been adduced that metabolites correlate with the diagnosis of complex human diseases. Several computational methods have been developed to detect potential metabolite-disease associations. In this article, we propose a novel method based on the spy strategy and an artificial bee colony (ABC) algorithm for metabolite-disease association prediction (SSABCMDA). Due to the fact that there are large parts of missing associations in unconfirmed metabolite-disease pairs, spy strategy is adopted to extract reliable negative samples from unconfirmed pairs. Considering the effects of parameters, the ABC algorithm is utilized to optimize parameters. In relevant cross-validation experiments, our method achieves excellent predictive performance. Moreover, three types of case studies are conducted on three common diseases to demonstrate the validity and utility of SSABCMDA method. Relevant experimental results indicate that our method can predict potential associations between metabolites and diseases effectively.
Collapse
Affiliation(s)
- Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Cheng Zhang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Yueyue Wang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| |
Collapse
|
7
|
Yan C, Zhang Z, Bao S, Hou P, Zhou M, Xu C, Sun J. Computational Methods and Applications for Identifying Disease-Associated lncRNAs as Potential Biomarkers and Therapeutic Targets. MOLECULAR THERAPY. NUCLEIC ACIDS 2020; 21:156-171. [PMID: 32585624 PMCID: PMC7321789 DOI: 10.1016/j.omtn.2020.05.018] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Revised: 04/06/2020] [Accepted: 05/18/2020] [Indexed: 12/12/2022]
Abstract
Long non-coding RNAs (lncRNAs) have been recognized as critical components of a broad genomic regulatory network and play pivotal roles in physiological and pathological processes. Identification of disease-associated lncRNAs is becoming increasingly crucial for fundamentally improving our understanding of molecular mechanisms of disease and developing novel biomarkers and therapeutic targets. Considering lower efficiency and higher time and labor cost of biological experiments, computer-aided inference of disease-associated RNAs has become a promising avenue for facilitating the study of lncRNA functions and provides complementary value for experimental studies. In this study, we first summarize data and knowledge resources publicly available for the study of lncRNA-disease associations. Then, we present an updated systematic overview of dozens of computational methods and models for inferring lncRNA-disease associations proposed in recent years. Finally, we explore the perspectives and challenges for further studies. Our study provides a guide for biologists and medical scientists to look for dedicated resources and more competent tools for accelerating the unraveling of disease-associated lncRNAs.
Collapse
Affiliation(s)
- Congcong Yan
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Zicheng Zhang
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Siqi Bao
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Ping Hou
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Meng Zhou
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Chongyong Xu
- Department of Radiology, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou 325027, P.R. China.
| | - Jie Sun
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China.
| |
Collapse
|
8
|
A random forest based computational model for predicting novel lncRNA-disease associations. BMC Bioinformatics 2020; 21:126. [PMID: 32216744 PMCID: PMC7099795 DOI: 10.1186/s12859-020-3458-1] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 03/18/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Accumulated evidence shows that the abnormal regulation of long non-coding RNA (lncRNA) is associated with various human diseases. Accurately identifying disease-associated lncRNAs is helpful to study the mechanism of lncRNAs in diseases and explore new therapies of diseases. Many lncRNA-disease association (LDA) prediction models have been implemented by integrating multiple kinds of data resources. However, most of the existing models ignore the interference of noisy and redundancy information among these data resources. RESULTS To improve the ability of LDA prediction models, we implemented a random forest and feature selection based LDA prediction model (RFLDA in short). First, the RFLDA integrates the experiment-supported miRNA-disease associations (MDAs) and LDAs, the disease semantic similarity (DSS), the lncRNA functional similarity (LFS) and the lncRNA-miRNA interactions (LMI) as input features. Then, the RFLDA chooses the most useful features to train prediction model by feature selection based on the random forest variable importance score that takes into account not only the effect of individual feature on prediction results but also the joint effects of multiple features on prediction results. Finally, a random forest regression model is trained to score potential lncRNA-disease associations. In terms of the area under the receiver operating characteristic curve (AUC) of 0.976 and the area under the precision-recall curve (AUPR) of 0.779 under 5-fold cross-validation, the performance of the RFLDA is better than several state-of-the-art LDA prediction models. Moreover, case studies on three cancers demonstrate that 43 of the 45 lncRNAs predicted by the RFLDA are validated by experimental data, and the other two predicted lncRNAs are supported by other LDA prediction models. CONCLUSIONS Cross-validation and case studies indicate that the RFLDA has excellent ability to identify potential disease-associated lncRNAs.
Collapse
|
9
|
Zhang Y, Chen M, Li A, Cheng X, Jin H, Liu Y. LDAI-ISPS: LncRNA-Disease Associations Inference Based on Integrated Space Projection Scores. Int J Mol Sci 2020; 21:E1508. [PMID: 32098405 PMCID: PMC7073162 DOI: 10.3390/ijms21041508] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Revised: 02/18/2020] [Accepted: 02/19/2020] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNAs (long ncRNAs, lncRNAs) of all kinds have been implicated in a range of cell developmental processes and diseases, while they are not translated into proteins. Inferring diseases associated lncRNAs by computational methods can be helpful to understand the pathogenesis of diseases, but those current computational methods still have not achieved remarkable predictive performance: such as the inaccurate construction of similarity networks and inadequate numbers of known lncRNA-disease associations. In this research, we proposed a lncRNA-disease associations inference based on integrated space projection scores (LDAI-ISPS) composed of the following key steps: changing the Boolean network of known lncRNA-disease associations into the weighted networks via combining all the global information (e.g., disease semantic similarities, lncRNA functional similarities, and known lncRNA-disease associations); obtaining the space projection scores via vector projections of the weighted networks to form the final prediction scores without biases. The leave-one-out cross validation (LOOCV) results showed that, compared with other methods, LDAI-ISPS had a higher accuracy with area-under-the-curve (AUC) value of 0.9154 for inferring diseases, with AUC value of 0.8865 for inferring new lncRNAs (whose associations related to diseases are unknown), with AUC value of 0.7518 for inferring isolated diseases (whose associations related to lncRNAs are unknown). A case study also confirmed the predictive performance of LDAI-ISPS as a helper for traditional biological experiments in inferring the potential LncRNA-disease associations and isolated diseases.
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| | - Min Chen
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang 421002, China
| | - Ang Li
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang 421002, China
| | - Xiaohui Cheng
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| | - Hong Jin
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| | - Yarong Liu
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| |
Collapse
|