1
|
Wu J, Zhao X, He Y, Pan B, Lai J, Ji M, Li S, Huang J, Han J. IDMIR: identification of dysregulated miRNAs associated with disease based on a miRNA-miRNA interaction network constructed through gene expression data. Brief Bioinform 2024; 25:bbae258. [PMID: 38801703 PMCID: PMC11129766 DOI: 10.1093/bib/bbae258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 05/10/2024] [Accepted: 05/15/2024] [Indexed: 05/29/2024] Open
Abstract
Micro ribonucleic acids (miRNAs) play a pivotal role in governing the human transcriptome in various biological phenomena. Hence, the accumulation of miRNA expression dysregulation frequently assumes a noteworthy role in the initiation and progression of complex diseases. However, accurate identification of dysregulated miRNAs still faces challenges at the current stage. Several bioinformatics tools have recently emerged for forecasting the associations between miRNAs and diseases. Nonetheless, the existing reference tools mainly identify the miRNA-disease associations in a general state and fall short of pinpointing dysregulated miRNAs within a specific disease state. Additionally, no studies adequately consider miRNA-miRNA interactions (MMIs) when analyzing the miRNA-disease associations. Here, we introduced a systematic approach, called IDMIR, which enabled the identification of expression dysregulated miRNAs through an MMI network under the gene expression context, where the network's architecture was designed to implicitly connect miRNAs based on their shared biological functions within a particular disease context. The advantage of IDMIR is that it uses gene expression data for the identification of dysregulated miRNAs by analyzing variations in MMIs. We illustrated the excellent predictive power for dysregulated miRNAs of the IDMIR approach through data analysis on breast cancer and bladder urothelial cancer. IDMIR could surpass several existing miRNA-disease association prediction approaches through comparison. We believe the approach complements the deficiencies in predicting miRNA-disease association and may provide new insights and possibilities for diagnosing and treating diseases. The IDMIR approach is now available as a free R package on CRAN (https://CRAN.R-project.org/package=IDMIR).
Collapse
Affiliation(s)
- Jiashuo Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Xilong Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Yalan He
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Bingyue Pan
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Jiyin Lai
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Miao Ji
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Siyuan Li
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Junling Huang
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| | - Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, No. 157 Baojian Road, Nangang District, Harbin, Heilongjiang Province, China
| |
Collapse
|
2
|
Wang W, Han P, Li Z, Nie R, Wang K, Wang L, Liao H. LMGATCDA: Graph Neural Network With Labeling Trick for Predicting circRNA-Disease Associations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:289-300. [PMID: 38231821 DOI: 10.1109/tcbb.2024.3355093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Previous studies have proven that circular RNAs (circRNAs) are inextricably connected to the etiology and pathophysiology of complicated diseases. Since conventional biological research are frequently small-scale, expensive, and time-consuming, it is essential to establish an efficient and reasonable computation-based method to identify disease-related circRNAs. In this article, we proposed a novel ensemble model for predicting probable circRNA-disease associations based on multi-source similarity information(LMGATCDA). In particular, LMGATCDA first incorporates information on circRNA functional similarity, disease semantic similarity, and the Gaussian interaction profile (GIP) kernel similarity as explicit features, along with node-labeling of the three-hop subgraphs extracted from each linked target node as graph structural features. After that, the fused features are used as input, and further implied features are extracted by graph sampling aggregation (GraphSAGE) and multi-hop attention graph neural network (MAGNA). Finally, the prediction scores are obtained through a fully connected layer. With five-fold cross-validation, LMGATCDA demonstrated excellent competitiveness against gold standard data, reaching 95.37% accuracy and 91.31% recall with an AUC of 94.25% on the circR2Disease benchmark dataset. Collectively, the noteworthy findings from these case studies support our conclusion that the LMGATCDA model can provide reliable circRNA-disease associations for clinical research while helping to mitigate experimental uncertainties in wet-lab investigations.
Collapse
|
3
|
Yao HB, Hou ZJ, Zhang WG, Li H, Chen Y. Prediction of MicroRNA-Disease Potential Association Based on Sparse Learning and Multilayer Random Walks. J Comput Biol 2024; 31:241-256. [PMID: 38377572 DOI: 10.1089/cmb.2023.0266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024] Open
Abstract
More and more studies have shown that microRNAs (miRNAs) play an indispensable role in the study of complex diseases in humans. Traditional biological experiments to detect miRNA-disease associations are expensive and time-consuming. Therefore, it is necessary to propose efficient and meaningful computational models to predict miRNA-disease associations. In this study, we aim to propose a miRNA-disease association prediction model based on sparse learning and multilayer random walks (SLMRWMDA). The miRNA-disease association matrix is decomposed and reconstructed by the sparse learning method to obtain richer association information, and at the same time, the initial probability matrix for the random walk with restart algorithm is obtained. The disease similarity network, miRNA similarity network, and miRNA-disease association network are used to construct heterogeneous networks, and the stable probability is obtained based on the topological structure features of diseases and miRNAs through a multilayer random walk algorithm to predict miRNA-disease potential association. The experimental results show that the prediction accuracy of this model is significantly improved compared with the previous related models. We evaluated the model using global leave-one-out cross-validation (global LOOCV) and fivefold cross-validation (5-fold CV). The area under the curve (AUC) value for the LOOCV is 0.9368. The mean AUC value for 5-fold CV is 0.9335 and the variance is 0.0004. In the case study, the results show that SLMRWMDA is effective in inferring the potential association of miRNA-disease.
Collapse
Affiliation(s)
- Hai-Bin Yao
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Zhen-Jie Hou
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Wen-Guang Zhang
- Life Sciences, Inner Mongolia Agricultural University, Hohhot, China
| | - Han Li
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Yan Chen
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| |
Collapse
|
4
|
Ergün S, Sankaranarayanan R, Petrović N. Clinically informative microRNAs for SARS-CoV-2 infection. Epigenomics 2023; 15:705-716. [PMID: 37661862 PMCID: PMC10476648 DOI: 10.2217/epi-2023-0179] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 08/07/2023] [Indexed: 09/05/2023] Open
Abstract
COVID-19 is a viral respiratory infection induced by the newly discovered coronavirus SARS-CoV-2. miRNA is an example of a strong and direct regulator of a gene's transcriptional activity. The interaction between miRNAs and their target molecules is responsible for homeostasis. Virus-derived and host-derived miRNAs are involved in the activity of hiding from immune system cells, inducing the inflammatory reaction through interplay with associated genes, during SARS-COV-2 infection. Interest in miRNAs has raised the comprehension of the machinery and pathophysiology of SARS-COV-2 infection. In this review, the effects and biological roles of miRNAs on SARS-CoV-2 pathogenicity and life cycle are described. The therapeutic potential of miRNAs against SARS-CoV-2 infection are also mentioned.
Collapse
Affiliation(s)
- Sercan Ergün
- Department of Medical Biology, Faculty of Medicine, Ondokuz Mayis University, Samsun, Turkey
- Department of Multidisciplinary Molecular Medicine, Institute of Graduate Studies, Ondokuz Mayis University, Samsun, Turkey
| | | | - Nina Petrović
- Laboratory for Radiobiology & Molecular Genetics, Department of Health & Environment, ‘VINČA’ Institute of Nuclear Sciences – National Institute of the Republic of Serbia, University of Belgrade, Mike Petrovića Alasa 12–14, Belgrade, 11001, Serbia
- Department of Experimental Oncology, Institute for Oncology & Radiology of Serbia, Pasterova 14, Belgrade, 11000, Serbia
| |
Collapse
|
5
|
Liao Q, Ye Y, Li Z, Chen H, Zhuo L. Prediction of miRNA-disease associations in microbes based on graph convolutional networks and autoencoders. Front Microbiol 2023; 14:1170559. [PMID: 37187536 PMCID: PMC10175670 DOI: 10.3389/fmicb.2023.1170559] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 03/21/2023] [Indexed: 05/17/2023] Open
Abstract
MicroRNAs (miRNAs) are short RNA molecular fragments that regulate gene expression by targeting and inhibiting the expression of specific RNAs. Due to the fact that microRNAs affect many diseases in microbial ecology, it is necessary to predict microRNAs' association with diseases at the microbial level. To this end, we propose a novel model, termed as GCNA-MDA, where dual-autoencoder and graph convolutional network (GCN) are integrated to predict miRNA-disease association. The proposed method leverages autoencoders to extract robust representations of miRNAs and diseases and meantime exploits GCN to capture the topological information of miRNA-disease networks. To alleviate the impact of insufficient information for the original data, the association similarity and feature similarity data are combined to calculate a more complete initial basic vector of nodes. The experimental results on the benchmark datasets demonstrate that compared with the existing representative methods, the proposed method has achieved the superior performance and its precision reaches up to 0.8982. These results demonstrate that the proposed method can serve as a tool for exploring miRNA-disease associations in microbial environments.
Collapse
Affiliation(s)
- Qingquan Liao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Yuxiang Ye
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, China
| | - Zihang Li
- School of Computing and Data Science, Xiamen University Malaysia, Sepang, Selangor, Malaysia
| | - Hao Chen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
- *Correspondence: Hao Chen
| | - Linlin Zhuo
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, China
- Linlin Zhuo
| |
Collapse
|
6
|
Wang Y, Xiang J, Liu C, Tang M, Hou R, Bao M, Tian G, He J, He B. Drug repositioning for SARS-CoV-2 by Gaussian kernel similarity bilinear matrix factorization. Front Microbiol 2022; 13:1062281. [PMID: 36545200 PMCID: PMC9762482 DOI: 10.3389/fmicb.2022.1062281] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 11/21/2022] [Indexed: 12/12/2022] Open
Abstract
Coronavirus disease 2019 (COVID-19), a disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is currently spreading rapidly around the world. Since SARS-CoV-2 seriously threatens human life and health as well as the development of the world economy, it is very urgent to identify effective drugs against this virus. However, traditional methods to develop new drugs are costly and time-consuming, which makes drug repositioning a promising exploration direction for this purpose. In this study, we collected known antiviral drugs to form five virus-drug association datasets, and then explored drug repositioning for SARS-CoV-2 by Gaussian kernel similarity bilinear matrix factorization (VDA-GKSBMF). By the 5-fold cross-validation, we found that VDA-GKSBMF has an area under curve (AUC) value of 0.8851, 0.8594, 0.8807, 0.8824, and 0.8804, respectively, on the five datasets, which are higher than those of other state-of-art algorithms in four datasets. Based on known virus-drug association data, we used VDA-GKSBMF to prioritize the top-k candidate antiviral drugs that are most likely to be effective against SARS-CoV-2. We confirmed that the top-10 drugs can be molecularly docked with virus spikes protein/human ACE2 by AutoDock on five datasets. Among them, four antiviral drugs ribavirin, remdesivir, oseltamivir, and zidovudine have been under clinical trials or supported in recent literatures. The results suggest that VDA-GKSBMF is an effective algorithm for identifying potential antiviral drugs against SARS-CoV-2.
Collapse
Affiliation(s)
- Yibai Wang
- School of Information Engineering, Changsha Medical University, Changsha, China
| | - Ju Xiang
- School of Information Engineering, Changsha Medical University, Changsha, China,Academician Workstation, Changsha Medical University, Changsha, China,*Correspondence: Ju Xiang,
| | - Cuicui Liu
- School of Information Engineering, Changsha Medical University, Changsha, China
| | - Min Tang
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu, China
| | - Rui Hou
- Geneis (Beijing) Co., Ltd., Beijing, China,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Meihua Bao
- School of Pharmacy, Changsha Medical University, Changsha, China,Key Laboratory Breeding Base of Hunan Oriented Fundamental and Applied Research of Innovative Pharmaceutics, Changsha Medical University, Changsha, China
| | - Geng Tian
- Geneis (Beijing) Co., Ltd., Beijing, China,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Jianjun He
- Academician Workstation, Changsha Medical University, Changsha, China,School of Pharmacy, Changsha Medical University, Changsha, China,Key Laboratory Breeding Base of Hunan Oriented Fundamental and Applied Research of Innovative Pharmaceutics, Changsha Medical University, Changsha, China,Jianjun He,
| | - Binsheng He
- Academician Workstation, Changsha Medical University, Changsha, China,School of Pharmacy, Changsha Medical University, Changsha, China,Key Laboratory Breeding Base of Hunan Oriented Fundamental and Applied Research of Innovative Pharmaceutics, Changsha Medical University, Changsha, China,Binsheng He,
| |
Collapse
|
7
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: towards systematic evaluation of computational models. Brief Bioinform 2022; 23:6712303. [PMID: 36151749 DOI: 10.1093/bib/bbac407] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 08/11/2022] [Accepted: 08/20/2022] [Indexed: 12/14/2022] Open
Abstract
Currently, there exist no generally accepted strategies of evaluating computational models for microRNA-disease associations (MDAs). Though K-fold cross validations and case studies seem to be must-have procedures, the value of K, the evaluation metrics, and the choice of query diseases as well as the inclusion of other procedures (such as parameter sensitivity tests, ablation studies and computational cost reports) are all determined on a case-by-case basis and depending on the researchers' choices. In the current review, we include a comprehensive analysis on how 29 state-of-the-art models for predicting MDAs were evaluated. Based on the analytical results, we recommend a feasible evaluation workflow that would suit any future model to facilitate fair and systematic assessment of predictive performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
8
|
Cao B, Li R, Xiao S, Deng S, Zhou X, Zhou L. Predicting miRNA-disease association through combining miRNA function and network topological similarities based on MINE. iScience 2022; 25:105299. [DOI: 10.1016/j.isci.2022.105299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 07/08/2022] [Accepted: 09/28/2022] [Indexed: 11/16/2022] Open
|
9
|
Chen J, Lin J, Hu Y, Ye M, Yao L, Wu L, Zhang W, Wang M, Deng T, Guo F, Huang Y, Zhu B, Wang D. RNADisease v4.0: an updated resource of RNA-associated diseases, providing RNA-disease analysis, enrichment and prediction. Nucleic Acids Res 2022; 51:D1397-D1404. [PMID: 36134718 PMCID: PMC9825423 DOI: 10.1093/nar/gkac814] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 09/06/2022] [Accepted: 09/09/2022] [Indexed: 02/06/2023] Open
Abstract
Numerous studies have shown that RNA plays an important role in the occurrence and development of diseases, and RNA-disease associations are not limited to noncoding RNAs in mammals but also exist for protein-coding RNAs. Furthermore, RNA-associated diseases are found across species including plants and nonmammals. To better analyze diseases at the RNA level and facilitate researchers in exploring the pathogenic mechanism of diseases, we decided to update and change MNDR v3.0 to RNADisease v4.0, a repository for RNA-disease association (http://www.rnadisease.org/ or http://www.rna-society.org/mndr/). Compared to the previous version, new features include: (i) expanded data sources and categories of species, RNA types, and diseases; (ii) the addition of a comprehensive analysis of RNAs from thousands of high-throughput sequencing data of cancer samples and normal samples; (iii) the addition of an RNA-disease enrichment tool and (iv) the addition of four RNA-disease prediction tools. In summary, RNADisease v4.0 provides a comprehensive and concise data resource of RNA-disease associations which contains a total of 3 428 058 RNA-disease entries covering 18 RNA types, 117 species and 4090 diseases to meet the needs of biological research and lay the foundation for future therapeutic applications of diseases.
Collapse
Affiliation(s)
| | | | | | | | | | - Le Wu
- Department of Bioinformatics, Guangdong Province Key Laboratory of Molecular Tumor Pathology, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Wenhai Zhang
- Department of Bioinformatics, Guangdong Province Key Laboratory of Molecular Tumor Pathology, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Meiyi Wang
- Department of Bioinformatics, Guangdong Province Key Laboratory of Molecular Tumor Pathology, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Tingting Deng
- Department of Bioinformatics, Guangdong Province Key Laboratory of Molecular Tumor Pathology, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Feng Guo
- School of Medicine, Tsinghua University, Beijing 100084, China
| | - Yan Huang
- Cancer Research Institute, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Bofeng Zhu
- Correspondence may also be addressed to Bofeng Zhu. Tel: +86 20 61648787; Fax: +86 20 61648787;
| | - Dong Wang
- To whom correspondence should be addressed. Tel: +86 20 61648279; Fax: +86 20 61648279;
| |
Collapse
|
10
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: taxonomy, trends and challenges of computational models. Brief Bioinform 2022; 23:6686738. [PMID: 36056743 DOI: 10.1093/bib/bbac358] [Citation(s) in RCA: 71] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/24/2022] [Accepted: 07/30/2022] [Indexed: 12/12/2022] Open
Abstract
Since the problem proposed in late 2000s, microRNA-disease association (MDA) predictions have been implemented based on the data fusion paradigm. Integrating diverse data sources gains a more comprehensive research perspective, and brings a challenge to algorithm design for generating accurate, concise and consistent representations of the fused data. After more than a decade of research progress, a relatively simple algorithm like the score function or a single computation layer may no longer be sufficient for further improving predictive performance. Advanced model design has become more frequent in recent years, particularly in the form of reasonably combing multiple algorithms, a process known as model fusion. In the current review, we present 29 state-of-the-art models and introduce the taxonomy of computational models for MDA prediction based on model fusion and non-fusion. The new taxonomy exhibits notable changes in the algorithmic architecture of models, compared with that of earlier ones in the 2017 review by Chen et al. Moreover, we discuss the progresses that have been made towards overcoming the obstacles to effective MDA prediction since 2017 and elaborated on how future models can be designed according to a set of new schemas. Lastly, we analysed the strengths and weaknesses of each model category in the proposed taxonomy and proposed future research directions from diverse perspectives for enhancing model performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
11
|
Ma M, Na S, Zhang X, Chen C, Xu J. SFGAE: a self-feature-based graph autoencoder model for miRNA-disease associations prediction. Brief Bioinform 2022; 23:6678419. [PMID: 36037084 DOI: 10.1093/bib/bbac340] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Revised: 07/21/2022] [Accepted: 07/25/2022] [Indexed: 11/13/2022] Open
Abstract
Increasing evidence has suggested that microRNAs (miRNAs) are important biomarkers of various diseases. Numerous graph neural network (GNN) models have been proposed for predicting miRNA-disease associations. However, the existing GNN-based methods have over-smoothing issue-the learned feature embeddings of miRNA nodes and disease nodes are indistinguishable when stacking multiple GNN layers. This issue makes the performance of the methods sensitive to the number of layers, and significantly hurts the performance when more layers are employed. In this study, we resolve this issue by a novel self-feature-based graph autoencoder model, shortened as SFGAE. The key novelty of SFGAE is to construct miRNA-self embeddings and disease-self embeddings, and let them be independent of graph interactions between two types of nodes. The novel self-feature embeddings enrich the information of typical aggregated feature embeddings, which aggregate the information from direct neighbors and hence heavily rely on graph interactions. SFGAE adopts a graph encoder with attention mechanism to concatenate aggregated feature embeddings and self-feature embeddings, and adopts a bilinear decoder to predict links. Our experiments show that SFGAE achieves state-of-the-art performance. In particular, SFGAE improves the average AUC upon recent GAEMDA [1] on the benchmark datasets HMDD v2.0 and HMDD v3.2, and consistently performs better when less (e.g. 10%) training samples are used. Furthermore, SFGAE effectively overcomes the over-smoothing issue and performs stably well on deeper models (e.g. eight layers). Finally, we carry out case studies on three human diseases, colon neoplasms, esophageal neoplasms and kidney neoplasms, and perform a survival analysis using kidney neoplasm as an example. The results suggest that SFGAE is a reliable tool for predicting potential miRNA-disease associations.
Collapse
Affiliation(s)
- Mingyuan Ma
- Key Laboratory of High Confidence Software Technologies of Ministry of Education, School of Computer Science, Peking University, Beijing, China
| | - Sen Na
- International Computer Science Institute and Department of Statistics, University of California, Berkeley, Berkeley CA, USA
| | - Xiaolu Zhang
- Department of Information Systems, City University of Hong Kong, Hong Kong, China
| | - Congzhou Chen
- Key Laboratory of High Confidence Software Technologies of Ministry of Education, School of Computer Science, Peking University, Beijing, China
| | - Jin Xu
- Key Laboratory of High Confidence Software Technologies of Ministry of Education, School of Computer Science, Peking University, Beijing, China
| |
Collapse
|
12
|
Li G, Fang T, Zhang Y, Liang C, Xiao Q, Luo J. Predicting miRNA-disease associations based on graph attention network with multi-source information. BMC Bioinformatics 2022; 23:244. [PMID: 35729531 PMCID: PMC9215044 DOI: 10.1186/s12859-022-04796-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 06/15/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There is a growing body of evidence from biological experiments suggesting that microRNAs (miRNAs) play a significant regulatory role in both diverse cellular activities and pathological processes. Exploring miRNA-disease associations not only can decipher pathogenic mechanisms but also provide treatment solutions for diseases. As it is inefficient to identify undiscovered relationships between diseases and miRNAs using biotechnology, an explosion of computational methods have been advanced. However, the prediction accuracy of existing models is hampered by the sparsity of known association network and single-category feature, which is hard to model the complicated relationships between diseases and miRNAs. RESULTS In this study, we advance a new computational framework (GATMDA) to discover unknown miRNA-disease associations based on graph attention network with multi-source information, which effectively fuses linear and non-linear features. In our method, the linear features of diseases and miRNAs are constructed by disease-lncRNA correlation profiles and miRNA-lncRNA correlation profiles, respectively. Then, the graph attention network is employed to extract the non-linear features of diseases and miRNAs by aggregating information of each neighbor with different weights. Finally, the random forest algorithm is applied to infer the disease-miRNA correlation pairs through fusing linear and non-linear features of diseases and miRNAs. As a result, GATMDA achieves impressive performance: an average AUC of 0.9566 with five-fold cross validation, which is superior to other previous models. In addition, case studies conducted on breast cancer, colon cancer and lymphoma indicate that 50, 50 and 48 out of the top fifty prioritized candidates are verified by biological experiments. CONCLUSIONS The extensive experimental results justify the accuracy and utility of GATMDA and we could anticipate that it may regard as a utility tool for identifying unobserved disease-miRNA relationships.
Collapse
Affiliation(s)
- Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China.
| | - Tao Fang
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Yuejin Zhang
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| |
Collapse
|
13
|
Peng L, Yang C, Huang L, Chen X, Fu X, Liu W. RNMFLP: Predicting circRNA-disease associations based on robust nonnegative matrix factorization and label propagation. Brief Bioinform 2022; 23:6582881. [PMID: 35534179 DOI: 10.1093/bib/bbac155] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 03/09/2022] [Accepted: 04/06/2022] [Indexed: 12/22/2022] Open
Abstract
Circular RNAs (circRNAs) are a class of structurally stable endogenous noncoding RNA molecules. Increasing studies indicate that circRNAs play vital roles in human diseases. However, validating disease-related circRNAs in vivo is costly and time-consuming. A reliable and effective computational method to identify circRNA-disease associations deserves further studies. In this study, we propose a computational method called RNMFLP that combines robust nonnegative matrix factorization (RNMF) and label propagation algorithm (LP) to predict circRNA-disease associations. First, to reduce the impact of false negative data, the original circRNA-disease adjacency matrix is updated by matrix multiplication using the integrated circRNA similarity and the disease similarity information. Subsequently, the RNMF algorithm is used to obtain the restricted latent space to capture potential circRNA-disease pairs from the association matrix. Finally, the LP algorithm is utilized to predict more accurate circRNA-disease associations from the integrated circRNA similarity network and integrated disease similarity network, respectively. Fivefold cross-validation of four datasets shows that RNMFLP is superior to the state-of-the-art methods. In addition, case studies on lung cancer, hepatocellular carcinoma and colorectal cancer further demonstrate the reliability of our method to discover disease-related circRNAs.
Collapse
Affiliation(s)
- Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China.,Hunan Key Laboratory for Service computing and Novel Software Technology
| | - Cheng Yang
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, 10084, Beijing, China.,The Future Laboratory, Tsinghua University, 10084, Beijing, China
| | - Xiang Chen
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| | - Xiangzheng Fu
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Wei Liu
- College of Information Engineering, Xiangtan University, Xiangtan, 411105, Hunan, China
| |
Collapse
|
14
|
Liu W, Lin H, Huang L, Peng L, Tang T, Zhao Q, Yang L. Identification of miRNA-disease associations via deep forest ensemble learning based on autoencoder. Brief Bioinform 2022; 23:6553934. [PMID: 35325038 DOI: 10.1093/bib/bbac104] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 02/18/2022] [Accepted: 03/01/2022] [Indexed: 12/31/2022] Open
Abstract
Increasing evidences show that the occurrence of human complex diseases is closely related to microRNA (miRNA) variation and imbalance. For this reason, predicting disease-related miRNAs is essential for the diagnosis and treatment of complex human diseases. Although some current computational methods can effectively predict potential disease-related miRNAs, the accuracy of prediction should be further improved. In our study, a new computational method via deep forest ensemble learning based on autoencoder (DFELMDA) is proposed to predict miRNA-disease associations. Specifically, a new feature representation strategy is proposed to obtain different types of feature representations (from miRNA and disease) for each miRNA-disease association. Then, two types of low-dimensional feature representations are extracted by two deep autoencoders for predicting miRNA-disease associations. Finally, two prediction scores of the miRNA-disease associations are obtained by the deep random forest and combined to determine the final results. DFELMDA is compared with several classical methods on the The Human microRNA Disease Database (HMDD) dataset. Results reveal that the performance of this method is superior. The area under receiver operating characteristic curve (AUC) values obtained by DFELMDA through 5-fold and 10-fold cross-validation are 0.9552 and 0.9560, respectively. In addition, case studies on colon, breast and lung tumors of different disease types further demonstrate the excellent ability of DFELMDA to predict disease-associated miRNA-disease. Performance analysis shows that DFELMDA can be used as an effective computational tool for predicting miRNA-disease associations.
Collapse
Affiliation(s)
- Wei Liu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China.,School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Hui Lin
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China.,School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, China
| | - Ting Tang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China.,School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Li Yang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| |
Collapse
|
15
|
Wang CC, Li TH, Huang L, Chen X. Prediction of potential miRNA-disease associations based on stacked autoencoder. Brief Bioinform 2022; 23:6529883. [PMID: 35176761 DOI: 10.1093/bib/bbac021] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 01/05/2022] [Accepted: 01/14/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, increasing biological experiments and scientific studies have demonstrated that microRNA (miRNA) plays an important role in the development of human complex diseases. Therefore, discovering miRNA-disease associations can contribute to accurate diagnosis and effective treatment of diseases. Identifying miRNA-disease associations through computational methods based on biological data has been proven to be low-cost and high-efficiency. In this study, we proposed a computational model named Stacked Autoencoder for potential MiRNA-Disease Association prediction (SAEMDA). In SAEMDA, all the miRNA-disease samples were used to pretrain a Stacked Autoencoder (SAE) in an unsupervised manner. Then, the positive samples and the same number of selected negative samples were utilized to fine-tune SAE in a supervised manner after adding an output layer with softmax classifier to the SAE. SAEMDA can make full use of the feature information of all unlabeled miRNA-disease pairs. Therefore, SAEMDA is suitable for our dataset containing small labeled samples and large unlabeled samples. As a result, SAEMDA achieved AUCs of 0.9210 and 0.8343 in global and local leave-one-out cross validation. Besides, SAEMDA obtained an average AUC and standard deviation of 0.9102 ± /-0.0029 in 100 times of 5-fold cross validation. These results were better than those of previous models. Moreover, we carried out three case studies to further demonstrate the predictive accuracy of SAEMDA. As a result, 82% (breast neoplasms), 100% (lung neoplasms) and 90% (esophageal neoplasms) of the top 50 predicted miRNAs were verified by databases. Thus, SAEMDA could be a useful and reliable model to predict potential miRNA-disease associations.
Collapse
Affiliation(s)
- Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| | - Tian-Hao Li
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Xing Chen
- Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
16
|
Logistic matrix factorisation and generative adversarial neural network-based method for predicting drug-target interactions. Mol Divers 2021; 25:1497-1516. [PMID: 34297278 DOI: 10.1007/s11030-021-10273-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 07/04/2021] [Indexed: 12/21/2022]
Abstract
Identifying drug-target protein association pairs is a prerequisite and a crucial task in drug discovery and development. Numerous computational models, based on different assumptions and algorithms, have been proposed as an alternative to the laborious, costly, and time-consuming traditional wet-lab methods. Most proposed methods focus on separated drug and target descriptors, calculated, respectively, from chemical structures and protein sequences, and fail to introduce and extract features where the interaction information is embedded. In this paper, we propose a new three-step method based on matrix factorisation and generative adversarial network (GAN) for drug-target interaction prediction. Firstly, the matrix factorisation technique is used to capture and extract the joint interaction feature, for both drugs and targets, from the drug-target interaction matrix. Then, a GAN is introduced for data augmentation. It generates a fake positive sample similar to the real positive sample (known interactions) in order to balance the samples, allow the exploitation of the entire negative sample, and increase the data size for an accurate prediction. Finally, a fully connected four-layer neural network is built for classification. Experimental results illustrate a higher prediction performance of the proposed method compared to shallow classifiers and to state-of-the-art methods with an accuracy higher than 97%. Moreover, the data generation effect is confirmed by evaluating the proposed method with and without the generation step. These results demonstrated the efficiency of the latent interaction features and data generation on predicting new drugs or repurposing existing drugs. Overview of the WGANMF-DTI workflow for the Drug-Target Interaction Prediction task.
Collapse
|