1
|
Sulaimany S, Farahmandi K, Mafakheri A. Computational prediction of new therapeutic effects of probiotics. Sci Rep 2024; 14:11932. [PMID: 38789535 PMCID: PMC11126595 DOI: 10.1038/s41598-024-62796-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 05/21/2024] [Indexed: 05/26/2024] Open
Abstract
Probiotics are living microorganisms that provide health benefits to their hosts, potentially aiding in the treatment or prevention of various diseases, including diarrhea, irritable bowel syndrome, ulcerative colitis, and Crohn's disease. Motivated by successful applications of link prediction in medical and biological networks, we applied link prediction to the probiotic-disease network to identify unreported relations. Using data from the Probio database and International Classification of Diseases-10th Revision (ICD-10) resources, we constructed a bipartite graph focused on the relationship between probiotics and diseases. We applied customized link prediction algorithms for this bipartite network, including common neighbors, Jaccard coefficient, and Adamic/Adar ranking formulas. We evaluated the results using Area under the Curve (AUC) and precision metrics. Our analysis revealed that common neighbors outperformed the other methods, with an AUC of 0.96 and precision of 0.6, indicating that basic formulas can predict at least six out of ten probable relations correctly. To support our findings, we conducted an exact search of the top 20 predictions and found six confirming papers on Google Scholar and Science Direct. Evidence suggests that Lactobacillus jensenii may provide prophylactic and therapeutic benefits for gastrointestinal diseases and that Lactobacillus acidophilus may have potential activity against urologic and female genital illnesses. Further investigation of other predictions through additional preclinical and clinical studies is recommended. Future research may focus on deploying more powerful link prediction algorithms to achieve better and more accurate results.
Collapse
Affiliation(s)
- Sadegh Sulaimany
- Social and Biological Network Analysis Laboratory (SBNA), Department of Computer Engineering, University of Kurdistan, Sanandaj, Iran.
| | - Kajal Farahmandi
- Department of Industrial and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - Aso Mafakheri
- Social and Biological Network Analysis Laboratory (SBNA), Department of Computer Engineering, University of Kurdistan, Sanandaj, Iran
| |
Collapse
|
2
|
Daniel Thomas S, Vijayakumar K, John L, Krishnan D, Rehman N, Revikumar A, Kandel Codi JA, Prasad TSK, S S V, Raju R. Machine Learning Strategies in MicroRNA Research: Bridging Genome to Phenome. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2024; 28:213-233. [PMID: 38752932 DOI: 10.1089/omi.2024.0047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2024]
Abstract
MicroRNAs (miRNAs) have emerged as a prominent layer of regulation of gene expression. This article offers the salient and current aspects of machine learning (ML) tools and approaches from genome to phenome in miRNA research. First, we underline that the complexity in the analysis of miRNA function ranges from their modes of biogenesis to the target diversity in diverse biological conditions. Therefore, it is imperative to first ascertain the miRNA coding potential of genomes and understand the regulatory mechanisms of their expression. This knowledge enables the efficient classification of miRNA precursors and the identification of their mature forms and respective target genes. Second, and because one miRNA can target multiple mRNAs and vice versa, another challenge is the assessment of the miRNA-mRNA target interaction network. Furthermore, long-noncoding RNA (lncRNA)and circular RNAs (circRNAs) also contribute to this complexity. ML has been used to tackle these challenges at the high-dimensional data level. The present expert review covers more than 100 tools adopting various ML approaches pertaining to, for example, (1) miRNA promoter prediction, (2) precursor classification, (3) mature miRNA prediction, (4) miRNA target prediction, (5) miRNA- lncRNA and miRNA-circRNA interactions, (6) miRNA-mRNA expression profiling, (7) miRNA regulatory module detection, (8) miRNA-disease association, and (9) miRNA essentiality prediction. Taken together, we unpack, critically examine, and highlight the cutting-edge synergy of ML approaches and miRNA research so as to develop a dynamic and microlevel understanding of human health and diseases.
Collapse
Affiliation(s)
- Sonet Daniel Thomas
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Krithika Vijayakumar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Levin John
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Deepak Krishnan
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Niyas Rehman
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Amjesh Revikumar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Kerala Genome Data Centre, Kerala Development and Innovation Strategic Council, Thiruvananthapuram, Kerala, India
| | - Jalaluddin Akbar Kandel Codi
- Department of Surgical Oncology, Yenepoya Medical College, Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | | | - Vinodchandra S S
- Department of Computer Science, University of Kerala, Thiruvananthapuram, Kerala, India
| | - Rajesh Raju
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| |
Collapse
|
3
|
Ghafouri-Fard S, Askari A, Hussen BM, Taheri M, Akbari Dilmaghani N. Role of miR-424 in the carcinogenesis. Clin Transl Oncol 2024; 26:16-38. [PMID: 37178445 PMCID: PMC10761534 DOI: 10.1007/s12094-023-03209-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 04/27/2023] [Indexed: 05/15/2023]
Abstract
Recent studies have revealed the impact of microRNAs (miRNAs) in the carcinogenic process. miR-424 is a miRNA whose role in this process is being to be identified. Experiments in the ovarian cancer, cervical cancer, hepatocellular carcinoma, neuroblastoma, breast cancer, osteosarcoma, intrahepatic cholangiocarcinoma, prostate cancer, endometrial cancer, non-small cell lung cancer, hemangioma and gastric cancer have reported down-regulation of miR-424. On the other hand, this miRNA has been found to be up-regulated in melanoma, laryngeal and esophageal squamous cell carcinomas, glioma, multiple myeloma and thyroid cancer. Expression of this miRNA is regulated by methylation status of its promoter. Besides, LINC00641, CCAT2, PVT1, LIN00657, LINC00511 and NNT-AS1 are among lncRNAs that act as molecular sponges for miR-424, thus regulating its expression. Moreover, several members of SNHG family of lncRNAs have been found to regulate expression of miR-424. This miRNA is also involved in the regulation of E2F transcription factors. The current review aims at summarization of the role of miR-424 in the process of cancer evolution and its impact on clinical outcome of patients in order to find appropriate markers for malignancies.
Collapse
Affiliation(s)
- Soudeh Ghafouri-Fard
- Department of Medical Genetics, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Arian Askari
- Phytochemistry Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Bashdar Mahmud Hussen
- Department of Clinical Analysis, College of Pharmacy, Hawler Medical University, Kurdistan Region, Erbil, Iraq
| | - Mohammad Taheri
- Institute of Human Genetics, Jena University Hospital, Jena, Germany.
- Urology and Nephrology Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Nader Akbari Dilmaghani
- Skull Base Research Center, Loghman Hakim Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
4
|
Das G, Das T, Parida S, Ghosh Z. LncRTPred: Predicting RNA-RNA mode of interaction mediated by lncRNA. IUBMB Life 2024; 76:53-68. [PMID: 37606159 DOI: 10.1002/iub.2778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Accepted: 07/19/2023] [Indexed: 08/23/2023]
Abstract
Long non-coding RNAs (lncRNAs) play a significant role in various biological processes. Hence, it is utmost important to elucidate their functions in order to understand the molecular mechanism of a complex biological system. This versatile RNA molecule has diverse modes of interaction, one of which constitutes lncRNA-mRNA interaction. Hence, identifying its target mRNA is essential to understand the function of an lncRNA explicitly. Existing lncRNA target prediction tools mainly adopt thermodynamics approach. Large execution time and inability to perform real-time prediction limit their usage. Further, lack of negative training dataset has been a hindrance in the path of developing machine learning (ML) based lncRNA target prediction tools. In this work, we have developed a ML-based lncRNA-mRNA target prediction model- 'LncRTPred'. Here we have addressed the existing problems by generating reliable negative dataset and creating robust ML models. We have identified the non-interacting lncRNA and mRNAs from the unlabelled dataset using BLAT. It is further filtered to get a reliable set of outliers. LncRTPred provides a cumulative_model_score as the final output against each query. In terms of prediction accuracy, LncRTPred outperforms other popular target prediction protocols like LncTar. Further, we have tested its performance against experimentally validated disease-specific lncRNA-mRNA interactions. Overall, performance of LncRTPred is heavily dependent on the size of the training dataset, which is highly reflected by the difference in its performance for human and mouse species. Its performance for human species shows better as compared to that for mouse when applied on an unknown data due to smaller size of the training dataset in case of mouse compared to that of human. Availability of increased number of lncRNA-mRNA interaction data for mouse will improve the performance of LncRTPred in future. Both webserver and standalone versions of LncRTPred are available. Web server link: http://bicresources.jcbose.ac.in/zhumur/lncrtpred/index.html. Github Link: https://github.com/zglabDIB/LncRTPred.
Collapse
Affiliation(s)
- Gourab Das
- Division of Bioinformatics, Bose Institute, Kolkata, India
| | - Troyee Das
- Division of Bioinformatics, Bose Institute, Kolkata, India
| | - Sibun Parida
- Division of Bioinformatics, Bose Institute, Kolkata, India
| | - Zhumur Ghosh
- Division of Bioinformatics, Bose Institute, Kolkata, India
| |
Collapse
|
5
|
Xie W, Chen X, Zheng Z, Wang F, Zhu X, Lin Q, Sun Y, Wong KC. LncRNA-Top: Controlled deep learning approaches for lncRNA gene regulatory relationship annotations across different platforms. iScience 2023; 26:108197. [PMID: 37965148 PMCID: PMC10641498 DOI: 10.1016/j.isci.2023.108197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 08/10/2023] [Accepted: 10/10/2023] [Indexed: 11/16/2023] Open
Abstract
By soaking microRNAs (miRNAs), long non-coding RNAs (lncRNAs) have the potential to regulate gene expression. Few methods have been created based on this mechanism to anticipate the lncRNA-gene relationship prediction. Hence, we present lncRNA-Top to forecast potential lncRNA-gene regulation relationships. Specifically, we constructed controlled deep-learning methods using 12417 lncRNAs and 16127 genes. We have provided retrospective and innovative views among negative sampling, random seeds, cross-validation, metrics, and independent datasets. The AUC, AUPR, and our defined precision@k were leveraged to evaluate performance. In-depth case studies demonstrate that 47 out of 100 projected top unknown pairings were recorded in publications, supporting the predictive power. Our additional software can annotate the scores with target candidates. The lncRNA-Top will be a helpful tool to uncover prospective lncRNA targets and better comprehend the regulatory processes of lncRNAs.
Collapse
Affiliation(s)
- Weidun Xie
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Xingjian Chen
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Zetian Zheng
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Fuzhou Wang
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Xiaowei Zhu
- Department of Neuroscience, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Qiuzhen Lin
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Yanni Sun
- Department of Electrical Engineering, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
- Shenzhen Research Institute, City University of Hong Kong, Shenzhen, China
- Hong Kong Institute for Data Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| |
Collapse
|
6
|
Tang X, Ji L. Predicting Plant miRNA-lncRNA Interactions via a Deep Learning Method. IEEE Trans Nanobioscience 2023; 22:728-733. [PMID: 37167036 DOI: 10.1109/tnb.2023.3275178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
In recent years, due to the contribution to elucidating the functional mechanisms of miRNAs and lncRNAs, the research on miRNA-lncRNA interaction prediction has increased exponentially. However, the prediction research is challenging in bioinformatics domain. It is expensive and time-consuming to verify the interactions by biological experiments. The existing prediction models have some limitations, such as the need to manually extract features, the potential loss of features from pre-treatment approaches, long-distance dependency to be solved, and so on. Additionally, most of the current models prefer to the animal data. However, the establishment of an efficient and accurate plant miRNA-lncRNA interaction prediction model is necessary. In this work, a new deep learning model called PmlIPM is presented to infer plant miRNA-lncRNA associations. PmlIPM is a four-step framework including Input Embedding, Positional Encoding, Multi-Head Attention and Max Pooling. PmlIPM accepts separately input of miRNA and lncRNA to extract sequence features, avoiding information loss caused by direct splicing the two sequences as model inputs. The attention mechanisms give the model the ability to capture long distance features. PmlIPM is compared with the existing models on 2 benchmark datasets. The results show that our model performs better than other methods and obtains AUC scores of 0.8412, 0.8587, 0.9666 and 0.9225 in the four independent test sets of Arabidopsis lyrata (A.ly), Solanum lycopersicum (S.ly), Brachypodium distachyon (B.di) and Solanum tuberosum (S.tu), respectively.
Collapse
|
7
|
Sheng N, Wang Y, Huang L, Gao L, Cao Y, Xie X, Fu Y. Multi-task prediction-based graph contrastive learning for inferring the relationship among lncRNAs, miRNAs and diseases. Brief Bioinform 2023; 24:bbad276. [PMID: 37529914 DOI: 10.1093/bib/bbad276] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 07/09/2023] [Accepted: 07/11/2023] [Indexed: 08/03/2023] Open
Abstract
MOTIVATION Identifying the relationships among long non-coding RNAs (lncRNAs), microRNAs (miRNAs) and diseases is highly valuable for diagnosing, preventing, treating and prognosing diseases. The development of effective computational prediction methods can reduce experimental costs. While numerous methods have been proposed, they often to treat the prediction of lncRNA-disease associations (LDAs), miRNA-disease associations (MDAs) and lncRNA-miRNA interactions (LMIs) as separate task. Models capable of predicting all three relationships simultaneously remain relatively scarce. Our aim is to perform multi-task predictions, which not only construct a unified framework, but also facilitate mutual complementarity of information among lncRNAs, miRNAs and diseases. RESULTS In this work, we propose a novel unsupervised embedding method called graph contrastive learning for multi-task prediction (GCLMTP). Our approach aims to predict LDAs, MDAs and LMIs by simultaneously extracting embedding representations of lncRNAs, miRNAs and diseases. To achieve this, we first construct a triple-layer lncRNA-miRNA-disease heterogeneous graph (LMDHG) that integrates the complex relationships between these entities based on their similarities and correlations. Next, we employ an unsupervised embedding model based on graph contrastive learning to extract potential topological feature of lncRNAs, miRNAs and diseases from the LMDHG. The graph contrastive learning leverages graph convolutional network architectures to maximize the mutual information between patch representations and corresponding high-level summaries of the LMDHG. Subsequently, for the three prediction tasks, multiple classifiers are explored to predict LDA, MDA and LMI scores. Comprehensive experiments are conducted on two datasets (from older and newer versions of the database, respectively). The results show that GCLMTP outperforms other state-of-the-art methods for the disease-related lncRNA and miRNA prediction tasks. Additionally, case studies on two datasets further demonstrate the ability of GCLMTP to accurately discover new associations. To ensure reproducibility of this work, we have made the datasets and source code publicly available at https://github.com/sheng-n/GCLMTP.
Collapse
Affiliation(s)
- Nan Sheng
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Yan Wang
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
- School of Artificial Intelligence, Jilin University, 130012 Changchun, China
| | - Lan Huang
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Ling Gao
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Yangkun Cao
- School of Artificial Intelligence, Jilin University, 130012 Changchun, China
| | - Xuping Xie
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Yuan Fu
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion, UK
| |
Collapse
|
8
|
Sheng N, Huang L, Gao L, Cao Y, Xie X, Wang Y. A Survey of Computational Methods and Databases for lncRNA-MiRNA Interaction Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2810-2826. [PMID: 37030713 DOI: 10.1109/tcbb.2023.3264254] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) are two prevalent non-coding RNAs in current research. They play critical regulatory roles in the life processes of animals and plants. Studies have shown that lncRNAs can interact with miRNAs to participate in post-transcriptional regulatory processes, mainly involved in regulating cancer development, metastatic progression, and drug resistance. Additionally, these interactions have significant effects on plant growth, development, and responses to biotic and abiotic stresses. Deciphering the potential relationships between lncRNAs and miRNAs may provide new insights into our understanding of the biological functions of lncRNAs and miRNAs, and the pathogenesis of complex diseases. In contrast, gathering information on lncRNA-miRNA interactions (LMIs) through biological experiments is expensive and time-consuming. With the accumulation of multi-omics data, computational models are extremely attractive in systematically exploring potential LMIs. To the best of our knowledge, this is the first comprehensive review of computational methods for identifying LMIs. Specifically, we first summarized the available public databases for predicting animal and plant LMIs. Second, we comprehensively reviewed the computational methods for predicting LMIs and classified them into two categories, including network-based methods and sequence-based methods. Third, we analyzed the standard evaluation methods and metrics used in LMI prediction. Finally, we pointed out some problems in the current study and discuss future research directions. Relevant databases and the latest advances in LMI prediction are summarized in a GitHub repository https://github.com/sheng-n/lncRNA-miRNA-interaction-methods, and we'll keep it updated.
Collapse
|
9
|
Li H, Wu B, Sun M, Ye Y, Zhu Z, Chen K. Multi-view graph neural network with cascaded attention for lncRNA-miRNA interaction prediction. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
|
10
|
Chen L, Sun ZL. PmliHFM: Predicting Plant miRNA-lncRNA Interactions with Hybrid Feature Mining Network. Interdiscip Sci 2023; 15:44-54. [PMID: 36223068 DOI: 10.1007/s12539-022-00540-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 09/27/2022] [Accepted: 09/27/2022] [Indexed: 11/07/2022]
Abstract
Due to the crucial role of interactions between microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) in biological processes, the study of their biological functions is necessary. So far, the various computational methods have been employed to make predictions of the miRNA-lncRNA interaction, which compensate for the inadequacy of biological experiments. However, the existing methods do not consider the differences between miRNA and lncRNA in feature extraction. In this paper, we propose a hybrid feature mining network, named PmliHFM, for predicting plant miRNA-lncRNA interactions. Firstly, miRNA and lncRNA with different sequence lengths are encoded by different encodings, which can reduce the loss of information caused by using the same coding approach. Then, a hybrid feature mining network is designed to adapt to different encoding methods and extract more useful feature information than a single network. Finally, an ensemble module is utilized to integrate the training results of the hybrid feature mining network, while a prediction module is employed to determine whether there are interactions. By testing on multiple test sets, PmliHFM outperforms several state-of-the-art approaches. The results show that the AUC of PmliHFM achieves 0.8[Formula: see text], 3.1[Formula: see text] and 0.4[Formula: see text] improvement respectively on three balanced datasets, and achieves 2.1[Formula: see text] and 1.8[Formula: see text] improvement respectively on two imbalanced datasets. These experiments demonstrate the feasibility of the proposed method.
Collapse
Affiliation(s)
- Lin Chen
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, Anhui, China
- School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, Anhui, China
| | - Zhan-Li Sun
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, Anhui, China.
- School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, Anhui, China.
| |
Collapse
|
11
|
Zhang H, Wang Y, Pan Z, Sun X, Mou M, Zhang B, Li Z, Li H, Zhu F. ncRNAInter: a novel strategy based on graph neural network to discover interactions between lncRNA and miRNA. Brief Bioinform 2022; 23:6747810. [PMID: 36198065 DOI: 10.1093/bib/bbac411] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 08/04/2022] [Accepted: 08/23/2022] [Indexed: 12/14/2022] Open
Abstract
In recent years, many studies have illustrated the significant role that non-coding RNA (ncRNA) plays in biological activities, in which lncRNA, miRNA and especially their interactions have been proved to affect many biological processes. Some in silico methods have been proposed and applied to identify novel lncRNA-miRNA interactions (LMIs), but there are still imperfections in their RNA representation and information extraction approaches, which imply there is still room for further improving their performances. Meanwhile, only a few of them are accessible at present, which limits their practical applications. The construction of a new tool for LMI prediction is thus imperative for the better understanding of their relevant biological mechanisms. This study proposed a novel method, ncRNAInter, for LMI prediction. A comprehensive strategy for RNA representation and an optimized deep learning algorithm of graph neural network were utilized in this study. ncRNAInter was robust and showed better performance of 26.7% higher Matthews correlation coefficient than existing reputable methods for human LMI prediction. In addition, ncRNAInter proved its universal applicability in dealing with LMIs from various species and successfully identified novel LMIs associated with various diseases, which further verified its effectiveness and usability. All source code and datasets are freely available at https://github.com/idrblab/ncRNAInter.
Collapse
Affiliation(s)
- Hanyu Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Xiuna Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Bing Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Zhaorong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Honglin Li
- School of Computer Science and Technology, East China Normal University, Shanghai 200062, China.,Shanghai Key Laboratory of New Drug Design, East China University of Science and Technology, Shanghai 200237, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
12
|
Wang W, Zhang L, Sun J, Zhao Q, Shuai J. Predicting the potential human lncRNA-miRNA interactions based on graph convolution network with conditional random field. Brief Bioinform 2022; 23:6775599. [PMID: 36305458 DOI: 10.1093/bib/bbac463] [Citation(s) in RCA: 134] [Impact Index Per Article: 67.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 09/10/2022] [Accepted: 09/27/2022] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNA (lncRNA) and microRNA (miRNA) are two typical types of non-coding RNAs (ncRNAs), their interaction plays an important regulatory role in many biological processes. Exploring the interactions between unknown lncRNA and miRNA can help us better understand the functional expression between lncRNA and miRNA. At present, the interactions between lncRNA and miRNA are mainly obtained through biological experiments, but such experiments are often time-consuming and labor-intensive, it is necessary to design a computational method that can predict the interactions between lncRNA and miRNA. In this paper, we propose a method based on graph convolutional neural (GCN) network and conditional random field (CRF) for predicting human lncRNA-miRNA interactions, named GCNCRF. First, we construct a heterogeneous network using the known interactions of lncRNA and miRNA in the LncRNASNP2 database, the lncRNA/miRNA integration similarity network, and the lncRNA/miRNA feature matrix. Second, the initial embedding of nodes is obtained using a GCN network. A CRF set in the GCN hidden layer can update the obtained preliminary embeddings so that similar nodes have similar embeddings. At the same time, an attention mechanism is added to the CRF layer to reassign weights to nodes to better grasp the feature information of important nodes and ignore some nodes with less influence. Finally, the final embedding is decoded and scored through the decoding layer. Through a 5-fold cross-validation experiment, GCNCRF has an area under the receiver operating characteristic curve value of 0.947 on the main dataset, which has higher prediction accuracy than the other six state-of-the-art methods.
Collapse
Affiliation(s)
- Wenya Wang
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Jianqiang Sun
- School of Automation and Electrical Engineering, Linyi University, Linyi, 276000, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Jianwei Shuai
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), and Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, 325001, China.,Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research, Xiamen University, Xiamen, 361005, China.,National Institute for Data Science in Health and Medicine, and State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, Xiamen University, Xiamen, 361005, China
| |
Collapse
|
13
|
Baruah C, Nath P, Barah P. LncRNAs in neuropsychiatric disorders and computational insights for their prediction. Mol Biol Rep 2022; 49:11515-11534. [PMID: 36097122 DOI: 10.1007/s11033-022-07819-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 07/20/2022] [Accepted: 07/24/2022] [Indexed: 12/06/2022]
Abstract
Long non-coding RNAs (lncRNAs) are 200 nucleotide extended transcripts that do not encode proteins or possess limited coding ability. LncRNAs epigenetically control several biological functions such as gene regulation, transcription, mRNA splicing, protein interaction, and genomic imprinting. Over the years, drastic progress in understanding the role of lncRNAs in diverse biological processes has been made. LncRNAs are reported to show tissue-specific expression patterns suggesting their potential as novel candidate biomarkers for diseases. Among all other non-coding RNAs, lncRNAs are highly expressed within the brain-enriched or brain-specific regions of the neural tissues. They are abundantly expressed in the neocortex and pre-mature frontal regions of the brain. LncRNAs are co-expressed with the protein-coding genes and have a significant role in the evolution of functions of the brain. Any deregulation in the lncRNAs contributes to disruptions in normal brain functions resulting in multiple neurological disorders. Neuropsychiatric disorders such as schizophrenia, bipolar disease, autism spectrum disorders, and anxiety are associated with the abnormal expression and regulation of lncRNAs. This review aims to highlight the understanding of lncRNAs concerning normal brain functions and their deregulation associated with neuropsychiatric disorders. We have also provided a survey on the available computational tools for the prediction of lncRNAs, their protein coding potentials, and sub-cellular locations, along with a section on existing online databases with known lncRNAs, and their interactions with other molecules.
Collapse
Affiliation(s)
- Cinmoyee Baruah
- Department of Molecular Biology and Biotechnology, Tezpur University, 784028, Napaam, Sonitpur, Assam, India
| | - Prangan Nath
- Department of Molecular Biology and Biotechnology, Tezpur University, 784028, Napaam, Sonitpur, Assam, India
| | - Pankaj Barah
- Department of Molecular Biology and Biotechnology, Tezpur University, 784028, Napaam, Sonitpur, Assam, India.
| |
Collapse
|
14
|
Asim MN, Ibrahim MA, Zehe C, Trygg J, Dengel A, Ahmed S. BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA–miRNA interaction prediction. Interdiscip Sci 2022; 14:841-862. [PMID: 35947255 PMCID: PMC9581873 DOI: 10.1007/s12539-022-00535-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 06/16/2022] [Accepted: 07/12/2022] [Indexed: 11/30/2022]
Abstract
Background and objective: Interactions of long non-coding ribonucleic acids (lncRNAs) with micro-ribonucleic acids (miRNAs) play an essential role in gene regulation, cellular metabolic, and pathological processes. Existing purely sequence based computational approaches lack robustness and efficiency mainly due to the high length variability of lncRNA sequences. Hence, the prime focus of the current study is to find optimal length trade-offs between highly flexible length lncRNA sequences. Method The paper at hand performs in-depth exploration of diverse copy padding, sequence truncation approaches, and presents a novel idea of utilizing only subregions of lncRNA sequences to generate fixed-length lncRNA sequences. Furthermore, it presents a novel bag of tricks-based deep learning approach “Bot-Net” which leverages a single layer long-short-term memory network regularized through DropConnect to capture higher order residue dependencies, pooling to retain most salient features, normalization to prevent exploding and vanishing gradient issues, learning rate decay, and dropout to regularize precise neural network for lncRNA–miRNA interaction prediction. Results BoT-Net outperforms the state-of-the-art lncRNA–miRNA interaction prediction approach by 2%, 8%, and 4% in terms of accuracy, specificity, and matthews correlation coefficient. Furthermore, a case study analysis indicates that BoT-Net also outperforms state-of-the-art lncRNA–protein interaction predictor on a benchmark dataset by accuracy of 10%, sensitivity of 19%, specificity of 6%, precision of 14%, and matthews correlation coefficient of 26%. Conclusion In the benchmark lncRNA–miRNA interaction prediction dataset, the length of the lncRNA sequence varies from 213 residues to 22,743 residues and in the benchmark lncRNA–protein interaction prediction dataset, lncRNA sequences vary from 15 residues to 1504 residues. For such highly flexible length sequences, fixed length generation using copy padding introduces a significant level of bias which makes a large number of lncRNA sequences very much identical to each other and eventually derail classifier generalizeability. Empirical evaluation reveals that within 50 residues of only the starting region of long lncRNA sequences, a highly informative distribution for lncRNA–miRNA interaction prediction is contained, a crucial finding exploited by the proposed BoT-Net approach to optimize the lncRNA fixed length generation process. Availability: BoT-Net web server can be accessed at https://sds_genetic_analysis.opendfki.de/lncmiRNA/. Graphic Abstract ![]()
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- Department of Computer Science, Technical University of Kaiserslautern, 67663, Kaiserslautern, Rhineland-Palatinate, Germany.
- German Research Center for Artificial Intelligence GmbH, 67663, Kaiserslautern, Rhineland-Palatinate, Germany.
| | - Muhammad Ali Ibrahim
- Department of Computer Science, Technical University of Kaiserslautern, 67663, Kaiserslautern, Rhineland-Palatinate, Germany
- German Research Center for Artificial Intelligence GmbH, 67663, Kaiserslautern, Rhineland-Palatinate, Germany
| | - Christoph Zehe
- Sartorius Stedim Cellca GmbH, 88471, Laupheim, Baden-Wurttemberg, Germany
| | - Johan Trygg
- Sartorius Stedim Cellca GmbH, 88471, Laupheim, Baden-Wurttemberg, Germany
- Computational Life Science Cluster (CLiC), Umea University, 90187, Umea, Sweden
| | - Andreas Dengel
- Department of Computer Science, Technical University of Kaiserslautern, 67663, Kaiserslautern, Rhineland-Palatinate, Germany
- German Research Center for Artificial Intelligence GmbH, 67663, Kaiserslautern, Rhineland-Palatinate, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence GmbH, 67663, Kaiserslautern, Rhineland-Palatinate, Germany
- Computational Life Science Cluster (CLiC), Umea University, 90187, Umea, Sweden
| |
Collapse
|
15
|
Recent Deep Learning Methodology Development for RNA–RNA Interaction Prediction. Symmetry (Basel) 2022. [DOI: 10.3390/sym14071302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Genetic regulation of organisms involves complicated RNA–RNA interactions (RRIs) among messenger RNA (mRNA), microRNA (miRNA), and long non-coding RNA (lncRNA). Detecting RRIs is beneficial for discovering biological mechanisms as well as designing new drugs. In recent years, with more and more experimentally verified RNA–RNA interactions being deposited into databases, statistical machine learning, especially recent deep-learning-based automatic algorithms, have been widely applied to RRI prediction with remarkable success. This paper first gives a brief introduction to the traditional machine learning methods applied on RRI prediction and benchmark databases for training the models, and then provides a recent methodology overview of deep learning models in the prediction of microRNA (miRNA)–mRNA interactions and long non-coding RNA (lncRNA)–miRNA interactions.
Collapse
|
16
|
Song J, Tian S, Yu L, Yang Q, Xing Y, Zhang C, Dai Q, Duan X. MD-MLI: Prediction of miRNA-lncRNA Interaction by Using Multiple Features and Hierarchical Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1724-1733. [PMID: 33125334 DOI: 10.1109/tcbb.2020.3034922] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Long non-coding RNA(lncRNA) can interact with microRNA(miRNA) and play an important role in inhibiting or activating the expression of target genes and the occurrence and development of tumors. Accumulating studies focus on the prediction of miRNA-lncRNA interaction, and mostly are concerned with biological experiments and machine learning methods. These methods are found with long cycles, high costs, and requiring over much human intervention. In this paper, a data-driven hierarchical deep learning framework was proposed, which was composed of a capsule network, an independent recurrent neural network with attention mechanism and bi-directional long short-term memory network. This framework combines the advantages of different networks, uses multiple sequence-derived features of the original sequence and features of secondary structure to mine the dependency between features, and devotes to obtain better results. In the experiment, five-fold cross-validation was used to evaluate the performance of the model, and the zea mays data set was compared with the different model to obtain better classification effect. In addition, sorghum, brachypodium distachyon and bryophyte data sets were used to test the model, and the accuracy reached 0.9850, 0.9859 and 0.9777, respectively, which verified the model's good generalization ability.
Collapse
|
17
|
Huang YA, Huang ZA, Li JQ, You ZH, Wang L, Yi HC, Yu CQ. GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences. BMC Genomics 2022; 22:916. [PMID: 35296232 PMCID: PMC8925046 DOI: 10.1186/s12864-022-08423-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Accepted: 02/25/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Recent evidences have suggested that human microorganisms participate in important biological activities in the human body. The dysfunction of host-microbiota interactions could lead to complex human disorders. The knowledge on host-microbiota interactions can provide valuable insights into understanding the pathological mechanism of diseases. However, it is time-consuming and costly to identify the disorder-specific microbes from the biological "haystack" merely by routine wet-lab experiments. With the developments in next-generation sequencing and omics-based trials, it is imperative to develop computational prediction models for predicting microbe-disease associations on a large scale. RESULTS Based on the known microbe-disease associations derived from the Human Microbe-Disease Association Database (HMDAD), the proposed model shows reliable performance with high values of the area under ROC curve (AUC) of 0.9456 and 0.8866 in leave-one-out cross validations and five-fold cross validations, respectively. In case studies of colorectal carcinoma, 80% out of the top-20 predicted microbes have been experimentally confirmed via published literatures. CONCLUSION Based on the assumption that functionally similar microbes tend to share the similar interaction patterns with human diseases, we here propose a group based computational model of Bayesian disease-oriented ranking to prioritize the most potential microbes associating with various human diseases. Based on the sequence information of genes, two computational approaches (BLAST+ and MEGA 7) are leveraged to measure the microbe-microbe similarity from different perspectives. The disease-disease similarity is calculated by capturing the hierarchy information from the Medical Subject Headings (MeSH) data. The experimental results illustrate the accuracy and effectiveness of the proposed model. This work is expected to facilitate the characterization and identification of promising microbial biomarkers.
Collapse
Affiliation(s)
- Yu-An Huang
- Department of Information Engineering, Xijing University, Xi'an, 710123, China.
| | - Zhi-An Huang
- Center for Computer Science and Information Technology, City University of Hong Kong Dongguan Research Institute, Dongguan, China
| | - Jian-Qiang Li
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China.
| | - Zhu-Hong You
- Department of Information Engineering, Xijing University, Xi'an, 710123, China
| | - Lei Wang
- Guangxi Academy of Science, Nanning, 530000, China
| | - Hai-Cheng Yi
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, 830000, China
| | - Chang-Qing Yu
- Department of Information Engineering, Xijing University, Xi'an, 710123, China
| |
Collapse
|
18
|
Heydarnezhad Asl M, Pasban Khelejani F, Bahojb Mahdavi SZ, Emrahi L, Jebelli A, Mokhtarzadeh A. The various regulatory functions of long noncoding RNAs in apoptosis, cell cycle, and cellular senescence. J Cell Biochem 2022; 123:995-1024. [PMID: 35106829 DOI: 10.1002/jcb.30221] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 12/28/2021] [Accepted: 01/11/2022] [Indexed: 12/12/2022]
Abstract
Long noncoding RNAs (lncRNAs) are a group of noncoding cellular RNAs involved in significant biological phenomena such as differentiation, cell development, genomic imprinting, adjusting the enzymatic activity, regulating chromosome conformation, apoptosis, cell cycle, and cellular senescence. The misregulation of lncRNAs interrupting normal biological processes has been implicated in tumor formation and metastasis, resulting in cancer. Apoptosis and cell cycle, two main biological phenomena, are highly conserved and intimately coupled mechanisms. Hence, some cell cycle regulators can influence both programmed cell death and cell division. Apoptosis eliminates defective and unwanted cells, and the cell cycle enables cells to replicate themselves. The improper regulation of apoptosis and cell cycle contributes to numerous disorders such as neurodegenerative and autoimmune diseases, viral infection, anemia, and mainly cancer. Cellular senescence is a tumor-suppressing response initiated by environmental and internal stress factors. This phenomenon has recently attained more attention due to its therapeutic implications in the field of senotherapy. In this review, the regulatory roles of lncRNAs on apoptosis, cell cycle, and senescence will be discussed. First, the role of lncRNAs in mitochondrial dynamics and apoptosis is addressed. Next, the interaction between lncRNAs and caspases, pro/antiapoptotic proteins, and also EGFR/PI3K/PTEN/AKT/mTORC1 signaling pathway will be investigated. Furthermore, the effect of lncRNAs in the cell cycle is surveyed through interaction with cyclins, cdks, p21, and wnt/β-catenin/c-myc pathway. Finally, the function of essential lncRNAs in cellular senescence is mentioned.
Collapse
Affiliation(s)
| | - Faezeh Pasban Khelejani
- Department of Cell and Molecular Biology, Faculty of Basic Sciences, University of Maragheh, Maragheh, Iran
| | | | - Leila Emrahi
- Department of Medical Genetics, Faculty of Medical Science, Tarbiat Modares University, Tehran, Iran
| | - Asiyeh Jebelli
- Department of Biological Science, Faculty of Basic Science, Higher Education Institute of Rab-Rashid, Tabriz, Iran.,Tuberculosis and Lung Disease Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Ahad Mokhtarzadeh
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| |
Collapse
|
19
|
Rincón-Riveros A, Morales D, Rodríguez JA, Villegas VE, López-Kleine L. Bioinformatic Tools for the Analysis and Prediction of ncRNA Interactions. Int J Mol Sci 2021; 22:11397. [PMID: 34768830 PMCID: PMC8583695 DOI: 10.3390/ijms222111397] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 09/30/2021] [Accepted: 09/30/2021] [Indexed: 12/16/2022] Open
Abstract
Noncoding RNAs (ncRNAs) play prominent roles in the regulation of gene expression via their interactions with other biological molecules such as proteins and nucleic acids. Although much of our knowledge about how these ncRNAs operate in different biological processes has been obtained from experimental findings, computational biology can also clearly substantially boost this knowledge by suggesting possible novel interactions of these ncRNAs with other molecules. Computational predictions are thus used as an alternative source of new insights through a process of mutual enrichment because the information obtained through experiments continuously feeds through into computational methods. The results of these predictions in turn shed light on possible interactions that are subsequently validated experimentally. This review describes the latest advances in databases, bioinformatic tools, and new in silico strategies that allow the establishment or prediction of biological interactions of ncRNAs, particularly miRNAs and lncRNAs. The ncRNA species described in this work have a special emphasis on those found in humans, but information on ncRNA of other species is also included.
Collapse
Affiliation(s)
- Andrés Rincón-Riveros
- Bioinformatics and Systems Biology Group, Universidad Nacional de Colombia, Bogotá 111221, Colombia;
| | - Duvan Morales
- Centro de Investigaciones en Microbiología y Biotecnología-UR (CIMBIUR), Facultad de Ciencias Naturales, Universidad del Rosario, Bogotá 111221, Colombia;
| | - Josefa Antonia Rodríguez
- Grupo de Investigación en Biología del Cáncer, Instituto Nacional de Cancerología, Bogotá 111221, Colombia;
| | - Victoria E. Villegas
- Centro de Investigaciones en Microbiología y Biotecnología-UR (CIMBIUR), Facultad de Ciencias Naturales, Universidad del Rosario, Bogotá 111221, Colombia;
| | - Liliana López-Kleine
- Department of Statistics, Faculty of Science, Universidad Nacional de Colombia, Bogotá 111221, Colombia
| |
Collapse
|
20
|
Zhang XM, Liang L, Liu L, Tang MJ. Graph Neural Networks and Their Current Applications in Bioinformatics. Front Genet 2021; 12:690049. [PMID: 34394185 PMCID: PMC8360394 DOI: 10.3389/fgene.2021.690049] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 05/28/2021] [Indexed: 12/22/2022] Open
Abstract
Graph neural networks (GNNs), as a branch of deep learning in non-Euclidean space, perform particularly well in various tasks that process graph structure data. With the rapid accumulation of biological network data, GNNs have also become an important tool in bioinformatics. In this research, a systematic survey of GNNs and their advances in bioinformatics is presented from multiple perspectives. We first introduce some commonly used GNN models and their basic principles. Then, three representative tasks are proposed based on the three levels of structural information that can be learned by GNNs: node classification, link prediction, and graph generation. Meanwhile, according to the specific applications for various omics data, we categorize and discuss the related studies in three aspects: disease prediction, drug discovery, and biomedical imaging. Based on the analysis, we provide an outlook on the shortcomings of current studies and point out their developing prospect. Although GNNs have achieved excellent results in many biological tasks at present, they still face challenges in terms of low-quality data processing, methodology, and interpretability and have a long road ahead. We believe that GNNs are potentially an excellent method that solves various biological problems in bioinformatics research.
Collapse
Affiliation(s)
- Xiao-Meng Zhang
- School of Information, Yunnan Normal University, Kunming, China
| | - Li Liang
- School of Information, Yunnan Normal University, Kunming, China
| | - Lin Liu
- School of Information, Yunnan Normal University, Kunming, China
- Key Laboratory of Educational Informatization for Nationalities Ministry of Education, Yunnan Normal University, Kunming, China
| | - Ming-Jing Tang
- Key Laboratory of Educational Informatization for Nationalities Ministry of Education, Yunnan Normal University, Kunming, China
- School of Life Sciences, Yunnan Normal University, Kunming, China
| |
Collapse
|
21
|
Classification of Breast Cancer and Breast Neoplasm Scenarios Based on Machine Learning and Sequence Features from lncRNAs-miRNAs-Diseases Associations. Interdiscip Sci 2021; 13:572-581. [PMID: 34152557 DOI: 10.1007/s12539-021-00451-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 04/28/2021] [Accepted: 06/09/2021] [Indexed: 10/21/2022]
Abstract
The influence of non-coding RNAs, such as lncRNAs (long non-coding RNAs) and miRNAs (microRNAs), is undeniable in several diseases, for example, in the formation of neoplasms and cancer scenarios. However, there are challenges due to the scarcity of validated datasets and the imbalance in the data. We found that the research of associations between miRNAs-lncRNAs and diseases is limited or done separately. In addition, those investigations, which use Machine Learning models joined with genomic sequence features extracted from miRNAs and lncRNAs, are few compared with using some methods such as genomic expression or Deep Learning techniques. In this paper, we propose a structure of using supervised and unsupervised machine learning models with genomic sequence features, such as k-mers, sequence alignments, and energy folding values, to validate miRNAs and lncRNAs association with breast cancer and neoplasms scenarios. Using One-Class SVM for outlier detection and comparing two supervised models such as SVM and Random Forest, we manage to obtain accuracy results of 95.44% for the One-class model, with 88.79% and 99.65% for the SVM and Random Forest models, respectively. The results showed a promising path for the study of sequence features interactions joined with Machine Learning models comparable to those found in the existing literature.
Collapse
|
22
|
Song J, Zhang Z. Long non‑coding RNA SNHG20 promotes cell proliferation, migration and invasion in retinoblastoma via the miR‑335‑5p/E2F3 axis. Mol Med Rep 2021; 24:543. [PMID: 34080033 DOI: 10.3892/mmr.2021.12182] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Accepted: 03/08/2021] [Indexed: 11/05/2022] Open
Abstract
Current therapies for retinoblastoma (RB) are unsatisfactory and there is an urgent need for the development of new treatment modalities. Small nucleolar RNA host gene 20 (SNHG20) has been reported to serve a key oncogenic role in the development of various types of cancer, but its role in RB tumorigenesis remains to be fully determined. The present study aimed to investigate the expression patterns and biological roles of SNHG20 in RB. The expression levels of SNHG20 were measured via reverse transcription‑quantitative PCR in RB tissues and cell lines. The impact of SNHG20 status on cell proliferation, survival, migration and invasion was determined using small interfering RNA and a range of established experimental assays. The SNHG20/microRNA (miR)‑335‑5p/E2F transcription factor 3 (E2F3) signaling axis was further investigated using a dual‑luciferase activity reporter system and an RNA pull‑down assay combined with bioinformatics analyses. SNHG20 expression was significantly increased in RB tissues and cell lines. Silencing of SNHG20 in RB cells was shown to inhibit cell proliferation, clonogenic survival, migration and invasion. Moreover, mechanistic investigations demonstrated that SNHG20 could enhance the expression of E2F3 by sponging of miR‑335‑5p. These data suggested that the long non‑coding RNA SNHG20 may promote cell proliferation, migration and invasion in RB via the miR‑335‑5p/E2F3 axis.
Collapse
Affiliation(s)
- Jing Song
- Department of Ophthalmology, The First People's Hospital of Lianyungang, Lianyungang, Jiangsu 222000, P.R. China
| | - Ziping Zhang
- Department of Ophthalmology, The First People's Hospital of Lianyungang, Lianyungang, Jiangsu 222000, P.R. China
| |
Collapse
|
23
|
Kang Q, Meng J, Shi W, Luan Y. Ensemble Deep Learning Based on Multi-level Information Enhancement and Greedy Fuzzy Decision for Plant miRNA-lncRNA Interaction Prediction. Interdiscip Sci 2021; 13:603-614. [PMID: 33900552 DOI: 10.1007/s12539-021-00434-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 04/01/2021] [Accepted: 04/16/2021] [Indexed: 12/18/2022]
Abstract
MicroRNAs (miRNAs) and long non-coding RNAs (lncRNAs) are both non-coding RNAs (ncRNAs) and their interactions play important roles in biological processes. Computational methods, such as machine learning and various bioinformatics tools, can predict potential miRNA-lncRNA interactions, which is significant for studying their mechanisms and biological functions. A growing number of RNA interaction predictors for animal have been reported, but they are unreliable for plant due to the differences of ncRNAs in animal and plant. It is urgent to build a reliable plant predictor, especially for cross-species. This paper proposes an ensemble deep learning model based on multi-level information enhancement and greedy fuzzy decision (PmliPEMG) for plant miRNA-lncRNA interaction prediction. The fusion complex features, multi-scale convolutional long short-term memory networks, and attention mechanism are adopted to enhance the sample information at the feature, scale, and model levels, respectively. An ensemble deep learning model is built based on a novel method (greedy fuzzy decision) which greatly improves the efficiency. The multi-level information enhancement and greedy fuzzy decision are verified to have the positive effects on prediction performance. PmliPEMG can be applied to the cross-species prediction. It shows better performance and stronger generalization ability than state-of-the-art predictors and may provide valuable references for related research.
Collapse
Affiliation(s)
- Qiang Kang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, China
| | - Jun Meng
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, China.
| | - Wenhao Shi
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, China
| | - Yushi Luan
- School of Bioengineering, Dalian University of Technology, Dalian, 116024, Liaoning, China
| |
Collapse
|
24
|
Heydarzadeh S, Ranjbar M, Karimi F, Seif F, Alivand MR. Overview of host miRNA properties and their association with epigenetics, long non-coding RNAs, and Xeno-infectious factors. Cell Biosci 2021; 11:43. [PMID: 33632341 PMCID: PMC7905430 DOI: 10.1186/s13578-021-00552-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 02/06/2021] [Indexed: 12/19/2022] Open
Abstract
MicroRNA-derived structures play impressive roles in various biological processes. So dysregulation of miRNAs can lead to different human diseases. Recent studies have extended our comprehension of the control of miRNA function and features. Here, we overview some remarkable miRNA properties that have potential implications for the miRNA functions, including different variants of a miRNA called isomiRs, miRNA arm selection/arm switching, and the effect of these factors on miRNA target selection. Besides, we review some aspects of miRNA interactions such as the interaction between epigenetics and miRNA (different miRNAs and their related processing enzymes are epigenetically regulated by multiple DNA methylation enzymes. moreover, DNA methylation could be controlled by diverse mechanisms related to miRNAs), direct and indirect crosstalk between miRNA and lnc (Long Non-Coding) RNAs as a further approach to conduct intercellular regulation called "competing endogenous RNA" (ceRNA) that is involved in the pathogenesis of different diseases, and the interaction of miRNA activities and some Xeno-infectious (virus/bacteria/parasite) factors, which result in modulation of the pathogenesis of infections. This review provides some related studies to a better understanding of miRNA involvement mechanisms and overcoming the complexity of related diseases that may be applicable and useful to prognostic, diagnostic, therapeutic purposes and personalized medicine in the future.
Collapse
Affiliation(s)
- Samaneh Heydarzadeh
- Department of Medical Genetics, Faculty of Medicine, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Maryam Ranjbar
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Farokh Karimi
- Department of Biotechnology, Faculty of Science, University of Maragheh, Maragheh, Iran
| | - Farhad Seif
- Department of Immunology and Allergy, Academic Center for Education, Culture, and Research (ACECR), Tehran, Iran
- Neuroscience Research Center, Iran University of Medical Sciences, Tehran, Iran
| | - Mohammad Reza Alivand
- Department of Medical Genetics, Faculty of Medicine, Tabriz University of Medical Sciences, Tabriz, Iran.
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
25
|
Cantile M, Di Bonito M, Tracey De Bellis M, Botti G. Functional Interaction among lncRNA HOTAIR and MicroRNAs in Cancer and Other Human Diseases. Cancers (Basel) 2021; 13:cancers13030570. [PMID: 33540611 PMCID: PMC7867281 DOI: 10.3390/cancers13030570] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 01/22/2021] [Accepted: 01/28/2021] [Indexed: 02/06/2023] Open
Abstract
Simple Summary This review aimed to describe the contribution of functional interaction between the lncRNA HOTAIR and microRNAs in human diseases, including cancer. HOTAIR/miRNAs complexes interfere with different cellular processes during carcinogenesis, mainly deregulating a series of oncogenic signaling pathways. A great number of ncRNAs-related databases have been established, supported by bioinformatics technologies, to identify the ncRNA-mediated sponge regulatory network. These approaches need experimental validation through cells and animal models studies. The optimization of systems to interfere with HOTAIR/miRNAs interplay could represent a new tool for the definition of diagnostic therapeutics in cancer patients. Abstract LncRNAs are a class of non-coding RNAs mostly involved in regulation of cancer initiation, metastatic progression, and drug resistance, through participation in post-transcription regulatory processes by interacting with different miRNAs. LncRNAs are able to compete with endogenous RNAs by binding and sequestering miRNAs and thereby regulating the expression of their target genes, often represented by oncogenes. The lncRNA HOX transcript antisense RNA (HOTAIR) represents a diagnostic, prognostic, and predictive biomarker in many human cancers, and its functional interaction with miRNAs has been described as crucial in the modulation of different cellular processes during cancer development. The aim of this review is to highlight the relation between lncRNA HOTAIR and different microRNAs in human diseases, discussing the contribution of these functional interactions, especially in cancer development and progression.
Collapse
Affiliation(s)
- Monica Cantile
- Pathology Unit, Istituto Nazionale Tumori-Irccs-Fondazione G.Pascale, 80131 Naples, Italy;
- Correspondence: ; Tel.: +39-081-590-3471; Fax: +39-081-590-3718
| | - Maurizio Di Bonito
- Pathology Unit, Istituto Nazionale Tumori-Irccs-Fondazione G.Pascale, 80131 Naples, Italy;
| | - Maura Tracey De Bellis
- Scientific Direction, Istituto Nazionale Tumori-Irccs-Fondazione G.Pascale, 80131 Naples, Italy; (M.T.D.B.); (G.B.)
| | - Gerardo Botti
- Scientific Direction, Istituto Nazionale Tumori-Irccs-Fondazione G.Pascale, 80131 Naples, Italy; (M.T.D.B.); (G.B.)
| |
Collapse
|
26
|
Wu QW, Xia JF, Ni JC, Zheng CH. GAERF: predicting lncRNA-disease associations by graph auto-encoder and random forest. Brief Bioinform 2021; 22:6067881. [PMID: 33415333 DOI: 10.1093/bib/bbaa391] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 11/26/2020] [Accepted: 11/30/2020] [Indexed: 12/11/2022] Open
Abstract
Predicting disease-related long non-coding RNAs (lncRNAs) is beneficial to finding of new biomarkers for prevention, diagnosis and treatment of complex human diseases. In this paper, we proposed a machine learning techniques-based classification approach to identify disease-related lncRNAs by graph auto-encoder (GAE) and random forest (RF) (GAERF). First, we combined the relationship of lncRNA, miRNA and disease into a heterogeneous network. Then, low-dimensional representation vectors of nodes were learned from the network by GAE, which reduce the dimension and heterogeneity of biological data. Taking these feature vectors as input, we trained a RF classifier to predict new lncRNA-disease associations (LDAs). Related experiment results show that the proposed method for the representation of lncRNA-disease characterizes them accurately. GAERF achieves superior performance owing to the ensemble learning method, outperforming other methods significantly. Moreover, case studies further demonstrated that GAERF is an effective method to predict LDAs.
Collapse
Affiliation(s)
- Qing-Wen Wu
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China
| | - Jun-Feng Xia
- Institute of Physical Science and Information Technology, Anhui University, Hefei, China
| | - Jian-Cheng Ni
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Chun-Hou Zheng
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China
| |
Collapse
|
27
|
A Compressive Review about Taxol ®: History and Future Challenges. Molecules 2020; 25:molecules25245986. [PMID: 33348838 PMCID: PMC7767101 DOI: 10.3390/molecules25245986] [Citation(s) in RCA: 141] [Impact Index Per Article: 35.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 12/15/2020] [Accepted: 12/16/2020] [Indexed: 12/17/2022] Open
Abstract
Taxol®, which is also known as paclitaxel, is a chemotherapeutic agent widely used to treat different cancers. Since the discovery of its antitumoral activity, Taxol® has been used to treat over one million patients, making it one of the most widely employed antitumoral drugs. Taxol® was the first microtubule targeting agent described in the literature, with its main mechanism of action consisting of the disruption of microtubule dynamics, thus inducing mitotic arrest and cell death. However, secondary mechanisms for achieving apoptosis have also been demonstrated. Despite its wide use, Taxol® has certain disadvantages. The main challenges facing Taxol® are the need to find an environmentally sustainable production method based on the use of microorganisms, increase its bioavailability without exerting adverse effects on the health of patients and minimize the resistance presented by a high percentage of cells treated with paclitaxel. This review details, in a succinct manner, the main aspects of this important drug, from its discovery to the present day. We highlight the main challenges that must be faced in the coming years, in order to increase the effectiveness of Taxol® as an anticancer agent.
Collapse
|
28
|
Alam T, Al-Absi HRH, Schmeier S. Deep Learning in LncRNAome: Contribution, Challenges, and Perspectives. Noncoding RNA 2020; 6:E47. [PMID: 33266128 PMCID: PMC7711891 DOI: 10.3390/ncrna6040047] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2020] [Revised: 10/27/2020] [Accepted: 11/06/2020] [Indexed: 12/11/2022] Open
Abstract
Long non-coding RNAs (lncRNA), the pervasively transcribed part of the mammalian genome, have played a significant role in changing our protein-centric view of genomes. The abundance of lncRNAs and their diverse roles across cell types have opened numerous avenues for the research community regarding lncRNAome. To discover and understand lncRNAome, many sophisticated computational techniques have been leveraged. Recently, deep learning (DL)-based modeling techniques have been successfully used in genomics due to their capacity to handle large amounts of data and produce relatively better results than traditional machine learning (ML) models. DL-based modeling techniques have now become a choice for many modeling tasks in the field of lncRNAome as well. In this review article, we summarized the contribution of DL-based methods in nine different lncRNAome research areas. We also outlined DL-based techniques leveraged in lncRNAome, highlighting the challenges computational scientists face while developing DL-based models for lncRNAome. To the best of our knowledge, this is the first review article that summarizes the role of DL-based techniques in multiple areas of lncRNAome.
Collapse
Affiliation(s)
- Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha 34110, Qatar;
| | - Hamada R. H. Al-Absi
- College of Science and Engineering, Hamad Bin Khalifa University, Doha 34110, Qatar;
| | - Sebastian Schmeier
- School of Natural and Computational Sciences, Massey University, Auckland 0632, New Zealand;
| |
Collapse
|
29
|
Wang W, Guan X, Khan MT, Xiong Y, Wei DQ. LMI-DForest: A deep forest model towards the prediction of lncRNA-miRNA interactions. Comput Biol Chem 2020; 89:107406. [PMID: 33120126 DOI: 10.1016/j.compbiolchem.2020.107406] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 10/12/2020] [Accepted: 10/15/2020] [Indexed: 02/07/2023]
Abstract
The interactions between miRNAs and long non-coding RNAs (lncRNAs) are subject to intensive recent studies due to its critical role in gene regulations. Computational prediction of lncRNA-miRNA interactions has become a popular alternative strategy to the experimental methods for identification of underlying interactions. It is desirable to develop the machine learning-based models for prediction of lncRNA-miRNA based on the experimentally validated interactions between lncRNAs and miRNAs. The accuracy and robustness of existing models based on machine learning techniques are subject to further improvement. Considering that the attributes of lncRNA and miRNA contribute key importance in the interaction between these two RNAs, a deep learning model, named LMI-DForest, is proposed here by combining the deep forest and autoencoder strategies. Systematic comparison on the experiment validated datasets for lncRNA-miRNA interaction datasets demonstrates that the proposed method consistently shows superior performance over the other machine learning models in the lncRNA-miRNA interaction prediction.
Collapse
Affiliation(s)
- Wei Wang
- School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, China
| | - Xiaoqing Guan
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Muhammad Tahir Khan
- Institute of Molecular Biology and Biotechnology, The University of Lahore Pakistan, Pakistan
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China; Peng Cheng Laboratory, Shenzhen, Guangdong, China.
| |
Collapse
|
30
|
Kang Q, Meng J, Cui J, Luan Y, Chen M. PmliPred: a method based on hybrid model and fuzzy decision for plant miRNA-lncRNA interaction prediction. Bioinformatics 2020; 36:2986-2992. [PMID: 32087005 DOI: 10.1093/bioinformatics/btaa074] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 12/18/2019] [Accepted: 01/27/2020] [Indexed: 12/28/2022] Open
Abstract
MOTIVATION The studies have indicated that not only microRNAs (miRNAs) or long non-coding RNAs (lncRNAs) play important roles in biological activities, but also their interactions affect the biological process. A growing number of studies focus on the miRNA-lncRNA interactions, while few of them are proposed for plant. The prediction of interactions is significant for understanding the mechanism of interaction between miRNA and lncRNA in plant. RESULTS This article proposes a new method for fulfilling plant miRNA-lncRNA interaction prediction (PmliPred). The deep learning model and shallow machine learning model are trained using raw sequence and manually extracted features, respectively. Then they are hybridized based on fuzzy decision for prediction. PmliPred shows better performance and generalization ability compared with the existing methods. Several new miRNA-lncRNA interactions in Solanum lycopersicum are successfully identified using quantitative real time-polymerase chain reaction from the candidates predicted by PmliPred, which further verifies its effectiveness. AVAILABILITY AND IMPLEMENTATION The source code of PmliPred is freely available at http://bis.zju.edu.cn/PmliPred/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qiang Kang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Jun Meng
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Jun Cui
- School of Bioengineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Yushi Luan
- School of Bioengineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Ming Chen
- College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
31
|
Application of deep learning in genomics. SCIENCE CHINA-LIFE SCIENCES 2020; 63:1860-1878. [PMID: 33051704 DOI: 10.1007/s11427-020-1804-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Accepted: 08/15/2020] [Indexed: 12/19/2022]
Abstract
In recent years, deep learning has been widely used in diverse fields of research, such as speech recognition, image classification, autonomous driving and natural language processing. Deep learning has showcased dramatically improved performance in complex classification and regression problems, where the intricate structure in the high-dimensional data is difficult to discover using conventional machine learning algorithms. In biology, applications of deep learning are gaining increasing popularity in predicting the structure and function of genomic elements, such as promoters, enhancers, or gene expression levels. In this review paper, we described the basic concepts in machine learning and artificial neural network, followed by elaboration on the workflow of using convolutional neural network in genomics. Then we provided a concise introduction of deep learning applications in genomics and synthetic biology at the levels of DNA, RNA and protein. Finally, we discussed the current challenges and future perspectives of deep learning in genomics.
Collapse
|
32
|
Yang S, Wang Y, Lin Y, Shao D, He K, Huang L. LncMirNet: Predicting LncRNA-miRNA Interaction Based on Deep Learning of Ribonucleic Acid Sequences. Molecules 2020; 25:E4372. [PMID: 32977679 PMCID: PMC7583909 DOI: 10.3390/molecules25194372] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 09/19/2020] [Accepted: 09/22/2020] [Indexed: 12/22/2022] Open
Abstract
Long non-coding RNA (LncRNA) and microRNA (miRNA) are both non-coding RNAs that play significant regulatory roles in many life processes. There is cumulating evidence showing that the interaction patterns between lncRNAs and miRNAs are highly related to cancer development, gene regulation, cellular metabolic process, etc. Contemporaneously, with the rapid development of RNA sequence technology, numerous novel lncRNAs and miRNAs have been found, which might help to explore novel regulated patterns. However, the increasing unknown interactions between lncRNAs and miRNAs may hinder finding the novel regulated pattern, and wet experiments to identify the potential interaction are costly and time-consuming. Furthermore, few computational tools are available for predicting lncRNA-miRNA interaction based on a sequential level. In this paper, we propose a hybrid sequence feature-based model, LncMirNet (lncRNA-miRNA interactions network), to predict lncRNA-miRNA interactions via deep convolutional neural networks (CNN). First, four categories of sequence-based features are introduced to encode lncRNA/miRNA sequences including k-mer (k = 1, 2, 3, 4), composition transition distribution (CTD), doc2vec, and graph embedding features. Then, to fit the CNN learning pattern, a histogram-dd method is incorporated to fuse multiple types of features into a matrix. Finally, LncMirNet attained excellent performance in comparison with six other state-of-the-art methods on a real dataset collected from lncRNASNP2 via five-fold cross validation. LncMirNet increased accuracy and area under curve (AUC) by more than 3%, respectively, over that of the other tools, and improved the Matthews correlation coefficient (MCC) by more than 6%. These results show that LncMirNet can obtain high confidence in predicting potential interactions between lncRNAs and miRNAs.
Collapse
Affiliation(s)
- Sen Yang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (S.Y.); (D.S.); (K.H.); (L.H.)
| | - Yan Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (S.Y.); (D.S.); (K.H.); (L.H.)
- School of Artificial Intelligence, Jilin University, Changchun 130012, China;
| | - Yu Lin
- School of Artificial Intelligence, Jilin University, Changchun 130012, China;
| | - Dan Shao
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (S.Y.); (D.S.); (K.H.); (L.H.)
| | - Kai He
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (S.Y.); (D.S.); (K.H.); (L.H.)
| | - Lan Huang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (S.Y.); (D.S.); (K.H.); (L.H.)
| |
Collapse
|