1
|
Dias BDS, Diniz LFA, Corrêa LD, de Souza RP, Ferreira LT, Pasqualin DDC, de Cicco R, da Silva EHT, Severino P. Comparative analysis of miRNA-mRNA interaction prediction tools based on experimental head and neck cancer data. EINSTEIN-SAO PAULO 2025; 23:eAO1372. [PMID: 40266039 PMCID: PMC12061445 DOI: 10.31744/einstein_journal/2025ao1372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Accepted: 10/20/2024] [Indexed: 04/24/2025] Open
Abstract
BACKGROUND We evaluated the performance of TargetScan, miRDB, and miRWalk for predicting miRNA-mRNA interactions in HNSCC. Based on clinical tumor and cancer-free tissue data, miRWalk emerged as the most comprehensive tool. Validation using NanoString technology and MiRTarBase confirmed key predictions, highlighting the important roles of the PI3K-Akt and Wnt pathways. This study underscores the importance of integrating bioinformatics and experimental data to better understand HNSCC. BACKGROUND ■ miRWalk had the highest predicted interactions and validated miRNA networks in HNSCC. BACKGROUND ■ Around 3.3% of interactions overlapped across tools, emphasizing the need for multitool approaches. BACKGROUND ■ Dysregulated genes and miRNAs were tied to cancerdriving PI3K-Akt and Wnt pathways. BACKGROUND ■ The validated approach highlights the importance of integrating computational and molecular data. OBJECTIVE Head and neck squamous cell carcinoma (HNSCC) has a poor prognosis largely due to late diagnosis and a lack of reliable biomarkers. MicroRNAs (miRNAs), small non-coding RNAs that regulate gene expression, are promising biomarkers for HNSCC. This study evaluated miRNA-mRNA interactions in HNSCC using conventional computational tools and validated the results using molecular data. METHODS We compared three miRNA-mRNA interaction prediction tools, TargetScan, miRDB, and miRWalk, using differentially expressed miRNAs and mRNAs from HNSCC and cancer-free tissues. NanoString nCounter was used to measure miRNA and mRNA expression and the miRTarBase database was used to validate the predicted miRNA-mRNA interactions. RESULTS TargetScan and miRWalk provide a comprehensive overview of potential interactions, whereas miRDB provides functional insights. Our results identified 77 and 154 differentially expressed miRNAs and mRNAs in HNSCC, respectively. miRWalk predicted the highest number of miRNA-mRNA interactions, followed by miRDB and TargetScan. Only 3.3% of interactions were common among the tools. The MiRTarBase analysis confirmed a small subset of the predictions. Biological pathway analysis highlighted the dysregulation of PI3K-Akt and Wnt signaling; miRWalk was the best for elucidating how miRNAs modulate target mRNAs in these key pathways during HNSCC progression. CONCLUSION miRWalk emerged as the most robust tool for predicting miRNA-mRNA interactions. Our findings highlight the importance of integrating bioinformatics predictions with experimental data to better understand the regulatory networks in HNSCC and identify potential biomarkers for diagnosis and therapy.
Collapse
Affiliation(s)
- Bárbara dos Santos Dias
- Hospital Israelita Albert EinsteinSão PauloSPBrazilHospital Israelita Albert Einstein, São Paulo, SP, Brazil.
- Universidade de São PauloPrograma de Pós-Graduação Interunidades em BiotecnologiaSão PauloSPBrazilPrograma de Pós-Graduação Interunidades em Biotecnologia, Universidade de São Paulo, São Paulo, SP, Brazil.
| | | | - Lucca D’Arco Corrêa
- Hospital Israelita Albert EinsteinSão PauloSPBrazilHospital Israelita Albert Einstein, São Paulo, SP, Brazil.
- Universidade Estadual Paulista "Júlio de Mesquita Filho"São PauloSPBrazilUniversidade Estadual Paulista "Júlio de Mesquita Filho", São Paulo, SP, Brazil.
| | - Rafael Pereira de Souza
- Instituto do Câncer Dr. Arnaldo Vieira de CarvalhoSão PauloSPBrazilInstituto do Câncer Dr. Arnaldo Vieira de Carvalho, São Paulo, SP, Brazil.
| | - Leticia Torres Ferreira
- Hospital Israelita Albert EinsteinSão PauloSPBrazilHospital Israelita Albert Einstein, São Paulo, SP, Brazil.
| | - Denise da Cunha Pasqualin
- Hospital Israelita Albert EinsteinSão PauloSPBrazilHospital Israelita Albert Einstein, São Paulo, SP, Brazil.
| | - Rafael de Cicco
- Instituto do Câncer Dr. Arnaldo Vieira de CarvalhoSão PauloSPBrazilInstituto do Câncer Dr. Arnaldo Vieira de Carvalho, São Paulo, SP, Brazil.
| | - Eloiza Helena Tajara da Silva
- Faculdade de Medicina de São José do Rio PretoSão Jose do Rio PretoSPBrazilFaculdade de Medicina de São José do Rio Preto, São Jose do Rio Preto, SP, Brazil.
| | - Patricia Severino
- Hospital Israelita Albert EinsteinSão PauloSPBrazilHospital Israelita Albert Einstein, São Paulo, SP, Brazil.
- Universidade de São PauloPrograma de Pós-Graduação Interunidades em BiotecnologiaSão PauloSPBrazilPrograma de Pós-Graduação Interunidades em Biotecnologia, Universidade de São Paulo, São Paulo, SP, Brazil.
| |
Collapse
|
2
|
Wang Y, Wang B, Zou J, Wu A, Liu Y, Wan Y, Luo J, Wu J. Capsule neural network and its applications in drug discovery. iScience 2025; 28:112217. [PMID: 40241764 PMCID: PMC12002614 DOI: 10.1016/j.isci.2025.112217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/18/2025] Open
Abstract
Deep learning holds great promise in drug discovery, yet its application is hindered by high labeling costs and limited datasets. Developing algorithms that effectively learn from sparsely labeled data is crucial. Capsule networks (CapsNet), introduced in 2017, solve the spatial information loss in traditional neural networks and excel in handling small datasets by capturing spatial hierarchical relationships among features. This capability makes CapsNet particularly promising for drug discovery, where data scarcity is a common challenge. Various modified CapsNet architectures have been successfully applied to drug design and discovery tasks. This review provides a comprehensive analysis of CapsNet's theoretical foundations, its current applications in drug discovery, and its performance in addressing key challenges in the field. Additionally, the study highlights the limitations of CapsNet and outlines potential future research directions to further enhance its utility in drug discovery, offering valuable insights for researchers in both computational and pharmaceutical sciences.
Collapse
Affiliation(s)
- Yiwei Wang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
- Key Laboratory of Medical Electrophysiology, Ministry of Education & Medical Electrophysiological Key Laboratory of Sichuan Province, Institute of Cardiovascular Research, Southwest Medical University, Luzhou 646000, China
| | - Binyou Wang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
| | - Jun Zou
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Anguo Wu
- Sichuan Key Medical Laboratory of New Drug Discovery and Druggability Evaluation, Luzhou Key Laboratory of Activity Screening and Druggability Evaluation for Chinese Materia Medica, School of Pharmacy, Southwest Medical University, Luzhou 646000, China
| | - Yuan Liu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
| | - Ying Wan
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
| | - Jiesi Luo
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
| | - Jianming Wu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
- Key Laboratory of Medical Electrophysiology, Ministry of Education & Medical Electrophysiological Key Laboratory of Sichuan Province, Institute of Cardiovascular Research, Southwest Medical University, Luzhou 646000, China
- Sichuan Key Medical Laboratory of New Drug Discovery and Druggability Evaluation, Luzhou Key Laboratory of Activity Screening and Druggability Evaluation for Chinese Materia Medica, School of Pharmacy, Southwest Medical University, Luzhou 646000, China
| |
Collapse
|
3
|
Mohebbi M, Manzourolajdad A, Bennett E, Williams P. A Multi-Input Neural Network Model for Accurate MicroRNA Target Site Detection. Noncoding RNA 2025; 11:23. [PMID: 40126347 PMCID: PMC11932204 DOI: 10.3390/ncrna11020023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2024] [Revised: 02/07/2025] [Accepted: 03/03/2025] [Indexed: 03/25/2025] Open
Abstract
(1) Background: MicroRNAs are non-coding RNA sequences that regulate cellular functions by targeting messenger RNAs and inhibiting protein synthesis. Identifying their target sites is vital to understanding their roles. However, it is challenging due to the high cost and time demands of experimental methods and the high false-positive rates of computational approaches. (2) Methods: We introduce a Multi-Input Neural Network (MINN) algorithm that integrates diverse biologically relevant features, including the microRNA duplex structure, substructures, minimum free energy, and base-pairing probabilities. For each feature derived from a microRNA target-site duplex, we create a corresponding image. These images are processed in parallel by the MINN algorithm, allowing it to learn a comprehensive and precise representation of the underlying biological mechanisms. (3) Results: Our method, on an experimentally validated test set, detects target sites with an AUPRC of 0.9373, Precision of 0.8725, and Recall of 0.8703 and outperforms several commonly used computational methods of microRNA target-site predictions. (4) Conclusions: Incorporating diverse biologically explainable features, such as duplex structure, substructures, their MFEs, and binding probabilities, enables our model to perform well on experimentally validated test data. These features, rather than nucleotide sequences, enhance our model to generalize beyond specific sequence contexts and perform well on sequentially distant samples.
Collapse
Affiliation(s)
- Mohammad Mohebbi
- Department of Computer Science and Information Science, University of North Georgia, Dahlonega, GA 30597, USA; (E.B.); (P.W.)
| | | | - Ethan Bennett
- Department of Computer Science and Information Science, University of North Georgia, Dahlonega, GA 30597, USA; (E.B.); (P.W.)
| | - Phillip Williams
- Department of Computer Science and Information Science, University of North Georgia, Dahlonega, GA 30597, USA; (E.B.); (P.W.)
| |
Collapse
|
4
|
Cuinat C, Pan J, Comelli EM. Host-dependent alteration of the gut microbiota: the role of luminal microRNAs. MICROBIOME RESEARCH REPORTS 2025; 4:15. [PMID: 40207285 PMCID: PMC11977366 DOI: 10.20517/mrr.2024.46] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Revised: 01/22/2025] [Accepted: 02/10/2025] [Indexed: 04/11/2025]
Abstract
MicroRNAs (miRNAs) are short, non-coding RNAs that play gene expression regulatory roles in eukaryotes. MiRNAs are also released in body fluids, and in the intestine, they are found in the lumen and feces. Here, together with exogenous dietary-derived miRNAs, they constitute the fecal miRNome. Several miRNAs were identified in the feces of healthy adults, including, as shown here, core miRNAs hsa-miR-21-5p and hsa-miR-1246. These miRNAs are important for intestinal homeostasis. Recent evidence suggests that miRNAs may interact with gut bacteria. This represents a new avenue to understand host-bacteria crosstalk in the gut and its role in health and disease. This review provides a comprehensive overview of current knowledge on fecal miRNAs, their representation across individuals, and their effects on the gut microbiota. It also discusses existing evidence on potential mechanisms of uptake and interaction with bacterial genomes, drawing from knowledge of prokaryotic small RNAs (sRNAs) regulation of gene expression. Finally, we review in silico and experimental approaches for profiling miRNA-mRNA interactions in bacterial species, highlighting challenges in target validation. This work emphasizes the need for further research into host miRNA-bacterial interactions to better understand their regulatory roles in the gut ecosystem and support their exploitation for disease prevention and treatment.
Collapse
Affiliation(s)
- Céline Cuinat
- Department of Nutritional Sciences, Faculty of Medicine, University of Toronto, Toronto M5S 1A8, Canada
- Authors contributed equally
| | - Jiali Pan
- Department of Nutritional Sciences, Faculty of Medicine, University of Toronto, Toronto M5S 1A8, Canada
- Authors contributed equally
| | - Elena M. Comelli
- Department of Nutritional Sciences, Faculty of Medicine, University of Toronto, Toronto M5S 1A8, Canada
- Joannah and Brian Lawson Centre for Child Nutrition, Faculty of Medicine, University of Toronto, Toronto M5S 1A8, Canada
| |
Collapse
|
5
|
Zhakypbek Y, Belkozhayev AM, Kerimkulova A, Kossalbayev BD, Murat T, Tursbekov S, Turysbekova G, Tursunova A, Tastambek KT, Allakhverdiev SI. MicroRNAs in Plant Genetic Regulation of Drought Tolerance and Their Function in Enhancing Stress Adaptation. PLANTS (BASEL, SWITZERLAND) 2025; 14:410. [PMID: 39942972 PMCID: PMC11820447 DOI: 10.3390/plants14030410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2024] [Revised: 01/18/2025] [Accepted: 01/23/2025] [Indexed: 02/16/2025]
Abstract
Adverse environmental conditions, including drought stress, pose a significant threat to plant survival and agricultural productivity, necessitating innovative and efficient approaches to enhance their resilience. MicroRNAs (miRNAs) are recognized as key elements in regulating plant adaptation to drought stress, with a notable ability to modulate various physiological and molecular mechanisms. This review provides an in-depth analysis of the role of miRNAs in drought response mechanisms, including abscisic acid (ABA) signaling, reactive oxygen species (ROS) detoxification, and the optimization of root system architecture. Additionally, it examines the effectiveness of bioinformatics tools, such as those employed in in silico analyses, for studying miRNA-mRNA interactions, as well as the potential for their integration with experimental methods. Advanced methods such as microarray analysis, high-throughput sequencing (HTS), and RACE-PCR are discussed for their contributions to miRNA target identification and validation. Moreover, new data and perspectives are presented on the role of miRNAs in plant responses to abiotic stresses, particularly drought adaptation. This review aims to deepen the understanding of genetic regulatory mechanisms in plants and to establish a robust scientific foundation for the development of drought-tolerant crop varieties.
Collapse
Affiliation(s)
- Yryszhan Zhakypbek
- Department of Surveying and Geodesy, Mining and Metallurgical Institute Named After O.A. Baikonurov, Satbayev University, Almaty 050043, Kazakhstan; (T.M.); (S.T.)
| | - Ayaz M. Belkozhayev
- Department of Chemical and Biochemical Engineering, Geology and Oil-Gas Business Institute Named After K. Turyssov, Satbayev University, Almaty 050043, Kazakhstan;
- Department of Biotechnology, Al-Farabi Kazakh National University, Almaty 050040, Kazakhstan
| | - Aygul Kerimkulova
- Department of Chemical and Biochemical Engineering, Geology and Oil-Gas Business Institute Named After K. Turyssov, Satbayev University, Almaty 050043, Kazakhstan;
| | - Bekzhan D. Kossalbayev
- Department of Chemical and Biochemical Engineering, Geology and Oil-Gas Business Institute Named After K. Turyssov, Satbayev University, Almaty 050043, Kazakhstan;
- Ecology Research Institute, Khoja Akhmet Yassawi International Kazakh Turkish University, Turkistan 161200, Kazakhstan;
- Sustainability of Ecology and Bioresources, Al-Farabi Kazakh National University, Al-Farabi 71, Almaty 050038, Kazakhstan
| | - Toktar Murat
- Department of Surveying and Geodesy, Mining and Metallurgical Institute Named After O.A. Baikonurov, Satbayev University, Almaty 050043, Kazakhstan; (T.M.); (S.T.)
- Department of Agronomy and Forestry, Faculty of Agrotechnology, Kozybayev University, Petropavlovsk 150000, Kazakhstan
- Department of Soil Ecology, Kazakh Research Institute of Soil Science and Agrochemistry, Named After U.U. Uspanov, Al-Farabi Ave. 75, Almaty 050060, Kazakhstan
| | - Serik Tursbekov
- Department of Surveying and Geodesy, Mining and Metallurgical Institute Named After O.A. Baikonurov, Satbayev University, Almaty 050043, Kazakhstan; (T.M.); (S.T.)
| | - Gaukhar Turysbekova
- Department of Metallurgy and Mineral Processing, Satbayev University, Almaty 050000, Kazakhstan;
| | - Alnura Tursunova
- Kazakh Research Institute of Plant Protection and Quarantine Named After Zhazken Zhiembayev, Almaty 050070, Kazakhstan;
| | - Kuanysh T. Tastambek
- Ecology Research Institute, Khoja Akhmet Yassawi International Kazakh Turkish University, Turkistan 161200, Kazakhstan;
- Sustainability of Ecology and Bioresources, Al-Farabi Kazakh National University, Al-Farabi 71, Almaty 050038, Kazakhstan
| | - Suleyman I. Allakhverdiev
- Department of Plant Physiology, Faculty of Biology, M.V. Lomonosov Moscow State University, Leninskie Gory 1-12, 119991 Moscow, Russia;
- Controlled Photobiosynthesis Laboratory, K.A. Timiryazev Institute of Plant Physiology RAS, Botanicheskaya Street 35, 127276 Moscow, Russia
- Faculty of Engineering and Natural Sciences, Bahcesehir University, Istanbul 34353, Turkey
| |
Collapse
|
6
|
Bereczki Z, Benczik B, Balogh OM, Marton S, Puhl E, Pétervári M, Váczy-Földi M, Papp ZT, Makkos A, Glass K, Locquet F, Euler G, Schulz R, Ferdinandy P, Ágg B. Mitigating off-target effects of small RNAs: conventional approaches, network theory and artificial intelligence. Br J Pharmacol 2025; 182:340-379. [PMID: 39293936 DOI: 10.1111/bph.17302] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 05/07/2024] [Accepted: 06/17/2024] [Indexed: 09/20/2024] Open
Abstract
Three types of highly promising small RNA therapeutics, namely, small interfering RNAs (siRNAs), microRNAs (miRNAs) and the RNA subtype of antisense oligonucleotides (ASOs), offer advantages over small-molecule drugs. These small RNAs can target any gene product, opening up new avenues of effective and safe therapeutic approaches for a wide range of diseases. In preclinical research, synthetic small RNAs play an essential role in the investigation of physiological and pathological pathways as silencers of specific genes, facilitating discovery and validation of drug targets in different conditions. Off-target effects of small RNAs, however, could make it difficult to interpret experimental results in the preclinical phase and may contribute to adverse events of small RNA therapeutics. Out of the two major types of off-target effects we focused on the hybridization-dependent, especially on the miRNA-like off-target effects. Our main aim was to discuss several approaches, including sequence design, chemical modifications and target prediction, to reduce hybridization-dependent off-target effects that should be considered even at the early development phase of small RNA therapy. Because there is no standard way of predicting hybridization-dependent off-target effects, this review provides an overview of all major state-of-the-art computational methods and proposes new approaches, such as the possible inclusion of network theory and artificial intelligence (AI) in the prediction workflows. Case studies and a concise survey of experimental methods for validating in silico predictions are also presented. These methods could contribute to interpret experimental results, to minimize off-target effects and hopefully to avoid off-target-related adverse events of small RNA therapeutics. LINKED ARTICLES: This article is part of a themed issue Non-coding RNA Therapeutics. To view the other articles in this section visit http://onlinelibrary.wiley.com/doi/10.1111/bph.v182.2/issuetoc.
Collapse
Affiliation(s)
- Zoltán Bereczki
- Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary
- HUN-REN-SU System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
| | - Bettina Benczik
- Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary
- HUN-REN-SU System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Pharmahungary Group, Szeged, Hungary
| | - Olivér M Balogh
- Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary
- HUN-REN-SU System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
| | - Szandra Marton
- Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary
| | - Eszter Puhl
- Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary
| | - Mátyás Pétervári
- Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary
- HUN-REN-SU System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Sanovigado Kft, Budapest, Hungary
| | - Máté Váczy-Földi
- Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary
- HUN-REN-SU System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
| | - Zsolt Tamás Papp
- Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary
- HUN-REN-SU System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
| | - András Makkos
- Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary
- HUN-REN-SU System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Pharmahungary Group, Szeged, Hungary
| | - Kimberly Glass
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| | - Fabian Locquet
- Physiologisches Institut, Justus-Liebig-Universität Gießen, Giessen, Germany
| | - Gerhild Euler
- Physiologisches Institut, Justus-Liebig-Universität Gießen, Giessen, Germany
| | - Rainer Schulz
- Physiologisches Institut, Justus-Liebig-Universität Gießen, Giessen, Germany
| | - Péter Ferdinandy
- Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary
- HUN-REN-SU System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Pharmahungary Group, Szeged, Hungary
| | - Bence Ágg
- Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary
- HUN-REN-SU System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Pharmahungary Group, Szeged, Hungary
| |
Collapse
|
7
|
Yoon S, Yoon H, Cho J, Lee K. AEmiGAP: AutoEncoder-Based miRNA-Gene Association Prediction Using Deep Learning Method. Int J Mol Sci 2024; 25:13075. [PMID: 39684787 DOI: 10.3390/ijms252313075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 11/28/2024] [Accepted: 12/03/2024] [Indexed: 12/18/2024] Open
Abstract
MicroRNAs (miRNAs) play a crucial role in gene regulation and are strongly linked to various diseases, including cancer. This study presents AEmiGAP, an advanced deep learning model that integrates autoencoders with long short-term memory (LSTM) networks to predict miRNA-gene associations. By enhancing feature extraction through autoencoders, AEmiGAP captures intricate, latent relationships between miRNAs and genes with unprecedented accuracy, outperforming all existing models in miRNA-gene association prediction. A thoroughly curated dataset of positive and negative miRNA-gene pairs was generated using distance-based filtering methods, significantly improving the model's AUC and overall predictive accuracy. Additionally, this study proposes two case studies to highlight AEmiGAP's application: first, a top 30 list of miRNA-gene pairs with the highest predicted association scores among previously unknown pairs, and second, a list of the top 10 miRNAs strongly associated with each of five key oncogenes. These findings establish AEmiGAP as a new benchmark in miRNA-gene association prediction, with considerable potential to advance both cancer research and precision medicine.
Collapse
Affiliation(s)
- Seungwon Yoon
- Department of Computer Science & Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 305-764, Republic of Korea
| | - Hyewon Yoon
- Department of Computer Science & Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 305-764, Republic of Korea
| | - Jaeeun Cho
- Department of Computer Science & Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 305-764, Republic of Korea
| | - Kyuchul Lee
- Department of Computer Science & Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 305-764, Republic of Korea
| |
Collapse
|
8
|
Yin R, Zhao H, Li L, Yang Q, Zeng M, Yang C, Bian J, Xie M. Gra-CRC-miRTar: The pre-trained nucleotide-to-graph neural networks to identify potential miRNA targets in colorectal cancer. Comput Struct Biotechnol J 2024; 23:3020-3029. [PMID: 39171252 PMCID: PMC11338065 DOI: 10.1016/j.csbj.2024.07.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 07/13/2024] [Accepted: 07/13/2024] [Indexed: 08/23/2024] Open
Abstract
Colorectal cancer (CRC) is the third most diagnosed cancer and the second deadliest cancer worldwide representing a major public health problem. In recent years, increasing evidence has shown that microRNA (miRNA) can control the expression of targeted human messenger RNA (mRNA) by reducing their abundance or translation, acting as oncogenes or tumor suppressors in various cancers, including CRC. Due to the significant up-regulation of oncogenic miRNAs in CRC, elucidating the underlying mechanism and identifying dysregulated miRNA targets may provide a basis for improving current therapeutic interventions. In this paper, we proposed Gra-CRC-miRTar, a pre-trained nucleotide-to-graph neural network framework, for identifying potential miRNA targets in CRC. Different from previous studies, we constructed two pre-trained models to encode RNA sequences and transformed them into de Bruijn graphs. We employed different graph neural networks to learn the latent representations. The embeddings generated from de Bruijn graphs were then fed into a Multilayer Perceptron (MLP) to perform the prediction tasks. Our extensive experiments show that Gra-CRC-miRTar achieves better performance than other deep learning algorithms and existing predictors. In addition, our analyses also successfully revealed 172 out of 201 functional interactions through experimentally validated miRNA-mRNA pairs in CRC. Collectively, our effort provides an accurate and efficient framework to identify potential miRNA targets in CRC, which can also be used to reveal miRNA target interactions in other malignancies, facilitating the development of novel therapeutics. The Gra-CRC-miRTar web server can be found at: http://gra-crc-mirtar.com/.
Collapse
Affiliation(s)
- Rui Yin
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Hongru Zhao
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Lu Li
- Department of Biochemistry and Molecular Biology, University of Florida, Gainesville, FL, USA
| | - Qiang Yang
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Carl Yang
- Department of Computer Science, Emory University, Atlanta, GA, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Mingyi Xie
- Department of Biochemistry and Molecular Biology, University of Florida, Gainesville, FL, USA
| |
Collapse
|
9
|
Petković M, Menkovski V. Description Generation Using Variational Auto-Encoders for Precursor microRNA. ENTROPY (BASEL, SWITZERLAND) 2024; 26:921. [PMID: 39593866 PMCID: PMC11592592 DOI: 10.3390/e26110921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 10/14/2024] [Accepted: 10/25/2024] [Indexed: 11/28/2024]
Abstract
Micro RNAs (miRNA) are a type of non-coding RNA involved in gene regulation and can be associated with diseases such as cancer, cardiovascular, and neurological diseases. As such, identifying the entire genome of miRNA can be of great relevance. Since experimental methods for novel precursor miRNA (pre-miRNA) detection are complex and expensive, computational detection using Machine Learning (ML) could be useful. Existing ML methods are often complex black boxes that do not create an interpretable structural description of pre-miRNA. In this paper, we propose a novel framework that makes use of generative modeling through Variational Auto-Encoders to uncover the generative factors of pre-miRNA. After training the VAE, the pre-miRNA description is developed using a decision tree on the lower dimensional latent space. Applying the framework to miRNA classification, we obtain a high reconstruction and classification performance while also developing an accurate miRNA description.
Collapse
Affiliation(s)
- Marko Petković
- Department of Applied Physics and Science Education, Eindhoven University of Technology, 5612AZ Eindhoven, The Netherlands;
- Eindhoven Artificial Intelligence Systems Institute, 5612AZ Eindhoven, The Netherlands
| | - Vlado Menkovski
- Eindhoven Artificial Intelligence Systems Institute, 5612AZ Eindhoven, The Netherlands
- Department of Mathematics and Computer Science, Eindhoven University of Technology, 5612AZ Eindhoven, The Netherlands
| |
Collapse
|
10
|
Agrawal M, Mani A. Integrative in silico approaches to analyse microRNA-mediated responses in human diseases. J Gene Med 2024; 26:e3734. [PMID: 39197943 DOI: 10.1002/jgm.3734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Revised: 07/23/2024] [Accepted: 08/13/2024] [Indexed: 09/01/2024] Open
Abstract
Advancements in sequencing technologies have facilitated omics level information generation for various diseases in human. High-throughput technologies have become a powerful tool to understand differential expression studies and transcriptional network analysis. An understanding of complex transcriptional networks in human diseases requires integration of datasets representing different RNA species including microRNA (miRNA) and messenger RNA (mRNA). This review emphasises on conceptual explanation of generalized workflow and methodologies to the miRNA mediated responses in human diseases by using different in silico analysis. Although, there have been many prior explorations in miRNA-mediated responses in human diseases, the advantages, limitations and overcoming the limitation through different statistical techniques have not yet been discussed. This review focuses on miRNAs as important gene regulators in human diseases, methodologies for miRNA-target gene prediction and data driven methods for enrichment and network analysis for miRnome-targetome interactions. Additionally, it proposes an integrative workflow to analyse structural components of networks obtained from high-throughput data. This review explains how to apply the existing methods to analyse miRNA-mediated responses in human diseases. It addresses unique characteristics of different analysis, its limitations and its statistical solutions influencing the choice of methods for the analysis through a workflow. Moreover, it provides an overview of promising common integrative approaches to comprehend miRNA-mediated gene regulatory events in biological processes in humans. The proposed methodologies and workflow shall help in the analysis of multi-source data to identify molecular signatures of various human diseases.
Collapse
Affiliation(s)
- Meghna Agrawal
- Department of Biotechnology, Motilal Nehru Institute of Technology Allahabad, Prayagraj, India
| | - Ashutosh Mani
- Department of Biotechnology, Motilal Nehru Institute of Technology Allahabad, Prayagraj, India
| |
Collapse
|
11
|
Cohen-Davidi E, Veksler-Lublinsky I. Benchmarking the negatives: Effect of negative data generation on the classification of miRNA-mRNA interactions. PLoS Comput Biol 2024; 20:e1012385. [PMID: 39186797 PMCID: PMC11379385 DOI: 10.1371/journal.pcbi.1012385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 09/06/2024] [Accepted: 08/04/2024] [Indexed: 08/28/2024] Open
Abstract
MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression post-transcriptionally. In animals, this regulation is achieved via base-pairing with partially complementary sequences on mainly 3' UTR region of messenger RNAs (mRNAs). Computational approaches that predict miRNA target interactions (MTIs) facilitate the process of narrowing down potential targets for experimental validation. The availability of new datasets of high-throughput, direct MTIs has led to the development of machine learning (ML) based methods for MTI prediction. To train an ML algorithm, it is beneficial to provide entries from all class labels (i.e., positive and negative). Currently, no high-throughput assays exist for capturing negative examples. Therefore, current ML approaches must rely on either artificially generated or inferred negative examples deduced from experimentally identified positive miRNA-target datasets. Moreover, the lack of uniform standards for generating such data leads to biased results and hampers comparisons between studies. In this comprehensive study, we collected methods for generating negative data for animal miRNA-target interactions and investigated their impact on the classification of true human MTIs. Our study relies on training ML models on a fixed positive dataset in combination with different negative datasets and evaluating their intra- and cross-dataset performance. As a result, we were able to examine each method independently and evaluate ML models' sensitivity to the methodologies utilized in negative data generation. To achieve a deep understanding of the performance results, we analyzed unique features that distinguish between datasets. In addition, we examined whether one-class classification models that utilize solely positive interactions for training are suitable for the task of MTI classification. We demonstrate the importance of negative data in MTI classification, analyze specific methodological characteristics that differentiate negative datasets, and highlight the challenge of ML models generalizing interaction rules from training to testing sets derived from different approaches. This study provides valuable insights into the computational prediction of MTIs that can be further used to establish standards in the field.
Collapse
Affiliation(s)
- Efrat Cohen-Davidi
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Isana Veksler-Lublinsky
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
12
|
Yang S, Kim SH, Yang E, Kang M, Joo JY. Molecular insights into regulatory RNAs in the cellular machinery. Exp Mol Med 2024; 56:1235-1249. [PMID: 38871819 PMCID: PMC11263585 DOI: 10.1038/s12276-024-01239-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 02/27/2024] [Accepted: 03/05/2024] [Indexed: 06/15/2024] Open
Abstract
It is apparent that various functional units within the cellular machinery are derived from RNAs. The evolution of sequencing techniques has resulted in significant insights into approaches for transcriptome studies. Organisms utilize RNA to govern cellular systems, and a heterogeneous class of RNAs is involved in regulatory functions. In particular, regulatory RNAs are increasingly recognized to participate in intricately functioning machinery across almost all levels of biological systems. These systems include those mediating chromatin arrangement, transcription, suborganelle stabilization, and posttranscriptional modifications. Any class of RNA exhibiting regulatory activity can be termed a class of regulatory RNA and is typically represented by noncoding RNAs, which constitute a substantial portion of the genome. These RNAs function based on the principle of structural changes through cis and/or trans regulation to facilitate mutual RNA‒RNA, RNA‒DNA, and RNA‒protein interactions. It has not been clearly elucidated whether regulatory RNAs identified through deep sequencing actually function in the anticipated mechanisms. This review addresses the dominant properties of regulatory RNAs at various layers of the cellular machinery and covers regulatory activities, structural dynamics, modifications, associated molecules, and further challenges related to therapeutics and deep learning.
Collapse
Affiliation(s)
- Sumin Yang
- Department of Pharmacy, College of Pharmacy, Hanyang University, Ansan, Gyeonggi-do, 15588, Republic of Korea
| | - Sung-Hyun Kim
- Department of Pharmacy, College of Pharmacy, Hanyang University, Ansan, Gyeonggi-do, 15588, Republic of Korea
| | - Eunjeong Yang
- Department of Pharmacy, College of Pharmacy, Hanyang University, Ansan, Gyeonggi-do, 15588, Republic of Korea
| | - Mingon Kang
- Department of Computer Science, University of Nevada, Las Vegas, NV, 89154, USA
| | - Jae-Yeol Joo
- Department of Pharmacy, College of Pharmacy, Hanyang University, Ansan, Gyeonggi-do, 15588, Republic of Korea.
| |
Collapse
|
13
|
Hwang H, Jeon H, Yeo N, Baek D. Big data and deep learning for RNA biology. Exp Mol Med 2024; 56:1293-1321. [PMID: 38871816 PMCID: PMC11263376 DOI: 10.1038/s12276-024-01243-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 02/27/2024] [Accepted: 03/05/2024] [Indexed: 06/15/2024] Open
Abstract
The exponential growth of big data in RNA biology (RB) has led to the development of deep learning (DL) models that have driven crucial discoveries. As constantly evidenced by DL studies in other fields, the successful implementation of DL in RB depends heavily on the effective utilization of large-scale datasets from public databases. In achieving this goal, data encoding methods, learning algorithms, and techniques that align well with biological domain knowledge have played pivotal roles. In this review, we provide guiding principles for applying these DL concepts to various problems in RB by demonstrating successful examples and associated methodologies. We also discuss the remaining challenges in developing DL models for RB and suggest strategies to overcome these challenges. Overall, this review aims to illuminate the compelling potential of DL for RB and ways to apply this powerful technology to investigate the intriguing biology of RNA more effectively.
Collapse
Affiliation(s)
- Hyeonseo Hwang
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Hyeonseong Jeon
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
- Genome4me Inc., Seoul, Republic of Korea
| | - Nagyeong Yeo
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Daehyun Baek
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea.
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
- Genome4me Inc., Seoul, Republic of Korea.
| |
Collapse
|
14
|
Bayraktar R, Fontana B, Calin GA, Nemeth K. miRNA Biology in Chronic Lymphocytic Leukemia. Semin Hematol 2024; 61:181-193. [PMID: 38724414 DOI: 10.1053/j.seminhematol.2024.03.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 02/23/2024] [Accepted: 03/11/2024] [Indexed: 07/13/2024]
Abstract
microRNAs (miRNAs) are a class of small non-coding RNAs that play a crucial regulatory role in fundamental biological processes and have been implicated in various diseases, including cancer. The first evidence of the cancer-related function of miRNAs was discovered in chronic lymphocytic leukemia (CLL) in the early 2000s. Alterations in miRNA expression have since been shown to strongly influence the clinical course, prognosis, and response to treatment in patients with CLL. Therefore, the identification of specific miRNA alterations not only enhances our understanding of the molecular mechanisms underlying CLL but also holds promise for the development of novel diagnostic and therapeutic strategies. This review aims to provide a comprehensive summary of the current knowledge and recent insights into miRNA dysregulation in CLL, emphasizing its pivotal roles in disease progression, including the development of the lethal Richter syndrome, and to provide an update on the latest translational research in this field.
Collapse
Affiliation(s)
- Recep Bayraktar
- Translational Molecular Pathology Department, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Beatrice Fontana
- Translational Molecular Pathology Department, The University of Texas MD Anderson Cancer Center, Houston, TX; Department of Medical and Surgical Sciences (DIMEC), University of Bologna, Bologna, Italy
| | - George A Calin
- Translational Molecular Pathology Department, The University of Texas MD Anderson Cancer Center, Houston, TX; The RNA Interference and Non-coding RNA Center, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Kinga Nemeth
- Translational Molecular Pathology Department, The University of Texas MD Anderson Cancer Center, Houston, TX.
| |
Collapse
|
15
|
Daniel Thomas S, Vijayakumar K, John L, Krishnan D, Rehman N, Revikumar A, Kandel Codi JA, Prasad TSK, S S V, Raju R. Machine Learning Strategies in MicroRNA Research: Bridging Genome to Phenome. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2024; 28:213-233. [PMID: 38752932 DOI: 10.1089/omi.2024.0047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2024]
Abstract
MicroRNAs (miRNAs) have emerged as a prominent layer of regulation of gene expression. This article offers the salient and current aspects of machine learning (ML) tools and approaches from genome to phenome in miRNA research. First, we underline that the complexity in the analysis of miRNA function ranges from their modes of biogenesis to the target diversity in diverse biological conditions. Therefore, it is imperative to first ascertain the miRNA coding potential of genomes and understand the regulatory mechanisms of their expression. This knowledge enables the efficient classification of miRNA precursors and the identification of their mature forms and respective target genes. Second, and because one miRNA can target multiple mRNAs and vice versa, another challenge is the assessment of the miRNA-mRNA target interaction network. Furthermore, long-noncoding RNA (lncRNA)and circular RNAs (circRNAs) also contribute to this complexity. ML has been used to tackle these challenges at the high-dimensional data level. The present expert review covers more than 100 tools adopting various ML approaches pertaining to, for example, (1) miRNA promoter prediction, (2) precursor classification, (3) mature miRNA prediction, (4) miRNA target prediction, (5) miRNA- lncRNA and miRNA-circRNA interactions, (6) miRNA-mRNA expression profiling, (7) miRNA regulatory module detection, (8) miRNA-disease association, and (9) miRNA essentiality prediction. Taken together, we unpack, critically examine, and highlight the cutting-edge synergy of ML approaches and miRNA research so as to develop a dynamic and microlevel understanding of human health and diseases.
Collapse
Affiliation(s)
- Sonet Daniel Thomas
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Krithika Vijayakumar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Levin John
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Deepak Krishnan
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Niyas Rehman
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Amjesh Revikumar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Kerala Genome Data Centre, Kerala Development and Innovation Strategic Council, Thiruvananthapuram, Kerala, India
| | - Jalaluddin Akbar Kandel Codi
- Department of Surgical Oncology, Yenepoya Medical College, Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | | | - Vinodchandra S S
- Department of Computer Science, University of Kerala, Thiruvananthapuram, Kerala, India
| | - Rajesh Raju
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| |
Collapse
|
16
|
Lu H, Zhang J, Cao Y, Wu S, Wei Y, Yin R. Advances in applications of artificial intelligence algorithms for cancer-related miRNA research. Zhejiang Da Xue Xue Bao Yi Xue Ban 2024; 53:231-243. [PMID: 38650448 PMCID: PMC11057993 DOI: 10.3724/zdxbyxb-2023-0511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 01/30/2024] [Indexed: 04/25/2024]
Abstract
MiRNAs are a class of small non-coding RNAs, which regulate gene expression post-transcriptionally by partial complementary base pairing. Aberrant miRNA expressions have been reported in tumor tissues and peripheral blood of cancer patients. In recent years, artificial intelligence algorithms such as machine learning and deep learning have been widely used in bioinformatic research. Compared to traditional bioinformatic tools, miRNA target prediction tools based on artificial intelligence algorithms have higher accuracy, and can successfully predict subcellular localization and redistribution of miRNAs to deepen our understanding. Additionally, the construction of clinical models based on artificial intelligence algorithms could significantly improve the mining efficiency of miRNA used as biomarkers. In this article, we summarize recent development of bioinformatic miRNA tools based on artificial intelligence algorithms, focusing on the potential of machine learning and deep learning in cancer-related miRNA research.
Collapse
Affiliation(s)
- Hongyu Lu
- School of Pharmacy, Jiangsu University, Zhenjiang 212013, Jiangsu Province, China.
| | - Jia Zhang
- School of Pharmacy, Jiangsu University, Zhenjiang 212013, Jiangsu Province, China
| | - Yixin Cao
- Department of Medical Oncology, Affiliated Hospital of Jiangsu University, Zhenjiang 212013, Jiangsu Province, China
| | - Shuming Wu
- School of Pharmacy, Jiangsu University, Zhenjiang 212013, Jiangsu Province, China
| | - Yuan Wei
- School of Pharmacy, Jiangsu University, Zhenjiang 212013, Jiangsu Province, China.
| | - Runting Yin
- School of Pharmacy, Jiangsu University, Zhenjiang 212013, Jiangsu Province, China.
| |
Collapse
|
17
|
Yin R, Zhao H, Li L, Yang Q, Zeng M, Yang C, Bian J, Xie M. Gra-CRC-miRTar: The pre-trained nucleotide-to-graph neural networks to identify potential miRNA targets in colorectal cancer. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.15.589599. [PMID: 38659732 PMCID: PMC11042274 DOI: 10.1101/2024.04.15.589599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Colorectal cancer (CRC) is the third most diagnosed cancer and the second deadliest cancer worldwide representing a major public health problem. In recent years, increasing evidence has shown that microRNA (miRNA) can control the expression of targeted human messenger RNA (mRNA) by reducing their abundance or translation, acting as oncogenes or tumor suppressors in various cancers, including CRC. Due to the significant up-regulation of oncogenic miRNAs in CRC, elucidating the underlying mechanism and identifying dysregulated miRNA targets may provide a basis for improving current therapeutic interventions. In this paper, we proposed Gra-CRC-miRTar, a pre-trained nucleotide-to-graph neural network framework, for identifying potential miRNA targets in CRC. Different from previous studies, we constructed two pre-trained models to encode RNA sequences and transformed them into de Bruijn graphs. We employed different graph neural networks to learn the latent representations. The embeddings generated from de Bruijn graphs were then fed into a Multilayer Perceptron (MLP) to perform the prediction tasks. Our extensive experiments show that Gra-CRC-miRTar achieves better performance than other deep learning algorithms and existing predictors. In addition, our analyses also successfully revealed 172 out of 201 functional interactions through experimentally validated miRNA-mRNA pairs in CRC. Collectively, our effort provides an accurate and efficient framework to identify potential miRNA targets in CRC, which can also be used to reveal miRNA target interactions in other malignancies, facilitating the development of novel therapeutics.
Collapse
Affiliation(s)
- Rui Yin
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
- These authors contributed equally
| | - Hongru Zhao
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
- These authors contributed equally
| | - Lu Li
- Department of Biochemistry and Molecular Biology, University of Florida, Gainesville, FL, USA
| | - Qiang Yang
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Carl Yang
- Department of Computer Science, Emory University, Atlanta, GA, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Mingyi Xie
- Department of Biochemistry and Molecular Biology, University of Florida, Gainesville, FL, USA
| |
Collapse
|
18
|
Yang T, Wang Y, He Y. TEC-miTarget: enhancing microRNA target prediction based on deep learning of ribonucleic acid sequences. BMC Bioinformatics 2024; 25:159. [PMID: 38643080 PMCID: PMC11032603 DOI: 10.1186/s12859-024-05780-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 04/12/2024] [Indexed: 04/22/2024] Open
Abstract
BACKGROUND MicroRNAs play a critical role in regulating gene expression by binding to specific target sites within gene transcripts, making the identification of microRNA targets a prominent focus of research. Conventional experimental methods for identifying microRNA targets are both time-consuming and expensive, prompting the development of computational tools for target prediction. However, the existing computational tools exhibit limited performance in meeting the demands of practical applications, highlighting the need to improve the performance of microRNA target prediction models. RESULTS In this paper, we utilize the most popular natural language processing and computer vision technologies to propose a novel approach, called TEC-miTarget, for microRNA target prediction based on transformer encoder and convolutional neural networks. TEC-miTarget treats RNA sequences as a natural language and encodes them using a transformer encoder, a widely used encoder in natural language processing. It then combines the representations of a pair of microRNA and its candidate target site sequences into a contact map, which is a three-dimensional array similar to a multi-channel image. Therefore, the contact map's features are extracted using a four-layer convolutional neural network, enabling the prediction of interactions between microRNA and its candidate target sites. We applied a series of comparative experiments to demonstrate that TEC-miTarget significantly improves microRNA target prediction, compared with existing state-of-the-art models. Our approach is the first approach to perform comparisons with other approaches at both sequence and transcript levels. Furthermore, it is the first approach compared with both deep learning-based and seed-match-based methods. We first compared TEC-miTarget's performance with approaches at the sequence level, and our approach delivers substantial improvements in performance using the same datasets and evaluation metrics. Moreover, we utilized TEC-miTarget to predict microRNA targets in long mRNA sequences, which involves two steps: selecting candidate target site sequences and applying sequence-level predictions. We finally showed that TEC-miTarget outperforms other approaches at the transcript level, including the popular seed match methods widely used in previous years. CONCLUSIONS We propose a novel approach for predicting microRNA targets at both sequence and transcript levels, and demonstrate that our approach outperforms other methods based on deep learning or seed match. We also provide our approach as an easy-to-use software, TEC-miTarget, at https://github.com/tingpeng17/TEC-miTarget . Our results provide new perspectives for microRNA target prediction.
Collapse
Affiliation(s)
- Tingpeng Yang
- Peng Cheng Laboratory, Shenzhen, 518055, China
- Tsinghua Shenzhen International Graduate School, Shenzhen, 518055, China
| | - Yu Wang
- Peng Cheng Laboratory, Shenzhen, 518055, China.
| | - Yonghong He
- Peng Cheng Laboratory, Shenzhen, 518055, China.
- Tsinghua Shenzhen International Graduate School, Shenzhen, 518055, China.
| |
Collapse
|
19
|
Hadad E, Rokach L, Veksler-Lublinsky I. Empowering prediction of miRNA-mRNA interactions in species with limited training data through transfer learning. Heliyon 2024; 10:e28000. [PMID: 38560149 PMCID: PMC10981012 DOI: 10.1016/j.heliyon.2024.e28000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 03/06/2024] [Accepted: 03/11/2024] [Indexed: 04/04/2024] Open
Abstract
MicroRNAs (miRNAs) play a crucial role in mRNA regulation. Identifying functionally important mRNA targets of a specific miRNA is essential for uncovering its biological function and assisting miRNA-based drug development. Datasets of high-throughput direct bona fide miRNA-target interactions (MTIs) exist only for a few model organisms, prompting the need for computational prediction. However, the scarcity of data poses a challenge in training accurate machine learning models for MTI prediction. In this study, we explored the potential of transfer learning technique (with ANN and XGB) to address the limited data challenge by leveraging the similarities in interaction rules between species. Furthermore, we introduced a novel approach called TransferSHAP for estimating the feature importance of transfer learning in tabular dataset tasks. We demonstrated that transfer learning improves MTI prediction accuracy for species with limited datasets and identified the specific interaction features the models employed to transfer information across different species.
Collapse
Affiliation(s)
- Eyal Hadad
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, David Ben-Gurion Blvd. 1, Beer-Sheva 8410501, Israel
| | - Lior Rokach
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, David Ben-Gurion Blvd. 1, Beer-Sheva 8410501, Israel
| | - Isana Veksler-Lublinsky
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, David Ben-Gurion Blvd. 1, Beer-Sheva 8410501, Israel
| |
Collapse
|
20
|
Yang TH, Chen JC, Lee YH, Lu SY, Wu SH, Chang FY, Huang YC, Lee MH, Tseng YY, Wu WS. Identifying Human miRNA Target Sites via Learning the Interaction Patterns between miRNA and mRNA Segments. J Chem Inf Model 2024; 64:2445-2453. [PMID: 37903033 DOI: 10.1021/acs.jcim.3c01150] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]
Abstract
miRNAs (microRNAs) target specific mRNA (messenger RNA) sites to regulate their translation expression. Although miRNA targeting can rely on seed region base pairing, animal miRNAs, including human miRNAs, typically cooperate with several cofactors, leading to various noncanonical pairing rules. Therefore, identifying the binding sites of animal miRNAs remains challenging. Because experiments for mapping miRNA targets are costly, computational methods are preferred for extracting potential miRNA-mRNA fragment binding pairs first. However, existing prediction tools can have significant false positives due to the prevalent noncanonical miRNA binding behaviors and the information-biased training negative sets that were used while constructing these tools. To overcome these obstacles, we first prepared an information-balanced miRNA binding pair ground-truth data set. A miRNA-mRNA interaction-aware model was then designed to help identify miRNA binding events. On the test set, our model (auROC = 94.4%) outperformed existing models by at least 2.8% in auROC. Furthermore, we showed that this model can suggest potential binding patterns for miRNA-mRNA sequence interacting pairs. Finally, we made the prepared data sets and the designed model available at http://cosbi2.ee.ncku.edu.tw/mirna_binding/download.
Collapse
Affiliation(s)
- Tzu-Hsien Yang
- Department of Biomedical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
- Medical Device Innovation Center, National Cheng Kung University, No.1 University Road, Tainan 701, Taiwan
| | - Jhih-Cheng Chen
- Department of Electrical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
| | - Yuan-Han Lee
- Department of Electrical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
| | - Shang-Yi Lu
- Department of Electrical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
| | - Sheng-Hang Wu
- Department of Information Management, National University of Kaohsiung, Kaohsiung University Rd, Kaohsiung 811, Taiwan
| | - Fang-Yuan Chang
- Department of Information Management, National University of Kaohsiung, Kaohsiung University Rd, Kaohsiung 811, Taiwan
| | - Yan-Cheng Huang
- Department of Electrical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
| | - Mei-Hsien Lee
- Department of Mathematics, University of Taipei, No.1, Ai-Guo West Road, Taipei 100234, Taiwan
| | - Yan-Yuan Tseng
- Center for Molecular Medicine and Genetics, Wayne State University, School of Medicine, Detroit, Michigan 48201, United States
| | - Wei-Sheng Wu
- Department of Electrical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
| |
Collapse
|
21
|
Liu C, Yu C, Song G, Fan X, Peng S, Zhang S, Zhou X, Zhang C, Geng X, Wang T, Cheng W, Zhu W. Comprehensive analysis of miRNA-mRNA regulatory pairs associated with colorectal cancer and the role in tumor immunity. BMC Genomics 2023; 24:724. [PMID: 38036953 PMCID: PMC10688136 DOI: 10.1186/s12864-023-09635-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 08/29/2023] [Indexed: 12/02/2023] Open
Abstract
BACKGROUND MicroRNA (miRNA) which can act as post-transcriptional regulators of mRNAs via base-pairing with complementary sequences within mRNAs is involved in processes of the complex interaction between immune system and tumors. In this research, we elucidated the profiles of miRNAs and target mRNAs expression and their associations with the phenotypic hallmarks of colorectal cancers (CRC) by integrating transcriptomic, immunophenotype, methylation, mutation and survival data. RESULTS We conducted the analysis of differential miRNA/mRNA expression profile by GEO, TCGA and GTEx databases and the correlation between miRNA and targeted mRNA by miRTarBase and TarBase. Then we detected using qRT-PCR and validated the diagnostic value of miRNA-mRNA regulator pairs by the ROC, calibration curve and DCA. Phenotypic hallmarks of regulatory pairs including tumor-infiltrating lymphocytes, tumor microenvironment, tumor mutation burden, global methylation and gene mutation were also described. The expression levels of miRNAs and target mRNAs were detected in 80 paired colon tissue samples. Ultimately, we picked up two pivotal regulatory pairs (miR-139-5p/ STC1 and miR-20a-5p/ FGL2) and verified the diagnostic value of the complex model which is the combination of 4 signatures above-mentioned in 3 testing GEO datasets and an external validation cohort. CONCLUSIONS We found that 2 miRNAs by targeting 2 metastasis-related mRNAs were correlated with tumor-infiltrating macrophages, HRAS, and BRAF gene mutation status. Our results established the diagnostic model containing 2 miRNAs and their respective targeted mRNAs to distinguish CRCs and normal controls and displayed their complex roles in CRC pathogenesis especially tumor immunity.
Collapse
Affiliation(s)
- Cheng Liu
- Department of Gastroenterology, the First Affiliated Hospital of Nanjing Medical University, 300 Guangzhou Road, Nanjing, 210029, Jiangsu, China
| | - Chun Yu
- Department of Gastroenterology, the First Affiliated Hospital of Nanjing Medical University, 300 Guangzhou Road, Nanjing, 210029, Jiangsu, China
| | - Guoxin Song
- Department of Pathology, the First Affiliated Hospital of Nanjing Medical University, Nanjing, 210029, China, Jiangsu
| | - Xingchen Fan
- Department of Oncology, the First Affiliated Hospital of Nanjing Medical University, 300 Guangzhou Road, Nanjing, 210029, China, Jiangsu
| | - Shuang Peng
- Department of Oncology, the First Affiliated Hospital of Nanjing Medical University, 300 Guangzhou Road, Nanjing, 210029, China, Jiangsu
| | - Shiyu Zhang
- Department of Oncology, the First Affiliated Hospital of Nanjing Medical University, 300 Guangzhou Road, Nanjing, 210029, China, Jiangsu
| | - Xin Zhou
- Department of Oncology, the First Affiliated Hospital of Nanjing Medical University, 300 Guangzhou Road, Nanjing, 210029, China, Jiangsu
| | - Cheng Zhang
- Department of Science and Technology, the First Affiliated Hospital of Nanjing Medical University, Nanjing, 210029, China, Jiangsu
| | - Xiangnan Geng
- Department of Clinical Engineer, the First Affiliated Hospital of Nanjing Medical University, Nanjing, 210029, China, Jiangsu
| | - Tongshan Wang
- Department of Oncology, the First Affiliated Hospital of Nanjing Medical University, 300 Guangzhou Road, Nanjing, 210029, China, Jiangsu
| | - Wenfang Cheng
- Department of Gastroenterology, the First Affiliated Hospital of Nanjing Medical University, 300 Guangzhou Road, Nanjing, 210029, Jiangsu, China.
| | - Wei Zhu
- Department of Oncology, the First Affiliated Hospital of Nanjing Medical University, 300 Guangzhou Road, Nanjing, 210029, China, Jiangsu.
| |
Collapse
|
22
|
Przybyszewski J, Malawski M, Lichołai S. GraphTar: applying word2vec and graph neural networks to miRNA target prediction. BMC Bioinformatics 2023; 24:436. [PMID: 37978418 PMCID: PMC10657114 DOI: 10.1186/s12859-023-05564-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 11/09/2023] [Indexed: 11/19/2023] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) are short, non-coding RNA molecules that regulate gene expression by binding to specific mRNAs, inhibiting their translation. They play a critical role in regulating various biological processes and are implicated in many diseases, including cardiovascular, oncological, gastrointestinal diseases, and viral infections. Computational methods that can identify potential miRNA-mRNA interactions from raw data use one-dimensional miRNA-mRNA duplex representations and simple sequence encoding techniques, which may limit their performance. RESULTS We have developed GraphTar, a new target prediction method that uses a novel graph-based representation to reflect the spatial structure of the miRNA-mRNA duplex. Unlike existing approaches, we use the word2vec method to accurately encode RNA sequence information. In conjunction with the novel encoding method, we use a graph neural network classifier that can accurately predict miRNA-mRNA interactions based on graph representation learning. As part of a comparative study, we evaluate three different node embedding approaches within the GraphTar framework and compare them with other state-of-the-art target prediction methods. The results show that the proposed method achieves similar performance to the best methods in the field and outperforms them on one of the datasets. CONCLUSIONS In this study, a novel miRNA target prediction approach called GraphTar is introduced. Results show that GraphTar is as effective as existing methods and even outperforms them in some cases, opening new avenues for further research. However, the expansion of available datasets is critical for advancing the field towards real-world applications.
Collapse
Affiliation(s)
- Jan Przybyszewski
- Sano Centre for Computational Medicine, Czarnowiejska 36, 30-054, Cracow, Poland.
| | - Maciej Malawski
- Sano Centre for Computational Medicine, Czarnowiejska 36, 30-054, Cracow, Poland
| | - Sabina Lichołai
- Division of Molecular Biology and Clinical Genetics, Faculty of Medicine, Jagiellonian University Medical College, Skawińska 8, 31-066, Cracow, Poland
| |
Collapse
|
23
|
Zhang J, Lang M, Zhou Y, Zhang Y. Predicting RNA structures and functions by artificial intelligence. Trends Genet 2023; 40:S0168-9525(23)00229-9. [PMID: 39492264 DOI: 10.1016/j.tig.2023.10.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 08/22/2023] [Accepted: 10/03/2023] [Indexed: 11/05/2024]
Abstract
RNA functions by interacting with its intended targets structurally. However, due to the dynamic nature of RNA molecules, RNA structures are difficult to determine experimentally or predict computationally. Artificial intelligence (AI) has revolutionized many biomedical fields and has been progressively utilized to deduce RNA structures, target binding, and associated functionality. Integrating structural and target binding information could also help improve the robustness of AI-based RNA function prediction and RNA design. Given the rapid development of deep learning (DL) algorithms, AI will provide an unprecedented opportunity to elucidate the sequence-structure-function relation of RNAs.
Collapse
Affiliation(s)
- Jun Zhang
- National Engineering Laboratory for Big Data System Computing Technology, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong, 518060, China
| | - Mei Lang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518106, China
| | - Yaoqi Zhou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518106, China.
| | - Yang Zhang
- School of Science, Harbin Institute of Technology, Shenzhen, Guangdong, 518055, China.
| |
Collapse
|
24
|
Ajila V, Colley L, Ste-Croix DT, Nissan N, Cober ER, Mimee B, Samanfar B, Green JR. Species-specific microRNA discovery and target prediction in the soybean cyst nematode. Sci Rep 2023; 13:17657. [PMID: 37848601 PMCID: PMC10582106 DOI: 10.1038/s41598-023-44469-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 10/09/2023] [Indexed: 10/19/2023] Open
Abstract
The soybean cyst nematode (SCN) is a devastating pathogen for economic and food security considerations. Although the SCN genome has recently been sequenced, the presence of any miRNA has not been systematically explored and reported. This paper describes the development of a species-specific SCN miRNA discovery pipeline and its application to the SCN genome. Experiments on well-documented model nematodes (Caenorhabditis elegans and Pristionchus pacificus) are used to tune the pipeline's hyperparameters and confirm its recall and precision. Application to the SCN genome identifies 3342 high-confidence putative SCN miRNA. Prediction specificity within SCN is confirmed by applying the pipeline to RNA hairpins from known exonic regions of the SCN genome (i.e., sequences known to not be miRNA). Prediction recall is confirmed by building a positive control set of SCN miRNA, based on a limited deep sequencing experiment. Interestingly, a number of novel miRNA are predicted to be encoded within the intronic regions of effector genes, known to be involved in SCN parasitism, suggesting that these miRNA may also be involved in the infection process or virulence. Beyond miRNA discovery, gene targets within SCN are predicted for all high-confidence novel miRNA using a miRNA:mRNA target prediction system. Lastly, cross-kingdom miRNA targeting is investigated, where putative soybean mRNA targets are identified for novel SCN miRNA. All predicted miRNA and gene targets are made available in appendix and through a Borealis DataVerse open repository ( https://borealisdata.ca/dataset.xhtml?persistentId=doi:10.5683/SP3/30DEXA ).
Collapse
Affiliation(s)
- Victoria Ajila
- Department of Systems and Computer Engineering, Carleton University, Ottawa, K1S 5B6, Canada
| | - Laura Colley
- Department of Systems and Computer Engineering, Carleton University, Ottawa, K1S 5B6, Canada
| | - Dave T Ste-Croix
- Saint-Jean-sur-Richelieu Research and Development Centre, Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu, J3B 7B5, Canada
| | - Nour Nissan
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, K1A 0C6, Canada
- Department of Biology and Ottawa Institute of Systems Biology, Carleton University, Ottawa, K1S 5B6, Canada
| | - Elroy R Cober
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, K1A 0C6, Canada
| | - Benjamin Mimee
- Saint-Jean-sur-Richelieu Research and Development Centre, Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu, J3B 7B5, Canada
| | - Bahram Samanfar
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, K1A 0C6, Canada
- Department of Biology and Ottawa Institute of Systems Biology, Carleton University, Ottawa, K1S 5B6, Canada
| | - James R Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, K1S 5B6, Canada.
| |
Collapse
|
25
|
Grafanaki K, Grammatikakis I, Ghosh A, Gopalan V, Olgun G, Liu H, Kyriakopoulos GC, Skeparnias I, Georgiou S, Stathopoulos C, Hannenhalli S, Merlino G, Marie KL, Day CP. Noncoding RNA circuitry in melanoma onset, plasticity, and therapeutic response. Pharmacol Ther 2023; 248:108466. [PMID: 37301330 PMCID: PMC10527631 DOI: 10.1016/j.pharmthera.2023.108466] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 05/24/2023] [Accepted: 05/31/2023] [Indexed: 06/12/2023]
Abstract
Melanoma, the cancer of the melanocyte, is the deadliest form of skin cancer with an aggressive nature, propensity to metastasize and tendency to resist therapeutic intervention. Studies have identified that the re-emergence of developmental pathways in melanoma contributes to melanoma onset, plasticity, and therapeutic response. Notably, it is well known that noncoding RNAs play a critical role in the development and stress response of tissues. In this review, we focus on the noncoding RNAs, including microRNAs, long non-coding RNAs, circular RNAs, and other small RNAs, for their functions in developmental mechanisms and plasticity, which drive onset, progression, therapeutic response and resistance in melanoma. Going forward, elucidation of noncoding RNA-mediated mechanisms may provide insights that accelerate development of novel melanoma therapies.
Collapse
Affiliation(s)
- Katerina Grafanaki
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA; Department of Dermatology, School of Medicine, University of Patras, 26504 Patras, Greece
| | - Ioannis Grammatikakis
- Cancer Genetics Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Arin Ghosh
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Vishaka Gopalan
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gulden Olgun
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Huaitian Liu
- Laboratory of Human Carcinogenesis, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - George C Kyriakopoulos
- Department of Biochemistry, School of Medicine, University of Patras, 26504 Patras, Greece
| | - Ilias Skeparnias
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Sophia Georgiou
- Department of Dermatology, School of Medicine, University of Patras, 26504 Patras, Greece
| | | | - Sridhar Hannenhalli
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Glenn Merlino
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Kerrie L Marie
- Division of Molecular and Cellular Function, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK.
| | - Chi-Ping Day
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
26
|
Baig MS, Deepanshu, Prakash P, Alam P, Krishnan A. In silico analysis reveals hypoxia-induced miR-210-3p specifically targets SARS-CoV-2 RNA. J Biomol Struct Dyn 2023; 41:12305-12327. [PMID: 36752331 DOI: 10.1080/07391102.2023.2175255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 01/01/2023] [Indexed: 02/09/2023]
Abstract
Human coronaviruses (HCoVs) until the emergence of SARS in 2003 were associated with mild cold and upper respiratory tract infections. The ongoing pandemic caused by SARS-CoV-2 has enhanced the potential for infection and transmission as compared to other known members of this family. MicroRNAs (miRNA) are 21-25 nucleotides long non-coding RNA that bind to 3' UTR of genes and regulate almost every aspect of cellular function. Several human miRNAs have been known to target viral genomes, mostly to downregulate their expression and sometimes to upregulate also. In some cases, host miRNAs could be sequestered by the viral genome to create a condition for favourable virus existence. The ongoing SARS CoV-2 pandemic is unique based on its transmissibility and severity and we hypothesised that there could be a unique mechanism for its pathogenesis. In this study, we exploited in silico approach to identify human respiratory system-specific miRNAs targeting the viral genome of three highly pathogenic HCoVs (SARS-CoV-2 Wuhan strain, SARS-CoV, and MERS-CoV) and three low pathogenic HCoVs (OC43, NL63, and HKU1). We identified ten common microRNAs that target all HCoVs studied here. In addition, we identified unique miRNAs which targeted specifically one particular HCoV. miR-210-3p was the single unique lung-specific miRNA, which was found to target the NSP3, NSP4, and NSP13 genes of SARS-CoV-2. Further miR-210-NSP3, miR-210-NSP4, and miR-210-NSP13 SARS-CoV-2 duplexes were docked with the hAGO2 protein (PDB ID 4F3T) which showed Z-score values of -1.9, -1.7, and -1.6, respectively. The role of miR-210-3p as master hypoxia regulator and inflammation regulation may be important for SARS-CoV-2 pathogenesis. Overall, this analysis advocates that miR-210-3p be investigated experimentally in SARS-CoV-2 infection.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
| | - Deepanshu
- Department of Molecular Medicine, Jamia Hamdard, New Delhi, India
| | - Prem Prakash
- Department of Molecular Medicine, Jamia Hamdard, New Delhi, India
| | - Pravej Alam
- Department of Biology, College of Science and Humanities in Al-Kharj, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
| | - Anuja Krishnan
- Department of Molecular Medicine, Jamia Hamdard, New Delhi, India
| |
Collapse
|
27
|
Implementing computational methods in tandem with synonymous gene recoding for therapeutic development. Trends Pharmacol Sci 2023; 44:73-84. [PMID: 36307252 DOI: 10.1016/j.tips.2022.09.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 09/26/2022] [Accepted: 09/27/2022] [Indexed: 12/24/2022]
Abstract
Synonymous gene recoding, the substitution of synonymous variants into the genetic sequence, has been used to overcome many production limitations in therapeutic development. However, the safety and efficacy of recoded therapeutics can be difficult to evaluate because synonymous codon substitutions can result in subtle, yet impactful changes in protein features and require sensitive methods for detection. Given that computational approaches have made significant leaps in recent years, we propose that machine-learning (ML) tools may be leveraged to assess gene-recoded therapeutics and foresee an opportunity to adapt codon contexts to enhance some powerful existing tools. Here, we examine how synonymous gene recoding has been used to address challenges in therapeutic development, explain the biological mechanisms underlying its effects, and explore the application of computational platforms to improve the surveillance of functional variants in therapeutic design.
Collapse
|
28
|
Ahmed B, Haque MA, Iquebal MA, Jaiswal S, Angadi UB, Kumar D, Rai A. DeepAProt: Deep learning based abiotic stress protein sequence classification and identification tool in cereals. FRONTIERS IN PLANT SCIENCE 2023; 13:1008756. [PMID: 36714750 PMCID: PMC9877618 DOI: 10.3389/fpls.2022.1008756] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 11/14/2022] [Indexed: 06/18/2023]
Abstract
The impact of climate change has been alarming for the crop growth. The extreme weather conditions can stress the crops and reduce the yield of major crops belonging to Poaceae family too, that sustains 50% of the world's food calorie and 20% of protein intake. Computational approaches, such as artificial intelligence-based techniques have become the forefront of prediction-based data interpretation and plant stress responses. In this study, we proposed a novel activation function, namely, Gaussian Error Linear Unit with Sigmoid (SIELU) which was implemented in the development of a Deep Learning (DL) model along with other hyper parameters for classification of unknown abiotic stress protein sequences from crops of Poaceae family. To develop this models, data pertaining to four different abiotic stress (namely, cold, drought, heat and salinity) responsive proteins of the crops belonging to poaceae family were retrieved from public domain. It was observed that efficiency of the DL models with our proposed novel SIELU activation function outperformed the models as compared to GeLU activation function, SVM and RF with 95.11%, 80.78%, 94.97%, and 81.69% accuracy for cold, drought, heat and salinity, respectively. Also, a web-based tool, named DeepAProt (http://login1.cabgrid.res.in:5500/) was developed using flask API, along with its mobile app. This server/App will provide researchers a convenient tool, which is rapid and economical in identification of proteins for abiotic stress management in crops Poaceae family, in endeavour of higher production for food security and combating hunger, ensuring UN SDG goal 2.0.
Collapse
Affiliation(s)
- Bulbul Ahmed
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Md Ashraful Haque
- Division of Computer Application, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Mir Asif Iquebal
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Sarika Jaiswal
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - U. B. Angadi
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Dinesh Kumar
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
- Department of Biotechnology, School of Interdisciplinary and Applied Sciences, Central University of Haryana, Mahendergarh, Haryana, India
| | - Anil Rai
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| |
Collapse
|
29
|
Ajila V, Colley L, Ste-Croix DT, Nissan N, Golshani A, Cober ER, Mimee B, Samanfar B, Green JR. P-TarPmiR accurately predicts plant-specific miRNA targets. Sci Rep 2023; 13:332. [PMID: 36609461 PMCID: PMC9822942 DOI: 10.1038/s41598-022-27283-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 12/29/2022] [Indexed: 01/09/2023] Open
Abstract
microRNAs (miRNAs) are small non-coding ribonucleic acids that post-transcriptionally regulate gene expression through the targeting of messenger RNA (mRNAs). Most miRNA target predictors have focused on animal species and prediction performance drops substantially when applied to plant species. Several rule-based miRNA target predictors have been developed in plant species, but they often fail to discover new miRNA targets with non-canonical miRNA-mRNA binding. Here, the recently published TarDB database of plant miRNA-mRNA data is leveraged to retrain the TarPmiR miRNA target predictor for application on plant species. Rigorous experiment design across four plant test species demonstrates that animal-trained predictors fail to sustain performance on plant species, and that the use of plant-specific training data improves accuracy depending on the quantity of plant training data used. Surprisingly, our results indicate that the complete exclusion of animal training data leads to the most accurate plant-specific miRNA target predictor indicating that animal-based data may detract from miRNA target prediction in plants. Our final plant-specific miRNA prediction method, dubbed P-TarPmiR, is freely available for use at http://ptarpmir.cu-bic.ca . The final P-TarPmiR method is used to predict targets for all miRNA within the soybean genome. Those ranked predictions, together with GO term enrichment, are shared with the research community.
Collapse
Affiliation(s)
- Victoria Ajila
- grid.34428.390000 0004 1936 893XDepartment of Systems and Computer Engineering, Carleton University, Ottawa, K1S 5B6 Canada
| | - Laura Colley
- grid.34428.390000 0004 1936 893XDepartment of Systems and Computer Engineering, Carleton University, Ottawa, K1S 5B6 Canada
| | - Dave T. Ste-Croix
- grid.55614.330000 0001 1302 4958Saint-Jean-sur-Richelieu Research and Development Center, Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu, J3B 7B5 Canada
| | - Nour Nissan
- grid.55614.330000 0001 1302 4958Ottawa Research and Development Center, Agriculture and Agri-Food Canada, Ottawa, K1A 0C6 Canada ,grid.34428.390000 0004 1936 893XDepartment of Biology, Carleton University, Ottawa, K1S 5B6 Canada
| | - Ashkan Golshani
- grid.34428.390000 0004 1936 893XDepartment of Biology, Carleton University, Ottawa, K1S 5B6 Canada
| | - Elroy R. Cober
- grid.55614.330000 0001 1302 4958Ottawa Research and Development Center, Agriculture and Agri-Food Canada, Ottawa, K1A 0C6 Canada
| | - Benjamin Mimee
- grid.55614.330000 0001 1302 4958Saint-Jean-sur-Richelieu Research and Development Center, Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu, J3B 7B5 Canada
| | - Bahram Samanfar
- grid.55614.330000 0001 1302 4958Ottawa Research and Development Center, Agriculture and Agri-Food Canada, Ottawa, K1A 0C6 Canada ,grid.34428.390000 0004 1936 893XDepartment of Biology, Carleton University, Ottawa, K1S 5B6 Canada
| | - James R. Green
- grid.34428.390000 0004 1936 893XDepartment of Systems and Computer Engineering, Carleton University, Ottawa, K1S 5B6 Canada
| |
Collapse
|
30
|
Schmitz U. Overview of Computational and Experimental Methods to Identify Tissue-Specific MicroRNA Targets. Methods Mol Biol 2023; 2630:155-177. [PMID: 36689183 DOI: 10.1007/978-1-0716-2982-6_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
As ubiquitous posttranscriptional regulators of gene expression, microRNAs (miRNAs) play key roles in cell physiology and function across taxa. In the last two decades, we have gained a good understanding about miRNA biogenesis pathways, modes of action, and consequences of miRNA-mediated gene regulation. More recently, research has focused on exploring causes for miRNA dysregulation, miRNA-mediated crosstalk between genes and signaling pathways, and the role of miRNAs in disease.This chapter discusses methods for the identification of miRNA-target interactions and causes for tissue-specific miRNA-target regulation. Computational approaches for predicting miRNA target sites and assessing tissue-specific target regulation are discussed. Moreover, there is an emphasis on features that affect miRNA target recognition and how high-throughput sequencing protocols can help in assessing miRNA-mediated gene regulation on a genome-wide scale. In addition, this chapter introduces some experimental approaches for the validation of miRNA targets as well as web-based resources sharing predicted and validated miRNA-target interactions.
Collapse
Affiliation(s)
- Ulf Schmitz
- Department of Molecular & Cell Biology, College of Public Health, Medical & Vet Sciences, James Cook University, Douglas, Australia.
- Centre for Tropical Bioinformatics and Molecular Biology, Australian Institute of Tropical Health and Medicine, James Cook University, Cairns, Australia.
| |
Collapse
|
31
|
Small RNA Targets: Advances in Prediction Tools and High-Throughput Profiling. BIOLOGY 2022; 11:biology11121798. [PMID: 36552307 PMCID: PMC9775672 DOI: 10.3390/biology11121798] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 11/27/2022] [Accepted: 12/08/2022] [Indexed: 12/14/2022]
Abstract
MicroRNAs (miRNAs) are an abundant class of small non-coding RNAs that regulate gene expression at the post-transcriptional level. They are suggested to be involved in most biological processes of the cell primarily by targeting messenger RNAs (mRNAs) for cleavage or translational repression. Their binding to their target sites is mediated by the Argonaute (AGO) family of proteins. Thus, miRNA target prediction is pivotal for research and clinical applications. Moreover, transfer-RNA-derived fragments (tRFs) and other types of small RNAs have been found to be potent regulators of Ago-mediated gene expression. Their role in mRNA regulation is still to be fully elucidated, and advancements in the computational prediction of their targets are in their infancy. To shed light on these complex RNA-RNA interactions, the availability of good quality high-throughput data and reliable computational methods is of utmost importance. Even though the arsenal of computational approaches in the field has been enriched in the last decade, there is still a degree of discrepancy between the results they yield. This review offers an overview of the relevant advancements in the field of bioinformatics and machine learning and summarizes the key strategies utilized for small RNA target prediction. Furthermore, we report the recent development of high-throughput sequencing technologies, and explore the role of non-miRNA AGO driver sequences.
Collapse
|
32
|
Xu D, Liu B, Wang J, Zhang Z. Bibliometric analysis of artificial intelligence for biotechnology and applied microbiology: Exploring research hotspots and frontiers. Front Bioeng Biotechnol 2022; 10:998298. [PMID: 36277390 PMCID: PMC9585160 DOI: 10.3389/fbioe.2022.998298] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 09/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background: In the biotechnology and applied microbiology sectors, artificial intelligence (AI) has been extensively used in disease diagnostics, drug research and development, functional genomics, biomarker recognition, and medical imaging diagnostics. In our study, from 2000 to 2021, science publications focusing on AI in biotechnology were reviewed, and quantitative, qualitative, and modeling analyses were performed. Methods: On 6 May 2022, the Web of Science Core Collection (WoSCC) was screened for AI applications in biotechnology and applied microbiology; 3,529 studies were identified between 2000 and 2022, and analyzed. The following information was collected: publication, country or region, references, knowledgebase, institution, keywords, journal name, and research hotspots, and examined using VOSviewer and CiteSpace V bibliometric platforms. Results: We showed that 128 countries published articles related to AI in biotechnology and applied microbiology; the United States had the most publications. In addition, 584 global institutions contributed to publications, with the Chinese Academy of Science publishing the most. Reference clusters from studies were categorized into ten headings: deep learning, prediction, support vector machines (SVM), object detection, feature representation, synthetic biology, amyloid, human microRNA precursors, systems biology, and single cell RNA-Sequencing. Research frontier keywords were represented by microRNA (2012–2020) and protein-protein interactions (PPIs) (2012–2020). Conclusion: We systematically, objectively, and comprehensively analyzed AI-related biotechnology and applied microbiology literature, and additionally, identified current hot spots and future trends in this area. Our review provides researchers with a comprehensive overview of the dynamic evolution of AI in biotechnology and applied microbiology and identifies future key research areas.
Collapse
Affiliation(s)
- Dongyu Xu
- Department of Computer, School of Intelligent Medicine, China Medical University, Shenyang, Liaoning, China
| | - Bing Liu
- Department of Bone Oncology, The People’s Hospital of Liaoning Province, Shenyang, Liaoning, China
| | - Jian Wang
- Department of Pathogenic Biology, School of Basic Medicine, China Medical University, Shenyang, Liaoning, China
| | - Zhichang Zhang
- Department of Computer, School of Intelligent Medicine, China Medical University, Shenyang, Liaoning, China
- *Correspondence: Zhichang Zhang,
| |
Collapse
|
33
|
Feitosa RM, Prieto-Oliveira P, Brentani H, Machado-Lima A. MicroRNA target prediction tools for animals: Where we are at and where we are going to - A systematic review. Comput Biol Chem 2022; 100:107729. [DOI: 10.1016/j.compbiolchem.2022.107729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 07/08/2022] [Accepted: 07/09/2022] [Indexed: 11/26/2022]
|
34
|
Mégret L, Mendoza C, Arrieta Lobo M, Brouillet E, Nguyen TTY, Bouaziz O, Chambaz A, Néri C. Precision machine learning to understand micro-RNA regulation in neurodegenerative diseases. Front Mol Neurosci 2022; 15:914830. [PMID: 36157078 PMCID: PMC9500540 DOI: 10.3389/fnmol.2022.914830] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 08/19/2022] [Indexed: 11/13/2022] Open
Abstract
Micro-RNAs (miRNAs) are short (∼21 nt) non-coding RNAs that regulate gene expression through the degradation or translational repression of mRNAs. Accumulating evidence points to a role of miRNA regulation in the pathogenesis of a wide range of neurodegenerative (ND) diseases such as, for example, Alzheimer’s disease, Parkinson’s disease, amyotrophic lateral sclerosis and Huntington disease (HD). Several systems level studies aimed to explore the role of miRNA regulation in NDs, but these studies remain challenging. Part of the problem may be related to the lack of sufficiently rich or homogeneous data, such as time series or cell-type-specific data obtained in model systems or human biosamples, to account for context dependency. Part of the problem may also be related to the methodological challenges associated with the accurate system-level modeling of miRNA and mRNA data. Here, we critically review the main families of machine learning methods used to analyze expression data, highlighting the added value of using shape-analysis concepts as a solution for precisely modeling highly dimensional miRNA and mRNA data such as the ones obtained in the study of the HD process, and elaborating on the potential of these concepts and methods for modeling complex omics data.
Collapse
Affiliation(s)
- Lucile Mégret
- Sorbonne Université, Centre National de la Recherche Scientifique UMR 8256, Paris, France
- *Correspondence: Lucile Mégret,
| | - Cloé Mendoza
- Sorbonne Université, Centre National de la Recherche Scientifique UMR 8256, Paris, France
| | - Maialen Arrieta Lobo
- Sorbonne Université, Centre National de la Recherche Scientifique UMR 8256, Paris, France
| | - Emmanuel Brouillet
- Sorbonne Université, Centre National de la Recherche Scientifique UMR 8256, Paris, France
| | - Thi-Thanh-Yen Nguyen
- Université Paris Cité, MAP5 (Centre National de la Recherche Scientifique UMR 8145), Paris, France
| | - Olivier Bouaziz
- Université Paris Cité, MAP5 (Centre National de la Recherche Scientifique UMR 8145), Paris, France
| | - Antoine Chambaz
- Université Paris Cité, MAP5 (Centre National de la Recherche Scientifique UMR 8145), Paris, France
| | - Christian Néri
- Sorbonne Université, Centre National de la Recherche Scientifique UMR 8256, Paris, France
- Christian Néri,
| |
Collapse
|
35
|
Shakyawar S, Southekal S, Guda C. mintRULS: Prediction of miRNA–mRNA Target Site Interactions Using Regularized Least Square Method. Genes (Basel) 2022; 13:genes13091528. [PMID: 36140696 PMCID: PMC9498445 DOI: 10.3390/genes13091528] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/19/2022] [Accepted: 08/22/2022] [Indexed: 11/16/2022] Open
Abstract
Identification of miRNA–mRNA interactions is critical to understand the new paradigms in gene regulation. Existing methods show suboptimal performance owing to inappropriate feature selection and limited integration of intuitive biological features of both miRNAs and mRNAs. The present regularized least square-based method, mintRULS, employs features of miRNAs and their target sites using pairwise similarity metrics based on free energy, sequence and repeat identities, and target site accessibility to predict miRNA-target site interactions. We hypothesized that miRNAs sharing similar structural and functional features are more likely to target the same mRNA, and conversely, mRNAs with similar features can be targeted by the same miRNA. Our prediction model achieved an impressive AUC of 0.93 and 0.92 in LOOCV and LmiTOCV settings, respectively. In comparison, other popular tools such as miRDB, TargetScan, MBSTAR, RPmirDIP, and STarMir scored AUCs at 0.73, 0.77, 0.55, 0.84, and 0.67, respectively, in LOOCV setting. Similarly, mintRULS outperformed other methods using metrics such as accuracy, sensitivity, specificity, and MCC. Our method also demonstrated high accuracy when validated against experimentally derived data from condition- and cell-specific studies and expression studies of miRNAs and target genes, both in human and mouse.
Collapse
Affiliation(s)
- Sushil Shakyawar
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Siddesh Southekal
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Chittibabu Guda
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
- Center for Biomedical Informatics Research and Innovation (CBIRI), University of Nebraska Medical Center, Omaha, NE 68198, USA
- Correspondence:
| |
Collapse
|
36
|
Sun Y, Xiong F, Sun Y, Zhao Y, Cao Y. A miRNA Target Prediction Model Based on Distributed Representation Learning and Deep Learning. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:4490154. [PMID: 35924115 PMCID: PMC9343202 DOI: 10.1155/2022/4490154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 05/16/2022] [Accepted: 06/07/2022] [Indexed: 11/18/2022]
Abstract
MicroRNAs (miRNAs) are a kind of noncoding RNA, which plays an essential role in gene regulation by binding to messenger RNAs (mRNAs). Accurate and rapid identification of miRNA target genes is helpful to reveal the mechanism of transcriptome regulation, which is of great significance for the study of cancer and other diseases. Many bioinformatics methods have been proposed to solve this problem, but the previous research did not further study the encoding of the nucleotide sequence. In this paper, we developed a novel method combining word embedding and deep learning for human miRNA targets at the site-level prediction, which is inspired by the similarity between natural language and biological sequences. First, the word2vec model was used to mine the distribution representation of miRNAs and mRNAs. Then, the embedding is extracted automatically via the stacked bidirectional long short-term memory (BiLSTM) network. By testing, our method can effectively improve the accuracy, sensitivity, specificity, and F-measure of other methods. Through our research, it is proved that the distributed representation can improve the accuracy of the deep learning model and better solve the miRNA target site prediction problem.
Collapse
Affiliation(s)
- Yuzhuo Sun
- College of Big Data and Intelligent Engineering, Southwest Forestry University, Kunming, China
| | - Fei Xiong
- College of Big Data and Intelligent Engineering, Southwest Forestry University, Kunming, China
| | - Yongke Sun
- College of Material Science and Engineering, Southwest Forestry University, Kunming, China
| | - Youjie Zhao
- College of Big Data and Intelligent Engineering, Southwest Forestry University, Kunming, China
| | - Yong Cao
- College of Big Data and Intelligent Engineering, Southwest Forestry University, Kunming, China
| |
Collapse
|
37
|
Recent Deep Learning Methodology Development for RNA–RNA Interaction Prediction. Symmetry (Basel) 2022. [DOI: 10.3390/sym14071302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Genetic regulation of organisms involves complicated RNA–RNA interactions (RRIs) among messenger RNA (mRNA), microRNA (miRNA), and long non-coding RNA (lncRNA). Detecting RRIs is beneficial for discovering biological mechanisms as well as designing new drugs. In recent years, with more and more experimentally verified RNA–RNA interactions being deposited into databases, statistical machine learning, especially recent deep-learning-based automatic algorithms, have been widely applied to RRI prediction with remarkable success. This paper first gives a brief introduction to the traditional machine learning methods applied on RRI prediction and benchmark databases for training the models, and then provides a recent methodology overview of deep learning models in the prediction of microRNA (miRNA)–mRNA interactions and long non-coding RNA (lncRNA)–miRNA interactions.
Collapse
|
38
|
Xie W, Zheng Z, Zhang W, Huang L, Lin Q, Wong KC. SRG-vote: Predicting miRNA-gene relationships via embedding and LSTM ensemble. IEEE J Biomed Health Inform 2022; 26:4335-4344. [PMID: 35471879 DOI: 10.1109/jbhi.2022.3169542] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
AbstractTargeted therapy for one for a set of genes has made it possible to apply precision medicine for different patients due to the existence of tumor heterogeneity. However, how to regulate those genes are still problematic. One of the natural regulators of genes is microRNAs. Thus, a better understanding of the miRNA-gene interaction mechanism might contribute to future diagnosis, prevention, and cancer therapy. The interactions between microRNA and genes play an essential role in molecular genetics. The in-vivo experiments validating the relationships between them are time-consuming, money-costly, and labor-intensive. With the development of high-throughput technology, we dealt with tons of biological data. However, extracting features from tremendous raw data and making a mathematical model is still a challenging topic. Machine learning and deep learning algorithms have become powerful tools in dealing with biological data. Inspired by this, in this paper, we propose a model that combines features/embedding extraction methods, deep learning algorithms, and a voting system. We leverage doc2vec to generate sequential embedding from molecular sequences. The role2vec, GCN, and GMM for geometrical embedding were generated from the complex network from similarity and pair-wise datasets. For the deep learning algorithms, we leveraged LSTM and Bi-LSTM according to different embedding and features. Finally, we adopted a voting system to balance results from different data sources. The results have shown that our voting system could achieve a higher AUC than the existing benchmark. The case studies demonstrate that our model could reveal potential relationships between miRNAs and genes. The source code, features, and predictive results can be downloaded at https://github.com/Xshelton/SRG-vote.
Collapse
|
39
|
Zhang B, Zhou Z, Cao W, Qi X, Xu C, Wen W. A New Few-Shot Learning Method of Bacterial Colony Counting Based on the Edge Computing Device. BIOLOGY 2022; 11:biology11020156. [PMID: 35205023 PMCID: PMC8869218 DOI: 10.3390/biology11020156] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 01/16/2022] [Accepted: 01/17/2022] [Indexed: 04/09/2023]
Abstract
Bacterial colony counting is a time consuming but important task for many fields, such as food quality testing and pathogen detection, which own the high demand for accurate on-site testing. However, bacterial colonies are often overlapped, adherent with each other, and difficult to precisely process by traditional algorithms. The development of deep learning has brought new possibilities for bacterial colony counting, but deep learning networks usually require a large amount of training data and highly configured test equipment. The culture and annotation time of bacteria are costly, and professional deep learning workstations are too expensive and large to meet portable requirements. To solve these problems, we propose a lightweight improved YOLOv3 network based on the few-shot learning strategy, which is able to accomplish high detection accuracy with only five raw images and be deployed on a low-cost edge device. Compared with the traditional methods, our method improved the average accuracy from 64.3% to 97.4% and decreased the False Negative Rate from 32.1% to 1.5%. Our method could greatly improve the detection accuracy, realize the portability for on-site testing, and significantly save the cost of data collection and annotation over 80%, which brings more potential for bacterial colony counting.
Collapse
Affiliation(s)
- Beini Zhang
- Advanced Materials Thrust, Department of Physics, The Hong Kong University of Science and Technology, Hong Kong;
| | - Zhentao Zhou
- Clearwaterbay Biomaterials Ltd., Shenzhen 518100, China; (Z.Z.); (W.C.)
| | - Wenbin Cao
- Clearwaterbay Biomaterials Ltd., Shenzhen 518100, China; (Z.Z.); (W.C.)
| | - Xirui Qi
- Department of Physics, The Hong Kong University of Science and Technology, Hong Kong; (X.Q.); (C.X.)
| | - Chen Xu
- Department of Physics, The Hong Kong University of Science and Technology, Hong Kong; (X.Q.); (C.X.)
| | - Weijia Wen
- Advanced Materials Thrust, Department of Physics, The Hong Kong University of Science and Technology, Hong Kong;
- Correspondence:
| |
Collapse
|
40
|
Machine Learning Based Methods and Best Practices of microRNA-Target Prediction and Validation. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1385:109-131. [DOI: 10.1007/978-3-031-08356-3_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
41
|
Biological features between miRNAs and their targets are unveiled from deep learning models. Sci Rep 2021; 11:23825. [PMID: 34893648 PMCID: PMC8664955 DOI: 10.1038/s41598-021-03215-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 11/08/2021] [Indexed: 12/02/2022] Open
Abstract
MicroRNAs (miRNAs) are ~ 22 nucleotide ubiquitous gene regulators. They modulate a broad range of essential cellular processes linked to human health and diseases. Consequently, identifying miRNA targets and understanding how they function are critical for treating miRNA associated diseases. In our earlier work, a hybrid deep learning-based approach (miTAR) was developed for predicting miRNA targets. It performs substantially better than the existing methods. The approach integrates two major types of deep learning algorithms: convolutional neural networks (CNNs) and recurrent neural networks (RNNs). However, the features in miRNA:target interactions learned by miTAR have not been investigated. In the current study, we demonstrated that miTAR captures known features, including the involvement of seed region and the free energy, as well as multiple novel features, in the miRNA:target interactions. Interestingly, the CNN and RNN layers of the model perform differently at capturing the free energy feature: the units in RNN layer is more unique at capturing the feature but collectively the CNN layer is more efficient at capturing the feature. Although deep learning models are commonly thought “black-boxes”, our discoveries support that the biological features in miRNA:target can be unveiled from deep learning models, which will be beneficial to the understanding of the mechanisms in miRNA:target interactions.
Collapse
|
42
|
Kaczmarek E, Pyman B, Nanayakkara J, Tuschl T, Tyryshkin K, Renwick N, Mousavi P. Discriminating Neoplastic from Nonneoplastic Tissues Using an miRNA-Based Deep Cancer Classifier. THE AMERICAN JOURNAL OF PATHOLOGY 2021; 192:344-352. [PMID: 34774515 DOI: 10.1016/j.ajpath.2021.10.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 10/07/2021] [Accepted: 10/13/2021] [Indexed: 10/19/2022]
Abstract
Next-generation sequencing has enabled the collection of large biological data sets, allowing novel molecular-based classification methods to be developed for increased understanding of disease. miRNAs are small regulatory RNA molecules that can be quantified using next-generation sequencing and are excellent classificatory markers. Herein, we adapt a deep cancer classifier (DCC) to differentiate neoplastic from nonneoplastic samples using comprehensive miRNA expression profiles from 1031 human breast and skin tissue samples. The classifier was fine-tuned and evaluated using 750 neoplastic and 281 nonneoplastic breast and skin tissue samples. Performance of the DCC was compared with two machine-learning classifiers: support vector machine and random forests. In addition, performance of feature extraction through the DCC was also compared with a developed feature selection algorithm, cancer specificity. The DCC had the highest performance of area under the receiver operating curve and high performance in both sensitivity and specificity, unlike machine-learning and feature selection models, which often performed well in one metric compared with the other. In particular, deep learning was shown to have noticeable advantages with highly heterogeneous data sets. In addition, our cancer specificity algorithm identified candidate biomarkers for differentiating neoplastic and nonneoplastic tissue samples (eg, miR-144 and miR-375 in breast cancer and miR-375 and miR-451 in skin cancer).
Collapse
Affiliation(s)
- Emily Kaczmarek
- Medical Informatics Laboratory, School of Computing, Queen's University, Kingston, Ontario, Canada.
| | - Blake Pyman
- Medical Informatics Laboratory, School of Computing, Queen's University, Kingston, Ontario, Canada
| | - Jina Nanayakkara
- Laboratory of Translational RNA Biology, Department of Pathology and Molecular Medicine, Queen's University, Kingston, Ontario, Canada
| | - Thomas Tuschl
- Laboratory of RNA Molecular Biology, Rockefeller University, New York, New York
| | - Kathrin Tyryshkin
- Laboratory of Translational RNA Biology, Department of Pathology and Molecular Medicine, Queen's University, Kingston, Ontario, Canada
| | - Neil Renwick
- Laboratory of Translational RNA Biology, Department of Pathology and Molecular Medicine, Queen's University, Kingston, Ontario, Canada.
| | - Parvin Mousavi
- Medical Informatics Laboratory, School of Computing, Queen's University, Kingston, Ontario, Canada
| |
Collapse
|
43
|
Yang Q, Ji H, Fan X, Zhang Z, Lu H. Retention time prediction in hydrophilic interaction liquid chromatography with graph neural network and transfer learning. J Chromatogr A 2021; 1656:462536. [PMID: 34563892 DOI: 10.1016/j.chroma.2021.462536] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 09/02/2021] [Accepted: 09/03/2021] [Indexed: 01/04/2023]
Abstract
The combination of retention time (RT), accurate mass and tandem mass spectra can improve the structural annotation in untargeted metabolomics. However, the incorporation of RT for metabolite identification has received less attention because of the limitation of available RT data, especially for hydrophilic interaction liquid chromatography (HILIC). Here, the Graph Neural Network-based Transfer Learning (GNN-TL) is proposed to train a model for HILIC RTs prediction. The graph neural network was pre-trained using an in silico HILIC RT dataset (pseudo-labeling dataset) with ∼306 K molecules. Then, the weights of dense layers in the pre-trained GNN (pre-GNN) model were fine-tuned by transfer learning using a small number of experimental HILIC RTs from the target chromatographic system. The GNN-TL outperformed the methods in Retip, including the Random Forest (RF), Bayesian-regularized neural network (BRNN), XGBoost, light gradient-boosting machine (LightGBM), and Keras. It achieved the lowest mean absolute error (MAE) of 38.6 s on the test set and 33.4 s on an additional test set. It has the best ability to generalize with a small performance difference between training, test, and additional test sets. Furthermore, the predicted RTs can filter out nearly 60% false positive candidates on average, which is valuable for the identification of compounds complementary to mass spectrometry.
Collapse
Affiliation(s)
- Qiong Yang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| | - Hongchao Ji
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| | - Xiaqiong Fan
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China.
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China.
| |
Collapse
|
44
|
Tsugawa H, Rai A, Saito K, Nakabayashi R. Metabolomics and complementary techniques to investigate the plant phytochemical cosmos. Nat Prod Rep 2021; 38:1729-1759. [PMID: 34668509 DOI: 10.1039/d1np00014d] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Covering: up to 2021Plants and their associated microbial communities are known to produce millions of metabolites, a majority of which are still not characterized and are speculated to possess novel bioactive properties. In addition to their role in plant physiology, these metabolites are also relevant as existing and next-generation medicine candidates. Elucidation of the plant metabolite diversity is thus valuable for the successful exploitation of natural resources for humankind. Herein, we present a comprehensive review on recent metabolomics approaches to illuminate molecular networks in plants, including chemical isolation and enzymatic production as well as the modern metabolomics approaches such as stable isotope labeling, ultrahigh-resolution mass spectrometry, metabolome imaging (spatial metabolomics), single-cell analysis, cheminformatics, and computational mass spectrometry. Mass spectrometry-based strategies to characterize plant metabolomes through metabolite identification and annotation are described in detail. We also highlight the use of phytochemical genomics to mine genes associated with specialized metabolites' biosynthesis. Understanding the metabolic diversity through biotechnological advances is fundamental to elucidate the functions of the plant-derived specialized metabolome.
Collapse
Affiliation(s)
- Hiroshi Tsugawa
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan. .,RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.,Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 2-24-16 Nakamachi, Koganei, Tokyo 184-8588, Japan.,Graduate School of Medical Life Science, Yokohama City University, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Amit Rai
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan. .,Plant Molecular Science Center, Chiba University, 1-8-1 Inohana, Chuo-ku, Chiba 260-8675, Japan
| | - Kazuki Saito
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan. .,Plant Molecular Science Center, Chiba University, 1-8-1 Inohana, Chuo-ku, Chiba 260-8675, Japan
| | - Ryo Nakabayashi
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.
| |
Collapse
|
45
|
Khatun MS, Alam MA, Shoombuatong W, Mollah MNH, Kurata H, Hasan MM. Recent development of bioinformatics tools for microRNA target prediction. Curr Med Chem 2021; 29:865-880. [PMID: 34348604 DOI: 10.2174/0929867328666210804090224] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 06/10/2021] [Accepted: 06/15/2021] [Indexed: 11/22/2022]
Abstract
MicroRNAs (miRNAs) are central players that regulate the post-transcriptional processes of gene expression. Binding of miRNAs to target mRNAs can repress their translation by inducing the degradation or by inhibiting the translation of the target mRNAs. High-throughput experimental approaches for miRNA target identification are costly and time-consuming, depending on various factors. It is vitally important to develop the bioinformatics methods for accurately predicting miRNA targets. With the increase of RNA sequences in the post-genomic era, bioinformatics methods are being developed for miRNA studies specially for miRNA target prediction. This review summarizes the current development of state-of-the-art bioinformatics tools for miRNA target prediction, points out the progress and limitations of the available miRNA databases, and their working principles. Finally, we discuss the caveat and perspectives of the next-generation algorithms for the prediction of miRNA targets.
Collapse
Affiliation(s)
- Mst Shamima Khatun
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502. Japan
| | - Md Ashad Alam
- Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112. United States
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700. Thailand
| | - Md Nurul Haque Mollah
- Laboratory of Bioinformatics, Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh. 5Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083. Japan
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502. Japan
| | - Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502. Japan
| |
Collapse
|
46
|
Abstract
Interpreting the effects of genetic variants is key to understanding individual susceptibility to disease and designing personalized therapeutic approaches. Modern experimental technologies are enabling the generation of massive compendia of human genome sequence data and associated molecular and phenotypic traits, together with genome-scale expression, epigenomics and other functional genomic data. Integrative computational models can leverage these data to understand variant impact, elucidate the effect of dysregulated genes on biological pathways in specific disease and tissue contexts, and interpret disease risk beyond what is feasible with experiments alone. In this Review, we discuss recent developments in machine learning algorithms for genome interpretation and for integrative molecular-level modelling of cells, tissues and organs relevant to disease. More specifically, we highlight existing methods and key challenges and opportunities in identifying specific disease-causing genetic variants and linking them to molecular pathways and, ultimately, to disease phenotypes.
Collapse
|
47
|
Ben Or G, Veksler-Lublinsky I. Comprehensive machine-learning-based analysis of microRNA-target interactions reveals variable transferability of interaction rules across species. BMC Bioinformatics 2021; 22:264. [PMID: 34030625 PMCID: PMC8146624 DOI: 10.1186/s12859-021-04164-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 05/04/2021] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression post-transcriptionally via base-pairing with complementary sequences on messenger RNAs (mRNAs). Due to the technical challenges involved in the application of high-throughput experimental methods, datasets of direct bona fide miRNA targets exist only for a few model organisms. Machine learning (ML)-based target prediction models were successfully trained and tested on some of these datasets. There is a need to further apply the trained models to organisms in which experimental training data are unavailable. However, it is largely unknown how the features of miRNA-target interactions evolve and whether some features have remained fixed during evolution, raising questions regarding the general, cross-species applicability of currently available ML methods. RESULTS We examined the evolution of miRNA-target interaction rules and used data science and ML approaches to investigate whether these rules are transferable between species. We analyzed eight datasets of direct miRNA-target interactions in four species (human, mouse, worm, cattle). Using ML classifiers, we achieved high accuracy for intra-dataset classification and found that the most influential features of all datasets overlap significantly. To explore the relationships between datasets, we measured the divergence of their miRNA seed sequences and evaluated the performance of cross-dataset classification. We found that both measures coincide with the evolutionary distance between the compared species. CONCLUSIONS The transferability of miRNA-targeting rules between species depends on several factors, the most associated factors being the composition of seed families and evolutionary distance. Furthermore, our feature-importance results suggest that some miRNA-target features have evolved while others remained fixed during the evolution of the species. Our findings lay the foundation for the future development of target prediction tools that could be applied to "non-model" organisms for which minimal experimental data are available. AVAILABILITY AND IMPLEMENTATION The code is freely available at https://github.com/gbenor/TPVOD .
Collapse
Affiliation(s)
- Gilad Ben Or
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Isana Veksler-Lublinsky
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel
| |
Collapse
|
48
|
Zhao Y, Tian S, Yu L, Zhang Z, Zhang W. Analysis and Classification of Hepatitis Infections Using Raman Spectroscopy and Multiscale Convolutional Neural Networks. JOURNAL OF APPLIED SPECTROSCOPY 2021; 88:441-451. [PMID: 33972806 PMCID: PMC8099702 DOI: 10.1007/s10812-021-01192-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Hepatitis infections represent a major health concern worldwide. Numerous computer-aided approaches have been devised for the early detection of hepatitis. In this study, we propose a method for the analysis and classification of cases of hepatitis-B virus ( HBV), hepatitis-C virus (HCV), and healthy subjects using Raman spectroscopy and a multiscale convolutional neural network (MSCNN). In particular, serum samples of HBV-infected patients (435 cases), HCV-infected patients (374 cases), and healthy persons (499 cases) are analyzed via Raman spectroscopy. The differences between Raman peaks in the measured serum spectra indicate specific biomolecular differences among the three classes. The dimensionality of the spectral data is reduced through principal component analysis. Subsequently, features are extracted, and then feature normalization is applied. Next, the extracted features are used to train different classifiers, namely MSCNN, a single-scale convolutional neural network, and other traditional classifiers. Among these classifiers, the MSCNN model achieved the best outcomes with a precision of 98.89%, sensitivity of 97.44%, specificity of 94.54%, and accuracy of 94.92%. Overall, the results demonstrate that Raman spectral analysis and MSCNN can be effectively utilized for rapid screening of hepatitis B and C cases.
Collapse
Affiliation(s)
- Y. Zhao
- Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000 China
| | - Sh. Tian
- Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000 China
| | - L. Yu
- College of Software Engineering at Xin Jiang University, Urumqi, 830000 China
| | - Zh. Zhang
- The First Affiliated Hospital of Xinjiang Medical University, Urumqi, 830000 China
| | - W. Zhang
- Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000 China
| |
Collapse
|
49
|
Auslander N, Gussow AB, Koonin EV. Incorporating Machine Learning into Established Bioinformatics Frameworks. Int J Mol Sci 2021; 22:2903. [PMID: 33809353 PMCID: PMC8000113 DOI: 10.3390/ijms22062903] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 03/08/2021] [Accepted: 03/10/2021] [Indexed: 12/23/2022] Open
Abstract
The exponential growth of biomedical data in recent years has urged the application of numerous machine learning techniques to address emerging problems in biology and clinical research. By enabling the automatic feature extraction, selection, and generation of predictive models, these methods can be used to efficiently study complex biological systems. Machine learning techniques are frequently integrated with bioinformatic methods, as well as curated databases and biological networks, to enhance training and validation, identify the best interpretable features, and enable feature and model investigation. Here, we review recently developed methods that incorporate machine learning within the same framework with techniques from molecular evolution, protein structure analysis, systems biology, and disease genomics. We outline the challenges posed for machine learning, and, in particular, deep learning in biomedicine, and suggest unique opportunities for machine learning techniques integrated with established bioinformatics approaches to overcome some of these challenges.
Collapse
Affiliation(s)
| | | | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA;
| |
Collapse
|
50
|
Gu T, Zhao X, Barbazuk WB, Lee JH. miTAR: a hybrid deep learning-based approach for predicting miRNA targets. BMC Bioinformatics 2021; 22:96. [PMID: 33639834 PMCID: PMC7912887 DOI: 10.1186/s12859-021-04026-6] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Accepted: 02/14/2021] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND microRNAs (miRNAs) have been shown to play essential roles in a wide range of biological processes. Many computational methods have been developed to identify targets of miRNAs. However, the majority of these methods depend on pre-defined features that require considerable efforts and resources to compute and often prove suboptimal at predicting miRNA targets. RESULTS We developed a novel hybrid deep learning-based (DL-based) approach that is capable of predicting miRNA targets at a higher accuracy. This approach integrates convolutional neural networks (CNNs) that excel in learning spatial features and recurrent neural networks (RNNs) that discern sequential features. Therefore, our approach has the advantages of learning both the intrinsic spatial and sequential features of miRNA:target. The inputs for our approach are raw sequences of miRNAs and genes that can be obtained effortlessly. We applied our approach on two human datasets from recently miRNA target prediction studies and trained two models. We demonstrated that the two models consistently outperform the previous methods according to evaluation metrics on test datasets. Comparing our approach with currently available alternatives on independent datasets shows that our approach delivers substantial improvements in performance. We also show with multiple evidences that our approach is more robust than other methods on small datasets. Our study is the first study to perform comparisons across multiple existing DL-based approaches on miRNA target prediction. Furthermore, we examined the contribution of a Max pooling layer in between the CNN and RNN and demonstrated that it improves the performance of all our models. Finally, a unified model was developed that is robust on fitting different input datasets. CONCLUSIONS We present a new DL-based approach for predicting miRNA targets and demonstrate that our approach outperforms the current alternatives. We supplied an easy-to-use tool, miTAR, at https://github.com/tjgu/miTAR . Furthermore, our analysis results support that Max Pooling generally benefits the hybrid models and potentially prevents overfitting for hybrid models.
Collapse
Affiliation(s)
- Tongjun Gu
- Bioinformatics, Interdisciplinary Center for Biotechnology Research, University of Florida, Gainesville, FL, USA. .,Division of Quantitative Sciences, University of Florida Health Cancer Center, University of Florida, Gainesville, FL, USA.
| | - Xiwu Zhao
- Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
| | - William Bradley Barbazuk
- Bioinformatics, Interdisciplinary Center for Biotechnology Research, University of Florida, Gainesville, FL, USA.,Department of Biology, University of Florida, Gainesville, FL, USA.,Genetics Institute, University of Florida, Gainesville, FL, USA
| | - Ji-Hyun Lee
- Division of Quantitative Sciences, University of Florida Health Cancer Center, University of Florida, Gainesville, FL, USA.,Department of Biostatistics, University of Florida, Gainesville, FL, USA
| |
Collapse
|