1
|
Wang J, Fan Y, Hong L, Hu Z, Li Y. Deep learning for RNA structure prediction. Curr Opin Struct Biol 2025; 91:102991. [PMID: 39933218 DOI: 10.1016/j.sbi.2025.102991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Revised: 11/27/2024] [Accepted: 01/04/2025] [Indexed: 02/13/2025]
Abstract
Predicting RNA structures from sequences with computational approaches is of vital importance in RNA biology considering the high costs of experimental determination. AI methods have revolutionized this field in recent years, enabling RNA structure prediction with increasingly higher accuracy and efficiency. With an increase in the number of models proposed for this task, this review presents a timely summary of the applications of AI, particularly deep learning, in RNA structure prediction, highlighting their methodology advances as well as the challenges and opportunities for further work in this field.
Collapse
Affiliation(s)
- Jiuming Wang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Yimin Fan
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Liang Hong
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Zhihang Hu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Yu Li
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
2
|
Li Y, Pan X, Shen H, Yang Y. DRAG: design RNAs as hierarchical graphs with reinforcement learning. Brief Bioinform 2025; 26:bbaf106. [PMID: 40079262 PMCID: PMC11904406 DOI: 10.1093/bib/bbaf106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Revised: 02/10/2025] [Accepted: 02/25/2025] [Indexed: 03/15/2025] Open
Abstract
The rapid development of RNA vaccines and therapeutics puts forward intensive requirements on the sequence design of RNAs. RNA sequence design, or RNA inverse folding, aims to generate RNA sequences that can fold into specific target structures. To date, efficient and high-accuracy prediction models for secondary structures of RNAs have been developed. They provide a basis for computational RNA sequence design methods. Especially, reinforcement learning (RL) has emerged as a promising approach for RNA design due to its ability to learn from trial and error in generation tasks and work without ground truth data. However, existing RL methods are limited in considering complex hierarchical structures in RNA design environments. To address the above limitation, we propose DRAG, an RL method that builds design environments for target secondary structures with hierarchical division based on graph neural networks. Through extensive experiments on benchmark datasets, DRAG exhibits remarkable performance compared with current machine-learning approaches for RNA sequence design. This advantage is particularly evident in long and intricate tasks involving structures with significant depth.
Collapse
Affiliation(s)
- Yichong Li
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Minhang District, Shanghai 200240, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Minhang District, Shanghai 200240, China
- Key Laboratory of System Control and Information Processing, Ministry of Education of China, 800 Dong Chuan Rd., Minhang District, Shanghai 200240, China
| | - Hongbin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Minhang District, Shanghai 200240, China
- Key Laboratory of System Control and Information Processing, Ministry of Education of China, 800 Dong Chuan Rd., Minhang District, Shanghai 200240, China
| | - Yang Yang
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Minhang District, Shanghai 200240, China
- Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Minhang District, Shanghai 200240, China
| |
Collapse
|
3
|
Cao X, Zhang Y, Ding Y, Wan Y. Identification of RNA structures and their roles in RNA functions. Nat Rev Mol Cell Biol 2024; 25:784-801. [PMID: 38926530 DOI: 10.1038/s41580-024-00748-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/28/2024] [Indexed: 06/28/2024]
Abstract
The development of high-throughput RNA structure profiling methods in the past decade has greatly facilitated our ability to map and characterize different aspects of RNA structures transcriptome-wide in cell populations, single cells and single molecules. The resulting high-resolution data have provided insights into the static and dynamic nature of RNA structures, revealing their complexity as they perform their respective functions in the cell. In this Review, we discuss recent technical advances in the determination of RNA structures, and the roles of RNA structures in RNA biogenesis and functions, including in transcription, processing, translation, degradation, localization and RNA structure-dependent condensates. We also discuss the current understanding of how RNA structures could guide drug design for treating genetic diseases and battling pathogenic viruses, and highlight existing challenges and future directions in RNA structure research.
Collapse
Affiliation(s)
- Xinang Cao
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, Singapore, Singapore
| | - Yueying Zhang
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, UK
| | - Yiliang Ding
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, UK.
| | - Yue Wan
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, Singapore, Singapore.
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
4
|
Yang TH. DEBFold: Computational Identification of RNA Secondary Structures for Sequences across Structural Families Using Deep Learning. J Chem Inf Model 2024; 64:3756-3766. [PMID: 38648189 PMCID: PMC11094721 DOI: 10.1021/acs.jcim.4c00458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 04/09/2024] [Accepted: 04/09/2024] [Indexed: 04/25/2024]
Abstract
It is now known that RNAs play more active roles in cellular pathways beyond simply serving as transcription templates. These biological mechanisms might be mediated by higher RNA stereo conformations, triggering the need to understand RNA secondary structures first. However, experimental protocols for solving RNA structures are unavailable for large-scale investigation due to their high costs and time-consuming nature. Various computational tools were thus developed to predict the RNA secondary structures from sequences. Recently, deep networks have been investigated to help predict RNA structures directly from their sequences. However, existing deep-learning-based tools are more or less suffering from model overfitting due to their complicated problem formulation and defective model training processes, limiting their applications across sequences from different structural families. In this research, we designed a two-stage RNA structure prediction strategy called DEBFold (deep ensemble boosting and folding) based on convolution encoding/decoding and self-attention mechanisms to enhance the existing thermodynamic structure models. Moreover, the model training process followed rigorous steps to achieve an acceptable prediction generalization. On the family-wise reserved test sets and the PDB-derived test set, DEBFold achieves better structure prediction performance over traditional tools and existing deep-learning methods. In summary, we obtained a cutting-edge deep-learning-based structure prediction tool with supreme across-family generalization performance. The DEBFold tool can be accessed at https://cobis.bme.ncku.edu.tw/DEBFold/.
Collapse
Affiliation(s)
- Tzu-Hsien Yang
- Department
of Biomedical Engineering, National Cheng
Kung University, No.1, University Road, Tainan 701, Taiwan
- Medical
Device Innovation Center, National Cheng
Kung University, No.1,
University Road, Tainan 701, Taiwan
| |
Collapse
|
5
|
Rinaldi S, Moroni E, Rozza R, Magistrato A. Frontiers and Challenges of Computing ncRNAs Biogenesis, Function and Modulation. J Chem Theory Comput 2024; 20:993-1018. [PMID: 38287883 DOI: 10.1021/acs.jctc.3c01239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2024]
Abstract
Non-coding RNAs (ncRNAs), generated from nonprotein coding DNA sequences, constitute 98-99% of the human genome. Non-coding RNAs encompass diverse functional classes, including microRNAs, small interfering RNAs, PIWI-interacting RNAs, small nuclear RNAs, small nucleolar RNAs, and long non-coding RNAs. With critical involvement in gene expression and regulation across various biological and physiopathological contexts, such as neuronal disorders, immune responses, cardiovascular diseases, and cancer, non-coding RNAs are emerging as disease biomarkers and therapeutic targets. In this review, after providing an overview of non-coding RNAs' role in cell homeostasis, we illustrate the potential and the challenges of state-of-the-art computational methods exploited to study non-coding RNAs biogenesis, function, and modulation. This can be done by directly targeting them with small molecules or by altering their expression by targeting the cellular engines underlying their biosynthesis. Drawing from applications, also taken from our work, we showcase the significance and role of computer simulations in uncovering fundamental facets of ncRNA mechanisms and modulation. This information may set the basis to advance gene modulation tools and therapeutic strategies to address unmet medical needs.
Collapse
Affiliation(s)
- Silvia Rinaldi
- National Research Council of Italy (CNR) - Institute of Chemistry of OrganoMetallic Compounds (ICCOM), c/o Area di Ricerca CNR di Firenze Via Madonna del Piano 10, 50019 Sesto Fiorentino, Florence, Italy
| | - Elisabetta Moroni
- National Research Council of Italy (CNR) - Institute of Chemical Sciences and Technologies (SCITEC), via Mario Bianco 9, 20131 Milano, Italy
| | - Riccardo Rozza
- National Research Council of Italy (CNR) - Institute of Material Foundry (IOM) c/o International School for Advanced Studies (SISSA), Via Bonomea, 265, 34136 Trieste, Italy
| | - Alessandra Magistrato
- National Research Council of Italy (CNR) - Institute of Material Foundry (IOM) c/o International School for Advanced Studies (SISSA), Via Bonomea, 265, 34136 Trieste, Italy
| |
Collapse
|
6
|
Zhang J, Lang M, Zhou Y, Zhang Y. Predicting RNA structures and functions by artificial intelligence. Trends Genet 2023; 40:S0168-9525(23)00229-9. [PMID: 39492264 DOI: 10.1016/j.tig.2023.10.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 08/22/2023] [Accepted: 10/03/2023] [Indexed: 11/05/2024]
Abstract
RNA functions by interacting with its intended targets structurally. However, due to the dynamic nature of RNA molecules, RNA structures are difficult to determine experimentally or predict computationally. Artificial intelligence (AI) has revolutionized many biomedical fields and has been progressively utilized to deduce RNA structures, target binding, and associated functionality. Integrating structural and target binding information could also help improve the robustness of AI-based RNA function prediction and RNA design. Given the rapid development of deep learning (DL) algorithms, AI will provide an unprecedented opportunity to elucidate the sequence-structure-function relation of RNAs.
Collapse
Affiliation(s)
- Jun Zhang
- National Engineering Laboratory for Big Data System Computing Technology, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong, 518060, China
| | - Mei Lang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518106, China
| | - Yaoqi Zhou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518106, China.
| | - Yang Zhang
- School of Science, Harbin Institute of Technology, Shenzhen, Guangdong, 518055, China.
| |
Collapse
|