1
|
Chen X, Wang L, Xie J, Nowak JS, Luo B, Zhang C, Jia G, Zou J, Huang D, Glatt S, Yang Y, Su Z. RNA sample optimization for cryo-EM analysis. Nat Protoc 2025; 20:1114-1157. [PMID: 39548288 DOI: 10.1038/s41596-024-01072-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 09/12/2024] [Indexed: 11/17/2024]
Abstract
RNAs play critical roles in most biological processes. Although the three-dimensional (3D) structures of RNAs primarily determine their functions, it remains challenging to experimentally determine these 3D structures due to their conformational heterogeneity and intrinsic dynamics. Cryogenic electron microscopy (cryo-EM) has recently played an emerging role in resolving dynamic conformational changes and understanding structure-function relationships of RNAs including ribozymes, riboswitches and bacterial and viral noncoding RNAs. A variety of methods and pipelines have been developed to facilitate cryo-EM structure determination of challenging RNA targets with small molecular weights at subnanometer to near-atomic resolutions. While a wide range of conditions have been used to prepare RNAs for cryo-EM analysis, correlations between the variables in these conditions and cryo-EM visualizations and reconstructions remain underexplored, which continue to hinder optimizations of RNA samples for high-resolution cryo-EM structure determination. Here we present a protocol that describes rigorous screenings and iterative optimizations of RNA preparation conditions that facilitate cryo-EM structure determination, supplemented by cryo-EM data processing pipelines that resolve RNA dynamics and conformational changes and RNA modeling algorithms that generate atomic coordinates based on moderate- to high-resolution cryo-EM density maps. The current protocol is designed for users with basic skills and experience in RNA biochemistry, cryo-EM and RNA modeling. The expected time to carry out this protocol may range from 3 days to more than 3 weeks, depending on the many variables described in the protocol. For particularly challenging RNA targets, this protocol could also serve as a starting point for further optimizations.
Collapse
Affiliation(s)
- Xingyu Chen
- The State Key Laboratory of Biotherapy, Department of Geriatrics and National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, China
| | - Liu Wang
- The State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, National Center for Stomatology, Department of Cardiology and Endodontics, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Jiahao Xie
- The State Key Laboratory of Biotherapy, Department of Geriatrics and National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, China
| | - Jakub S Nowak
- Malopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
| | - Bingnan Luo
- The State Key Laboratory of Biotherapy, Department of Geriatrics and National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, China
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Chong Zhang
- The State Key Laboratory of Biotherapy, Department of Geriatrics and National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, China
| | - Guowen Jia
- The State Key Laboratory of Biotherapy, Department of Geriatrics and National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, China
| | - Jian Zou
- The State Key Laboratory of Biotherapy, Department of Geriatrics and National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, China
| | - Dingming Huang
- The State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, National Center for Stomatology, Department of Cardiology and Endodontics, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Sebastian Glatt
- Malopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Department for Biological Sciences and Pathobiology, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Yang Yang
- Department of Prosthodontics, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Zhaoming Su
- The State Key Laboratory of Biotherapy, Department of Geriatrics and National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, China.
| |
Collapse
|
2
|
Li J, Tan Y, Lu R, Liang P, Liu H, Yao X. Artificial intelligence for RNA-ligand interaction prediction: advances and prospects. Drug Discov Today 2025; 30:104366. [PMID: 40286982 DOI: 10.1016/j.drudis.2025.104366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2025] [Revised: 04/17/2025] [Accepted: 04/22/2025] [Indexed: 04/29/2025]
Abstract
Accurate prediction of RNA-ligand interactions is vital for understanding biological processes and advancing RNA-targeted drug discovery. Given their complexity, artificial intelligence (AI) is revolutionizing the study of RNA-ligand interactions, offering insights into the complex dynamics and therapeutic potential of RNA. In this review, we highlight advances in AI-driven RNA-ligand binding site identification, structure modeling, binding mode and binding affinity prediction, and virtual screening (VS). We also discuss key challenges, such as data set scarcity and modeling RNA flexibility. Future directions emphasize integrating cutting-edge AI techniques with physics-based models and expanding experimental data sets to enhance RNA-ligand interaction predictions.
Collapse
Affiliation(s)
- Jing Li
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China
| | - Yi Tan
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China
| | - Ruiqiang Lu
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China
| | - Pengyu Liang
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China
| | - Huanxiang Liu
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China.
| | - Xiaojun Yao
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, 999078 Macao, China.
| |
Collapse
|
3
|
Wang J, Fan Y, Hong L, Hu Z, Li Y. Deep learning for RNA structure prediction. Curr Opin Struct Biol 2025; 91:102991. [PMID: 39933218 DOI: 10.1016/j.sbi.2025.102991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Revised: 11/27/2024] [Accepted: 01/04/2025] [Indexed: 02/13/2025]
Abstract
Predicting RNA structures from sequences with computational approaches is of vital importance in RNA biology considering the high costs of experimental determination. AI methods have revolutionized this field in recent years, enabling RNA structure prediction with increasingly higher accuracy and efficiency. With an increase in the number of models proposed for this task, this review presents a timely summary of the applications of AI, particularly deep learning, in RNA structure prediction, highlighting their methodology advances as well as the challenges and opportunities for further work in this field.
Collapse
Affiliation(s)
- Jiuming Wang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Yimin Fan
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Liang Hong
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Zhihang Hu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Yu Li
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
4
|
Leitão AL, Enguita FJ. The Unpaved Road of Non-Coding RNA Structure-Function Relationships: Current Knowledge, Available Methodologies, and Future Trends. Noncoding RNA 2025; 11:20. [PMID: 40126344 PMCID: PMC11932211 DOI: 10.3390/ncrna11020020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2025] [Revised: 01/31/2025] [Accepted: 02/26/2025] [Indexed: 03/25/2025] Open
Abstract
The genomes from complex eukaryotes are enriched in non-coding genes whose transcription products (non-coding RNAs) are involved in the regulation of genomic output at different levels. Non-coding RNA action is predominantly driven by sequence and structural motifs that interact with specific functional partners. Despite the exponential growth in primary RNA sequence data facilitated by next-generation sequencing studies, the availability of tridimensional RNA data is comparatively more limited. The subjacent reasons for this relative lack of information regarding RNA structure are related to the specific chemical nature of RNA molecules and the limitations of the currently available methods for structural characterization of biomolecules. In this review, we describe and analyze the different structural motifs involved in non-coding RNA function and the wet-lab and computational methods used to characterize their structure-function relationships, highlighting the current need for detailed structural studies to explore the molecular determinants of non-coding RNA function.
Collapse
Affiliation(s)
- Ana Lúcia Leitão
- Departamento de Química, Faculdade de Ciências e Tecologia, Universidade NOVA de Lisboa, Campus da Caparica, 2829-516 Caparica, Portugal;
| | - Francisco J. Enguita
- Faculdade de Medicina, Universidade de Lisboa, Av. Prof. Egas Moniz, 1649-028 Lisboa, Portugal
| |
Collapse
|
5
|
McCann H, Meade C, Williams L, Petrov A, Johnson P, Simon A, Hoksza D, Nawrocki E, Chan P, Lowe T, Ribas C, Sweeney B, Madeira F, Anyango S, Appasamy S, Deshpande M, Varadi M, Velankar S, Zirbel C, Naiden A, Jossinet F, Petrov A. R2DT: a comprehensive platform for visualizing RNA secondary structure. Nucleic Acids Res 2025; 53:gkaf032. [PMID: 39921562 PMCID: PMC11806352 DOI: 10.1093/nar/gkaf032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2024] [Revised: 12/17/2024] [Accepted: 01/14/2025] [Indexed: 02/10/2025] Open
Abstract
RNA secondary (2D) structure visualization is an essential tool for understanding RNA function. R2DT is a software package designed to visualize RNA 2D structures in consistent, recognizable, and reproducible layouts. The latest release, R2DT 2.0, introduces multiple significant features, including the ability to display position-specific information, such as single nucleotide polymorphisms or SHAPE reactivities. It also offers a new template-free mode allowing visualization of RNAs without pre-existing templates, alongside a constrained folding mode and support for animated visualizations. Users can interactively modify R2DT diagrams, either manually or using natural language prompts, to generate new templates or create publication-quality images. Additionally, R2DT features faster performance, an expanded template library, and a growing collection of compatible tools and utilities. Already integrated into multiple biological databases, R2DT has evolved into a comprehensive platform for RNA 2D visualization, accessible at https://r2dt.bio.
Collapse
Affiliation(s)
- Holly McCann
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, United States
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, United States
- Department of Biomedical Data Science, Stanford University, Palo Alto, CA, 94305-5102, United States
| | - Caeden D Meade
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, United States
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, United States
| | - Loren Dean Williams
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, United States
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, United States
| | - Anton S Petrov
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, United States
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, United States
| | - Philip Z Johnson
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, United States
| | - Anne E Simon
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, United States
| | - David Hoksza
- Department of Software Engineering, Charles University, Prague 118 00, Czech Republic
| | - Eric P Nawrocki
- National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
| | - Patricia P Chan
- Department of Biomolecular Engineering, Baskin School of Engineering, University of California, Santa Cruz, CA 95064, United States
| | - Todd M Lowe
- Department of Biomolecular Engineering, Baskin School of Engineering, University of California, Santa Cruz, CA 95064, United States
| | - Carlos Eduardo Ribas
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Blake A Sweeney
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Fábio Madeira
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Stephen Anyango
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Sri Devan Appasamy
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Mandar Deshpande
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Mihaly Varadi
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Sameer Velankar
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Craig L Zirbel
- Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, OH 43403, United States
| | | | - Fabrice Jossinet
- Faculty of Life Sciences, University of Strasbourg, Strasbourg 67000, France
| | - Anton I Petrov
- Riboscope Ltd, 23 King Street, Cambridge CB1 1AH, United Kingdom
| |
Collapse
|
6
|
Bu F, Adam Y, Adamiak RW, Antczak M, de Aquino BRH, Badepally NG, Batey RT, Baulin EF, Boinski P, Boniecki MJ, Bujnicki JM, Carpenter KA, Chacon J, Chen SJ, Chiu W, Cordero P, Das NK, Das R, Dawson WK, DiMaio F, Ding F, Dock-Bregeon AC, Dokholyan NV, Dror RO, Dunin-Horkawicz S, Eismann S, Ennifar E, Esmaeeli R, Farsani MA, Ferré-D'Amaré AR, Geniesse C, Ghanim GE, Guzman HV, Hood IV, Huang L, Jain DS, Jaryani F, Jin L, Joshi A, Karelina M, Kieft JS, Kladwang W, Kmiecik S, Koirala D, Kollmann M, Kretsch RC, Kurciński M, Li J, Li S, Magnus M, Masquida B, Moafinejad SN, Mondal A, Mukherjee S, Nguyen THD, Nikolaev G, Nithin C, Nye G, Pandaranadar Jeyeram IPN, Perez A, Pham P, Piccirilli JA, Pilla SP, Pluta R, Poblete S, Ponce-Salvatierra A, Popenda M, Popenda L, Pucci F, Rangan R, Ray A, Ren A, Sarzynska J, Sha CM, Stefaniak F, Su Z, Suddala KC, Szachniuk M, Townshend R, Trachman RJ, Wang J, Wang W, Watkins A, Wirecki TK, Xiao Y, Xiong P, Xiong Y, Yang J, Yesselman JD, Zhang J, Zhang Y, Zhang Z, Zhou Y, Zok T, Zhang D, Zhang S, Żyła A, Westhof E, Miao Z. RNA-Puzzles Round V: blind predictions of 23 RNA structures. Nat Methods 2025; 22:399-411. [PMID: 39623050 PMCID: PMC11810798 DOI: 10.1038/s41592-024-02543-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 10/29/2024] [Indexed: 01/16/2025]
Abstract
RNA-Puzzles is a collective endeavor dedicated to the advancement and improvement of RNA three-dimensional structure prediction. With agreement from structural biologists, RNA structures are predicted by modeling groups before publication of the experimental structures. We report a large-scale set of predictions by 18 groups for 23 RNA-Puzzles: 4 RNA elements, 2 Aptamers, 4 Viral elements, 5 Ribozymes and 8 Riboswitches. We describe automatic assessment protocols for comparisons between prediction and experiment. Our analyses reveal some critical steps to be overcome to achieve good accuracy in modeling RNA structures: identification of helix-forming pairs and of non-Watson-Crick modules, correct coaxial stacking between helices and avoidance of entanglements. Three of the top four modeling groups in this round also ranked among the top four in the CASP15 contest.
Collapse
Grants
- T32 GM066706 NIGMS NIH HHS
- NSFC T2225007 National Natural Science Foundation of China (National Science Foundation of China)
- R35 GM134919 NIGMS NIH HHS
- R35GM145409 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- R35 GM145409 NIGMS NIH HHS
- 32270707 National Natural Science Foundation of China (National Science Foundation of China)
- R35 GM122579 NIGMS NIH HHS
- R35 GM134864 NIGMS NIH HHS
- T32 grant GM066706 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- P20GM121342 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- R21 CA219847 NCI NIH HHS
- 32171191 National Natural Science Foundation of China (National Science Foundation of China)
- P20 GM121342 NIGMS NIH HHS
- R35 GM152029 NIGMS NIH HHS
- R01 GM073850 NIGMS NIH HHS
- F32 GM112294 NIGMS NIH HHS
- ZIA DK075136 Intramural NIH HHS
- Z.M. is supported by Major Projects of Guangzhou National Laboratory, (Grant No. GZNL2023A01006, GZNL2024A01002, SRPG22-003, SRPG22-006, SRPG22-007, HWYQ23-003, YW-YFYJ0102), the National Key R&D Programs of China (2023YFF1204700, 2023YFF1204701, 2021YFF1200900, 2021YFF1200903). This work is part of the ITI 2021-2028 program and supported by IdEx Unistra (ANR-10-IDEX-0002 to E.W.), SFRI-STRAT’US project (ANR-20-SFRI-0012) and EUR IMCBio (IMCBio ANR-17-EURE-0023 to E.W.) under the framework of the French Investments for the Future Program.
- E.W. acknowledges also support from Wenzhou Institute, University of Chinese Academy of Sciences (WIUCASQD2024002).
- E.F.B. was additionally supported by European Molecular Biology Organization (EMBO) fellowship (ALTF 525-2022).
- Boniecki’s research was supported by the Polish National Science Center Poland (NCN) (grant 2016/23/B/ST6/03433 to Michal J. Boniecki). Predictions were performed using computational resources of the Interdisciplinary Centre for Mathematical and Computational Modelling of the University of Warsaw (ICM) (grant G66-9).
- J.M.B. is supported by the National Science Centre in Poland (NCN grants: 2017/26/A/NZ1/01083 to J.M.B., 2021/43/D/NZ1/03360 to S.M., 2020/39/B/NZ2/03127 to F.S., 2020/39/D/NZ2/02837 to T.K.W.). J.M.B. acknowledge Poland high-performance computing Infrastructure PLGrid (HPC Centers: ACK Cyfronet AGH, PCSS, CI TASK, WCSS) for providing computer facilities and support within the computational grant PLG/2023/016080.
- S.J.C. is supported by the National Institutes of Health under Grant R35-GM134919.
- R.D. is supported by Stanford Bio-X (to R.D., R.O.D., R.C.K., and S.E.); Stanford Gerald J. Lieberman Fellowship (to R.R.); the National Institutes of Health (R21 CA219847 and R35 GM122579 to R.D.), the Howard Hughes Medical Institute (HHMI, to R.D.); Consejo Nacional de Ciencia y Tecnología CONACyT Fellowship 312765 (P.C.); the Ruth L. Kirschstein National Research Service Award Postdoctoral Fellowships GM112294 (to J.D.Y.); National Science Foundation Graduate Research Fellowships (R.J.L.T. and R.R.); the National Library of Medicine T15 Training Grant (NLM T15007033 to K.A.C.); the U.S. Department of Energy, Office of Science Graduate Student Research program (R.J.L.T.).
- The National Institutes of Health grants 1R35 GM134864 and the Passan Foundation.
- R.O.D. is supported by the U.S. Department of Energy, Office of Science, Scientific Discovery through Advanced Computing (SciDAC) program (R.O.D.); Intel (R.O.D.).
- A.F.D. is supported, in part, by the intramural program of the National Heart, Lung and Blood Institute, National Institutes of Health, USA.
- Guangdong Science and Technology Department (2022A1515010328, 2023B1212060013, 2020B1212030004), Fundamental Research Funds for the Central Universities, Sun Yat-sen University (23ptpy41).
- D.K. is supported by the NSF CAREER award MCB-2236996, and start-up, SURFF, and START awards from the University of Maryland Baltimore County to D.K.
- BM is supported by the Interdisciplinary Thematic Institute IMCBio, as part of the ITI 2021-2028 program at the University of Strasbourg, CNRS and Inserm, by IdEx Unistra (ANR-10-IDEX-0002), and EUR (IMCBio ANR-17-EUR-0023), under the framework of the French Investments Program for the Future.
- T.H.D.N. is supported by UKRI-Medical Research Council grant MC_UP_1201/19.
- C.N. and M.K. acknowledge funding from the National Science Centre, Poland [OPUS 2019/33/B/NZ2/02100]; S.P.P. acknowledges funding from the National Science Centre, Poland [OPUS 2020/39/B/NZ2/01301]; S.K. acknowledges funding from the National Science Centre, Poland [Sheng 2021/40/Q/NZ2/00078]; C.N. acknowledge Polish high-performance computing infrastructure PLGrid (HPC Centers: PCSS, ACK Cyfronet AGH, CI TASK, WCSS) for providing computer facilities and support within the computational grants PLG/2022/016043, PLG/2022/015327 and PLG/2020/013424.
- AP is supported by an NSF-CAREER award CHE-2235785
- A.R. is supported by grants from the Natural Science Foundation of China (32325029, 32022039, 91940302, and 91640104), the National Key Research and Development Project of China (2021YFC2300300 and 2023YFC2604300).
- Marta Szachniuk are supported by the National Science Centre, Poland (2019/35/B/ST6/03074 to M.S.), the statutory funds of IBCH PAS and Poznan University of Technology.
- J.W. is supported by the Penn State College of Medicine’s Artificial Intelligence and Biomedical Informatics Program.
- J.Z. is supported by the Intramural Research Program of the NIH, the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) (ZIADK075136 to J.Z.), and an NIH Deputy Director for Intramural Research (DDIR) Challenge Award to J.Z.
Collapse
Affiliation(s)
- Fan Bu
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Yagoub Adam
- Inter-institutional Graduate Program on Bioinformatics, Department of Computer Science and Mathematics, FFCLRP, University of São Paulo, Ribeirão Preto, Brazil
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Nigeria
| | - Ryszard W Adamiak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Maciej Antczak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Belisa Rebeca H de Aquino
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Robert T Batey
- Department of Biochemistry, University of Colorado at Boulder, Boulder, CO, USA
| | - Eugene F Baulin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Pawel Boinski
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Michal J Boniecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Kristy A Carpenter
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Jose Chacon
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Department of Cell and Developmental Biology, University of California San Diego, San Diego, CA, USA
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Wah Chiu
- Department of Bioengineering and James H. Clark Center, Stanford University, Stanford, CA, USA
| | - Pablo Cordero
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Stripe, South San Francisco, CA, USA
| | - Naba Krishna Das
- Department of Chemistry and Biochemistry, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Rhiju Das
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
- Biophysics program, Stanford University, Stanford, CA, USA
| | - Wayne K Dawson
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Feng Ding
- Department of Physics and Astronomy, Clemson University, Clemson, SC, USA
| | - Anne-Catherine Dock-Bregeon
- Laboratory of Integrative Biology of Marine Models (LBI2M), Sorbonne University-CNRS UMR8227, Roscoff, France
| | - Nikolay V Dokholyan
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Ron O Dror
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Structural Biology, Stanford University, Stanford, CA, USA
- Department of Molecular and Cellular Physiology, Stanford University, Stanford, CA, USA
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA
| | - Stanisław Dunin-Horkawicz
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Stephan Eismann
- Department of Applied Physics, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Eric Ennifar
- Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, Strasbourg, France
| | - Reza Esmaeeli
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Masoud Amiri Farsani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Adrian R Ferré-D'Amaré
- Laboratory of Nucleic Acids, National Heart, Lung and Blood Institute, Bethesda, MD, USA
| | - Caleb Geniesse
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - George E Ghanim
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Horacio V Guzman
- Instituto de Ciencia de Materials de Barcelona, ICMAB-CSIC, Bellaterra E-08193, Spain & Departamento de Física Teórica de la Materia Condensada, Universidad Autónoma de Madrid, Madrid, Spain
| | - Iris V Hood
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Lin Huang
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University Guangzhou, Guangdong, China
| | - Dharm Skandh Jain
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Farhang Jaryani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Lei Jin
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Astha Joshi
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Masha Karelina
- Biophysics program, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jeffrey S Kieft
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, CO, USA
- New York Structural Biology Center, New York, NY, USA
| | - Wipapat Kladwang
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Sebastian Kmiecik
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Deepak Koirala
- Department of Chemistry and Biochemistry, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Markus Kollmann
- Department of Computer Science, Heinrich Heine University of Düsseldorf, Düsseldorf, Germany
| | | | - Mateusz Kurciński
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Jun Li
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Shuang Li
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Marcin Magnus
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - BenoÎt Masquida
- UMR 7156, CNRS - Université de Strasbourg, IPCB, Strasbourg, France
| | - S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Sunandan Mukherjee
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | | | - Grigory Nikolaev
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Chandran Nithin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Grace Nye
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Iswarya P N Pandaranadar Jeyeram
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Phillip Pham
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Joseph A Piccirilli
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, IL, USA
- Department of Chemistry, The University of Chicago, Chicago, IL, USA
| | - Smita Priyadarshini Pilla
- Laboratory of Computational Biology, Biological and Chemical Research Center, University of Warsaw, Warsaw, Poland
| | - Radosław Pluta
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Simón Poblete
- Facultad de Ingeniería, Arquitectura y Diseño, Universidad San Sebastián, Santiago, Chile
- Centro BASAL Ciencia & Vida, Universidad San Sebastián, Santiago, Chile
| | - Almudena Ponce-Salvatierra
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Mariusz Popenda
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Lukasz Popenda
- NanoBioMedical Centre, Adam Mickiewicz University, Poznan, Poland
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
| | - Ramya Rangan
- Biophysics program, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Angana Ray
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Aiming Ren
- Life Sciences Institute, Zhejiang University, Hangzhou, China
| | - Joanna Sarzynska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Congzhou Mike Sha
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Filip Stefaniak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Zhaoming Su
- The State Key Laboratory of Biotherapy, West China Hospital, Chengdu, China
| | - Krishna C Suddala
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Raphael Townshend
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Robert J Trachman
- Laboratory of Nucleic Acids, National Heart, Lung and Blood Institute, Bethesda, MD, USA
| | - Jian Wang
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Wenkai Wang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
| | - Andrew Watkins
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Prescient Design, Genentech Research and Early Development, South San Francisco, CA, USA
| | - Tomasz K Wirecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Yi Xiao
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Peng Xiong
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Department of Biomedical Engineering, Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
| | - Yiduo Xiong
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Jianyi Yang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
| | - Joseph David Yesselman
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
- Department of Chemistry, University of Nebraska, Lincoln, NE, USA
| | - Jinwei Zhang
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Yi Zhang
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Zhenzhen Zhang
- Department of Physics and Astronomy, Clemson University, Clemson, SC, USA
| | - Yuanzhe Zhou
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Dong Zhang
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Sicheng Zhang
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Adriana Żyła
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Eric Westhof
- Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, Strasbourg, France.
- Engineering Research Center of Clinical Functional Materials and Diagnosis & Treatment Devices of Zhejiang Province, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, China.
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China.
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai, China.
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Cambridge, UK.
| |
Collapse
|
7
|
Bernard C, Postic G, Ghannay S, Tahi F. Has AlphaFold3 achieved success for RNA? Acta Crystallogr D Struct Biol 2025; 81:49-62. [PMID: 39868559 PMCID: PMC11804252 DOI: 10.1107/s2059798325000592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Accepted: 01/21/2025] [Indexed: 01/28/2025] Open
Abstract
Predicting the 3D structure of RNA is a significant challenge despite ongoing advancements in the field. Although AlphaFold has successfully addressed this problem for proteins, RNA structure prediction raises difficulties due to the fundamental differences between proteins and RNA, which hinder its direct adaptation. The latest release of AlphaFold, AlphaFold3, has broadened its scope to include multiple different molecules such as DNA, ligands and RNA. While the AlphaFold3 article discussed the results for the last CASP-RNA data set, the scope of its performance and the limitations for RNA are unclear. In this article, we provide a comprehensive analysis of the performance of AlphaFold3 in the prediction of 3D structures of RNA. Through an extensive benchmark over five different test sets, we discuss the performance and limitations of AlphaFold3. We also compare its performance with ten existing state-of-the-art ab initio, template-based and deep-learning approaches. Our results are freely available on the EvryRNA platform at https://evryrna.ibisc.univ-evry.fr/evryrna/alphafold3/.
Collapse
Affiliation(s)
- Clément Bernard
- Université Paris-Saclay, Université Evry, IBISC, 91020Evry-Courcouronnes, France
- LISN – CNRS/Université Paris-Saclay, 91400Orsay, France
| | - Guillaume Postic
- Université Paris-Saclay, Université Evry, IBISC, 91020Evry-Courcouronnes, France
| | - Sahar Ghannay
- LISN – CNRS/Université Paris-Saclay, 91400Orsay, France
| | - Fariza Tahi
- Université Paris-Saclay, Université Evry, IBISC, 91020Evry-Courcouronnes, France
| |
Collapse
|
8
|
Tarafder S, Bhattacharya D. RNAbpFlow: Base pair-augmented SE(3)-flow matching for conditional RNA 3D structure generation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.24.634669. [PMID: 39896539 PMCID: PMC11785242 DOI: 10.1101/2025.01.24.634669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/04/2025]
Abstract
Motivation Despite the groundbreaking advances in deep learning-enabled methods for bimolecular modeling, predicting accurate three-dimensional (3D) structures of RNA remains challenging due to the highly flexible nature of RNA molecules combined with the limited availability of evolutionary sequences or structural homology. Results We introduce RNAbpFlow, a novel sequence- and base-pair-conditioned SE(3)-equivariant flow matching model for generating RNA 3D structural ensemble. Leveraging a nucleobase center representation, RNAbpFlow enables end-to-end generation of all-atom RNA structures without the explicit or implicit use of evolutionary information or homologous structural templates. Experimental results show that base pairing conditioning leads to broadly generalizable performance improvements over current approaches for RNA topology sampling and predictive modeling in large-scale benchmarking. Availability RNAbpFlow is freely available at https://github.com/Bhattacharya-Lab/RNAbpFlow.
Collapse
Affiliation(s)
- Sumit Tarafder
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, 24061, USA
| | | |
Collapse
|
9
|
Wirecki TK, Lach G, Badepally NG, Moafinejad S, Jaryani F, Klaudel G, Nec K, Baulin EF, Bujnicki JM. DesiRNA: structure-based design of RNA sequences with a replica exchange Monte Carlo approach. Nucleic Acids Res 2025; 53:gkae1306. [PMID: 39831304 PMCID: PMC11744100 DOI: 10.1093/nar/gkae1306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 12/15/2024] [Accepted: 12/25/2024] [Indexed: 01/22/2025] Open
Abstract
Designing RNA sequences that form a specific structure remains a challenge. Current computational methods often struggle with the complexity of RNA structures, especially when considering pseudoknots or restrictions related to RNA function. We developed DesiRNA, a computational tool for the design of RNA sequences based on the Replica Exchange Monte Carlo approach. It finds sequences that minimize a multiobjective scoring function, fulfill user-defined constraints and minimize the violation of restraints. DesiRNA handles pseudoknots, designs RNA-RNA complexes and sequences with alternative structures, prevents oligomerization of monomers, prevents folding into undesired structures and allows users to specify nucleotide composition preferences. In benchmarking tests, DesiRNA with a default simple scoring function solved all 100 puzzles in the Eterna100 benchmark within 24 h, outperforming all existing RNA design programs. With its ability to address complex RNA design challenges, DesiRNA holds promise for a range of applications in RNA research and therapeutic development.
Collapse
Affiliation(s)
- Tomasz K Wirecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland
| | - Grzegorz Lach
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland
- Institute of Theoretical Physics, Faculty of Physics (FUW), University of Warsaw, ul. Hoża 69, 00-681 Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland
| | - S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland
| | - Farhang Jaryani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland
| | - Gaja Klaudel
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland
- Institute of Theoretical Physics, Faculty of Physics (FUW), University of Warsaw, ul. Hoża 69, 00-681 Warsaw, Poland
| | - Kalina Nec
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland
| | - Eugene F Baulin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland
| |
Collapse
|
10
|
Leonarski F, Henning-Knechtel A, Kirmizialtin S, Ennifar E, Auffinger P. Principles of ion binding to RNA inferred from the analysis of a 1.55 Å resolution bacterial ribosome structure - Part I: Mg2. Nucleic Acids Res 2025; 53:gkae1148. [PMID: 39791453 PMCID: PMC11724316 DOI: 10.1093/nar/gkae1148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 07/22/2024] [Accepted: 11/01/2024] [Indexed: 01/12/2025] Open
Abstract
The importance of Mg2+ ions for RNA structure and function cannot be overstated. Several attempts were made to establish a comprehensive Mg2+ binding site classification. However, such descriptions were hampered by poorly modelled ion binding sites as observed in a recent cryo-EM 1.55 Å Escherichia coli ribosome structure where incomplete ion assignments blurred our understanding of their binding patterns. We revisited this model to establish general binding principles applicable to any RNA of sufficient resolution. These principles rely on the 2.9 Å distance separating two water molecules bound in cis to Mg2+. By applying these rules, we could assign all Mg2+ ions bound with 2-4 non-water oxygens. We also uncovered unanticipated motifs where up to five adjacent nucleotides wrap around a single ion. The formation of such motifs involves a hierarchical Mg2+ ion dehydration process that plays a significant role in ribosome biogenesis and in the folding of large RNAs. Besides, we established a classification of the Mg2+…Mg2+ and Mg2+…K+ ion pairs observed in this ribosome. Overall, the uncovered binding principles enhance our understanding of the roles of ions in RNA structure and will help refining the solvation shell of other RNA systems.
Collapse
Affiliation(s)
- Filip Leonarski
- Swiss Light Source, Paul Scherrer Institut, Forschungsstrasse 111, Villigen PSI 5232, Switzerland
| | - Anja Henning-Knechtel
- Chemistry Program, Science Division, New York University Abu Dhabi, Saadiyat Island, 129188 Abu Dhabi, United Arab Emirates
| | - Serdal Kirmizialtin
- Chemistry Program, Science Division, New York University Abu Dhabi, Saadiyat Island, 129188 Abu Dhabi, United Arab Emirates
- Department of Chemistry, New York University, USA
| | - Eric Ennifar
- Université de Strasbourg, Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, 2 Allée Konrad Roentgen, 67084 Strasbourg, France
| | - Pascal Auffinger
- Université de Strasbourg, Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, 2 Allée Konrad Roentgen, 67084 Strasbourg, France
| |
Collapse
|
11
|
Bernard C, Postic G, Ghannay S, Tahi F. RNA-TorsionBERT: leveraging language models for RNA 3D torsion angles prediction. Bioinformatics 2024; 41:btaf004. [PMID: 39775709 PMCID: PMC11758789 DOI: 10.1093/bioinformatics/btaf004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 12/11/2024] [Accepted: 01/07/2025] [Indexed: 01/11/2025] Open
Abstract
MOTIVATION Predicting the 3D structure of RNA is an ongoing challenge that has yet to be completely addressed despite continuous advancements. RNA 3D structures rely on distances between residues and base interactions but also backbone torsional angles. Knowing the torsional angles for each residue could help reconstruct its global folding, which is what we tackle in this work. This paper presents a novel approach for directly predicting RNA torsional angles from raw sequence data. Our method draws inspiration from the successful application of language models in various domains and adapts them to RNA. RESULTS We have developed a language-based model, RNA-TorsionBERT, incorporating better sequential interactions for predicting RNA torsional and pseudo-torsional angles from the sequence only. Through extensive benchmarking, we demonstrate that our method improves the prediction of torsional angles compared to state-of-the-art methods. In addition, by using our predictive model, we have inferred a torsion angle-dependent scoring function, called TB-MCQ, that replaces the true reference angles by our model prediction. We show that it accurately evaluates the quality of near-native predicted structures, in terms of RNA backbone torsion angle values. Our work demonstrates promising results, suggesting the potential utility of language models in advancing RNA 3D structure prediction. AVAILABILITY AND IMPLEMENTATION Source code is freely available on the EvryRNA platform: https://evryrna.ibisc.univ-evry.fr/evryrna/RNA-TorsionBERT.
Collapse
Affiliation(s)
- Clément Bernard
- Université Paris Saclay, Univ Evry, IBISC, Evry-Courcouronnes 91020, France
- LISN—CNRS/Université Paris-Saclay, Orsay 91400, France
| | - Guillaume Postic
- Université Paris Saclay, Univ Evry, IBISC, Evry-Courcouronnes 91020, France
| | - Sahar Ghannay
- LISN—CNRS/Université Paris-Saclay, Orsay 91400, France
| | - Fariza Tahi
- Université Paris Saclay, Univ Evry, IBISC, Evry-Courcouronnes 91020, France
| |
Collapse
|
12
|
Shen T, Hu Z, Sun S, Liu D, Wong F, Wang J, Chen J, Wang Y, Hong L, Xiao J, Zheng L, Krishnamoorthi T, King I, Wang S, Yin P, Collins JJ, Li Y. Accurate RNA 3D structure prediction using a language model-based deep learning approach. Nat Methods 2024; 21:2287-2298. [PMID: 39572716 PMCID: PMC11621015 DOI: 10.1038/s41592-024-02487-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 09/25/2024] [Indexed: 12/07/2024]
Abstract
Accurate prediction of RNA three-dimensional (3D) structures remains an unsolved challenge. Determining RNA 3D structures is crucial for understanding their functions and informing RNA-targeting drug development and synthetic biology design. The structural flexibility of RNA, which leads to the scarcity of experimentally determined data, complicates computational prediction efforts. Here we present RhoFold+, an RNA language model-based deep learning method that accurately predicts 3D structures of single-chain RNAs from sequences. By integrating an RNA language model pretrained on ~23.7 million RNA sequences and leveraging techniques to address data scarcity, RhoFold+ offers a fully automated end-to-end pipeline for RNA 3D structure prediction. Retrospective evaluations on RNA-Puzzles and CASP15 natural RNA targets demonstrate the superiority of RhoFold+ over existing methods, including human expert groups. Its efficacy and generalizability are further validated through cross-family and cross-type assessments, as well as time-censored benchmarks. Additionally, RhoFold+ predicts RNA secondary structures and interhelical angles, providing empirically verifiable features that broaden its applicability to RNA structure and function studies.
Collapse
Affiliation(s)
- Tao Shen
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
- Shanghai Zelixir Biotech Company Ltd, Shanghai, China
- Shenzhen Institute of Advanced Technology, Shenzhen, China
| | - Zhihang Hu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Siqi Sun
- Research Institute of Intelligent Complex Systems, Fudan University, Shanghai, China.
- Shanghai Artificial Intelligence Laboratory, Shanghai, China.
| | - Di Liu
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Center for Molecular Design and Biomimetics at the Biodesign Institute, Arizona State University, Tempe, AZ, USA.
- School of Molecular Sciences, Arizona State University, Tempe, AZ, USA.
| | - Felix Wong
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, USA
- Integrated Biosciences, Redwood City, CA, USA
| | - Jiuming Wang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
- OneAIM Ltd, Hong Kong SAR, China
| | - Jiayang Chen
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Yixuan Wang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Liang Hong
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Jin Xiao
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Liangzhen Zheng
- Shanghai Zelixir Biotech Company Ltd, Shanghai, China
- Shenzhen Institute of Advanced Technology, Shenzhen, China
| | - Tejas Krishnamoorthi
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
| | - Irwin King
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd, Shanghai, China.
- Shenzhen Institute of Advanced Technology, Shenzhen, China.
| | - Peng Yin
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
| | - James J Collins
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Yu Li
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China.
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- The CUHK Shenzhen Research Institute, Shenzhen, China.
| |
Collapse
|
13
|
Bahai A, Kwoh CK, Mu Y, Li Y. Systematic benchmarking of deep-learning methods for tertiary RNA structure prediction. PLoS Comput Biol 2024; 20:e1012715. [PMID: 39775239 PMCID: PMC11723642 DOI: 10.1371/journal.pcbi.1012715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 01/10/2025] [Accepted: 12/10/2024] [Indexed: 01/11/2025] Open
Abstract
The 3D structure of RNA critically influences its functionality, and understanding this structure is vital for deciphering RNA biology. Experimental methods for determining RNA structures are labour-intensive, expensive, and time-consuming. Computational approaches have emerged as valuable tools, leveraging physics-based-principles and machine learning to predict RNA structures rapidly. Despite advancements, the accuracy of computational methods remains modest, especially when compared to protein structure prediction. Deep learning methods, while successful in protein structure prediction, have shown some promise for RNA structure prediction as well, but face unique challenges. This study systematically benchmarks state-of-the-art deep learning methods for RNA structure prediction across diverse datasets. Our aim is to identify factors influencing performance variation, such as RNA family diversity, sequence length, RNA type, multiple sequence alignment (MSA) quality, and deep learning model architecture. We show that generally ML-based methods perform much better than non-ML methods on most RNA targets, although the performance difference isn't substantial when working with unseen novel or synthetic RNAs. The quality of the MSA and secondary structure prediction both play an important role and most methods aren't able to predict non-Watson-Crick pairs in the RNAs. Overall among the automated 3D RNA structure prediction methods, DeepFoldRNA has the best prediction results followed by DRFold as the second best method. Finally, we also suggest possible mitigations to improve the quality of the prediction for future method development.
Collapse
Affiliation(s)
- Akash Bahai
- School of Biological Sciences (SBS), Nanyang Technological University, Singapore, Singapore
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Yuguang Mu
- School of Biological Sciences (SBS), Nanyang Technological University, Singapore, Singapore
| | - Yinghui Li
- School of Biological Sciences (SBS), Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
14
|
Tarafder S, Bhattacharya D. lociPARSE: A Locality-aware Invariant Point Attention Model for Scoring RNA 3D Structures. J Chem Inf Model 2024; 64:8655-8664. [PMID: 39523843 PMCID: PMC11600500 DOI: 10.1021/acs.jcim.4c01621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Revised: 10/17/2024] [Accepted: 10/29/2024] [Indexed: 11/16/2024]
Abstract
A scoring function that can reliably assess the accuracy of a 3D RNA structural model in the absence of experimental structure is not only important for model evaluation and selection but also useful for scoring-guided conformational sampling. However, high-fidelity RNA scoring has proven to be difficult using conventional knowledge-based statistical potentials and currently available machine learning-based approaches. Here, we present lociPARSE, a locality-aware invariant point attention architecture for scoring RNA 3D structures. Unlike existing machine learning methods that estimate superposition-based root-mean-square deviation (RMSD), lociPARSE estimates Local Distance Difference Test (lDDT) scores capturing the accuracy of each nucleotide and its surrounding local atomic environment in a superposition-free manner, before aggregating information to predict global structural accuracy. Tested on multiple datasets including CASP15, lociPARSE significantly outperforms existing statistical potentials (rsRNASP, cgRNASP, DFIRE-RNA, and RASP) and machine learning methods (ARES and RNA3DCNN) across complementary assessment metrics. lociPARSE is freely available at https://github.com/Bhattacharya-Lab/lociPARSE.
Collapse
Affiliation(s)
- Sumit Tarafder
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Debswapna Bhattacharya
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
15
|
Mukherjee S, Moafinejad SN, Badepally NG, Merdas K, Bujnicki JM. Advances in the field of RNA 3D structure prediction and modeling, with purely theoretical approaches, and with the use of experimental data. Structure 2024; 32:1860-1876. [PMID: 39321802 DOI: 10.1016/j.str.2024.08.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 08/08/2024] [Accepted: 08/22/2024] [Indexed: 09/27/2024]
Abstract
Recent advancements in RNA three-dimensional (3D) structure prediction have provided significant insights into RNA biology, highlighting the essential role of RNA in cellular functions and its therapeutic potential. This review summarizes the latest developments in computational methods, particularly the incorporation of artificial intelligence and machine learning, which have improved the efficiency and accuracy of RNA structure predictions. We also discuss the integration of new experimental data types, including cryoelectron microscopy (cryo-EM) techniques and high-throughput sequencing, which have transformed RNA structure modeling. The combination of experimental advances with computational methods represents a significant leap in RNA structure determination. We review the outcomes of RNA-Puzzles and critical assessment of structure prediction (CASP) challenges, which assess the state of the field and limitations of existing methods. Future perspectives are discussed, focusing on the impact of RNA 3D structure prediction on understanding RNA mechanisms and its implications for drug discovery and RNA-targeted therapies, opening new avenues in molecular biology.
Collapse
Affiliation(s)
- Sunandan Mukherjee
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Katarzyna Merdas
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland.
| |
Collapse
|
16
|
Bohdan D, Bujnicki J, Baulin E. ARTEMIS: a method for topology-independent superposition of RNA 3D structures and structure-based sequence alignment. Nucleic Acids Res 2024; 52:10850-10861. [PMID: 39258540 PMCID: PMC11472068 DOI: 10.1093/nar/gkae758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Revised: 08/16/2024] [Accepted: 08/20/2024] [Indexed: 09/12/2024] Open
Abstract
Non-coding RNAs play a major role in diverse processes in living cells with their sequence and spatial structure serving as the principal determinants of their function. Superposition of RNA 3D structures is the most accurate method for comparative analysis of RNA molecules and for inferring structure-based sequence alignments. Topology-independent superposition is particularly relevant, as evidenced by structurally similar RNAs with sequence permutations such as tRNA and Y RNA. To date, state-of-the-art methods for RNA 3D structure superposition rely on intricate heuristics, and the potential for topology-independent superposition has not been exhausted. Recently, we introduced the ARTEM method for unrestrained pairwise superposition of RNA 3D modules and now we developed it further to solve the global RNA 3D structure alignment problem. Our new tool ARTEMIS significantly outperforms state-of-the-art tools in both sequentially-ordered and topology-independent RNA 3D structure superposition. Using ARTEMIS we discovered a helical packing motif to be preserved within different backbone topology contexts across various non-coding RNAs, including multiple ribozymes and riboswitches. We anticipate that ARTEMIS will be essential for elucidating the landscape of RNA 3D folds and motifs featuring sequence permutations that thus far remained unexplored due to limitations in previous computational approaches.
Collapse
Affiliation(s)
- Davyd R Bohdan
- International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Janusz M Bujnicki
- International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Eugene F Baulin
- International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| |
Collapse
|
17
|
Mackowiak M, Adamczyk B, Szachniuk M, Zok T. RNAtango: Analysing and comparing RNA 3D structures via torsional angles. PLoS Comput Biol 2024; 20:e1012500. [PMID: 39374268 PMCID: PMC11486365 DOI: 10.1371/journal.pcbi.1012500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 10/17/2024] [Accepted: 09/18/2024] [Indexed: 10/09/2024] Open
Abstract
RNA molecules, essential for viruses and living organisms, derive their pivotal functions from intricate 3D structures. To understand these structures, one can analyze torsion and pseudo-torsion angles, which describe rotations around bonds, whether real or virtual, thus capturing the RNA conformational flexibility. Such an analysis has been made possible by RNAtango, a web server introduced in this paper, that provides a trigonometric perspective on RNA 3D structures, giving insights into the variability of examined models and their alignment with reference targets. RNAtango offers comprehensive tools for calculating torsion and pseudo-torsion angles, generating angle statistics, comparing RNA structures based on backbone torsions, and assessing local and global structural similarities using trigonometric functions and angle measures. The system operates in three scenarios: single model analysis, model-versus-target comparison, and model-versus-model comparison, with results output in text and graphical formats. Compatible with all modern web browsers, RNAtango is accessible freely along with the source code. It supports researchers in accurately assessing structural similarities, which contributes to the precision and efficiency of RNA modeling.
Collapse
Affiliation(s)
- Marta Mackowiak
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Bartosz Adamczyk
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Marta Szachniuk
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| |
Collapse
|
18
|
McCann H, Meade CD, Williams LD, Petrov AS, Johnson PZ, Simon AE, Hoksza D, Nawrocki EP, Chan PP, Lowe TM, Ribas CE, Sweeney BA, Madeira F, Anyango S, Appasamy SD, Deshpande M, Varadi M, Velankar S, Zirbel CL, Naiden A, Jossinet F, Petrov AI. R2DT: A COMPREHENSIVE PLATFORM FOR VISUALISING RNA SECONDARY STRUCTURE. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.29.611006. [PMID: 39803519 PMCID: PMC11722224 DOI: 10.1101/2024.09.29.611006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/24/2025]
Abstract
RNA secondary (2D) structure visualisation is an essential tool for understanding RNA function. R2DT is a software package designed to visualise RNA 2D structures in consistent, recognisable, and reproducible layouts. The latest release, R2DT 2.0, introduces multiple significant features, including the ability to display position-specific information, such as single nucleotide polymorphisms (SNPs) or SHAPE reactivities. It also offers a new template-free mode allowing visualisation of RNAs without pre-existing templates, alongside a constrained folding mode and support for animated visualisations. Users can interactively modify R2DT diagrams, either manually or using natural language prompts, to generate new templates or create publication-quality images. Additionally, R2DT features faster performance, an expanded template library, and a growing collection of compatible tools and utilities. Already integrated into multiple biological databases, R2DT has evolved into a comprehensive platform for RNA 2D visualisation, accessible at https://r2dt.bio.
Collapse
Affiliation(s)
- Holly McCann
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA; School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA
| | - Caeden D. Meade
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA; School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA
| | - Loren Dean Williams
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA; School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA
| | - Anton S. Petrov
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA; School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA
| | - Philip Z. Johnson
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Anne E. Simon
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - David Hoksza
- Department of Software Engineering, Charles University, Prague, 118 00, Czech Republic
| | - Eric P. Nawrocki
- National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Patricia P. Chan
- Department of Biomolecular Engineering, Baskin School of Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Todd M. Lowe
- Department of Biomolecular Engineering, Baskin School of Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Carlos Eduardo Ribas
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Blake A. Sweeney
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Fábio Madeira
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Stephen Anyango
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Sri Devan Appasamy
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Mandar Deshpande
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Mihaly Varadi
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Sameer Velankar
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Craig L. Zirbel
- Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, OH 43403, USA
| | | | - Fabrice Jossinet
- Faculty of Life Sciences, University of Strasbourg, Strasbourg, 67000, France
| | | |
Collapse
|
19
|
Fallah A, Havaei SA, Sedighian H, Kachuei R, Fooladi AAI. Prediction of aptamer affinity using an artificial intelligence approach. J Mater Chem B 2024; 12:8825-8842. [PMID: 39158322 DOI: 10.1039/d4tb00909f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2024]
Abstract
Aptamers are oligonucleotide sequences that can connect to particular target molecules, similar to monoclonal antibodies. They can be chosen by systematic evolution of ligands by exponential enrichment (SELEX), and are modifiable and can be synthesized. Even if the SELEX approach has been improved a lot, it is frequently challenging and time-consuming to identify aptamers experimentally. In particular, structure-based methods are the most used in computer-aided design and development of aptamers. For this purpose, numerous web-based platforms have been suggested for the purpose of forecasting the secondary structure and 3D configurations of RNAs and DNAs. Also, molecular docking and molecular dynamics (MD), which are commonly utilized in protein compound selection by structural information, are suitable for aptamer selection. On the other hand, from a large number of sequences, artificial intelligence (AI) may be able to quickly discover the possible aptamer candidates. Conversely, sophisticated machine and deep-learning (DL) models have demonstrated efficacy in forecasting the binding properties between ligands and targets during drug discovery; as such, they may provide a reliable and precise method for forecasting the binding of aptamers to targets. This research looks at advancements in AI pipelines and strategies for aptamer binding ability prediction, such as machine and deep learning, as well as structure-based approaches, molecular dynamics and molecular docking simulation methods.
Collapse
Affiliation(s)
- Arezoo Fallah
- Department of Bacteriology and Virology, Faculty of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Seyed Asghar Havaei
- Department of Microbiology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.
| | - Hamid Sedighian
- Applied Microbiology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| | - Reza Kachuei
- Molecular Biology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Abbas Ali Imani Fooladi
- Applied Microbiology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
20
|
Bussi G, Bonomi M, Gkeka P, Sattler M, Al-Hashimi HM, Auffinger P, Duca M, Foricher Y, Incarnato D, Jones AN, Kirmizialtin S, Krepl M, Orozco M, Palermo G, Pasquali S, Salmon L, Schwalbe H, Westhof E, Zacharias M. RNA dynamics from experimental and computational approaches. Structure 2024; 32:1281-1287. [PMID: 39241758 DOI: 10.1016/j.str.2024.07.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 06/21/2024] [Accepted: 07/29/2024] [Indexed: 09/09/2024]
Abstract
Conformational dynamics is crucial for the biological function of RNA molecules and for their potential as therapeutic targets. This meeting report outlines key "take-home" messages that emerged from the presentations and discussions during the CECAM workshop "RNA dynamics from experimental and computational approaches" in Paris, June 26-28, 2023.
Collapse
Affiliation(s)
- Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Bonomea 265, 34136 Trieste, Italy.
| | - Massimiliano Bonomi
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Computational Structural Biology Unit, Paris, France.
| | - Paraskevi Gkeka
- Integrated Drug Discovery, Molecular Design Sciences, Sanofi, Vitry-sur-Seine, France.
| | - Michael Sattler
- Technical University of Munich, Munich, Germany; Helmholtz Munich, Munich, Germany.
| | - Hashim M Al-Hashimi
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
| | - Pascal Auffinger
- Université de Strasbourg, Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, 2 Allée Konrad Roentgen, 67084 Strasbourg, France
| | - Maria Duca
- Université Côte d'Azur, CNRS, Institute of Chemistry of Nice, Nice, France
| | - Yann Foricher
- Integrated Drug Discovery, Small Molecules Medicinal Chemistry, Sanofi, Vitry-sur-Seine, France
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, the Netherlands
| | - Alisha N Jones
- Department of Chemistry, New York University, New York, NY, USA
| | - Serdal Kirmizialtin
- Department of Chemistry, New York University, New York, NY, USA; Chemistry Program, Science Division, New York University, Abu Dhabi, United Arab Emirates
| | - Miroslav Krepl
- Institute of Biophysics of the Czech Academy of Sciences, Kralovopolska 135, Brno 612 00, Czech Republic
| | - Modesto Orozco
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, and Department of Biochemistry and Biomedicine, University of Barcelona, Barcelona, Spain
| | - Giulia Palermo
- Department of Bioengineering and Department of Chemistry, The University of California, Riverside, Riverside, CA, USA
| | - Samuela Pasquali
- Laboratoire Biologie Fonctionnelle et Adaptative, CNRS UMR 8251 INSERM ERL 1133, Université Paris Cité, 35 rue Hélène Brion, 75013 Paris, France
| | - Loïc Salmon
- Centre de RMN à Très Hauts Champs, UMR 5082 (CNRS, École Normale Supérieure de Lyon, Université Claude Bernard Lyon 1), University of Lyon, 69100 Villeurbanne, France
| | - Harald Schwalbe
- Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance, Goethe-University Frankfurt, 60438 Frankfurt/Main, Germany
| | - Eric Westhof
- Architecture et Réactivité de l'ARN, Université de Strasbourg, Institut de biologie moléculaire et cellulaire du CNRS, 67084 Strasbourg, France
| | - Martin Zacharias
- Physics Department and Center of Protein Assemblies, Technical University of Munich, Munich, Germany
| |
Collapse
|
21
|
Zhang S, Li J, Chen SJ. Machine learning in RNA structure prediction: Advances and challenges. Biophys J 2024; 123:2647-2657. [PMID: 38297836 PMCID: PMC11393687 DOI: 10.1016/j.bpj.2024.01.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/08/2024] [Accepted: 01/24/2024] [Indexed: 02/02/2024] Open
Abstract
RNA molecules play a crucial role in various biological processes, with their functionality closely tied to their structures. The remarkable advancements in machine learning techniques for protein structure prediction have shown promise in the field of RNA structure prediction. In this perspective, we discuss the advances and challenges encountered in constructing machine learning-based models for RNA structure prediction. We explore topics including model building strategies, specific challenges involved in predicting RNA secondary (2D) and tertiary (3D) structures, and approaches to these challenges. In addition, we highlight the advantages and challenges of constructing RNA language models. Given the rapid advances of machine learning techniques, we anticipate that machine learning-based models will serve as important tools for predicting RNA structures, thereby enriching our understanding of RNA structures and their corresponding functions.
Collapse
Affiliation(s)
- Sicheng Zhang
- Department of Physics and Institute of Data Science and Informatics, University of Missouri, Columbia, Missouri
| | - Jun Li
- Department of Physics and Institute of Data Science and Informatics, University of Missouri, Columbia, Missouri
| | - Shi-Jie Chen
- Department of Physics and Institute of Data Science and Informatics, University of Missouri, Columbia, Missouri; Department of Biochemistry, University of Missouri, Columbia, Missouri.
| |
Collapse
|
22
|
Linzer JT, Aminov E, Abdullah AS, Kirkup CE, Diaz Ventura RI, Bijoor VR, Jung J, Huang S, Tse CG, Álvarez Toucet E, Onghai HP, Ghosh AP, Grodzki AC, Haines ER, Iyer AS, Khalil MK, Leong AP, Neuhaus MA, Park J, Shahid A, Xie M, Ziembicki JM, Simmerling C, Nagan MC. Accurately Modeling RNA Stem-Loops in an Implicit Solvent Environment. J Chem Inf Model 2024; 64:6092-6104. [PMID: 39002142 PMCID: PMC11584990 DOI: 10.1021/acs.jcim.4c00756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/15/2024]
Abstract
Ribonucleic acid (RNA) molecules can adopt a variety of secondary and tertiary structures in solution, with stem-loops being one of the more common motifs. Here, we present a systematic analysis of 15 RNA stem-loop sequences simulated with molecular dynamics simulations in an implicit solvent environment. Analysis of RNA cluster ensembles showed that the stem-loop structures can generally adopt the A-form RNA in the stem region. Loop structures are more sensitive, and experimental structures could only be reproduced with modification of CH···O interactions in the force field, combined with an implicit solvent nonpolar correction to better model base stacking interactions. Accurately modeling RNA with current atomistic physics-based models remains challenging, but the RNA systems studied herein may provide a useful benchmark set for testing other RNA modeling methods in the future.
Collapse
Affiliation(s)
- Jason T Linzer
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Ethan Aminov
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Aalim S Abdullah
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Colleen E Kirkup
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Rebeca I Diaz Ventura
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Vinay R Bijoor
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Jiyun Jung
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Sophie Huang
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Chi Gee Tse
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Emily Álvarez Toucet
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Hugo P Onghai
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Arghya P Ghosh
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Alex C Grodzki
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Emilee R Haines
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Aditya S Iyer
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Mark K Khalil
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Alexander P Leong
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Michael A Neuhaus
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Joseph Park
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Asir Shahid
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Matthew Xie
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Jan M Ziembicki
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Carlos Simmerling
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
| | - Maria C Nagan
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| |
Collapse
|
23
|
Genna V, Reyes-Fraile L, Iglesias-Fernandez J, Orozco M. Nucleic acids in modern molecular therapies: A realm of opportunities for strategic drug design. Curr Opin Struct Biol 2024; 87:102838. [PMID: 38759298 DOI: 10.1016/j.sbi.2024.102838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 04/10/2024] [Accepted: 04/23/2024] [Indexed: 05/19/2024]
Abstract
RNA vaccines have made evident to society what was already known by the scientific community: nucleic acids will be the "drugs of the future." By modifying the genome, interfering in transcription or translation, and by introducing new catalysts into the cell or by mimicking antibody effects, nucleic acids can generate therapeutic activities that are not accessible by any other therapeutic agents. There are, however, challenges that need to be solved in the next few years to make nucleic acids usable in a wide range of therapeutic scenarios. This review illustrates how simulation methods can help achieve this goal.
Collapse
Affiliation(s)
- Vito Genna
- NBD|Nostrum Biodiscovery, Josep Tarradellas 8-10, Barcelona 08019, Spain. https://twitter.com/_VitoGenna_
| | - Laura Reyes-Fraile
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, Barcelona 08028, Spain; Sixfold Bioscience Ltd, Translational & Innovation Hub, 84 Wood Ln, London W12 0BZ, United Kingdom
| | | | - Modesto Orozco
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, Barcelona 08028, Spain; Department of Biochemistry and Biomedicine, University of Barcelona, Barcelona 08028, Spain.
| |
Collapse
|
24
|
Nithin C, Kmiecik S, Błaszczyk R, Nowicka J, Tuszyńska I. Comparative analysis of RNA 3D structure prediction methods: towards enhanced modeling of RNA-ligand interactions. Nucleic Acids Res 2024; 52:7465-7486. [PMID: 38917327 PMCID: PMC11260495 DOI: 10.1093/nar/gkae541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 05/23/2024] [Accepted: 06/16/2024] [Indexed: 06/27/2024] Open
Abstract
Accurate RNA structure models are crucial for designing small molecule ligands that modulate their functions. This study assesses six standalone RNA 3D structure prediction methods-DeepFoldRNA, RhoFold, BRiQ, FARFAR2, SimRNA and Vfold2, excluding web-based tools due to intellectual property concerns. We focus on reproducing the RNA structure existing in RNA-small molecule complexes, particularly on the ability to model ligand binding sites. Using a comprehensive set of RNA structures from the PDB, which includes diverse structural elements, we found that machine learning (ML)-based methods effectively predict global RNA folds but are less accurate with local interactions. Conversely, non-ML-based methods demonstrate higher precision in modeling intramolecular interactions, particularly with secondary structure restraints. Importantly, ligand-binding site accuracy can remain sufficiently high for practical use, even if the overall model quality is not optimal. With the recent release of AlphaFold 3, we included this advanced method in our tests. Benchmark subsets containing new structures, not used in the training of the tested ML methods, show that AlphaFold 3's performance was comparable to other ML-based methods, albeit with some challenges in accurately modeling ligand binding sites. This study underscores the importance of enhancing binding site prediction accuracy and the challenges in modeling RNA-ligand interactions accurately.
Collapse
Affiliation(s)
- Chandran Nithin
- Molecure SA, 02-089 Warsaw, Poland
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, 02-089 Warsaw, Poland
| | - Sebastian Kmiecik
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, 02-089 Warsaw, Poland
| | | | | | | |
Collapse
|
25
|
Steffen FD, Cunha RA, Sigel RKO, Börner R. FRET-guided modeling of nucleic acids. Nucleic Acids Res 2024; 52:e59. [PMID: 38869063 PMCID: PMC11260485 DOI: 10.1093/nar/gkae496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 05/29/2024] [Indexed: 06/14/2024] Open
Abstract
The functional diversity of RNAs is encoded in their innate conformational heterogeneity. The combination of single-molecule spectroscopy and computational modeling offers new attractive opportunities to map structural transitions within nucleic acid ensembles. Here, we describe a framework to harmonize single-molecule Förster resonance energy transfer (FRET) measurements with molecular dynamics simulations and de novo structure prediction. Using either all-atom or implicit fluorophore modeling, we recreate FRET experiments in silico, visualize the underlying structural dynamics and quantify the reaction coordinates. Using multiple accessible-contact volumes as a post hoc scoring method for fragment assembly in Rosetta, we demonstrate that FRET can be used to filter a de novo RNA structure prediction ensemble by refuting models that are not compatible with in vitro FRET measurement. We benchmark our FRET-assisted modeling approach on double-labeled DNA strands and validate it against an intrinsically dynamic manganese(II)-binding riboswitch. We show that a FRET coordinate describing the assembly of a four-way junction allows our pipeline to recapitulate the global fold of the riboswitch displayed by the crystal structure. We conclude that computational fluorescence spectroscopy facilitates the interpretability of dynamic structural ensembles and improves the mechanistic understanding of nucleic acid interactions.
Collapse
Affiliation(s)
- Fabio D Steffen
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Richard A Cunha
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Roland K O Sigel
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Richard Börner
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| |
Collapse
|
26
|
Tarafder S, Bhattacharya D. lociPARSE: a locality-aware invariant point attention model for scoring RNA 3D structures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.04.565599. [PMID: 37961488 PMCID: PMC10635153 DOI: 10.1101/2023.11.04.565599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
A scoring function that can reliably assess the accuracy of a 3D RNA structural model in the absence of experimental structure is not only important for model evaluation and selection but also useful for scoring-guided conformational sampling. However, high-fidelity RNA scoring has proven to be difficult using conventional knowledge-based statistical potentials and currently-available machine learning-based approaches. Here we present lociPARSE, a locality-aware invariant point attention architecture for scoring RNA 3D structures. Unlike existing machine learning methods that estimate superposition-based root mean square deviation (RMSD), lociPARSE estimates Local Distance Difference Test (lDDT) scores capturing the accuracy of each nucleotide and its surrounding local atomic environment in a superposition-free manner, before aggregating information to predict global structural accuracy. Tested on multiple datasets including CASP15, lociPARSE significantly outperforms existing statistical potentials (rsRNASP, cgRNASP, DFIRE-RNA, and RASP) and machine learning methods (ARES and RNA3DCNN) across complementary assessment metrics. lociPARSE is freely available at https://github.com/Bhattacharya-Lab/lociPARSE.
Collapse
Affiliation(s)
- Sumit Tarafder
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, 24061, USA
| | | |
Collapse
|
27
|
Moafinejad SN, de Aquino BRH, Boniecki M, Pandaranadar Jeyeram IN, Nikolaev G, Magnus M, Farsani M, Badepally N, Wirecki T, Stefaniak F, Bujnicki J. SimRNAweb v2.0: a web server for RNA folding simulations and 3D structure modeling, with optional restraints and enhanced analysis of folding trajectories. Nucleic Acids Res 2024; 52:W368-W373. [PMID: 38738621 PMCID: PMC11223799 DOI: 10.1093/nar/gkae356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/07/2024] [Accepted: 04/29/2024] [Indexed: 05/14/2024] Open
Abstract
Research on ribonucleic acid (RNA) structures and functions benefits from easy-to-use tools for computational prediction and analyses of RNA three-dimensional (3D) structure. The SimRNAweb server version 2.0 offers an enhanced, user-friendly platform for RNA 3D structure prediction and analysis of RNA folding trajectories based on the SimRNA method. SimRNA employs a coarse-grained model, Monte Carlo sampling and statistical potentials to explore RNA conformational space, optionally guided by spatial restraints. Recognized for its accuracy in RNA 3D structure prediction in RNA-Puzzles and CASP competitions, SimRNA is particularly useful for incorporating restraints based on experimental data. The new server version introduces performance optimizations and extends user control over simulations and the processing of results. It allows the application of various hard and soft restraints, accommodating alternative structures involving canonical and noncanonical base pairs and unpaired residues, while also integrating data from chemical probing methods. Enhanced features include an improved analysis of folding trajectories, offering advanced clustering options and multiple analyses of the generated trajectories. These updates provide comprehensive tools for detailed RNA structure analysis. SimRNAweb v2.0 significantly broadens the scope of RNA modeling, emphasizing flexibility and user-defined parameter control. The web server is available at https://genesilico.pl/SimRNAweb.
Collapse
Affiliation(s)
- S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Belisa R H de Aquino
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Michał J Boniecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Iswarya P N Pandaranadar Jeyeram
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Grigory Nikolaev
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Marcin Magnus
- Department of Molecular and Cellular Biology, Harvard University, 52 Oxford St, Cambridge, MA 02138, USA
| | - Masoud Amiri Farsani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Tomasz K Wirecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Filip Stefaniak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| |
Collapse
|
28
|
Bernard C, Postic G, Ghannay S, Tahi F. State-of-the-RNArt: benchmarking current methods for RNA 3D structure prediction. NAR Genom Bioinform 2024; 6:lqae048. [PMID: 38745991 PMCID: PMC11091930 DOI: 10.1093/nargab/lqae048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/05/2024] [Accepted: 05/08/2024] [Indexed: 05/16/2024] Open
Abstract
RNAs are essential molecules involved in numerous biological functions. Understanding RNA functions requires the knowledge of their 3D structures. Computational methods have been developed for over two decades to predict the 3D conformations from RNA sequences. These computational methods have been widely used and are usually categorised as either ab initio or template-based. The performances remain to be improved. Recently, the rise of deep learning has changed the sight of novel approaches. Deep learning methods are promising, but their adaptation to RNA 3D structure prediction remains difficult. In this paper, we give a brief review of the ab initio, template-based and novel deep learning approaches. We highlight the different available tools and provide a benchmark on nine methods using the RNA-Puzzles dataset. We provide an online dashboard that shows the predictions made by benchmarked methods, freely available on the EvryRNA platform: https://evryrna.ibisc.univ-evry.fr/evryrna/state_of_the_rnart/.
Collapse
Affiliation(s)
- Clément Bernard
- Université Paris-Saclay, Univ. Evry, IBISC, 91020 Evry-Courcouronnes, France
- LISN - CNRS/Université Paris-Saclay, 91400 Orsay, France
| | - Guillaume Postic
- Université Paris-Saclay, Univ. Evry, IBISC, 91020 Evry-Courcouronnes, France
| | - Sahar Ghannay
- LISN - CNRS/Université Paris-Saclay, 91400 Orsay, France
| | - Fariza Tahi
- Université Paris-Saclay, Univ. Evry, IBISC, 91020 Evry-Courcouronnes, France
| |
Collapse
|
29
|
Peterson JM, Becker ST, O'Leary CA, Juneja P, Yang Y, Moss WN. Structure of the SARS-CoV-2 Frameshift Stimulatory Element with an Upstream Multibranch Loop. Biochemistry 2024; 63:1287-1296. [PMID: 38727003 DOI: 10.1021/acs.biochem.3c00716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) frameshift stimulatory element (FSE) is necessary for programmed -1 ribosomal frameshifting (-1 PRF) and optimized viral efficacy. The FSE has an abundance of context-dependent alternate conformations, but two of the structures most crucial to -1 PRF are an attenuator hairpin and a three-stem H-type pseudoknot structure. A crystal structure of the pseudoknot alone features three RNA stems in a helically stacked linear structure, whereas a 6.9 Å cryo-EM structure including the upstream heptameric slippery site resulted in a bend between two stems. Our previous research alluded to an extended upstream multibranch loop that includes both the attenuator hairpin and the slippery site-a conformation not previously modeled. We aim to provide further context to the SARS-CoV-2 FSE via computational and medium resolution cryo-EM approaches, by presenting a 6.1 Å cryo-EM structure featuring a linear pseudoknot structure and a dynamic upstream multibranch loop.
Collapse
Affiliation(s)
- Jake M Peterson
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa 50011, United States
| | - Scott T Becker
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa 50011, United States
| | - Collin A O'Leary
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa 50011, United States
| | - Puneet Juneja
- Cryo-EM Facility, Iowa State University, Ames, Iowa 50011, United States
| | - Yang Yang
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa 50011, United States
| | - Walter N Moss
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa 50011, United States
| |
Collapse
|
30
|
Ramakers J, Blum CF, König S, Harmeling S, Kollmann M. De novo prediction of RNA 3D structures with deep generative models. PLoS One 2024; 19:e0297105. [PMID: 38358972 PMCID: PMC10868834 DOI: 10.1371/journal.pone.0297105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 12/24/2023] [Indexed: 02/17/2024] Open
Abstract
We present a Deep Learning approach to predict 3D folding structures of RNAs from their nucleic acid sequence. Our approach combines an autoregressive Deep Generative Model, Monte Carlo Tree Search, and a score model to find and rank the most likely folding structures for a given RNA sequence. We show that RNA de novo structure prediction by deep learning is possible at atom resolution, despite the low number of experimentally measured structures that can be used for training. We confirm the predictive power of our approach by achieving competitive results in a retrospective evaluation of the RNA-Puzzles prediction challenges, without using structural contact information from multiple sequence alignments or additional data from chemical probing experiments. Blind predictions for recent RNA-Puzzle challenges under the name "Dfold" further support the competitive performance of our approach.
Collapse
Affiliation(s)
- Julius Ramakers
- Department of Computer Science, Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany
| | | | - Sabrina König
- Department of Computer Science, Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany
| | - Stefan Harmeling
- Department of Computer Science, Technical University Dortmund, Dortmund, Germany
| | - Markus Kollmann
- Department of Computer Science, Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany
| |
Collapse
|
31
|
Loyer G, Reinharz V. Concurrent prediction of RNA secondary structures with pseudoknots and local 3D motifs in an integer programming framework. Bioinformatics 2024; 40:btae022. [PMID: 38230755 PMCID: PMC10868335 DOI: 10.1093/bioinformatics/btae022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 11/30/2023] [Accepted: 01/12/2024] [Indexed: 01/18/2024] Open
Abstract
MOTIVATION The prediction of RNA structure canonical base pairs from a single sequence, especially pseudoknotted ones, remains challenging in a thermodynamic models that approximates the energy of the local 3D motifs joining canonical stems. It has become more and more apparent in recent years that the structural motifs in the loops, composed of noncanonical interactions, are essential for the final shape of the molecule enabling its multiple functions. Our capacity to predict accurate 3D structures is also limited when it comes to the organization of the large intricate network of interactions that form inside those loops. RESULTS We previously developed the integer programming framework RNA Motifs over Integer Programming (RNAMoIP) to reconcile RNA secondary structure and local 3D motif information available in databases. We further develop our model to now simultaneously predict the canonical base pairs (with pseudoknots) from base pair probability matrices with or without alignment. We benchmarked our new method over the all nonredundant RNAs below 150 nucleotides. We show that the joined prediction of canonical base pairs structure and local conserved motifs (i) improves the ratio of well-predicted interactions in the secondary structure, (ii) predicts well canonical and Wobble pairs at the location where motifs are inserted, (iii) is greatly improved with evolutionary information, and (iv) noncanonical motifs at kink-turn locations. AVAILABILITY AND IMPLEMENTATION The source code of the framework is available at https://gitlab.info.uqam.ca/cbe/RNAMoIP and an interactive web server at https://rnamoip.cbe.uqam.ca/.
Collapse
Affiliation(s)
- Gabriel Loyer
- Department of Computer Science, Université du Québec à Montréal, Montréal, QC H2X 3Y7, Canada
| | - Vladimir Reinharz
- Department of Computer Science, Université du Québec à Montréal, Montréal, QC H2X 3Y7, Canada
| |
Collapse
|
32
|
Thiel BC, Poblete S, Hofacker IL. The Multiscale Ernwin/SPQR RNA Structure Prediction Pipeline. Methods Mol Biol 2024; 2726:377-399. [PMID: 38780739 DOI: 10.1007/978-1-0716-3519-3_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Aside from the well-known role in protein synthesis, RNA can perform catalytic, regulatory, and other essential biological functions which are determined by its three-dimensional structure. In this regard, a great effort has been made during the past decade to develop computational tools for the prediction of the structure of RNAs from the knowledge of their sequence, incorporating experimental data to refine or guide the modeling process. Nevertheless, this task can become exceptionally challenging when dealing with long noncoding RNAs, constituted by more than 200 nucleotides, due to their large size and the specific interactions involved. In this chapter, we describe a multiscale approach to predict such structures, incorporating SAXS experimental data into a hierarchical procedure which couples two coarse-grained representations: Ernwin, a helix-based approach, which deals with the global arrangement of secondary structure elements, and SPQR, a nucleotide-centered coarse-grained model, which corrects and refines the structures predicted at the coarser level.We describe the methodology through its application on the Braveheart long noncoding RNA, starting from the SAXS and secondary structure data to propose a refined, all-atom structure.
Collapse
Affiliation(s)
- Bernhard C Thiel
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Vienna, Austria
| | - Simón Poblete
- Instituto de Ciencias Físicas y Matemáticas, Universidad Austral de Chile, Valdivia, Chile
- Computational Biology Lab, Fundación Ciencia & Vida, Santiago, Chile
- Facultad de Ingeniería, Arquitectura y Diseño, Universidad SanSebastián, Santiago, Chile
| | - Ivo L Hofacker
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Vienna, Austria.
- Research Group Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, Austria.
| |
Collapse
|
33
|
Das R, Kretsch RC, Simpkin AJ, Mulvaney T, Pham P, Rangan R, Bu F, Keegan RM, Topf M, Rigden DJ, Miao Z, Westhof E. Assessment of three-dimensional RNA structure prediction in CASP15. Proteins 2023; 91:1747-1770. [PMID: 37876231 PMCID: PMC10841292 DOI: 10.1002/prot.26602] [Citation(s) in RCA: 49] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/21/2023] [Accepted: 09/07/2023] [Indexed: 10/26/2023]
Abstract
The prediction of RNA three-dimensional structures remains an unsolved problem. Here, we report assessments of RNA structure predictions in CASP15, the first CASP exercise that involved RNA structure modeling. Forty-two predictor groups submitted models for at least one of twelve RNA-containing targets. These models were evaluated by the RNA-Puzzles organizers and, separately, by a CASP-recruited team using metrics (GDT, lDDT) and approaches (Z-score rankings) initially developed for assessment of proteins and generalized here for RNA assessment. The two assessments independently ranked the same predictor groups as first (AIchemy_RNA2), second (Chen), and third (RNAPolis and GeneSilico, tied); predictions from deep learning approaches were significantly worse than these top ranked groups, which did not use deep learning. Further analyses based on direct comparison of predicted models to cryogenic electron microscopy (cryo-EM) maps and x-ray diffraction data support these rankings. With the exception of two RNA-protein complexes, models submitted by CASP15 groups correctly predicted the global fold of the RNA targets. Comparisons of CASP15 submissions to designed RNA nanostructures as well as molecular replacement trials highlight the potential utility of current RNA modeling approaches for RNA nanotechnology and structural biology, respectively. Nevertheless, challenges remain in modeling fine details such as noncanonical pairs, in ranking among submitted models, and in prediction of multiple structures resolved by cryo-EM or crystallography.
Collapse
Affiliation(s)
- Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, CA USA
- Biophysics Program, Stanford University School of Medicine, CA USA
- Howard Hughes Medical Institute, Stanford University, CA USA
| | | | - Adam J. Simpkin
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV), Hamburg, Germany
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Phillip Pham
- Department of Biochemistry, Stanford University School of Medicine, CA USA
| | - Ramya Rangan
- Biophysics Program, Stanford University School of Medicine, CA USA
| | - Fan Bu
- Guangzhou Laboratory, Guangzhou International Bio Island, Guangzhou 510005, China
- Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230036, Anhui, China
| | - Ronan M. Keegan
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
- Life Science, Diamond Light Source, Harwell Science, UK
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV), Hamburg, Germany
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Daniel J. Rigden
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai 200434, China
| | - Eric Westhof
- Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, F-67084, Strasbourg, France
| |
Collapse
|
34
|
Peterson JM, O'Leary CA, Coppenbarger EC, Tompkins VS, Moss WN. Discovery of RNA secondary structural motifs using sequence-ordered thermodynamic stability and comparative sequence analysis. MethodsX 2023; 11:102275. [PMID: 37448951 PMCID: PMC10336498 DOI: 10.1016/j.mex.2023.102275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 06/28/2023] [Indexed: 07/18/2023] Open
Abstract
Major advances in RNA secondary structural motif prediction have been achieved in the last few years; however, few methods harness the predictive power of multiple approaches to deliver in-depth characterizations of local RNA motifs and their potential functionality. Additionally, most available methods do not predict RNA pseudoknots. This work combines complementary bioinformatic systems into one robust discovery pipeline where: •RNA sequences are folded to search for thermodynamically favorable motifs utilizing ScanFold.•Motifs are expanded and refolded into alternate pseudoknot conformations by Knotty/Iterative HFold.•All conformations are evaluated for covariance via the cm-builder pipeline (Infernal and R-scape).
Collapse
|
35
|
Sarzynska J, Popenda M, Antczak M, Szachniuk M. RNA tertiary structure prediction using RNAComposer in CASP15. Proteins 2023; 91:1790-1799. [PMID: 37615316 DOI: 10.1002/prot.26578] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/14/2023] [Accepted: 08/08/2023] [Indexed: 08/25/2023]
Abstract
As CASP15 participants, in the new category of 3D RNA structure prediction, we applied expert modeling with the support of our proprietary system RNAComposer. Although RNAComposer is primarily known as an automated web server, its features allow it to be used interactively, for example, for homology-based modeling or assembling models from user-provided structural elements. In the paper, we present various scenarios of applying the system to predict the 3D RNA structures that we employed. Their combination with expert input, comparative analysis of models, and routines to select representative resultant structures form a ready-for-reuse workflow. With selected examples, we demonstrate its application for the in silico modeling of natural and synthetic RNA molecules targeted in CASP15.
Collapse
Affiliation(s)
- Joanna Sarzynska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Mariusz Popenda
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Maciej Antczak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| |
Collapse
|
36
|
Kretsch RC, Andersen ES, Bujnicki JM, Chiu W, Das R, Luo B, Masquida B, McRae EK, Schroeder GM, Su Z, Wedekind JE, Xu L, Zhang K, Zheludev IN, Moult J, Kryshtafovych A. RNA target highlights in CASP15: Evaluation of predicted models by structure providers. Proteins 2023; 91:1600-1615. [PMID: 37466021 PMCID: PMC10792523 DOI: 10.1002/prot.26550] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/16/2023] [Accepted: 06/26/2023] [Indexed: 07/20/2023]
Abstract
The first RNA category of the Critical Assessment of Techniques for Structure Prediction competition was only made possible because of the scientists who provided experimental structures to challenge the predictors. In this article, these scientists offer a unique and valuable analysis of both the successes and areas for improvement in the predicted models. All 10 RNA-only targets yielded predictions topologically similar to experimentally determined structures. For one target, experimentalists were able to phase their x-ray diffraction data by molecular replacement, showing a potential application of structure predictions for RNA structural biologists. Recommended areas for improvement include: enhancing the accuracy in local interaction predictions and increased consideration of the experimental conditions such as multimerization, structure determination method, and time along folding pathways. The prediction of RNA-protein complexes remains the most significant challenge. Finally, given the intrinsic flexibility of many RNAs, we propose the consideration of ensemble models.
Collapse
Affiliation(s)
- Rachael C. Kretsch
- Biophysics Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Ebbe S. Andersen
- Interdisciplinary Nanoscience Center and Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| | - Janusz M. Bujnicki
- International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Wah Chiu
- Biophysics Program, Stanford University School of Medicine, Stanford, CA, USA
- Department of Bioengineering and James H. Clark Center, Stanford University, Stanford, CA, USA
- Division of CryoEM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Rhiju Das
- Biophysics Program, Stanford University School of Medicine, Stanford, CA, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford, CA, USA
| | - Bingnan Luo
- The State Key Laboratory of Biotherapy, Frontiers Medical Center of Tianfu Jincheng Laboratory, Department of Geriatrics and National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu 610044, Sichuan, China
| | - Benoît Masquida
- UMR 7156, CNRS – Universite de Strasbourg, Strasbourg, France
| | - Ewan K.S. McRae
- Center for RNA Therapeutics, Houston Methodist Research Institute, Houston, TX 77030, USA
| | - Griffin M. Schroeder
- Department of Biochemistry and Biophysics, University of Rochester School of Medicine and Dentistry, Rochester, NY, 14642, USA
- Center for RNA Biology, University of Rochester School of Medicine and Dentistry, Rochester, NY, 14642, USA
| | - Zhaoming Su
- The State Key Laboratory of Biotherapy, Frontiers Medical Center of Tianfu Jincheng Laboratory, Department of Geriatrics and National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu 610044, Sichuan, China
| | - Joseph E. Wedekind
- Department of Biochemistry and Biophysics, University of Rochester School of Medicine and Dentistry, Rochester, NY, 14642, USA
- Center for RNA Biology, University of Rochester School of Medicine and Dentistry, Rochester, NY, 14642, USA
| | - Lily Xu
- Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA, USA
| | - Kaiming Zhang
- Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230027, China
| | - Ivan N. Zheludev
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
| | - John Moult
- Department of Cell Biology and Molecular Genetics, Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland, USA
| | | |
Collapse
|
37
|
Baulin EF, Mukherjee S, Moafinejad SN, Wirecki TK, Badepally NG, Jaryani F, Stefaniak F, Amiri Farsani M, Ray A, Rocha de Moura T, Bujnicki JM. RNA tertiary structure prediction in CASP15 by the GeneSilico group: Folding simulations based on statistical potentials and spatial restraints. Proteins 2023; 91:1800-1810. [PMID: 37622458 DOI: 10.1002/prot.26575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 07/06/2023] [Accepted: 07/31/2023] [Indexed: 08/26/2023]
Abstract
Ribonucleic acid (RNA) molecules serve as master regulators of cells by encoding their biological function in the ribonucleotide sequence, particularly their ability to interact with other molecules. To understand how RNA molecules perform their biological tasks and to design new sequences with specific functions, it is of great benefit to be able to computationally predict how RNA folds and interacts in the cellular environment. Our workflow for computational modeling of the 3D structures of RNA and its interactions with other molecules uses a set of methods developed in our laboratory, including MeSSPredRNA for predicting canonical and non-canonical base pairs, PARNASSUS for detecting remote homology based on comparisons of sequences and secondary structures, ModeRNA for comparative modeling, the SimRNA family of programs for modeling RNA 3D structure and its complexes with other molecules, and QRNAS for model refinement. In this study, we present the results of testing this workflow in predicting RNA 3D structures in the CASP15 experiment. The overall high score of the computational models predicted by our group demonstrates the robustness of our workflow and its individual components in terms of predicting RNA 3D structures of acceptable quality that are close to the target structures. However, the variance in prediction quality is still quite high, and the results are still too far from the level of protein 3D structure predictions. This exercise led us to consider several improvements, especially to better predict and enforce stacking interactions and non-canonical base pairs.
Collapse
Affiliation(s)
- Eugene F Baulin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Sunandan Mukherjee
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Tomasz K Wirecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Farhang Jaryani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Filip Stefaniak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Masoud Amiri Farsani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Angana Ray
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Tales Rocha de Moura
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| |
Collapse
|
38
|
Wang W, Feng C, Han R, Wang Z, Ye L, Du Z, Wei H, Zhang F, Peng Z, Yang J. trRosettaRNA: automated prediction of RNA 3D structure with transformer network. Nat Commun 2023; 14:7266. [PMID: 37945552 PMCID: PMC10636060 DOI: 10.1038/s41467-023-42528-4] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 10/13/2023] [Indexed: 11/12/2023] Open
Abstract
RNA 3D structure prediction is a long-standing challenge. Inspired by the recent breakthrough in protein structure prediction, we developed trRosettaRNA, an automated deep learning-based approach to RNA 3D structure prediction. The trRosettaRNA pipeline comprises two major steps: 1D and 2D geometries prediction by a transformer network; and 3D structure folding by energy minimization. Benchmark tests suggest that trRosettaRNA outperforms traditional automated methods. In the blind tests of the 15th Critical Assessment of Structure Prediction (CASP15) and the RNA-Puzzles experiments, the automated trRosettaRNA predictions for the natural RNAs are competitive with the top human predictions. trRosettaRNA also outperforms other deep learning-based methods in CASP15 when measured by the Z-score of the Root-Mean-Square Deviation. Nevertheless, it remains challenging to predict accurate structures for synthetic RNAs with an automated approach. We hope this work could be a good start toward solving the hard problem of RNA structure prediction with deep learning.
Collapse
Affiliation(s)
- Wenkai Wang
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| | - Chenjie Feng
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
- School of Science, Ningxia Medical University, Yinchuan, 750004, China
| | - Renmin Han
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
| | - Ziyi Wang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
| | - Lisha Ye
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| | - Zongyang Du
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| | - Hong Wei
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| | - Fa Zhang
- School of Medical Technology, Beijing Institute of Technology, Beijing, 100081, China.
| | - Zhenling Peng
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.
| | - Jianyi Yang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.
| |
Collapse
|
39
|
Zhang S, Liu Y, Xie L. A universal framework for accurate and efficient geometric deep learning of molecular systems. Sci Rep 2023; 13:19171. [PMID: 37932352 PMCID: PMC10628308 DOI: 10.1038/s41598-023-46382-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 10/31/2023] [Indexed: 11/08/2023] Open
Abstract
Molecular sciences address a wide range of problems involving molecules of different types and sizes and their complexes. Recently, geometric deep learning, especially Graph Neural Networks, has shown promising performance in molecular science applications. However, most existing works often impose targeted inductive biases to a specific molecular system, and are inefficient when applied to macromolecules or large-scale tasks, thereby limiting their applications to many real-world problems. To address these challenges, we present PAMNet, a universal framework for accurately and efficiently learning the representations of three-dimensional (3D) molecules of varying sizes and types in any molecular system. Inspired by molecular mechanics, PAMNet induces a physics-informed bias to explicitly model local and non-local interactions and their combined effects. As a result, PAMNet can reduce expensive operations, making it time and memory efficient. In extensive benchmark studies, PAMNet outperforms state-of-the-art baselines regarding both accuracy and efficiency in three diverse learning tasks: small molecule properties, RNA 3D structures, and protein-ligand binding affinities. Our results highlight the potential for PAMNet in a broad range of molecular science applications.
Collapse
Affiliation(s)
- Shuo Zhang
- Department of Computer Science, Hunter College, The City University of New York, New York, 10065, USA
- Helen and Robert Appel Alzheimer's Disease Research Institute, Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, Cornell University, New York, 10065, USA
| | - Yang Liu
- Department of Computer Science, Hunter College, The City University of New York, New York, 10065, USA
| | - Lei Xie
- Department of Computer Science, Hunter College, The City University of New York, New York, 10065, USA.
- Helen and Robert Appel Alzheimer's Disease Research Institute, Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, Cornell University, New York, 10065, USA.
- Ph.D. Program in Computer Science, The Graduate Center, The City University of New York, New York, 10016, USA.
| |
Collapse
|
40
|
Malhotra S, Mulvaney T, Cragnolini T, Sidhu H, Joseph A, Beton J, Topf M. RIBFIND2: Identifying rigid bodies in protein and nucleic acid structures. Nucleic Acids Res 2023; 51:9567-9575. [PMID: 37670532 PMCID: PMC10570027 DOI: 10.1093/nar/gkad721] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 08/10/2023] [Accepted: 08/21/2023] [Indexed: 09/07/2023] Open
Abstract
Molecular structures are often fitted into cryo-EM maps by flexible fitting. When this requires large conformational changes, identifying rigid bodies can help optimize the model-map fit. Tools for identifying rigid bodies in protein structures exist, however an equivalent for nucleic acid structures is lacking. With the increase in cryo-EM maps containing RNA and progress in RNA structure prediction, there is a need for such tools. We previously developed RIBFIND, a program for clustering protein secondary structures into rigid bodies. In RIBFIND2, this approach is extended to nucleic acid structures. RIBFIND2 can identify biologically relevant rigid bodies in important groups of complex RNA structures, capturing a wide range of dynamics, including large rigid-body movements. The usefulness of RIBFIND2-assigned rigid bodies in cryo-EM model refinement was demonstrated on three examples, with two conformations each: Group II Intron complexed IEP, Internal Ribosome Entry Site and the Processome, using cryo-EM maps at 2.7-5 Å resolution. A hierarchical refinement approach, performed on progressively smaller sets of RIBFIND2 rigid bodies, was clearly shown to have an advantage over classical all-atom refinement. RIBFIND2 is available via a web server with structure visualization and as a standalone tool.
Collapse
Affiliation(s)
- Sony Malhotra
- Science and Technology Facilities Council, Scientific Computing, Research Complex at Harwell, Didcot OX11 0FA, UK
| | - Thomas Mulvaney
- Leibniz Institute of Virology, Hamburg 20251, Germany
- Centre for Structural Systems Biology, Hamburg D-22607, Germany
- Universitätsklinikum Hamburg Eppendorf (UKE), Hamburg 20246, Germany
| | - Tristan Cragnolini
- Leibniz Institute of Virology, Hamburg 20251, Germany
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, London WC1E 7HX, UK
| | - Haneesh Sidhu
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, London WC1E 7HX, UK
| | - Agnel P Joseph
- Science and Technology Facilities Council, Scientific Computing, Research Complex at Harwell, Didcot OX11 0FA, UK
| | - Joseph G Beton
- Leibniz Institute of Virology, Hamburg 20251, Germany
- Centre for Structural Systems Biology, Hamburg D-22607, Germany
| | - Maya Topf
- Leibniz Institute of Virology, Hamburg 20251, Germany
- Centre for Structural Systems Biology, Hamburg D-22607, Germany
- Universitätsklinikum Hamburg Eppendorf (UKE), Hamburg 20246, Germany
| |
Collapse
|
41
|
Schneider B, Sweeney BA, Bateman A, Cerny J, Zok T, Szachniuk M. When will RNA get its AlphaFold moment? Nucleic Acids Res 2023; 51:9522-9532. [PMID: 37702120 PMCID: PMC10570031 DOI: 10.1093/nar/gkad726] [Citation(s) in RCA: 48] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 08/13/2023] [Accepted: 08/22/2023] [Indexed: 09/14/2023] Open
Abstract
The protein structure prediction problem has been solved for many types of proteins by AlphaFold. Recently, there has been considerable excitement to build off the success of AlphaFold and predict the 3D structures of RNAs. RNA prediction methods use a variety of techniques, from physics-based to machine learning approaches. We believe that there are challenges preventing the successful development of deep learning-based methods like AlphaFold for RNA in the short term. Broadly speaking, the challenges are the limited number of structures and alignments making data-hungry deep learning methods unlikely to succeed. Additionally, there are several issues with the existing structure and sequence data, as they are often of insufficient quality, highly biased and missing key information. Here, we discuss these challenges in detail and suggest some steps to remedy the situation. We believe that it is possible to create an accurate RNA structure prediction method, but it will require solving several data quality and volume issues, usage of data beyond simple sequence alignments, or the development of new less data-hungry machine learning methods.
Collapse
Affiliation(s)
- Bohdan Schneider
- Institute of Biotechnology of the Czech Academy of Sciences, Prumyslova 595, CZ-252 50 Vestec, Czech Republic
| | - Blake Alexander Sweeney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Jiri Cerny
- Institute of Biotechnology of the Czech Academy of Sciences, Prumyslova 595, CZ-252 50 Vestec, Czech Republic
| | - Tomasz Zok
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Marta Szachniuk
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| |
Collapse
|
42
|
Das R, Kretsch RC, Simpkin AJ, Mulvaney T, Pham P, Rangan R, Bu F, Keegan RM, Topf M, Rigden DJ, Miao Z, Westhof E. Assessment of three-dimensional RNA structure prediction in CASP15. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.25.538330. [PMID: 37162955 PMCID: PMC10168427 DOI: 10.1101/2023.04.25.538330] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
The prediction of RNA three-dimensional structures remains an unsolved problem. Here, we report assessments of RNA structure predictions in CASP15, the first CASP exercise that involved RNA structure modeling. Forty two predictor groups submitted models for at least one of twelve RNA-containing targets. These models were evaluated by the RNA-Puzzles organizers and, separately, by a CASP-recruited team using metrics (GDT, lDDT) and approaches (Z-score rankings) initially developed for assessment of proteins and generalized here for RNA assessment. The two assessments independently ranked the same predictor groups as first (AIchemy_RNA2), second (Chen), and third (RNAPolis and GeneSilico, tied); predictions from deep learning approaches were significantly worse than these top ranked groups, which did not use deep learning. Further analyses based on direct comparison of predicted models to cryogenic electron microscopy (cryo-EM) maps and X-ray diffraction data support these rankings. With the exception of two RNA-protein complexes, models submitted by CASP15 groups correctly predicted the global fold of the RNA targets. Comparisons of CASP15 submissions to designed RNA nanostructures as well as molecular replacement trials highlight the potential utility of current RNA modeling approaches for RNA nanotechnology and structural biology, respectively. Nevertheless, challenges remain in modeling fine details such as non-canonical pairs, in ranking among submitted models, and in prediction of multiple structures resolved by cryo-EM or crystallography.
Collapse
Affiliation(s)
- Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, CA USA
- Biophysics Program, Stanford University School of Medicine, CA USA
- Howard Hughes Medical Institute, Stanford University, CA USA
| | | | - Adam J. Simpkin
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV)
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Phillip Pham
- Department of Biochemistry, Stanford University School of Medicine, CA USA
| | - Ramya Rangan
- Biophysics Program, Stanford University School of Medicine, CA USA
| | - Fan Bu
- Guangzhou Laboratory, Guangzhou International Bio Island, Guangzhou 510005, China
- Division of Life Sciences and Medicine,University of Science and Technology of China, Hefei 230036, Anhui, China
| | - Ronan M. Keegan
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
- Life Science, Diamond Light Source, Harwell Science, UK
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB), Leibniz-Institut für Virologie (LIV)
- University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Daniel J. Rigden
- Institute of Systems, Molecular & Integrative Biology, The University of Liverpool, UK
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People’s Hospital, School of Medicine, Tongji University, Shanghai 200434, China
| | - Eric Westhof
- Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, F-67084, Strasbourg, France
| |
Collapse
|
43
|
Li Y, Zhang C, Feng C, Pearce R, Lydia Freddolino P, Zhang Y. Integrating end-to-end learning with deep geometrical potentials for ab initio RNA structure prediction. Nat Commun 2023; 14:5745. [PMID: 37717036 PMCID: PMC10505173 DOI: 10.1038/s41467-023-41303-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 08/22/2023] [Indexed: 09/18/2023] Open
Abstract
RNAs are fundamental in living cells and perform critical functions determined by their tertiary architectures. However, accurate modeling of 3D RNA structure remains a challenging problem. We present a novel method, DRfold, to predict RNA tertiary structures by simultaneous learning of local frame rotations and geometric restraints from experimentally solved RNA structures, where the learned knowledge is converted into a hybrid energy potential to guide RNA structure assembly. The method significantly outperforms previous approaches by >73.3% in TM-score on a sequence-nonredundant dataset containing recently released structures. Detailed analyses showed that the major contribution to the improvements arise from the deep end-to-end learning supervised with the atom coordinates and the composite energy function integrating complementary information from geometry restraints and end-to-end learning models. The open-source DRfold program with fast training protocol allows large-scale application of high-resolution RNA structure modeling and can be further improved with future expansion of RNA structure databases.
Collapse
Affiliation(s)
- Yang Li
- Cancer Science Institute of Singapore, National University of Singapore, 117599, Singapore, Singapore
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT, 06511, USA
| | - Chenjie Feng
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- School of Science, Ningxia Medical University, Yinchuan, 750004, China
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- Department of Computer Science, School of Computing, National University of Singapore, 117417, Singapore, Singapore
| | - P Lydia Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
- Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
| | - Yang Zhang
- Cancer Science Institute of Singapore, National University of Singapore, 117599, Singapore, Singapore.
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
- Department of Computer Science, School of Computing, National University of Singapore, 117417, Singapore, Singapore.
- Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, 117596, Singapore, Singapore.
| |
Collapse
|
44
|
Taubert O, von der Lehr F, Bazarova A, Faber C, Knechtges P, Weiel M, Debus C, Coquelin D, Basermann A, Streit A, Kesselheim S, Götz M, Schug A. RNA contact prediction by data efficient deep learning. Commun Biol 2023; 6:913. [PMID: 37674020 PMCID: PMC10482910 DOI: 10.1038/s42003-023-05244-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 08/14/2023] [Indexed: 09/08/2023] Open
Abstract
On the path to full understanding of the structure-function relationship or even design of RNA, structure prediction would offer an intriguing complement to experimental efforts. Any deep learning on RNA structure, however, is hampered by the sparsity of labeled training data. Utilizing the limited data available, we here focus on predicting spatial adjacencies ("contact maps") as a proxy for 3D structure. Our model, BARNACLE, combines the utilization of unlabeled data through self-supervised pre-training and efficient use of the sparse labeled data through an XGBoost classifier. BARNACLE shows a considerable improvement over both the established classical baseline and a deep neural network. In order to demonstrate that our approach can be applied to tasks with similar data constraints, we show that our findings generalize to the related setting of accessible surface area prediction.
Collapse
Affiliation(s)
- Oskar Taubert
- Steinbuch Centre for Computing (SCC), Karlsruhe Institute of Technology, 76344, Eggenstein-Leopoldshafen, Germany
| | - Fabrice von der Lehr
- Institute for Software Technology (SC), German Aerospace Centre (DLR), 51147, Köln, Germany
| | - Alina Bazarova
- Jülich Supercomputing Centre, Forschungszentrum Jülich, 52428, Jülich, Germany
- Helmholtz AI, 81675, Munich, Germany
| | - Christian Faber
- Jülich Supercomputing Centre, Forschungszentrum Jülich, 52428, Jülich, Germany
| | - Philipp Knechtges
- Institute for Software Technology (SC), German Aerospace Centre (DLR), 51147, Köln, Germany
- Helmholtz AI, 81675, Munich, Germany
| | - Marie Weiel
- Steinbuch Centre for Computing (SCC), Karlsruhe Institute of Technology, 76344, Eggenstein-Leopoldshafen, Germany
- Helmholtz AI, 81675, Munich, Germany
| | - Charlotte Debus
- Steinbuch Centre for Computing (SCC), Karlsruhe Institute of Technology, 76344, Eggenstein-Leopoldshafen, Germany
- Helmholtz AI, 81675, Munich, Germany
| | - Daniel Coquelin
- Steinbuch Centre for Computing (SCC), Karlsruhe Institute of Technology, 76344, Eggenstein-Leopoldshafen, Germany
- Helmholtz AI, 81675, Munich, Germany
| | - Achim Basermann
- Institute for Software Technology (SC), German Aerospace Centre (DLR), 51147, Köln, Germany
| | - Achim Streit
- Steinbuch Centre for Computing (SCC), Karlsruhe Institute of Technology, 76344, Eggenstein-Leopoldshafen, Germany
| | - Stefan Kesselheim
- Jülich Supercomputing Centre, Forschungszentrum Jülich, 52428, Jülich, Germany
- Helmholtz AI, 81675, Munich, Germany
| | - Markus Götz
- Steinbuch Centre for Computing (SCC), Karlsruhe Institute of Technology, 76344, Eggenstein-Leopoldshafen, Germany.
- Helmholtz AI, 81675, Munich, Germany.
| | - Alexander Schug
- Jülich Supercomputing Centre, Forschungszentrum Jülich, 52428, Jülich, Germany.
- Faculty of Biology, University of Duisburg-Essen, 45117, Essen, Germany.
| |
Collapse
|
45
|
Chojnowski G, Zaborowski R, Magnus M, Mukherjee S, Bujnicki JM. RNA 3D structure modeling by fragment assembly with small-angle X-ray scattering restraints. Bioinformatics 2023; 39:btad527. [PMID: 37647627 PMCID: PMC10474949 DOI: 10.1093/bioinformatics/btad527] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 07/14/2023] [Accepted: 08/28/2023] [Indexed: 09/01/2023] Open
Abstract
SUMMARY Structure determination is a key step in the functional characterization of many non-coding RNA molecules. High-resolution RNA 3D structure determination efforts, however, are not keeping up with the pace of discovery of new non-coding RNA sequences. This increases the importance of computational approaches and low-resolution experimental data, such as from the small-angle X-ray scattering experiments. We present RNA Masonry, a computer program and a web service for a fully automated modeling of RNA 3D structures. It assemblies RNA fragments into geometrically plausible models that meet user-provided secondary structure constraints, restraints on tertiary contacts, and small-angle X-ray scattering data. We illustrate the method description with detailed benchmarks and its application to structural studies of viral RNAs with SAXS restraints. AVAILABILITY AND IMPLEMENTATION The program web server is available at http://iimcb.genesilico.pl/rnamasonry. The source code is available at https://gitlab.com/gchojnowski/rnamasonry.
Collapse
Affiliation(s)
- Grzegorz Chojnowski
- International Institute of Molecular and Cell Biology, Warsaw 02-109, Poland
- European Molecular Biology Laboratory, Hamburg Unit, Hamburg 22607, Germany
| | - Rafał Zaborowski
- International Institute of Molecular and Cell Biology, Warsaw 02-109, Poland
| | - Marcin Magnus
- ReMedy International Research Agenda Unit, IMol Polish Academy of Sciences, Warsaw, Poland
| | - Sunandan Mukherjee
- International Institute of Molecular and Cell Biology, Warsaw 02-109, Poland
| | - Janusz M Bujnicki
- International Institute of Molecular and Cell Biology, Warsaw 02-109, Poland
| |
Collapse
|
46
|
Deng J, Fang X, Huang L, Li S, Xu L, Ye K, Zhang J, Zhang K, Zhang QC. RNA structure determination: From 2D to 3D. FUNDAMENTAL RESEARCH 2023; 3:727-737. [PMID: 38933295 PMCID: PMC11197651 DOI: 10.1016/j.fmre.2023.06.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 06/04/2023] [Accepted: 06/05/2023] [Indexed: 06/28/2024] Open
Abstract
RNA molecules serve a wide range of functions that are closely linked to their structures. The basic structural units of RNA consist of single- and double-stranded regions. In order to carry out advanced functions such as catalysis and ligand binding, certain types of RNAs can adopt higher-order structures. The analysis of RNA structures has progressed alongside advancements in structural biology techniques, but it comes with its own set of challenges and corresponding solutions. In this review, we will discuss recent advances in RNA structure analysis techniques, including structural probing methods, X-ray crystallography, nuclear magnetic resonance, cryo-electron microscopy, and small-angle X-ray scattering. Often, a combination of multiple techniques is employed for the integrated analysis of RNA structures. We also survey important RNA structures that have been recently determined using various techniques.
Collapse
Affiliation(s)
- Jie Deng
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou 510120, China
| | - Xianyang Fang
- Beijing Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Key Laboratory of RNA Biology, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Lin Huang
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou 510120, China
| | - Shanshan Li
- MOE Key Laboratory for Cellular Dynamics and Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230027, China
| | - Lilei Xu
- Beijing Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Keqiong Ye
- Key Laboratory of RNA Biology, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jinsong Zhang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
| | - Kaiming Zhang
- MOE Key Laboratory for Cellular Dynamics and Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230027, China
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
| |
Collapse
|
47
|
Wang X, Yu S, Lou E, Tan YL, Tan ZJ. RNA 3D Structure Prediction: Progress and Perspective. Molecules 2023; 28:5532. [PMID: 37513407 PMCID: PMC10386116 DOI: 10.3390/molecules28145532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 07/05/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
Ribonucleic acid (RNA) molecules play vital roles in numerous important biological functions such as catalysis and gene regulation. The functions of RNAs are strongly coupled to their structures or proper structure changes, and RNA structure prediction has been paid much attention in the last two decades. Some computational models have been developed to predict RNA three-dimensional (3D) structures in silico, and these models are generally composed of predicting RNA 3D structure ensemble, evaluating near-native RNAs from the structure ensemble, and refining the identified RNAs. In this review, we will make a comprehensive overview of the recent advances in RNA 3D structure modeling, including structure ensemble prediction, evaluation, and refinement. Finally, we will emphasize some insights and perspectives in modeling RNA 3D structures.
Collapse
Affiliation(s)
- Xunxun Wang
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Shixiong Yu
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - En Lou
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Ya-Lan Tan
- School of Bioengineering and Health, Wuhan Textile University, Wuhan 430200, China
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430200, China
| | - Zhi-Jie Tan
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| |
Collapse
|
48
|
Wu KE, Zou JY, Chang H. Machine learning modeling of RNA structures: methods, challenges and future perspectives. Brief Bioinform 2023; 24:bbad210. [PMID: 37280185 DOI: 10.1093/bib/bbad210] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 05/12/2023] [Accepted: 05/17/2023] [Indexed: 06/08/2023] Open
Abstract
The three-dimensional structure of RNA molecules plays a critical role in a wide range of cellular processes encompassing functions from riboswitches to epigenetic regulation. These RNA structures are incredibly dynamic and can indeed be described aptly as an ensemble of structures that shifts in distribution depending on different cellular conditions. Thus, the computational prediction of RNA structure poses a unique challenge, even as computational protein folding has seen great advances. In this review, we focus on a variety of machine learning-based methods that have been developed to predict RNA molecules' secondary structure, as well as more complex tertiary structures. We survey commonly used modeling strategies, and how many are inspired by or incorporate thermodynamic principles. We discuss the shortcomings that various design decisions entail and propose future directions that could build off these methods to yield more robust, accurate RNA structure predictions.
Collapse
Affiliation(s)
- Kevin E Wu
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - James Y Zou
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Howard Chang
- Howard Hughes Medical Institute, Stanford University, Stanford, CA 94305, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
49
|
Watson ZL, Knudson IJ, Ward FR, Miller SJ, Cate JHD, Schepartz A, Abramyan AM. Atomistic simulations of the Escherichia coli ribosome provide selection criteria for translationally active substrates. Nat Chem 2023; 15:913-921. [PMID: 37308707 PMCID: PMC10322701 DOI: 10.1038/s41557-023-01226-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 04/28/2023] [Indexed: 06/14/2023]
Abstract
As genetic code expansion advances beyond L-α-amino acids to backbone modifications and new polymerization chemistries, delineating what substrates the ribosome can accommodate remains a challenge. The Escherichia coli ribosome tolerates non-L-α-amino acids in vitro, but few structural insights that explain how are available, and the boundary conditions for efficient bond formation are so far unknown. Here we determine a high-resolution cryogenic electron microscopy structure of the E. coli ribosome containing α-amino acid monomers and use metadynamics simulations to define energy surface minima and understand incorporation efficiencies. Reactive monomers across diverse structural classes favour a conformational space where the aminoacyl-tRNA nucleophile is <4 Å from the peptidyl-tRNA carbonyl with a Bürgi-Dunitz angle of 76-115°. Monomers with free energy minima that fall outside this conformational space do not react efficiently. This insight should accelerate the in vivo and in vitro ribosomal synthesis of sequence-defined, non-peptide heterooligomers.
Collapse
Affiliation(s)
- Zoe L Watson
- Department of Chemistry, University of California, Berkeley, CA, USA
- Center for Genetically Encoded Materials, University of California, Berkeley, CA, USA
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA, USA
| | - Isaac J Knudson
- Department of Chemistry, University of California, Berkeley, CA, USA
- Center for Genetically Encoded Materials, University of California, Berkeley, CA, USA
| | - Fred R Ward
- Center for Genetically Encoded Materials, University of California, Berkeley, CA, USA
- Department of Molecular and Cellular Biology, University of California, Berkeley, CA, USA
| | - Scott J Miller
- Center for Genetically Encoded Materials, University of California, Berkeley, CA, USA.
- Department of Chemistry, Yale University, New Haven, CT, USA.
| | - Jamie H D Cate
- Department of Chemistry, University of California, Berkeley, CA, USA.
- Center for Genetically Encoded Materials, University of California, Berkeley, CA, USA.
- Department of Molecular and Cellular Biology, University of California, Berkeley, CA, USA.
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| | - Alanna Schepartz
- Department of Chemistry, University of California, Berkeley, CA, USA.
- Center for Genetically Encoded Materials, University of California, Berkeley, CA, USA.
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA, USA.
- Department of Molecular and Cellular Biology, University of California, Berkeley, CA, USA.
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| | | |
Collapse
|
50
|
Zhang D, Qiao L, Lei X, Dong X, Tong Y, Wang J, Wang Z, Zhou R. Mutagenesis and structural studies reveal the basis for the specific binding of SARS-CoV-2 SL3 RNA element with human TIA1 protein. Nat Commun 2023; 14:3715. [PMID: 37349329 PMCID: PMC10287707 DOI: 10.1038/s41467-023-39410-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 06/12/2023] [Indexed: 06/24/2023] Open
Abstract
Viral RNA-host protein interactions are indispensable during RNA virus transcription and replication, but their detailed structural and dynamical features remain largely elusive. Here, we characterize the binding interface for the SARS-CoV-2 stem-loop 3 (SL3) cis-acting element to human TIA1 protein with a combined theoretical and experimental approaches. The highly structured SARS-CoV-2 SL3 has a high binding affinity to TIA1 protein, in which the aromatic stacking, hydrogen bonds, and hydrophobic interactions collectively direct this specific binding. Further mutagenesis studies validate our proposed 3D binding model and reveal two SL3 variants have enhanced binding affinities to TIA1. And disruptions of the identified RNA-protein interactions with designed antisense oligonucleotides dramatically reduce SARS-CoV-2 infection in cells. Finally, TIA1 protein could interact with conserved SL3 RNA elements within other betacoronavirus lineages. These findings open an avenue to explore the viral RNA-host protein interactions and provide a pioneering structural basis for RNA-targeting antiviral drug design.
Collapse
Affiliation(s)
- Dong Zhang
- Institute of Quantitative Biology, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China
| | - Lulu Qiao
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China
| | - Xiaobo Lei
- NHC Key Laboratory of Systems Biology of Pathogens and Christophe Mérieux Laboratory, Institute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100730, China
| | - Xiaojing Dong
- NHC Key Laboratory of Systems Biology of Pathogens and Christophe Mérieux Laboratory, Institute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100730, China
| | - Yunguang Tong
- College of Life Sciences, China Jiliang University, Hangzhou, Zhejiang, 310018, China
- Department of Pharmacy, China Jiliang University, Hangzhou, Zhejiang, 310018, China
| | - Jianwei Wang
- NHC Key Laboratory of Systems Biology of Pathogens and Christophe Mérieux Laboratory, Institute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100730, China.
| | - Zhiye Wang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China.
- The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, Zhejiang, 310058, China.
| | - Ruhong Zhou
- Institute of Quantitative Biology, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China.
- The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, Zhejiang, 310058, China.
| |
Collapse
|