1
|
Huang Y, Zhang Z, Zou Z, Zhang L, Chen Y, Wan J, Zhu Z, Yu S, Zuo H, Lin YCD, Huang HY, Huang HD. RegRNA 3.0: expanding regulatory RNA analysis with new features for motif, interaction, and annotation. Nucleic Acids Res 2025:gkaf405. [PMID: 40396374 DOI: 10.1093/nar/gkaf405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2025] [Revised: 04/17/2025] [Accepted: 05/19/2025] [Indexed: 05/22/2025] Open
Abstract
Functional RNA molecules are crucial for biological processes from gene regulation to protein synthesis, and analyzing functional motifs and elements is essential for understanding RNA regulation. Building on RegRNA 1.0 and 2.0, we present RegRNA 3.0, a sophisticated meta-workflow that integrates 26 computational tools and 28 databases for annotation, enabling one-step and customizable RNA motif predictions. RegRNA streamlines multi-step analysis and enhances result interpretation with interactive visualizations and comprehensive reporting tools. When provided with an RNA sequence, RegRNA 3.0 generates predictions for RNA functional motifs, RNA interaction motifs, and comprehensive RNA annotations. Specifically, RNA functional motifs include core promoter elements, RNA decay, G-quadruplex, and 14 previous types. RNA interaction motifs include newly added RNA-ligand interactions and RNA-binding protein predictions, along with three previous types. RNA annotation includes RNA family classification, blood exosomes RNA, subcellular localizations, A-to-I editing events, modifications, and 3D structures, along with four previously supported features. RegRNA 3.0 accelerates gene regulation and RNA biology discoveries by offering a user-friendly platform for identifying and analyzing RNA motifs and interactions. The web interface has been improved for intuitive visualizations of predicted motifs and structures, with flexible download options in multiple formats. It is available at http://awi.cuhk.edu.cn/∼RegRNA/.
Collapse
Affiliation(s)
- Yixian Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
| | - Zhiyong Zhang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
| | - Zhengkai Zou
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
| | - Lingquan Zhang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
| | - Yigang Chen
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
| | - Jingting Wan
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
| | - Zihao Zhu
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
| | - Sicong Yu
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
| | - Huali Zuo
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Guangdong Provincial Key Laboratory of Digital Biology and Drug Development, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
| | - Yang-Chi-Dung Lin
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Guangdong Provincial Key Laboratory of Digital Biology and Drug Development, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
| | - Hsi-Yuan Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Guangdong Provincial Key Laboratory of Digital Biology and Drug Development, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
| | - Hsien-Da Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Guangdong Provincial Key Laboratory of Digital Biology and Drug Development, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen, Guangdong 518172, China
- Department of Endocrinology, Key Laboratory of Endocrinology of National Health Commission, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, PR China
| |
Collapse
|
2
|
Vallet T, Vignuzzi M. Self-Amplifying RNA: Advantages and Challenges of a Versatile Platform for Vaccine Development. Viruses 2025; 17:566. [PMID: 40285008 PMCID: PMC12031284 DOI: 10.3390/v17040566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2025] [Revised: 04/09/2025] [Accepted: 04/10/2025] [Indexed: 04/29/2025] Open
Abstract
Self-amplifying RNA is synthetic nucleic acid engineered to replicate within cells without generating viral particles. Derived from alphavirus genomes, saRNA retains the non-structural elements essential for replication while replacing the structural elements with an antigen of interest. By enabling efficient intracellular amplification, saRNA offers a promising alternative to conventional mRNA vaccines, enhancing antigen expression while requiring lower doses. However, this advantage comes with challenges. In this review, we highlight the key limitations of saRNA technology and explore potential strategies to overcome them. By identifying these challenges, we aim to provide insights that can guide the future design of saRNA-based therapeutics, extending their potential beyond vaccine applications.
Collapse
Affiliation(s)
- Thomas Vallet
- A*STAR Infectious Diseases Labs (A*IDL), Agency for Science, Technology and Research (A*STAR), Singapore 138634, Singapore;
- Infectious Diseases Translational Research Programme, Department of Microbiology and Immunology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 118420, Singapore
| | - Marco Vignuzzi
- A*STAR Infectious Diseases Labs (A*IDL), Agency for Science, Technology and Research (A*STAR), Singapore 138634, Singapore;
- Infectious Diseases Translational Research Programme, Department of Microbiology and Immunology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 118420, Singapore
| |
Collapse
|
3
|
Pan S, Wang H, Zhang H, Tang Z, Xu L, Yan Z, Hu Y. UTR-Insight: integrating deep learning for efficient 5' UTR discovery and design. BMC Genomics 2025; 26:107. [PMID: 39905334 PMCID: PMC11796101 DOI: 10.1186/s12864-025-11269-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Accepted: 01/21/2025] [Indexed: 02/06/2025] Open
Abstract
The 5' UTR is critical for mRNA stability and translation efficiency in therapeutics. We developed UTR-Insight, a model integrating a pretrained language model with a CNN-Transformer architecture, explaining 89.1% of the mean ribosome load (MRL) variation in random 5' UTRs and 82.8% in endogenous 5' UTRs, surpassing existing models. Using UTR-Insight, we performed high-throughput in silico screening of hundreds of thousands of endogenous 5' UTRs from primates, mice, and viruses. The screened sequences increased protein expression by up to 319% compared to the human α-globin 5' UTR, and UTR-Insight-designed sequences achieved even greater expression levels than high-performing endogenous 5' UTRs.
Collapse
Affiliation(s)
- Saichao Pan
- Shenzhen Rhegen Biotechnology Co. Ltd, Shenzhen, Guangdong, China
| | - Hanyu Wang
- Shenzhen Rhegen Biotechnology Co. Ltd, Shenzhen, Guangdong, China
| | - Hang Zhang
- Shenzhen Rhegen Biotechnology Co. Ltd, Shenzhen, Guangdong, China
| | - Zan Tang
- Shenzhen Rhegen Biotechnology Co. Ltd, Shenzhen, Guangdong, China
| | - Lianqiang Xu
- Shenzhen Rhegen Biotechnology Co. Ltd, Shenzhen, Guangdong, China
| | - Zhixiang Yan
- Shenzhen Rhegen Biotechnology Co. Ltd, Shenzhen, Guangdong, China.
| | - Yong Hu
- Shenzhen Rhegen Biotechnology Co. Ltd, Shenzhen, Guangdong, China.
| |
Collapse
|
4
|
Khorshid Sokhangouy S, Behzadi M, Rezaei S, Farjami M, Haghshenas M, Sefidbakht Y, Mozaffari-Jovin S. mRNA Vaccines: Design Principles, Mechanisms, and Manufacturing-Insights From COVID-19 as a Model for Combating Infectious Diseases. Biotechnol J 2025; 20:e202400596. [PMID: 39989260 DOI: 10.1002/biot.202400596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 01/24/2025] [Accepted: 02/09/2025] [Indexed: 02/25/2025]
Abstract
The full approval of two SARS-CoV-2 mRNA vaccines, Comirnaty and Spikevax, has greatly accelerated the development of numerous mRNA vaccine candidates targeting infectious diseases and cancer. mRNA vaccines provide a rapid, safe, and versatile manufacturing process while eliciting strong humoral and cellular immune responses, making them particularly beneficial for addressing emerging pandemics. Recent advancements in modified nucleotides and lipid nanoparticle delivery systems have further emphasized the potential of this vaccine platform. Despite these transformative opportunities, significant improvements are needed to enhance vaccine efficacy, stability, and immunogenicity. This review outlines the fundamentals of mRNA vaccine design, the manufacturing process, and administration strategies, along with various optimization approaches. It also offers a comprehensive overview of the mRNA vaccine candidates developed since the onset of the COVID-19 pandemic, the challenges posed by emerging SARS-CoV-2 variants, and current strategies to address these variants. Finally, we discuss the potential of broad-spectrum and combined mRNA vaccines and examine the challenges and future prospects of the mRNA vaccine platform.
Collapse
Affiliation(s)
- Saeideh Khorshid Sokhangouy
- Department of Medical Genetics and Molecular Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Department of Medical Biotechnology, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Matine Behzadi
- Department of Medical Genetics and Molecular Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Shokuh Rezaei
- Protein Research Center, Shahid Beheshti University, Tehran, Iran
| | - Mahsa Farjami
- Department of Medical Genetics and Molecular Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Maryam Haghshenas
- Department of Medical Genetics and Molecular Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Yahya Sefidbakht
- Protein Research Center, Shahid Beheshti University, Tehran, Iran
| | - Sina Mozaffari-Jovin
- Department of Medical Genetics and Molecular Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| |
Collapse
|
5
|
Asim MN, Ibrahim MA, Asif T, Dengel A. RNA sequence analysis landscape: A comprehensive review of task types, databases, datasets, word embedding methods, and language models. Heliyon 2025; 11:e41488. [PMID: 39897847 PMCID: PMC11783440 DOI: 10.1016/j.heliyon.2024.e41488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Revised: 12/23/2024] [Accepted: 12/24/2024] [Indexed: 02/04/2025] Open
Abstract
Deciphering information of RNA sequences reveals their diverse roles in living organisms, including gene regulation and protein synthesis. Aberrations in RNA sequence such as dysregulation and mutations can drive a diverse spectrum of diseases including cancers, genetic disorders, and neurodegenerative conditions. Furthermore, researchers are harnessing RNA's therapeutic potential for transforming traditional treatment paradigms into personalized therapies through the development of RNA-based drugs and gene therapies. To gain insights of biological functions and to detect diseases at early stages and develop potent therapeutics, researchers are performing diverse types RNA sequence analysis tasks. RNA sequence analysis through conventional wet-lab methods is expensive, time-consuming and error prone. To enable large-scale RNA sequence analysis, empowerment of wet-lab experimental methods with Artificial Intelligence (AI) applications necessitates scientists to have a comprehensive knowledge of both DNA and AI fields. While molecular biologists encounter challenges in understanding AI methods, computer scientists often lack basic foundations of RNA sequence analysis tasks. Considering the absence of a comprehensive literature that bridges this research gap and promotes the development of AI-driven RNA sequence analysis applications, the contributions of this manuscript are manifold: It equips AI researchers with biological foundations of 47 distinct RNA sequence analysis tasks. It sets a stage for development of benchmark datasets related to 47 distinct RNA sequence analysis tasks by facilitating cruxes of 64 different biological databases. It presents word embeddings and language models applications across 47 distinct RNA sequence analysis tasks. It streamlines the development of new predictors by providing a comprehensive survey of 58 word embeddings and 70 language models based predictive pipelines performance values as well as top performing traditional sequence encoding based predictors and their performances across 47 RNA sequence analysis tasks.
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Muhammad Ali Ibrahim
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany
| | - Tayyaba Asif
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany
| | - Andreas Dengel
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany
| |
Collapse
|
6
|
Kumar A, Dixit S, Srinivasan K, M D, Vincent PMDR. Personalized cancer vaccine design using AI-powered technologies. Front Immunol 2024; 15:1357217. [PMID: 39582860 PMCID: PMC11581883 DOI: 10.3389/fimmu.2024.1357217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Accepted: 09/24/2024] [Indexed: 11/26/2024] Open
Abstract
Immunotherapy has ushered in a new era of cancer treatment, yet cancer remains a leading cause of global mortality. Among various therapeutic strategies, cancer vaccines have shown promise by activating the immune system to specifically target cancer cells. While current cancer vaccines are primarily prophylactic, advancements in targeting tumor-associated antigens (TAAs) and neoantigens have paved the way for therapeutic vaccines. The integration of artificial intelligence (AI) into cancer vaccine development is revolutionizing the field by enhancing various aspect of design and delivery. This review explores how AI facilitates precise epitope design, optimizes mRNA and DNA vaccine instructions, and enables personalized vaccine strategies by predicting patient responses. By utilizing AI technologies, researchers can navigate complex biological datasets and uncover novel therapeutic targets, thereby improving the precision and efficacy of cancer vaccines. Despite the promise of AI-powered cancer vaccines, significant challenges remain, such as tumor heterogeneity and genetic variability, which can limit the effectiveness of neoantigen prediction. Moreover, ethical and regulatory concerns surrounding data privacy and algorithmic bias must be addressed to ensure responsible AI deployment. The future of cancer vaccine development lies in the seamless integration of AI to create personalized immunotherapies that offer targeted and effective cancer treatments. This review underscores the importance of interdisciplinary collaboration and innovation in overcoming these challenges and advancing cancer vaccine development.
Collapse
Affiliation(s)
- Anant Kumar
- School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
| | - Shriniket Dixit
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
| | - Kathiravan Srinivasan
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
| | - Dinakaran M
- School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, India
| | - P. M. Durai Raj Vincent
- School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, India
| |
Collapse
|
7
|
Feng S, Chen T, Zhang Y, Lu C. mRNA Fragmentation Pattern Detected by SHAPE. Curr Issues Mol Biol 2024; 46:10249-10258. [PMID: 39329962 PMCID: PMC11431040 DOI: 10.3390/cimb46090610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 09/12/2024] [Accepted: 09/14/2024] [Indexed: 09/28/2024] Open
Abstract
The success of messenger RNA (mRNA) vaccines in controlling COVID-19 has warranted further developments in new technology. Currently, their quality control process largely relies on low-resolution electrophoresis for detecting chain breaks. Here, we present an approach using multi-primer reverse transcription sequencing (MPRT-seq) to identify degradation fragments in mRNA products. Using this in-house-made mRNA containing two antigens and untranslated regions (UTRs), we analyzed the mRNA completeness and degradation pattern at a nucleotide resolution. We then analyzed the sensitive base sequence and its correlation with the secondary structure. Our MPRT-seq mapping shows that certain sequences on the 5' of bulge-stem-loop structures can result in preferential chain breaks. Our results agree with commonly used capillary electrophoresis (CE) integrity analysis but at a much higher resolution, and can improve mRNA stability by providing information to remove sensitive structures or sequences in the mRNA sequence design.
Collapse
Affiliation(s)
| | | | | | - Changrui Lu
- College of Biological Science and Medical Engineering, Donghua University, Shanghai 201620, China; (S.F.); (T.C.); (Y.Z.)
| |
Collapse
|
8
|
Wu Z, Sun W, Qi H. Recent Advancements in mRNA Vaccines: From Target Selection to Delivery Systems. Vaccines (Basel) 2024; 12:873. [PMID: 39203999 PMCID: PMC11359327 DOI: 10.3390/vaccines12080873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 07/31/2024] [Accepted: 07/31/2024] [Indexed: 09/03/2024] Open
Abstract
mRNA vaccines are leading a medical revolution. mRNA technologies utilize the host's own cells as bio-factories to produce proteins that serve as antigens. This revolutionary approach circumvents the complicated processes involved in traditional vaccine production and empowers vaccines with the ability to respond to emerging or mutated infectious diseases rapidly. Additionally, the robust cellular immune response elicited by mRNA vaccines has shown significant promise in cancer treatment. However, the inherent instability of mRNA and the complexity of tumor immunity have limited its broader application. Although the emergence of pseudouridine and ionizable cationic lipid nanoparticles (LNPs) made the clinical application of mRNA possible, there remains substantial potential for further improvement of the immunogenicity of delivered antigens and preventive or therapeutic effects of mRNA technology. Here, we review the latest advancements in mRNA vaccines, including but not limited to target selection and delivery systems. This review offers a multifaceted perspective on this rapidly evolving field.
Collapse
Affiliation(s)
- Zhongyan Wu
- Newish Biological R&D Center, Beijing 100101, China;
| | - Weilu Sun
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK;
| | - Hailong Qi
- Newish Biological R&D Center, Beijing 100101, China;
| |
Collapse
|
9
|
He S, Huang R, Townley J, Kretsch RC, Karagianes TG, Cox DBT, Blair H, Penzar D, Vyaltsev V, Aristova E, Zinkevich A, Bakulin A, Sohn H, Krstevski D, Fukui T, Tatematsu F, Uchida Y, Jang D, Lee JS, Shieh R, Ma T, Martynov E, Shugaev MV, Bukhari HST, Fujikawa K, Onodera K, Henkel C, Ron S, Romano J, Nicol JJ, Nye GP, Wu Y, Choe C, Reade W, Das R. Ribonanza: deep learning of RNA structure through dual crowdsourcing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.24.581671. [PMID: 38464325 PMCID: PMC10925082 DOI: 10.1101/2024.02.24.581671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Prediction of RNA structure from sequence remains an unsolved problem, and progress has been slowed by a paucity of experimental data. Here, we present Ribonanza, a dataset of chemical mapping measurements on two million diverse RNA sequences collected through Eterna and other crowdsourced initiatives. Ribonanza measurements enabled solicitation, training, and prospective evaluation of diverse deep neural networks through a Kaggle challenge, followed by distillation into a single, self-contained model called RibonanzaNet. When fine tuned on auxiliary datasets, RibonanzaNet achieves state-of-the-art performance in modeling experimental sequence dropout, RNA hydrolytic degradation, and RNA secondary structure, with implications for modeling RNA tertiary structure.
Collapse
Affiliation(s)
- Shujun He
- Department of Chemical Engineering, Texas A&M University, TX, USA
| | - Rui Huang
- Department of Biochemistry, Stanford CA, USA
| | | | | | | | - David B T Cox
- Department of Biochemistry, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
| | | | - Dmitry Penzar
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
| | - Valeriy Vyaltsev
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Elizaveta Aristova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Arsenii Zinkevich
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Artemy Bakulin
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Hoyeol Sohn
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Daniel Krstevski
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | | | | | | | - Donghoon Jang
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
| | | | - Roger Shieh
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Tom Ma
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Eduard Martynov
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
| | - Maxim V Shugaev
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
| | | | | | | | | | - Shlomo Ron
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Jonathan Romano
- Eterna Massive Open Laboratory
- Howard Hughes Medical Institute
| | | | - Grace P Nye
- Department of Biochemistry, Stanford CA, USA
| | - Yuan Wu
- Department of Biochemistry, Stanford CA, USA
- Howard Hughes Medical Institute
| | | | | | - Rhiju Das
- Department of Biochemistry, Stanford CA, USA
- Biophysics Program, Stanford CA, USA
- Howard Hughes Medical Institute
| |
Collapse
|
10
|
Kim YA, Mousavi K, Yazdi A, Zwierzyna M, Cardinali M, Fox D, Peel T, Coller J, Aggarwal K, Maruggi G. Computational design of mRNA vaccines. Vaccine 2024; 42:1831-1840. [PMID: 37479613 DOI: 10.1016/j.vaccine.2023.07.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 06/23/2023] [Accepted: 07/10/2023] [Indexed: 07/23/2023]
Abstract
mRNA technology has emerged as a successful vaccine platform that offered a swift response to the COVID-19 pandemic. Accumulating evidence shows that vaccine efficacy, thermostability, and other important properties, are largely impacted by intrinsic properties of the mRNA molecule, such as RNA sequence and structure, both of which can be optimized. Designing mRNA sequence for vaccines presents a combinatorial problem due to an extremely large selection space. For instance, due to the degeneracy of the genetic code, there are over 10632 possible mRNA sequences that could encode the spike protein, the COVID-19 vaccines' target. Moreover, designing different elements of the mRNA sequence simultaneously against multiple objectives such as translational efficiency, reduced reactogenicity, and improved stability requires an efficient and sophisticated optimization strategy. Recently, there has been a growing interest in utilizing computational tools to redesign mRNA sequences to improve vaccine characteristics and expedite discovery timelines. In this review, we explore important biophysical features of mRNA to be considered for vaccine design and discuss how computational approaches can be applied to rapidly design mRNA sequences with desirable characteristics.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Jeff Coller
- Johns Hopkins University, Baltimore, MD, USA
| | | | | |
Collapse
|
11
|
Goyal F, Chattopadhyay A, Navik U, Jain A, Reddy PH, Bhatti GK, Bhatti JS. Advancing Cancer Immunotherapy: The Potential of mRNA Vaccines As a Promising Therapeutic Approach. ADVANCED THERAPEUTICS 2024; 7. [DOI: 10.1002/adtp.202300255] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Indexed: 01/11/2025]
Abstract
AbstractmRNA vaccines have long been recognized for their ability to induce robust immune responses. The discovery that mRNA vaccines may also contribute to antitumor immunity has made them a promising therapeutic approach against cancer. Recent advances in understanding of immune system are precious in developing therapeutic strategies that target pathways involved in tumor survival and progression, leading to the most reliable therapeutic strategies in cancer treatment history. Among all traditional cancer treatments, cancer immunotherapies are less toxic and more effective, even in advanced or recurrent stages of cancer. Recent advancements in genomics and machine learning algorithms give new insight into vaccine development. mRNA vaccines are designed to interfere with stimulator of interferon genes (STING) and tumor‐infiltrating lymphocytes pathways, activating more CD8+ T‐cells involved in destroying tumor cells and inhibiting tumor growth. A stronger immune response can be achieved by incorporating immunological adjuvants alongside mRNA. Nonformulated or vehicle‐based mRNA vaccines, when combined with adjuvants, efficiently express tumor antigens through antigen‐presenting cells and stimulate both innate and adaptive immune responses. Codelivery with additional immunotherapeutic agents, such as checkpoint inhibitors, further enhances the efficacy of mRNA vaccines. This article focuses on the current clinical approaches and challenges to consider when developing mRNA‐based vaccine technology for cancer treatment.
Collapse
Affiliation(s)
- Falak Goyal
- Laboratory of Translational Medicine and Nanotherapeutics Department of Human Genetics and Molecular Medicine School of Health Sciences Central University of Punjab Bathinda 151401 India
| | - Anandini Chattopadhyay
- Laboratory of Translational Medicine and Nanotherapeutics Department of Human Genetics and Molecular Medicine School of Health Sciences Central University of Punjab Bathinda 151401 India
| | - Umashanker Navik
- Department of Pharmacology School of Health Sciences Central University of Punjab Bathinda 151401 India
| | - Aklank Jain
- Department of Zoology Central University of Punjab Bathinda Punjab 151401 India
| | - P. Hemachandra Reddy
- Department of Internal Medicine Texas Tech University Health Sciences Center Lubbock TX 79430 USA
- Department of Pharmacology and Neuroscience and Garrison Institute on Aging Texas Tech University Health Sciences Center Lubbock TX 79430 USA
- Department of Public Health Graduate School of Biomedical Sciences Texas Tech University Health Sciences Center Lubbock TX 79430 USA
- Department of Neurology Texas Tech University Health Sciences Center Lubbock TX 79430 USA
- Department of Speech Language, and Hearing Sciences Texas Tech University Health Sciences Center Lubbock TX 79430 USA
| | - Gurjit Kaur Bhatti
- Department of Medical Lab Technology University Institute of Applied Health Sciences Chandigarh University Mohali 140413 India
| | - Jasvinder Singh Bhatti
- Laboratory of Translational Medicine and Nanotherapeutics Department of Human Genetics and Molecular Medicine School of Health Sciences Central University of Punjab Bathinda 151401 India
| |
Collapse
|
12
|
He S, Gao B, Sabnis R, Sun Q. Nucleic Transformer: Classifying DNA Sequences with Self-Attention and Convolutions. ACS Synth Biol 2023; 12:3205-3214. [PMID: 37916871 PMCID: PMC10863451 DOI: 10.1021/acssynbio.3c00154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 10/04/2023] [Accepted: 10/06/2023] [Indexed: 11/03/2023]
Abstract
Much work has been done to apply machine learning and deep learning to genomics tasks, but these applications usually require extensive domain knowledge, and the resulting models provide very limited interpretability. Here, we present the Nucleic Transformer, a conceptually simple but effective and interpretable model architecture that excels in the classification of DNA sequences. The Nucleic Transformer employs self-attention and convolutions on nucleic acid sequences, leveraging two prominent deep learning strategies commonly used in computer vision and natural language analysis. We demonstrate that the Nucleic Transformer can be trained without much domain knowledge to achieve high performance in Escherichia coli promoter classification, viral genome identification, enhancer classification, and chromatin profile predictions.
Collapse
Affiliation(s)
- Shujun He
- Department of Chemical
Engineering, Texas A&M University, College Station, Texas 77840, United States
| | - Baizhen Gao
- Department of Chemical
Engineering, Texas A&M University, College Station, Texas 77840, United States
| | - Rushant Sabnis
- Department of Chemical
Engineering, Texas A&M University, College Station, Texas 77840, United States
| | - Qing Sun
- Department of Chemical
Engineering, Texas A&M University, College Station, Texas 77840, United States
| |
Collapse
|
13
|
Choi SR, Lee M. Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review. BIOLOGY 2023; 12:1033. [PMID: 37508462 PMCID: PMC10376273 DOI: 10.3390/biology12071033] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/18/2023] [Accepted: 07/21/2023] [Indexed: 07/30/2023]
Abstract
The emergence and rapid development of deep learning, specifically transformer-based architectures and attention mechanisms, have had transformative implications across several domains, including bioinformatics and genome data analysis. The analogous nature of genome sequences to language texts has enabled the application of techniques that have exhibited success in fields ranging from natural language processing to genomic data. This review provides a comprehensive analysis of the most recent advancements in the application of transformer architectures and attention mechanisms to genome and transcriptome data. The focus of this review is on the critical evaluation of these techniques, discussing their advantages and limitations in the context of genome data analysis. With the swift pace of development in deep learning methodologies, it becomes vital to continually assess and reflect on the current standing and future direction of the research. Therefore, this review aims to serve as a timely resource for both seasoned researchers and newcomers, offering a panoramic view of the recent advancements and elucidating the state-of-the-art applications in the field. Furthermore, this review paper serves to highlight potential areas of future investigation by critically evaluating studies from 2019 to 2023, thereby acting as a stepping-stone for further research endeavors.
Collapse
Affiliation(s)
| | - Minhyeok Lee
- School of Electrical and Electronics Engineering, Chung-Ang University, Seoul 06974, Republic of Korea;
| |
Collapse
|