1
|
He S, Huang R, Townley J, Kretsch RC, Karagianes TG, Cox DBT, Blair H, Penzar D, Vyaltsev V, Aristova E, Zinkevich A, Bakulin A, Sohn H, Krstevski D, Fukui T, Tatematsu F, Uchida Y, Jang D, Lee JS, Shieh R, Ma T, Martynov E, Shugaev MV, Bukhari HST, Fujikawa K, Onodera K, Henkel C, Ron S, Romano J, Nicol JJ, Nye GP, Wu Y, Choe C, Reade W, Participants E, Das R. Ribonanza: deep learning of RNA structure through dual crowdsourcing. bioRxiv 2024:2024.02.24.581671. [PMID: 38464325 PMCID: PMC10925082 DOI: 10.1101/2024.02.24.581671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Prediction of RNA structure from sequence remains an unsolved problem, and progress has been slowed by a paucity of experimental data. Here, we present Ribonanza, a dataset of chemical mapping measurements on two million diverse RNA sequences collected through Eterna and other crowdsourced initiatives. Ribonanza measurements enabled solicitation, training, and prospective evaluation of diverse deep neural networks through a Kaggle challenge, followed by distillation into a single, self-contained model called RibonanzaNet. When fine tuned on auxiliary datasets, RibonanzaNet achieves state-of-the-art performance in modeling experimental sequence dropout, RNA hydrolytic degradation, and RNA secondary structure, with implications for modeling RNA tertiary structure.
Collapse
Affiliation(s)
- Shujun He
- Department of Chemical Engineering, Texas A&M University, TX, USA
| | - Rui Huang
- Department of Biochemistry, Stanford CA, USA
| | | | | | | | - David B T Cox
- Department of Biochemistry, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
| | | | - Dmitry Penzar
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
| | - Valeriy Vyaltsev
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Elizaveta Aristova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Arsenii Zinkevich
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Artemy Bakulin
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
| | - Hoyeol Sohn
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Daniel Krstevski
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | | | | | | | - Donghoon Jang
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
| | | | - Roger Shieh
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Tom Ma
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Eduard Martynov
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
| | - Maxim V Shugaev
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
| | | | | | | | | | - Shlomo Ron
- Department of Chemical Engineering, Texas A&M University, TX, USA
- Department of Biochemistry, Stanford CA, USA
- Eterna Massive Open Laboratory
- Biophysics Program, Stanford CA, USA
- Department of Medicine, Division of Hematology, and Department of Biochemistry, Stanford CA, USA
- Department of Mathematics, Stanford CA, USA
- AIRI, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russian Federation
- GO Inc., Tokyo, Japan
- Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea
- DeltaX, Seoul, Republic of Korea
- Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russian Federation
- Department of Materials Science and Engineering, University of Virginia, Charlottesville, VA 22904-4745, USA
- Vergesense, CA
- DeNA, Tokyo, Japan
- NVIDIA, Tokyo, Japan
- NVIDIA, Munich
- Howard Hughes Medical Institute
- Department of Bioengineering, Stanford CA, USA
- Kaggle, San Francisco CA, USA
| | - Jonathan Romano
- Eterna Massive Open Laboratory
- Howard Hughes Medical Institute
| | | | - Grace P Nye
- Department of Biochemistry, Stanford CA, USA
| | - Yuan Wu
- Department of Biochemistry, Stanford CA, USA
- Howard Hughes Medical Institute
| | | | | | | | - Rhiju Das
- Department of Biochemistry, Stanford CA, USA
- Biophysics Program, Stanford CA, USA
- Howard Hughes Medical Institute
| |
Collapse
|
2
|
Choe C, Andreasson JOL, Melaine F, Kladwang W, Wu MJ, Portela F, Wellington-Oguri R, Nicol JJ, Wayment-Steele HK, Gotrik M, Participants E, Khatri P, Greenleaf WJ, Das R. Compact RNA sensors for increasingly complex functions of multiple inputs. bioRxiv 2024:2024.01.04.572289. [PMID: 38260323 PMCID: PMC10802310 DOI: 10.1101/2024.01.04.572289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Designing single molecules that compute general functions of input molecular partners represents a major unsolved challenge in molecular design. Here, we demonstrate that high-throughput, iterative experimental testing of diverse RNA designs crowdsourced from Eterna yields sensors of increasingly complex functions of input oligonucleotide concentrations. After designing single-input RNA sensors with activation ratios beyond our detection limits, we created logic gates, including challenging XOR and XNOR gates, and sensors that respond to the ratio of two inputs. Finally, we describe the OpenTB challenge, which elicited 85-nucleotide sensors that compute a score for diagnosing active tuberculosis, based on the ratio of products of three gene segments. Building on OpenTB design strategies, we created an algorithm Nucleologic that produces similarly compact sensors for the three-gene score based on RNA and DNA. These results open new avenues for diverse applications of compact, single molecule sensors previously limited by design complexity.
Collapse
Affiliation(s)
- Christian Choe
- Department of Bioengineering, Stanford University School of Medicine, Stanford, CA, USA
| | - Johan O. L. Andreasson
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Current address: Airity Technologies, Redwood City, CA, USA
| | - Feriel Melaine
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
| | - Wipapat Kladwang
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Current address: Inceptive, Palo Alto, CA, USA
| | - Michelle J. Wu
- Program in Biomedical Informatics, Stanford University School of Medicine, Stanford, CA, USA
- Current address: Verily Life Sciences, South San Francisco, CA, USA
| | - Fernando Portela
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Eterna Massive Open Laboratory
| | - Roger Wellington-Oguri
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Eterna Massive Open Laboratory
| | - John J. Nicol
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Eterna Massive Open Laboratory
| | | | - Michael Gotrik
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Current address: Protillion Biosciences, Burlingame, CA, USA
| | | | - Purvesh Khatri
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
- Stanford Institute for Immunity, Transplantation and Infection, Stanford University School of Medicine, Stanford, CA, USA
| | - William J. Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Program in Biomedical Informatics, Stanford University School of Medicine, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| |
Collapse
|
3
|
Leppek K, Byeon GW, Kladwang W, Wayment-Steele HK, Kerr CH, Xu AF, Kim DS, Topkar VV, Choe C, Rothschild D, Tiu GC, Wellington-Oguri R, Fujii K, Sharma E, Watkins AM, Nicol JJ, Romano J, Tunguz B, Diaz F, Cai H, Guo P, Wu J, Meng F, Shi S, Participants E, Dormitzer PR, Solórzano A, Barna M, Das R. Combinatorial optimization of mRNA structure, stability, and translation for RNA-based therapeutics. Nat Commun 2022; 13:1536. [PMID: 35318324 PMCID: PMC8940940 DOI: 10.1038/s41467-022-28776-w] [Citation(s) in RCA: 74] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 02/07/2022] [Indexed: 02/07/2023] Open
Abstract
Therapeutic mRNAs and vaccines are being developed for a broad range of human diseases, including COVID-19. However, their optimization is hindered by mRNA instability and inefficient protein expression. Here, we describe design principles that overcome these barriers. We develop an RNA sequencing-based platform called PERSIST-seq to systematically delineate in-cell mRNA stability, ribosome load, as well as in-solution stability of a library of diverse mRNAs. We find that, surprisingly, in-cell stability is a greater driver of protein output than high ribosome load. We further introduce a method called In-line-seq, applied to thousands of diverse RNAs, that reveals sequence and structure-based rules for mitigating hydrolytic degradation. Our findings show that highly structured “superfolder” mRNAs can be designed to improve both stability and expression with further enhancement through pseudouridine nucleoside modification. Together, our study demonstrates simultaneous improvement of mRNA stability and protein expression and provides a computational-experimental platform for the enhancement of mRNA medicines. The authors develop an RNA sequencing-based platform, PERSIST-seq, to simultaneously delineate in-cell mRNA stability, ribosome load, and in-solution stability of a diverse mRNA library to derive design principles for improved mRNA therapeutics.
Collapse
Affiliation(s)
- Kathrin Leppek
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Gun Woo Byeon
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Wipapat Kladwang
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
| | | | - Craig H Kerr
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Adele F Xu
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Do Soon Kim
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
| | - Ved V Topkar
- Program in Biophysics, Stanford University, Stanford, CA, 94305, USA
| | - Christian Choe
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
| | - Daphna Rothschild
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Gerald C Tiu
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | | | - Kotaro Fujii
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Eesha Sharma
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
| | - Andrew M Watkins
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
| | - John J Nicol
- Eterna Massive Open Laboratory, Stanford University, Stanford, CA, 94305, USA
| | - Jonathan Romano
- Eterna Massive Open Laboratory, Stanford University, Stanford, CA, 94305, USA.,Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, New York, 14260, USA
| | - Bojan Tunguz
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA.,NVIDIA Corporation, 2788 San Tomas Expy, Santa Clara, CA, 95051, USA
| | - Fernando Diaz
- Pfizer Vaccine Research and Development, Pearl River, NY, USA
| | - Hui Cai
- Pfizer Vaccine Research and Development, Pearl River, NY, USA
| | - Pengbo Guo
- Pfizer Vaccine Research and Development, Pearl River, NY, USA
| | - Jiewei Wu
- Pfizer Vaccine Research and Development, Pearl River, NY, USA
| | - Fanyu Meng
- Pfizer Vaccine Research and Development, Pearl River, NY, USA
| | - Shuai Shi
- Pfizer Vaccine Research and Development, Pearl River, NY, USA
| | - Eterna Participants
- Eterna Massive Open Laboratory, Stanford University, Stanford, CA, 94305, USA
| | - Philip R Dormitzer
- Pfizer Vaccine Research and Development, Pearl River, NY, USA.,GlaxoSmithKline, 1000 Winter St., Waltham, MA, 02453, USA
| | | | - Maria Barna
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA.
| | - Rhiju Das
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA. .,Program in Biophysics, Stanford University, Stanford, CA, 94305, USA. .,Eterna Massive Open Laboratory, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
4
|
Wayment-Steele HK, Kladwang W, Watkins AM, Kim DS, Tunguz B, Reade W, Demkin M, Romano J, Wellington-Oguri R, Nicol JJ, Gao J, Onodera K, Fujikawa K, Mao H, Vandewiele G, Tinti M, Steenwinckel B, Ito T, Noumi T, He S, Ishi K, Lee Y, Öztürk F, Chiu KY, Öztürk E, Amer K, Fares M, Das R. Deep learning models for predicting RNA degradation via dual crowdsourcing. NAT MACH INTELL 2022; 4:1174-1184. [PMID: 36567960 PMCID: PMC9771809 DOI: 10.1038/s42256-022-00571-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 10/21/2022] [Indexed: 12/16/2022]
Abstract
Medicines based on messenger RNA (mRNA) hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics. Here, we describe a crowdsourced machine learning competition ('Stanford OpenVaccine') on Kaggle, involving single-nucleotide resolution measurements on 6,043 diverse 102-130-nucleotide RNA constructs that were themselves solicited through crowdsourcing on the RNA design platform Eterna. The entire experiment was completed in less than 6 months, and 41% of nucleotide-level predictions from the winning model were within experimental error of the ground truth measurement. Furthermore, these models generalized to blindly predicting orthogonal degradation data on much longer mRNA molecules (504-1,588 nucleotides) with improved accuracy compared with previously published models. These results indicate that such models can represent in-line hydrolysis with excellent accuracy, supporting their use for designing stabilized messenger RNAs. The integration of two crowdsourcing platforms, one for dataset creation and another for machine learning, may be fruitful for other urgent problems that demand scientific discovery on rapid timescales.
Collapse
Affiliation(s)
- Hannah K. Wayment-Steele
- grid.168010.e0000000419368956Department of Chemistry, Stanford University, Stanford, CA USA ,grid.497584.30000 0004 6761 3573Eterna Massive Open Laboratory, Stanford, CA USA
| | - Wipapat Kladwang
- grid.497584.30000 0004 6761 3573Eterna Massive Open Laboratory, Stanford, CA USA ,grid.168010.e0000000419368956Department of Biochemistry, Stanford University, Stanford, CA USA
| | - Andrew M. Watkins
- grid.497584.30000 0004 6761 3573Eterna Massive Open Laboratory, Stanford, CA USA ,grid.168010.e0000000419368956Department of Biochemistry, Stanford University, Stanford, CA USA ,grid.418158.10000 0004 0534 4718Prescient Design, Genentech, San Francisco, CA USA
| | - Do Soon Kim
- grid.497584.30000 0004 6761 3573Eterna Massive Open Laboratory, Stanford, CA USA ,grid.168010.e0000000419368956Department of Biochemistry, Stanford University, Stanford, CA USA
| | - Bojan Tunguz
- grid.168010.e0000000419368956Department of Biochemistry, Stanford University, Stanford, CA USA ,grid.451133.10000 0004 0458 4453NVIDIA Corporation, Santa Clara, CA USA
| | | | | | - Jonathan Romano
- grid.497584.30000 0004 6761 3573Eterna Massive Open Laboratory, Stanford, CA USA ,grid.168010.e0000000419368956Department of Biochemistry, Stanford University, Stanford, CA USA ,grid.273335.30000 0004 1936 9887Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY USA
| | | | - John J. Nicol
- grid.497584.30000 0004 6761 3573Eterna Massive Open Laboratory, Stanford, CA USA
| | | | | | | | | | - Gilles Vandewiele
- grid.5342.00000 0001 2069 7798IDLab, Ghent University, Technologiepark-Zwijnaarde, Gent, Belgium
| | - Michele Tinti
- grid.8241.f0000 0004 0397 2876The Wellcome Centre for Anti-Infectives Research, College of Life Sciences, University of Dundee, Dundee, UK
| | - Bram Steenwinckel
- grid.5342.00000 0001 2069 7798IDLab, Ghent University, Technologiepark-Zwijnaarde, Gent, Belgium
| | | | - Taiga Noumi
- grid.497111.b0000 0004 0570 906XKeyence Corporation, 1-3-14, Higashi-Nakajima, Higashi-Yodogawa-ku, Osaka, Japan
| | - Shujun He
- grid.264756.40000 0004 4687 2082Department of Chemical Engineering, Texas A&M University, College Station, TX USA
| | | | - Youhan Lee
- grid.418964.60000 0001 0742 3338Korea Atomic Energy Research Institute, Daejeon, Republic of Korea ,Kakao Brain Corp, Seongnam, Gyeonggi-do Republic of Korea
| | | | | | | | - Karim Amer
- grid.440877.80000 0004 0377 5987Center for Informatics Science, Nile University, Sheikh Zayed, Giza, Egypt
| | - Mohamed Fares
- grid.440877.80000 0004 0377 5987Center for Informatics Science, Nile University, Sheikh Zayed, Giza, Egypt ,grid.419725.c0000 0001 2151 8157National Research Centre, Dokki, Cairo, Egypt
| | | | - Rhiju Das
- grid.497584.30000 0004 6761 3573Eterna Massive Open Laboratory, Stanford, CA USA ,grid.168010.e0000000419368956Department of Biochemistry, Stanford University, Stanford, CA USA ,grid.168010.e0000000419368956Howard Hughes Medical Institute, Stanford University, Stanford, CA USA
| |
Collapse
|
5
|
Wayment-Steele HK, Kim DS, Choe CA, Nicol JJ, Wellington-Oguri R, Watkins AM, Parra Sperberg RA, Huang PS, Participants E, Das R. Correction to 'Theoretical basis for stabilizing messenger RNA through secondary structure design'. Nucleic Acids Res 2021; 49:11405. [PMID: 34591967 PMCID: PMC8565324 DOI: 10.1093/nar/gkab911] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Affiliation(s)
- Hannah K Wayment-Steele
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA.,Eterna Massive Open Laboratory
| | - Do Soon Kim
- Eterna Massive Open Laboratory.,Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA.,Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Christian A Choe
- Eterna Massive Open Laboratory.,Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | | | - Andrew M Watkins
- Eterna Massive Open Laboratory.,Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | | | - Po-Ssu Huang
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Rhiju Das
- Eterna Massive Open Laboratory.,Department of Biochemistry, Stanford University, Stanford, CA 94305, USA.,Department of Physics, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
6
|
Wayment-Steele HK, Kladwang W, Watkins AM, Kim DS, Tunguz B, Reade W, Demkin M, Romano J, Wellington-Oguri R, Nicol JJ, Gao J, Onodera K, Fujikawa K, Mao H, Vandewiele G, Tinti M, Steenwinckel B, Ito T, Noumi T, He S, Ishi K, Lee Y, Öztürk F, Chiu A, Öztürk E, Amer K, Fares M, Participants E, Das R. Deep learning models for predicting RNA degradation via dual crowdsourcing. ArXiv 2021:2110.07531. [PMID: 34671698 PMCID: PMC8528079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Revised: 04/22/2022] [Indexed: 12/31/2022]
Abstract
Messenger RNA-based medicines hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics. Here, we describe a crowdsourced machine learning competition ("Stanford OpenVaccine") on Kaggle, involving single-nucleotide resolution measurements on 6043 102-130-nucleotide diverse RNA constructs that were themselves solicited through crowdsourcing on the RNA design platform Eterna. The entire experiment was completed in less than 6 months, and 41% of nucleotide-level predictions from the winning model were within experimental error of the ground truth measurement. Furthermore, these models generalized to blindly predicting orthogonal degradation data on much longer mRNA molecules (504-1588 nucleotides) with improved accuracy compared to previously published models. Top teams integrated natural language processing architectures and data augmentation techniques with predictions from previous dynamic programming models for RNA secondary structure. These results indicate that such models are capable of representing in-line hydrolysis with excellent accuracy, supporting their use for designing stabilized messenger RNAs. The integration of two crowdsourcing platforms, one for data set creation and another for machine learning, may be fruitful for other urgent problems that demand scientific discovery on rapid timescales.
Collapse
Affiliation(s)
- Hannah K. Wayment-Steele
- Department of Chemistry, Stanford University, Stanford, California 94305, USA,Eterna Massive Open Laboratory
| | - Wipapat Kladwang
- Department of Biochemistry, Stanford University, California 94305, USA,Eterna Massive Open Laboratory
| | - Andrew M. Watkins
- Department of Biochemistry, Stanford University, California 94305, USA,Eterna Massive Open Laboratory
| | - Do Soon Kim
- Department of Biochemistry, Stanford University, California 94305, USA,Eterna Massive Open Laboratory
| | - Bojan Tunguz
- Department of Biochemistry, Stanford University, California 94305, USA,NVIDIA Corporation, Santa Clara, California 95051
| | | | | | - Jonathan Romano
- Department of Biochemistry, Stanford University, California 94305, USA,Eterna Massive Open Laboratory,Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, New York, 14260, USA
| | | | | | - Jiayang Gao
- High-flyer AI, Hangzhou, Zhejiang, China, 310000
| | | | | | - Hanfei Mao
- Yanfu Investments, Shanghai, China, 200000
| | - Gilles Vandewiele
- IDLab, Ghent University, Technologiepark-Zwijnaarde, Gent, Belgium, B-9052
| | - Michele Tinti
- College of Life Sciences, University of Dundee, Dundee DD1 4HN, United Kingdom
| | - Bram Steenwinckel
- IDLab, Ghent University, Technologiepark-Zwijnaarde, Gent, Belgium, B-9052
| | - Takuya Ito
- Universal Knowledge Inc., Tokyo 150-0013, Japan
| | - Taiga Noumi
- Keyence Corporation, 1-3-14, Higashi-Nakajima, Higashi-Yodogawa-ku, Osaka, 533-8555, Japan
| | - Shujun He
- Department of Chemical Engineering, Texas A&M University, College Station, TX 77843
| | | | - Youhan Lee
- Kakao Brain, Seongnam, Gyeonggi-do, Republic of Korea
| | | | | | | | - Karim Amer
- Center for Informatics Science, Nile University, Sheikh Zayed, Giza, Egypt, 12588
| | - Mohamed Fares
- National Research Centre, Dokki, Cairo, Egypt, 12622
| | | | - Rhiju Das
- Department of Biochemistry, Stanford University, California 94305, USA,Eterna Massive Open Laboratory,Department of Physics, Stanford University, California 94305, USA
| |
Collapse
|
7
|
Wayment-Steele HK, Kim DS, Choe CA, Nicol JJ, Wellington-Oguri R, Watkins AM, Parra Sperberg RA, Huang PS, Participants E, Das R. Theoretical basis for stabilizing messenger RNA through secondary structure design. Nucleic Acids Res 2021; 49:10604-10617. [PMID: 34520542 PMCID: PMC8499941 DOI: 10.1093/nar/gkab764] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 08/17/2021] [Accepted: 08/27/2021] [Indexed: 01/08/2023] Open
Abstract
RNA hydrolysis presents problems in manufacturing, long-term storage, world-wide delivery and in vivo stability of messenger RNA (mRNA)-based vaccines and therapeutics. A largely unexplored strategy to reduce mRNA hydrolysis is to redesign RNAs to form double-stranded regions, which are protected from in-line cleavage and enzymatic degradation, while coding for the same proteins. The amount of stabilization that this strategy can deliver and the most effective algorithmic approach to achieve stabilization remain poorly understood. Here, we present simple calculations for estimating RNA stability against hydrolysis, and a model that links the average unpaired probability of an mRNA, or AUP, to its overall hydrolysis rate. To characterize the stabilization achievable through structure design, we compare AUP optimization by conventional mRNA design methods to results from more computationally sophisticated algorithms and crowdsourcing through the OpenVaccine challenge on the Eterna platform. We find that rational design on Eterna and the more sophisticated algorithms lead to constructs with low AUP, which we term 'superfolder' mRNAs. These designs exhibit a wide diversity of sequence and structure features that may be desirable for translation, biophysical size, and immunogenicity. Furthermore, their folding is robust to temperature, computer modeling method, choice of flanking untranslated regions, and changes in target protein sequence, as illustrated by rapid redesign of superfolder mRNAs for B.1.351, P.1 and B.1.1.7 variants of the prefusion-stabilized SARS-CoV-2 spike protein. Increases in in vitro mRNA half-life by at least two-fold appear immediately achievable.
Collapse
MESH Headings
- Algorithms
- Base Pairing
- Base Sequence
- COVID-19/prevention & control
- Humans
- Hydrolysis
- RNA Stability
- RNA, Double-Stranded/chemistry
- RNA, Double-Stranded/genetics
- RNA, Double-Stranded/immunology
- RNA, Messenger/chemistry
- RNA, Messenger/genetics
- RNA, Messenger/immunology
- RNA, Viral/chemistry
- RNA, Viral/genetics
- RNA, Viral/immunology
- SARS-CoV-2/genetics
- SARS-CoV-2/immunology
- Spike Glycoprotein, Coronavirus/genetics
- Spike Glycoprotein, Coronavirus/immunology
- Thermodynamics
Collapse
Affiliation(s)
- Hannah K Wayment-Steele
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA
- Eterna Massive Open Laboratory
| | - Do Soon Kim
- Eterna Massive Open Laboratory
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Christian A Choe
- Eterna Massive Open Laboratory
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | | | - Andrew M Watkins
- Eterna Massive Open Laboratory
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | | | - Po-Ssu Huang
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Rhiju Das
- Eterna Massive Open Laboratory
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
- Department of Physics, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
8
|
Leppek K, Byeon GW, Kladwang W, Wayment-Steele HK, Kerr CH, Xu AF, Kim DS, Topkar VV, Choe C, Rothschild D, Tiu GC, Wellington-Oguri R, Fujii K, Sharma E, Watkins AM, Nicol JJ, Romano J, Tunguz B, Participants E, Barna M, Das R. Combinatorial optimization of mRNA structure, stability, and translation for RNA-based therapeutics. bioRxiv 2021:2021.03.29.437587. [PMID: 33821271 PMCID: PMC8020971 DOI: 10.1101/2021.03.29.437587] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Therapeutic mRNAs and vaccines are being developed for a broad range of human diseases, including COVID-19. However, their optimization is hindered by mRNA instability and inefficient protein expression. Here, we describe design principles that overcome these barriers. We develop a new RNA sequencing-based platform called PERSIST-seq to systematically delineate in-cell mRNA stability, ribosome load, as well as in-solution stability of a library of diverse mRNAs. We find that, surprisingly, in-cell stability is a greater driver of protein output than high ribosome load. We further introduce a method called In-line-seq, applied to thousands of diverse RNAs, that reveals sequence and structure-based rules for mitigating hydrolytic degradation. Our findings show that "superfolder" mRNAs can be designed to improve both stability and expression that are further enhanced through pseudouridine nucleoside modification. Together, our study demonstrates simultaneous improvement of mRNA stability and protein expression and provides a computational-experimental platform for the enhancement of mRNA medicines.
Collapse
Affiliation(s)
- Kathrin Leppek
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Gun Woo Byeon
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Wipapat Kladwang
- Department of Biochemistry, Stanford University, California 94305, USA
| | | | - Craig H Kerr
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Adele F Xu
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Do Soon Kim
- Department of Biochemistry, Stanford University, California 94305, USA
| | - Ved V Topkar
- Program in Biophysics, Stanford University, Stanford, California 94305, USA
| | - Christian Choe
- Department of Bioengineering, Stanford University, Stanford, California 94305, USA
| | - Daphna Rothschild
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Gerald C Tiu
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | | | - Kotaro Fujii
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Eesha Sharma
- Department of Biochemistry, Stanford University, California 94305, USA
| | - Andrew M Watkins
- Department of Biochemistry, Stanford University, California 94305, USA
| | | | - Jonathan Romano
- Eterna Massive Open Laboratory
- Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, New York, 14260, USA
| | - Bojan Tunguz
- Department of Biochemistry, Stanford University, California 94305, USA
| | | | - Maria Barna
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Rhiju Das
- Department of Biochemistry, Stanford University, California 94305, USA
| |
Collapse
|
9
|
Wayment-Steele HK, Kim DS, Choe CA, Nicol JJ, Wellington-Oguri R, Watkins AM, Sperberg RAP, Huang PS, Participants E, Das R. Theoretical basis for stabilizing messenger RNA through secondary structure design. bioRxiv 2021:2020.08.22.262931. [PMID: 32869022 PMCID: PMC7457604 DOI: 10.1101/2020.08.22.262931] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
RNA hydrolysis presents problems in manufacturing, long-term storage, world-wide delivery, and in vivo stability of messenger RNA (mRNA)-based vaccines and therapeutics. A largely unexplored strategy to reduce mRNA hydrolysis is to redesign RNAs to form double-stranded regions, which are protected from in-line cleavage and enzymatic degradation, while coding for the same proteins. The amount of stabilization that this strategy can deliver and the most effective algorithmic approach to achieve stabilization remain poorly understood. Here, we present simple calculations for estimating RNA stability against hydrolysis, and a model that links the average unpaired probability of an mRNA, or AUP, to its overall hydrolysis rate. To characterize the stabilization achievable through structure design, we compare AUP optimization by conventional mRNA design methods to results from more computationally sophisticated algorithms and crowdsourcing through the OpenVaccine challenge on the Eterna platform. These computational tests were carried out on both model mRNAs and COVID-19 mRNA vaccine candidates. We find that rational design on Eterna and the more sophisticated algorithms lead to constructs with low AUP, which we term 'superfolder' mRNAs. These designs exhibit wide diversity of sequence and structure features that may be desirable for translation, biophysical size, and immunogenicity, and their folding is robust to temperature, choice of flanking untranslated regions, and changes in target protein sequence, as illustrated by rapid redesign of superfolder mRNAs for B.1.351, P.1, and B.1.1.7 variants of the prefusion-stabilized SARS-CoV-2 spike protein. Increases in in vitro mRNA half-life by at least two-fold appear immediately achievable.
Collapse
Affiliation(s)
- Hannah K Wayment-Steele
- Department of Chemistry, Stanford University, Stanford, CA, 94305
- Eterna Massive Open Laboratory. Consortium authors listed in Table S1
| | - Do Soon Kim
- Eterna Massive Open Laboratory. Consortium authors listed in Table S1
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, 60208
- Department of Biochemistry, Stanford University, Stanford, CA, 94305
| | - Christian A Choe
- Eterna Massive Open Laboratory. Consortium authors listed in Table S1
- Department of Bioengineering, Stanford University, Stanford, CA, 94305
| | - John J Nicol
- Eterna Massive Open Laboratory. Consortium authors listed in Table S1
| | | | - Andrew M Watkins
- Eterna Massive Open Laboratory. Consortium authors listed in Table S1
- Department of Biochemistry, Stanford University, Stanford, CA, 94305
| | | | - Po-Ssu Huang
- Department of Bioengineering, Stanford University, Stanford, CA, 94305
| | | | - Rhiju Das
- Eterna Massive Open Laboratory. Consortium authors listed in Table S1
- Department of Biochemistry, Stanford University, Stanford, CA, 94305
- Department of Physics, Stanford University, Stanford, CA, 94305
| |
Collapse
|
10
|
Nicol JJ, Hoagland RL, Heitlinger LA. The prevalence of nausea and vomiting in pediatric patients receiving home parenteral nutrition. Nutr Clin Pract 1995; 10:189-92. [PMID: 8552012 DOI: 10.1177/0115426595010005189] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
We have observed that many home parenteral nutrition (HPN) recipients experience nausea, vomiting, or both during cyclic parenteral nutrition infusions. The current investigation was performed to determine the prevalence and course of these symptoms and effectiveness of therapeutic maneuvers. Eighty-nine recipients of HPN were contacted and 53 families (60%) responded. Thirty-five patients (66%) reported complaints of nausea, vomiting, or both associated with their HPN infusion. Patients with cancer (82%) or cystic fibrosis (83%) reported symptoms at similar rates, while patients with gastrointestinal disease (46%) reported symptoms less often (p < .05, chi-square). Within each diagnostic group, prevalence of symptoms did not vary with age. The majority of patients were symptomatic in the morning when being weaned or soon after completing the HPN infusion. Response rates to a variety of therapies were also similar. In conclusion, nausea and vomiting associated with cyclicHPN infusions appear to be common. The precipitating events and efficacy of interventions await identification and prospective evaluation.
Collapse
|