1
|
Fan Q, Zhao X, Li J, Liu R, Liu M, Feng Q, Long Y, Fu Y, Zhai J, Pan Q, Li Y. De novo non-canonical nanopore basecalling enables private communication using heavily-modified DNA data at single-molecule level. Nat Commun 2025; 16:4099. [PMID: 40316536 PMCID: PMC12048662 DOI: 10.1038/s41467-025-59357-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Accepted: 04/16/2025] [Indexed: 05/04/2025] Open
Abstract
Hidden messages in DNA molecules by employing chemical modifications has been suggested for private data storage and transmission at high information density. However, rapidly decoding these "molecular keys" with corresponding basecallers remains challenging. We present DeepSME, a nanopore sequencing and deep-learning based framework towards single-molecule encryption, demonstrated by using 5-hydroxymethylcytosine (5hmC) substitution for individual nucleotide recognition rather than sequential interactions. This non-natural, motif-insensitive methylation disrupts ion current, resulting in a readout failure of 67.2%-100%, concealing the privacy within the DNAs. We further develop an alignment-free DeepSME basecaller as a key to reconstitute the digital information. Our three-stage training pipeline, expands k-mer size from 46 to 49, achieving over 92% precision and recall from scratch. DeepSME deciphers fully 5hmC concealed text and image within 16× coverage depth with an F1-score of 86.4%, surpassing all the state-of-the-art basecallers. Demonstrated on edge computing devices, DeepSME holds supreme potential for DNA-based private communications and broader bioengineering and medical applications.
Collapse
Affiliation(s)
- Qingyuan Fan
- School of Microelectronics, MOE Engineering Research Center of Integrated Circuits for Next Generation Communications, Southern University of Science and Technology, Shenzhen, China
| | - Xuyang Zhao
- School of Microelectronics, MOE Engineering Research Center of Integrated Circuits for Next Generation Communications, Southern University of Science and Technology, Shenzhen, China
| | - Junyao Li
- School of Microelectronics, MOE Engineering Research Center of Integrated Circuits for Next Generation Communications, Southern University of Science and Technology, Shenzhen, China
| | - Ronghui Liu
- School of Microelectronics, MOE Engineering Research Center of Integrated Circuits for Next Generation Communications, Southern University of Science and Technology, Shenzhen, China
| | - Ming Liu
- School of Medicine, Southern University of Science and Technology, Shenzhen, China
| | - Qishun Feng
- National Clinical Research Center for Infectious Diseases, Shenzhen Third People's Hospital, The Second Affiliated Hospital of Southern University of Science and Technology, Shenzhen, China
| | - Yanping Long
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Yang Fu
- School of Medicine, Southern University of Science and Technology, Shenzhen, China
| | - Jixian Zhai
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Qing Pan
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Yi Li
- School of Microelectronics, MOE Engineering Research Center of Integrated Circuits for Next Generation Communications, Southern University of Science and Technology, Shenzhen, China.
| |
Collapse
|
2
|
Spangenberg J, Mündnich S, Busch A, Pastore S, Wierczeiko A, Goettsch W, Dietrich V, Pryszcz LP, Cruciani S, Novoa EM, Joshi K, Perera R, Di Giorgio S, Arrubarrena P, Tellioglu I, Poon CL, Wan YK, Göke J, Hildebrandt A, Dieterich C, Helm M, Marz M, Gerber S, Alagna N. The RMaP challenge of predicting RNA modifications by nanopore sequencing. Commun Chem 2025; 8:115. [PMID: 40221591 PMCID: PMC11993749 DOI: 10.1038/s42004-025-01507-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Accepted: 03/24/2025] [Indexed: 04/14/2025] Open
Abstract
The field of epitranscriptomics is undergoing a technology-driven revolution. During past decades, RNA modifications like N6-methyladenosine (m6A), pseudouridine (ψ), and 5-methylcytosine (m5C) became acknowledged for playing critical roles in cellular processes. Direct RNA sequencing by Oxford Nanopore Technologies (ONT) enabled the detection of modifications in native RNA, by detecting noncanonical RNA nucleosides properties in raw data. Consequently, the field's cutting edge has a heavy component in computer science, opening new avenues of cooperation across the community, as exchanging data is as impactful as exchanging samples. Therefore, we seize the occasion to bring scientists together within the RNA Modification and Processing (RMaP) challenge to advance solutions for RNA modification detection and discuss ideas, problems and approaches. We show several computational methods to detect the most researched mRNA modifications (m6A, ψ, and m5C). Results demonstrate that a low prediction error and a high prediction accuracy can be achieved on these modifications across different approaches and algorithms. The RMaP challenge marks a substantial step towards improving algorithms' comparability, reliability, and consistency in RNA modification prediction. It points out the deficits in this young field that need to be addressed in further challenges.
Collapse
Affiliation(s)
- Jannes Spangenberg
- RNA Bioinformatics, Friedrich-Schiller-University Jena, Leutragraben 1, 07743, Jena, Germany
| | - Stefan Mündnich
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, 55128, Mainz, Germany
| | - Anne Busch
- Institute for Informatics, Johannes Gutenberg-University Mainz, 55128, Mainz, Germany
| | - Stefan Pastore
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, 55128, Mainz, Germany
- Institute for Human Genetics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Anna Wierczeiko
- Institute for Human Genetics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Winfried Goettsch
- RNA Bioinformatics, Friedrich-Schiller-University Jena, Leutragraben 1, 07743, Jena, Germany
- Fritz Lipmann Institute-Leibniz Institute on Aging, 07745, Jena, Germany
| | - Vincent Dietrich
- Institute for Human Genetics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Leszek P Pryszcz
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
| | - Sonia Cruciani
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
| | - Eva Maria Novoa
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
- Universitat Pompeu Fabra, Barcelona, 08003, Spain
- ICREA, Pg Lluis Companys 23, Barcelona, 08010, Spain
| | - Kandarp Joshi
- Department of Neurosurgery, Oncology, Sidney Kimmel Comprehensive Cancer Center, School of Medicine, Johns Hopkins University, 1650 Orleans St, Baltimore, MD, 21231, USA
- Johns Hopkins All Children's Hospital, 600 5th St. South, St.Petersburg, FL, 33701, USA
| | - Ranjan Perera
- Department of Neurosurgery, Oncology, Sidney Kimmel Comprehensive Cancer Center, School of Medicine, Johns Hopkins University, 1650 Orleans St, Baltimore, MD, 21231, USA
- Johns Hopkins All Children's Hospital, 600 5th St. South, St.Petersburg, FL, 33701, USA
| | - Salvatore Di Giorgio
- Division of Immune Diversity, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
| | - Paola Arrubarrena
- Department of Mathematics at Imperial College London, London, SW7 2AZ, UK
- The Alan Turing Institute, London, NW1 2DB, UK
| | - Irem Tellioglu
- Division of Immune Diversity, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
- Graduate Program of the Faculty of Biosciences, Heidelberg University, Heidelberg, 69120, Germany
| | - Chi-Lam Poon
- Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Yuk Kei Wan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, 138672, Republic of Singapore
| | - Jonathan Göke
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, 138672, Republic of Singapore
- Department of Statistics and Applied Probability, National University of Singapore, Singapore, Republic of Singapore
| | - Andreas Hildebrandt
- Institute for Informatics, Johannes Gutenberg-University Mainz, 55128, Mainz, Germany
| | - Christoph Dieterich
- Klaus Tschira Institute for Integrative Computational Cardiology, University Hospital Heidelberg, Im Neuenheimer Feld 669, 69120, Heidelberg, Germany.
| | - Mark Helm
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, 55128, Mainz, Germany.
| | - Manja Marz
- RNA Bioinformatics, Friedrich-Schiller-University Jena, Leutragraben 1, 07743, Jena, Germany.
- Fritz Lipmann Institute-Leibniz Institute on Aging, 07745, Jena, Germany.
- Balance of the Microverse, Fürstengraben 1, 07743, Jena, Germany.
| | - Susanne Gerber
- Institute for Human Genetics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.
- Institute for Quantitative and Computational Biosciences (IQCB), Mainz, Germany.
| | - Nicolo Alagna
- Institute for Human Genetics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.
| |
Collapse
|
3
|
Monzó C, Liu T, Conesa A. Transcriptomics in the era of long-read sequencing. Nat Rev Genet 2025:10.1038/s41576-025-00828-z. [PMID: 40155769 DOI: 10.1038/s41576-025-00828-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/20/2025] [Indexed: 04/01/2025]
Abstract
Transcriptome sequencing revolutionized the analysis of gene expression, providing an unbiased approach to gene detection and quantification that enabled the discovery of novel isoforms, alternative splicing events and fusion transcripts. However, although short-read sequencing technologies have surpassed the limited dynamic range of previous technologies such as microarrays, they have limitations, for example, in resolving full-length transcripts and complex isoforms. Over the past 5 years, long-read sequencing technologies have matured considerably, with improvements in instrumentation and analytical methods, enabling their application to RNA sequencing (RNA-seq). Benchmarking studies are beginning to identify the strengths and limitations of long-read RNA-seq, although there remains a need for comprehensive resources to guide newcomers through the intricacies of this approach. In this Review, we provide a comprehensive overview of the long-read RNA-seq workflow, from library preparation and sequencing challenges to core data processing, downstream analyses and emerging developments. We present an extensive inventory of experimental and analytical methods and discuss current challenges and prospects.
Collapse
Affiliation(s)
- Carolina Monzó
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain.
| | - Tianyuan Liu
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain
| | - Ana Conesa
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain.
| |
Collapse
|
4
|
Faucher-Giguère L, de Préval BS, Rivera A, Scott MS, Elela SA. Small nucleolar RNAs: the hidden precursors of cancer ribosomes. Philos Trans R Soc Lond B Biol Sci 2025; 380:20230376. [PMID: 40045787 PMCID: PMC11883439 DOI: 10.1098/rstb.2023.0376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Revised: 08/16/2024] [Accepted: 08/28/2024] [Indexed: 03/09/2025] Open
Abstract
Ribosomes are heterogeneous in terms of their constituent proteins, structural RNAs and ribosomal RNA (rRNA) modifications, resulting in diverse potential translatomes. rRNA modifications, guided by small nucleolar RNAs (snoRNAs), enable fine-tuning of ribosome function and translation profiles. Recent studies have begun linking dysregulation of snoRNAs, via rRNA modifications, to tumourigenesis. Deciphering the specific contributions of individual rRNA modifications to cancer hallmarks and identifying snoRNAs with oncogenic potential could lead to novel therapeutic strategies. These strategies might target snoRNAs or exploit the dependence of cancer cells on specific rRNA modification sites, potentially disrupting aberrant ribosomal translation programs and hindering tumour growth. This review discusses current evidence and challenges in linking changes in snoRNA expression to rRNA modification and cancer biology.This article is part of the discussion meeting issue 'Ribosome diversity and its impact on protein synthesis, development and disease'.
Collapse
Affiliation(s)
- Laurence Faucher-Giguère
- Department of Microbiology and Infectiology, University of Sherbrooke, Sherbrooke, QuébecJ1E 4K8, Canada
| | - Baudouin S. de Préval
- Department of Biochemistry and Functional Genomics, University of Sherbrooke, Sherbrooke, QuébecJ1E 4K8, Canada
| | - Andrea Rivera
- Department of Microbiology and Infectiology, University of Sherbrooke, Sherbrooke, QuébecJ1E 4K8, Canada
| | - Michelle S. Scott
- Department of Biochemistry and Functional Genomics, University of Sherbrooke, Sherbrooke, QuébecJ1E 4K8, Canada
| | - Sherif Abou Elela
- Department of Microbiology and Infectiology, University of Sherbrooke, Sherbrooke, QuébecJ1E 4K8, Canada
| |
Collapse
|
5
|
Wu K, Li Y, Yi Y, Yu Y, Wang Y, Zhang L, Cao Q, Chen K. The detection, function, and therapeutic potential of RNA 2'-O-methylation. THE INNOVATION LIFE 2024; 3:100112. [PMID: 40206865 PMCID: PMC11981644 DOI: 10.59717/j.xinn-life.2024.100112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/11/2025]
Abstract
RNA modifications play crucial roles in shaping RNA structure, function, and metabolism. Their dysregulation has been associated with many diseases, including cancer, developmental disorders, cardiovascular diseases, as well as neurological and immune-related conditions. A particular type of RNA modification, 2'-O-methylation (Nm) stands out due to its widespread occurrence on all four types of nucleotides (A, U, G, C) and in most RNA categories, e.g., mRNA, rRNA, tRNA, miRNA, snRNA, snoRNA, and viral RNA. Nm is the addition of a methyl group to the 2' hydroxyl of the ribose moiety of a nucleoside. Given its great biological significance and reported association with many diseases, we first reviewed the occurrences and functional implications of Nm in various RNA species. We then summarized the reported Nm detection methods, ranging from biochemical techniques in the 70's and 80's to recent methods based on Illumina RNA sequencing, artificial intelligence (AI) models for computational prediction, and the latest nanopore sequencing methods currently under active development. Moreover, we discussed the applications of Nm in the realm of RNA medicine, highlighting its therapeutic potential. At last, we present perspectives on potential research directions, aiming to offer insights for future investigations on Nm modification.
Collapse
Affiliation(s)
- Kaiyuan Wu
- Basic and Translational Research Division, Department of Cardiology, Boston Children’s Hospital, Boston 02215, USA
- Department of Pediatrics, Harvard Medical School, Boston 02215, USA
- Department of Bioengineering, Rice University, Houston 77005, USA
- Department of Computational Biology and Bioinformatics, School of Medicine, Duke University, Durham 27708, USA
- These authors contributed equally to this work
| | - Yanqiang Li
- Basic and Translational Research Division, Department of Cardiology, Boston Children’s Hospital, Boston 02215, USA
- Department of Pediatrics, Harvard Medical School, Boston 02215, USA
- These authors contributed equally to this work
| | - Yang Yi
- Department of Urology, Feinberg School of Medicine, Northwestern University, Chicago 60611, USA
- Robert H. Lurie Comprehensive Cancer Center, Northwestern University Feinberg School of Medicine, Chicago 60611, USA
| | - Yang Yu
- Basic and Translational Research Division, Department of Cardiology, Boston Children’s Hospital, Boston 02215, USA
- Department of Pediatrics, Harvard Medical School, Boston 02215, USA
| | - Yunxia Wang
- Basic and Translational Research Division, Department of Cardiology, Boston Children’s Hospital, Boston 02215, USA
- Department of Pediatrics, Harvard Medical School, Boston 02215, USA
| | - Lili Zhang
- Basic and Translational Research Division, Department of Cardiology, Boston Children’s Hospital, Boston 02215, USA
- Department of Pediatrics, Harvard Medical School, Boston 02215, USA
| | - Qi Cao
- Department of Urology, Feinberg School of Medicine, Northwestern University, Chicago 60611, USA
- Robert H. Lurie Comprehensive Cancer Center, Northwestern University Feinberg School of Medicine, Chicago 60611, USA
| | - Kaifu Chen
- Basic and Translational Research Division, Department of Cardiology, Boston Children’s Hospital, Boston 02215, USA
- Department of Pediatrics, Harvard Medical School, Boston 02215, USA
- Broad Institute of MIT and Harvard, Boston 02215, USA
- Dana-Farber / Harvard Cancer Center, Boston 02215, USA
| |
Collapse
|
6
|
Huang S, Wylder AC, Pan T. Simultaneous nanopore profiling of mRNA m 6A and pseudouridine reveals translation coordination. Nat Biotechnol 2024; 42:1831-1835. [PMID: 38321115 PMCID: PMC11300707 DOI: 10.1038/s41587-024-02135-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 01/10/2024] [Indexed: 02/08/2024]
Abstract
N6-methyladenosine (m6A) and pseudouridine (Ψ) are the two most abundant modifications in mammalian messenger RNA, but the coordination of their biological functions remains poorly understood. We develop a machine learning-based nanopore direct RNA sequencing method (NanoSPA) that simultaneously analyzes m6A and Ψ in the human transcriptome. Applying NanoSPA to polysome profiling, we reveal opposing transcriptomic co-occurrence of m6A and Ψ and synergistic, hierarchical effects of m6A and Ψ on the polysome.
Collapse
Affiliation(s)
- Sihao Huang
- Department of Biochemistry & Molecular Biology, University of Chicago, Chicago, IL, USA
| | - Adam C Wylder
- Department of Molecular Genetics and Cell Biology, University of Chicago, Chicago, IL, USA
| | - Tao Pan
- Department of Biochemistry & Molecular Biology, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
7
|
Delgado-Tejedor A, Medina R, Begik O, Cozzuto L, López J, Blanco S, Ponomarenko J, Novoa EM. Native RNA nanopore sequencing reveals antibiotic-induced loss of rRNA modifications in the A- and P-sites. Nat Commun 2024; 15:10054. [PMID: 39613750 PMCID: PMC11607429 DOI: 10.1038/s41467-024-54368-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 11/05/2024] [Indexed: 12/01/2024] Open
Abstract
The biological relevance and dynamics of mRNA modifications have been extensively studied; however, whether rRNA modifications are dynamically regulated, and under which conditions, remains unclear. Here, we systematically characterize bacterial rRNA modifications upon exposure to diverse antibiotics using native RNA nanopore sequencing. To identify significant rRNA modification changes, we develop NanoConsensus, a novel pipeline that is robust across RNA modification types, stoichiometries and coverage, with very low false positive rates, outperforming all individual algorithms tested. We then apply NanoConsensus to characterize the rRNA modification landscape upon antibiotic exposure, finding that rRNA modification profiles are altered in the vicinity of A and P-sites of the ribosome, in an antibiotic-specific manner, possibly contributing to antibiotic resistance. Our work demonstrates that rRNA modification profiles can be rapidly altered in response to environmental exposures, and provides a robust workflow to study rRNA modification dynamics in any species, in a scalable and reproducible manner.
Collapse
Affiliation(s)
- Anna Delgado-Tejedor
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Rebeca Medina
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Oguzhan Begik
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Luca Cozzuto
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Judith López
- Molecular Mechanisms Program, Centro de Investigación del Cáncer and Instituto de Biología Molecular y Celular del Cáncer, Consejo Superior de Investigaciones Científicas (CSIC)-University of Salamanca, Salamanca, Spain
- Instituto de Investigación Biomédica de Salamanca (IBSAL), Hospital Universitario de Salamanca, Salamanca, Spain
| | - Sandra Blanco
- Molecular Mechanisms Program, Centro de Investigación del Cáncer and Instituto de Biología Molecular y Celular del Cáncer, Consejo Superior de Investigaciones Científicas (CSIC)-University of Salamanca, Salamanca, Spain
- Instituto de Investigación Biomédica de Salamanca (IBSAL), Hospital Universitario de Salamanca, Salamanca, Spain
| | - Julia Ponomarenko
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Eva Maria Novoa
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Universitat Pompeu Fabra, Barcelona, Spain.
- ICREA, Pg. Lluís Companys 23, Barcelona, Spain.
| |
Collapse
|
8
|
Wang Z, Fang Y, Liu Z, Hao N, Zhang HH, Sun X, Que J, Ding H. Adapting nanopore sequencing basecalling models for modification detection via incremental learning and anomaly detection. Nat Commun 2024; 15:7148. [PMID: 39169028 PMCID: PMC11339354 DOI: 10.1038/s41467-024-51639-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 08/12/2024] [Indexed: 08/23/2024] Open
Abstract
We leverage machine learning approaches to adapt nanopore sequencing basecallers for nucleotide modification detection. We first apply the incremental learning (IL) technique to improve the basecalling of modification-rich sequences, which are usually of high biological interest. With sequence backbones resolved, we further run anomaly detection (AD) on individual nucleotides to determine their modification status. By this means, our pipeline promises the single-molecule, single-nucleotide, and sequence context-free detection of modifications. We benchmark the pipeline using control oligos, further apply it in the basecalling of densely-modified yeast tRNAs and E.coli genomic DNAs, the cross-species detection of N6-methyladenosine (m6A) in mammalian mRNAs, and the simultaneous detection of N1-methyladenosine (m1A) and m6A in human mRNAs. Our IL-AD workflow is available at: https://github.com/wangziyuan66/IL-AD .
Collapse
Affiliation(s)
- Ziyuan Wang
- Department of Pharmacy Practice and Science, University of Arizona, Tucson, AZ, USA
| | - Yinshan Fang
- Columbia Center for Human Development, Department of Medicine, Columbia University Medical Center, New York, NY, USA
| | - Ziyang Liu
- Department of Pharmacy Practice and Science, University of Arizona, Tucson, AZ, USA
- Statistics and Data Science GIDP, University of Arizona, Tucson, AZ, USA
| | - Ning Hao
- Statistics and Data Science GIDP, University of Arizona, Tucson, AZ, USA
- Department of Mathematics, University of Arizona, Tucson, AZ, USA
| | - Hao Helen Zhang
- Statistics and Data Science GIDP, University of Arizona, Tucson, AZ, USA
- Department of Mathematics, University of Arizona, Tucson, AZ, USA
| | - Xiaoxiao Sun
- Statistics and Data Science GIDP, University of Arizona, Tucson, AZ, USA
- Department of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ, USA
| | - Jianwen Que
- Columbia Center for Human Development, Department of Medicine, Columbia University Medical Center, New York, NY, USA.
| | - Hongxu Ding
- Department of Pharmacy Practice and Science, University of Arizona, Tucson, AZ, USA.
- Statistics and Data Science GIDP, University of Arizona, Tucson, AZ, USA.
| |
Collapse
|
9
|
Bortoletto E, Rosani U. Bioinformatics for Inosine: Tools and Approaches to Trace This Elusive RNA Modification. Genes (Basel) 2024; 15:996. [PMID: 39202357 PMCID: PMC11353476 DOI: 10.3390/genes15080996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 07/23/2024] [Accepted: 07/25/2024] [Indexed: 09/03/2024] Open
Abstract
Inosine is a nucleotide resulting from the deamination of adenosine in RNA. This chemical modification process, known as RNA editing, is typically mediated by a family of double-stranded RNA binding proteins named Adenosine Deaminase Acting on dsRNA (ADAR). While the presence of ADAR orthologs has been traced throughout the evolution of metazoans, the existence and extension of RNA editing have been characterized in a more limited number of animals so far. Undoubtedly, ADAR-mediated RNA editing plays a vital role in physiology, organismal development and disease, making the understanding of the evolutionary conservation of this phenomenon pivotal to a deep characterization of relevant biological processes. However, the lack of direct high-throughput methods to reveal RNA modifications at single nucleotide resolution limited an extended investigation of RNA editing. Nowadays, these methods have been developed, and appropriate bioinformatic pipelines are required to fully exploit this data, which can complement existing approaches to detect ADAR editing. Here, we review the current literature on the "bioinformatics for inosine" subject and we discuss future research avenues in the field.
Collapse
Affiliation(s)
| | - Umberto Rosani
- Department of Biology, University of Padova, 35131 Padova, Italy;
| |
Collapse
|
10
|
Zhang Y, Yan H, Wei Z, Hong H, Huang D, Liu G, Qin Q, Rong R, Gao P, Meng J, Ying B. NanoMUD: Profiling of pseudouridine and N1-methylpseudouridine using Oxford Nanopore direct RNA sequencing. Int J Biol Macromol 2024; 270:132433. [PMID: 38759861 DOI: 10.1016/j.ijbiomac.2024.132433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 05/13/2024] [Accepted: 05/14/2024] [Indexed: 05/19/2024]
Abstract
Nanopore direct RNA sequencing provided a promising solution for unraveling the landscapes of modifications on single RNA molecules. Here, we proposed NanoMUD, a computational framework for predicting the RNA pseudouridine modification (Ψ) and its methylated analog N1-methylpseudouridine (m1Ψ), which have critical application in mRNA vaccination, at single-base and single-molecule resolution from direct RNA sequencing data. Electric signal features were fed into a bidirectional LSTM neural network to achieve improved accuracy and predictive capabilities. Motif-specific models (NNUNN, N = A, C, U or G) were trained based on features extracted from designed dataset and achieved superior performance on molecule-level modification prediction (Ψ models: min AUC = 0.86, max AUC = 0.99; m1Ψ models: min AUC = 0.87, max AUC = 0.99). We then aggregated read-level predictions for site stoichiometry estimation. Given the observed sequence-dependent bias in model performance, we trained regression models based on the distribution of modification probabilities for sites with known stoichiometry. The distribution-based site stoichiometry estimation method allows unbiased comparison between different contexts. To demonstrate the feasibility of our work, three case studies on both in vitro and in vivo transcribed RNAs were presented. NanoMUD will make a powerful tool to facilitate the research on modified therapeutic IVT RNAs and provides useful insight to the landscape and stoichiometry of pseudouridine and N1-pseudouridine on in vivo transcribed RNA species.
Collapse
Affiliation(s)
- Yuxin Zhang
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China; Institute of Systems, Molecular and Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
| | - Huayuan Yan
- Suzhou Abogen Biosciences Co., Ltd., Suzhou 215123, China
| | - Zhen Wei
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China; Institute of Systems, Molecular and Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
| | - Haifeng Hong
- Suzhou Abogen Biosciences Co., Ltd., Suzhou 215123, China
| | - Daiyun Huang
- Wisdom Lake Academy of Pharmacy, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Guopeng Liu
- Suzhou Abogen Biosciences Co., Ltd., Suzhou 215123, China
| | - Qianshan Qin
- Suzhou Abogen Biosciences Co., Ltd., Suzhou 215123, China
| | - Rong Rong
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Peng Gao
- Suzhou Abogen Biosciences Co., Ltd., Suzhou 215123, China.
| | - Jia Meng
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China; AI University Research Centre, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China; Institute of Systems, Molecular and Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom.
| | - Bo Ying
- Suzhou Abogen Biosciences Co., Ltd., Suzhou 215123, China.
| |
Collapse
|
11
|
Huang E, Frydman C, Xiao X. Navigating the landscape of epitranscriptomics and host immunity. Genome Res 2024; 34:515-529. [PMID: 38702197 PMCID: PMC11146601 DOI: 10.1101/gr.278412.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/06/2024]
Abstract
RNA modifications, also termed epitranscriptomic marks, encompass chemical alterations to individual nucleotides, including processes such as methylation and editing. These marks contribute to a wide range of biological processes, many of which are related to host immune system defense. The functions of immune-related RNA modifications can be categorized into three main groups: regulation of immunogenic RNAs, control of genes involved in innate immune response, and facilitation of adaptive immunity. Here, we provide an overview of recent research findings that elucidate the contributions of RNA modifications to each of these processes. We also discuss relevant methods for genome-wide identification of RNA modifications and their immunogenic substrates. Finally, we highlight recent advances in cancer immunotherapies that aim to reduce cancer cell viability by targeting the enzymes responsible for RNA modifications. Our presentation of these dynamic research avenues sets the stage for future investigations in this field.
Collapse
Affiliation(s)
- Elaine Huang
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, California 90095, USA
| | - Clara Frydman
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, California 90095, USA
| | - Xinshu Xiao
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, California 90095, USA;
- Department of Integrative Biology and Physiology, University of California, Los Angeles, California 90095, USA
- Molecular Biology Interdepartmental Program, University of California, Los Angeles, California 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
12
|
Wu Y, Shao W, Yan M, Wang Y, Xu P, Huang G, Li X, Gregory BD, Yang J, Wang H, Yu X. Transfer learning enables identification of multiple types of RNA modifications using nanopore direct RNA sequencing. Nat Commun 2024; 15:4049. [PMID: 38744925 PMCID: PMC11094168 DOI: 10.1038/s41467-024-48437-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 04/26/2024] [Indexed: 05/16/2024] Open
Abstract
Nanopore direct RNA sequencing (DRS) has emerged as a powerful tool for RNA modification identification. However, concurrently detecting multiple types of modifications in a single DRS sample remains a challenge. Here, we develop TandemMod, a transferable deep learning framework capable of detecting multiple types of RNA modifications in single DRS data. To train high-performance TandemMod models, we generate in vitro epitranscriptome datasets from cDNA libraries, containing thousands of transcripts labeled with various types of RNA modifications. We validate the performance of TandemMod on both in vitro transcripts and in vivo human cell lines, confirming its high accuracy for profiling m6A and m5C modification sites. Furthermore, we perform transfer learning for identifying other modifications such as m7G, Ψ, and inosine, significantly reducing training data size and running time without compromising performance. Finally, we apply TandemMod to identify 3 types of RNA modifications in rice grown in different environments, demonstrating its applicability across species and conditions. In summary, we provide a resource with ground-truth labels that can serve as benchmark datasets for nanopore-based modification identification methods, and TandemMod for identifying diverse RNA modifications using a single DRS sample.
Collapse
Affiliation(s)
- You Wu
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Wenna Shao
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Mengxiao Yan
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China
| | - Yuqin Wang
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China
| | - Pengfei Xu
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Guoqiang Huang
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xiaofei Li
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Brian D Gregory
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jun Yang
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China.
- Chenshan Scientific Research Center of CAS Center for Excellence in Molecular Plant Sciences, Shanghai, 201602, China.
| | - Hongxia Wang
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China.
- Chenshan Scientific Research Center of CAS Center for Excellence in Molecular Plant Sciences, Shanghai, 201602, China.
| | - Xiang Yu
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
| |
Collapse
|
13
|
Honeycutt E, Kizito F, Karn J, Sweet T. Direct Analysis of HIV mRNA m 6A Methylation by Nanopore Sequencing. Methods Mol Biol 2024; 2807:209-227. [PMID: 38743231 PMCID: PMC12120845 DOI: 10.1007/978-1-0716-3862-0_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The post-transcriptional processing and chemical modification of HIV RNA are understudied aspects of HIV virology, primarily due to the limited ability to accurately map and quantify RNA modifications. Modification-specific antibodies or modification-sensitive endonucleases coupled with short-read RNA sequencing technologies have allowed for low-resolution or limited mapping of important regulatory modifications of HIV RNA such as N6-methyladenosine (m6A). However, a high-resolution map of where these sites occur on HIV transcripts is needed for detailed mechanistic understanding. This has recently become possible with new sequencing technologies. Here, we describe the direct RNA sequencing of HIV transcripts using an Oxford Nanopore Technologies sequencer and the use of this technique to map m6A at near single nucleotide resolution. This technology also provides the ability to identify splice variants with long RNA reads and thus, can provide high-resolution RNA modification maps that distinguish between overlapping splice variants. The protocols outlined here for m6A also provide a powerful paradigm for studying any other RNA modifications that can be detected on the nanopore platform.
Collapse
Affiliation(s)
- Ethan Honeycutt
- Department of Molecular Biology and Microbiology, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Fredrick Kizito
- Department of Molecular Biology and Microbiology, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Jonathan Karn
- Department of Molecular Biology and Microbiology, School of Medicine, Case Western Reserve University, Cleveland, OH, USA.
| | - Thomas Sweet
- Department of Nutrition, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| |
Collapse
|
14
|
Rodell R, Robalin N, Martinez NM. Why U matters: detection and functions of pseudouridine modifications in mRNAs. Trends Biochem Sci 2024; 49:12-27. [PMID: 38097411 PMCID: PMC10976346 DOI: 10.1016/j.tibs.2023.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 10/24/2023] [Accepted: 10/25/2023] [Indexed: 01/07/2024]
Abstract
The uridine modifications pseudouridine (Ψ), dihydrouridine, and 5-methyluridine are present in eukaryotic mRNAs. Many uridine-modifying enzymes are associated with human disease, underscoring the importance of uncovering the functions of uridine modifications in mRNAs. These modified uridines have chemical properties distinct from those of canonical uridines, which impact RNA structure and RNA-protein interactions. Ψ, the most abundant of these uridine modifications, is present across (pre-)mRNAs. Recent work has shown that many Ψs are present at intermediate to high stoichiometries that are likely conducive to function and at locations that are poised to influence pre-/mRNA processing. Technological innovations and mechanistic investigations are unveiling the functions of uridine modifications in pre-mRNA splicing, translation, and mRNA stability, which are discussed in this review.
Collapse
Affiliation(s)
- Rebecca Rodell
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA
| | - Nicolas Robalin
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA
| | - Nicole M Martinez
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA; Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA; Sarafan ChEM-H Institute, Stanford University, Stanford, CA 94305, USA; Chan Zuckerberg Biohub, San Francisco, CA 94158, USA.
| |
Collapse
|
15
|
Hassan D, Ariyur A, Daulatabad SV, Mir Q, Janga SC. Nm-Nano: a machine learning framework for transcriptome-wide single-molecule mapping of 2´-O-methylation (Nm) sites in nanopore direct RNA sequencing datasets. RNA Biol 2024; 21:1-15. [PMID: 38758523 PMCID: PMC11110688 DOI: 10.1080/15476286.2024.2352192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Revised: 03/25/2024] [Accepted: 05/01/2024] [Indexed: 05/18/2024] Open
Abstract
2´-O-methylation (Nm) is one of the most abundant modifications found in both mRNAs and noncoding RNAs. It contributes to many biological processes, such as the normal functioning of tRNA, the protection of mRNA against degradation by the decapping and exoribonuclease (DXO) protein, and the biogenesis and specificity of rRNA. Recent advancements in single-molecule sequencing techniques for long read RNA sequencing data offered by Oxford Nanopore technologies have enabled the direct detection of RNA modifications from sequencing data. In this study, we propose a bio-computational framework, Nm-Nano, for predicting the presence of Nm sites in direct RNA sequencing data generated from two human cell lines. The Nm-Nano framework integrates two supervised machine learning (ML) models for predicting Nm sites: Extreme Gradient Boosting (XGBoost) and Random Forest (RF) with K-mer embedding. Evaluation on benchmark datasets from direct RNA sequecing of HeLa and HEK293 cell lines, demonstrates high accuracy (99% with XGBoost and 92% with RF) in identifying Nm sites. Deploying Nm-Nano on HeLa and HEK293 cell lines reveals genes that are frequently modified with Nm. In HeLa cell lines, 125 genes are identified as frequently Nm-modified, showing enrichment in 30 ontologies related to immune response and cellular processes. In HEK293 cell lines, 61 genes are identified as frequently Nm-modified, with enrichment in processes like glycolysis and protein localization. These findings underscore the diverse regulatory roles of Nm modifications in metabolic pathways, protein degradation, and cellular processes. The source code of Nm-Nano can be freely accessed at https://github.com/Janga-Lab/Nm-Nano.
Collapse
Affiliation(s)
- Doaa Hassan
- Department of Biohealth Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University Indianapolis (IUI), Indianapolis, Indiana, USA
- Computers and Systems Department, National Telecommunication Institute, Cairo, Egypt
| | - Aditya Ariyur
- Department of Biohealth Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University Indianapolis (IUI), Indianapolis, Indiana, USA
| | - Swapna Vidhur Daulatabad
- Department of Biohealth Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University Indianapolis (IUI), Indianapolis, Indiana, USA
| | - Quoseena Mir
- Department of Biohealth Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University Indianapolis (IUI), Indianapolis, Indiana, USA
| | - Sarath Chandra Janga
- Department of Biohealth Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University Indianapolis (IUI), Indianapolis, Indiana, USA
- Centre for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana
| |
Collapse
|
16
|
Wang Z, Fang Y, Liu Z, Hao N, Zhang HH, Sun X, Que J, Ding H. Adapting Nanopore Sequencing Basecalling Models for Modification Detection via Incremental Learning and Anomaly Detection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.19.572449. [PMID: 38187611 PMCID: PMC10769248 DOI: 10.1101/2023.12.19.572431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
We leverage machine learning approaches to adapt nanopore sequencing basecallers for nucleotide modification detection. We first apply the incremental learning technique to improve the basecalling of modification-rich sequences, which are usually of high biological interests. With sequence backbones resolved, we further run anomaly detection on individual nucleotides to determine their modification status. By this means, our pipeline promises the single-molecule, single-nucleotide and sequence context-free detection of modifications. We benchmark the pipeline using control oligos, further apply it in the basecalling of densely-modified yeast tRNAs and E.coli genomic DNAs, the cross-species detection of N6-methyladenosine (m6A) in mammalian mRNAs, and the simultaneous detection of N1-methyladenosine (m1A) and m6A in human mRNAs. Our IL-AD workflow is available at: https://github.com/wangziyuan66/IL-AD.
Collapse
Affiliation(s)
- Ziyuan Wang
- Department of Pharmacy Practice and Science, University of Arizona, Tucson, Arizona, USA
- These authors contributed equally to this work
| | - Yinshan Fang
- Columbia Center for Human Development, Department of Medicine, Columbia University Medical Center, New York, New York, USA
- These authors contributed equally to this work
| | - Ziyang Liu
- Department of Pharmacy Practice and Science, University of Arizona, Tucson, Arizona, USA
- Statistics and Data Science GIDP, University of Arizona, Tucson, Arizona, USA
| | - Ning Hao
- Statistics and Data Science GIDP, University of Arizona, Tucson, Arizona, USA
- Department of Mathematics, University of Arizona, Tucson, Arizona, USA
| | - Hao Helen Zhang
- Statistics and Data Science GIDP, University of Arizona, Tucson, Arizona, USA
- Department of Mathematics, University of Arizona, Tucson, Arizona, USA
| | - Xiaoxiao Sun
- Statistics and Data Science GIDP, University of Arizona, Tucson, Arizona, USA
- Department of Epidemiology and Biostatistics, University of Arizona, Tucson, Arizona, USA
| | - Jianwen Que
- Columbia Center for Human Development, Department of Medicine, Columbia University Medical Center, New York, New York, USA
| | - Hongxu Ding
- Department of Pharmacy Practice and Science, University of Arizona, Tucson, Arizona, USA
- Statistics and Data Science GIDP, University of Arizona, Tucson, Arizona, USA
| |
Collapse
|
17
|
Peng L, Zhang X, Du Y, Li F, Han J, Liu O, Dai S, Zhang X, Liu GE, Yang L, Zhou Y. New insights into transcriptome variation during cattle adipocyte adipogenesis by direct RNA sequencing. iScience 2023; 26:107753. [PMID: 37692285 PMCID: PMC10492216 DOI: 10.1016/j.isci.2023.107753] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 07/31/2023] [Accepted: 08/24/2023] [Indexed: 09/12/2023] Open
Abstract
We performed direct RNA sequencing (DRS) together with PCR-amplified cDNA long and short read sequencing for cattle adipocyte at different stages. We proved that the DRS was with advantages to avoid artificial transcripts and questionable exitrons. Totally, we obtained 68,124 transcripts with information of alternative splicing, poly (A) length and mRNA modification. The number of transcripts for adipogenesis was expanded by alternative splicing, which lead regulation mechanisms far more complex than ever known. We detected 891 differentially expressed genes (DEGs). However, 62.78% transcripts of DEGs were not significantly differentially expressed, and 248 transcripts showed opposite changing directions with their genes. The poly (A) tail became globally shorter in differentiated adipocyte than in primary adipocyte, and had a weak negative correlation with gene/transcript expression. Moreover, the study of different mRNA modifications implied their potential roles in gene expression and alternative splicing. Overall, our study promoted better understanding of adipogenesis mechanisms in cattle adipocytes.
Collapse
Affiliation(s)
- Lingwei Peng
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiaolian Zhang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Yuqin Du
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Fan Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Jiazheng Han
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Oujin Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Shoulu Dai
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiang Zhang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - George E. Liu
- Animal Genomics and Improvement Laboratory, BARC, USDA-ARS, Beltsville, MD 20705, USA
| | - Liguo Yang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Yang Zhou
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
18
|
Lee M. Machine learning for small interfering RNAs: a concise review of recent developments. Front Genet 2023; 14:1226336. [PMID: 37519887 PMCID: PMC10372481 DOI: 10.3389/fgene.2023.1226336] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 07/04/2023] [Indexed: 08/01/2023] Open
Abstract
The advent of machine learning and its subsequent integration into small interfering RNA (siRNA) research heralds a new epoch in the field of RNA interference (RNAi). This review emphasizes the urgency and relevance of assimilating the plethora of contributions and advancements in this domain, particularly focusing on the period of 2019-2023. Given the rapid progression of deep learning technologies, our synthesis of recent research is paramount to staying apprised of the state-of-the-art methods being utilized. It not only offers a comprehensive insight into the confluence of machine learning and siRNA but also serves as a beacon, guiding future explorations in this intersectional research field. Our rigorous examination of studies promises a discerning perspective on the contemporary landscape of machine learning applications in siRNA design and function. This review is an effort to foster further discourse and propel academic inquiry in this multifaceted domain.
Collapse
|
19
|
Acera Mateos P, Zhou Y, Zarnack K, Eyras E. Concepts and methods for transcriptome-wide prediction of chemical messenger RNA modifications with machine learning. Brief Bioinform 2023; 24:7150742. [PMID: 37139545 DOI: 10.1093/bib/bbad163] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 03/03/2023] [Indexed: 05/05/2023] Open
Abstract
The expanding field of epitranscriptomics might rival the epigenome in the diversity of biological processes impacted. In recent years, the development of new high-throughput experimental and computational techniques has been a key driving force in discovering the properties of RNA modifications. Machine learning applications, such as for classification, clustering or de novo identification, have been critical in these advances. Nonetheless, various challenges remain before the full potential of machine learning for epitranscriptomics can be leveraged. In this review, we provide a comprehensive survey of machine learning methods to detect RNA modifications using diverse input data sources. We describe strategies to train and test machine learning methods and to encode and interpret features that are relevant for epitranscriptomics. Finally, we identify some of the current challenges and open questions about RNA modification analysis, including the ambiguity in predicting RNA modifications in transcript isoforms or in single nucleotides, or the lack of complete ground truth sets to test RNA modifications. We believe this review will inspire and benefit the rapidly developing field of epitranscriptomics in addressing the current limitations through the effective use of machine learning.
Collapse
Affiliation(s)
- Pablo Acera Mateos
- EMBL Australia Partner Laboratory Network at the Australian National University, Canberra, Australia
- The Shine-Dalgarno Centre for RNA Innovation, The John Curtin School of Medical Research, Australian National University, Canberra, Australia
- The Centre for Computational Biomedical Sciences, The John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | - You Zhou
- Buchmann Institute for Molecular Life Sciences (BMLS), Goethe University Frankfurt, Max-von-Laue-Str. 15, 60438 Frankfurt a.M., Germany
- Institute of Molecular Biosciences, Goethe University Frankfurt, Max-von-Laue-Str. 15, 60438 Frankfurt a.M., Germany
| | - Kathi Zarnack
- Buchmann Institute for Molecular Life Sciences (BMLS), Goethe University Frankfurt, Max-von-Laue-Str. 15, 60438 Frankfurt a.M., Germany
- Institute of Molecular Biosciences, Goethe University Frankfurt, Max-von-Laue-Str. 15, 60438 Frankfurt a.M., Germany
| | - Eduardo Eyras
- EMBL Australia Partner Laboratory Network at the Australian National University, Canberra, Australia
- The Shine-Dalgarno Centre for RNA Innovation, The John Curtin School of Medical Research, Australian National University, Canberra, Australia
- The Centre for Computational Biomedical Sciences, The John Curtin School of Medical Research, Australian National University, Canberra, Australia
| |
Collapse
|
20
|
Ueda H, Dasgupta B, Yu BY. RNA Modification Detection Using Nanopore Direct RNA Sequencing and nanoDoc2. Methods Mol Biol 2023; 2632:299-319. [PMID: 36781737 DOI: 10.1007/978-1-0716-2996-3_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
RNA modifications regulate multiple aspects of cellular function including RNA splicing, translation, export, decay, stability, and phase separation. One of the comprehensive ways to detect such modifications is by the recent advancement of direct RNA sequencing from Oxford Nanopore Technologies (ONT). However, this method obtains a large amount of data with high complexity in the form of raw current signal that poses a new informatics challenge to accurately detect those modifications. Here, we provide nanoDoc2, a software to detect multiple types of RNA modification from nanopore direct RNA sequencing data. The nanoDoc2 includes a novel signal segmentation algorithm based on the trace value-a base probability feature that is added by the Guppy basecalling program from ONT during processing of the raw signal. The core of nanoDoc2 includes a machine learning algorithm in which a 6-mer segmented raw current signal is analyzed by deep one-class classification using a WaveNet-based neural network. As an output, an RNA modification is detected by a statistical score in each candidate position. Herein, we describe the detailed instructions on how to use nanoDoc2 for signal segmentation, train/test the neural network, and finally predict RNA modifications present in nanopore direct RNA sequencing data.
Collapse
Affiliation(s)
- Hiroki Ueda
- Biological data Science Division, Research Center for Advanced Science and Technologies, The University of Tokyo, Tokyo, Japan.
| | - Bhaskar Dasgupta
- Biological data Science Division, Research Center for Advanced Science and Technologies, The University of Tokyo, Tokyo, Japan
| | - Bo-Yi Yu
- Biological data Science Division, Research Center for Advanced Science and Technologies, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
21
|
Catacalos C, Krohannon A, Somalraju S, Meyer KD, Janga SC, Chakrabarti K. Epitranscriptomics in parasitic protists: Role of RNA chemical modifications in posttranscriptional gene regulation. PLoS Pathog 2022; 18:e1010972. [PMID: 36548245 PMCID: PMC9778586 DOI: 10.1371/journal.ppat.1010972] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
"Epitranscriptomics" is the new RNA code that represents an ensemble of posttranscriptional RNA chemical modifications, which can precisely coordinate gene expression and biological processes. There are several RNA base modifications, such as N6-methyladenosine (m6A), 5-methylcytosine (m5C), and pseudouridine (Ψ), etc. that play pivotal roles in fine-tuning gene expression in almost all eukaryotes and emerging evidences suggest that parasitic protists are no exception. In this review, we primarily focus on m6A, which is the most abundant epitranscriptomic mark and regulates numerous cellular processes, ranging from nuclear export, mRNA splicing, polyadenylation, stability, and translation. We highlight the universal features of spatiotemporal m6A RNA modifications in eukaryotic phylogeny, their homologs, and unique processes in 3 unicellular parasites-Plasmodium sp., Toxoplasma sp., and Trypanosoma sp. and some technological advances in this rapidly developing research area that can significantly improve our understandings of gene expression regulation in parasites.
Collapse
Affiliation(s)
- Cassandra Catacalos
- Department of Biological Sciences, University of North Carolina at Charlotte, Charlotte, North Carolina, United States of America
| | - Alexander Krohannon
- Department of BioHealth Informatics, School of Informatics and Computing, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, Indiana, United States of America
| | - Sahiti Somalraju
- Department of BioHealth Informatics, School of Informatics and Computing, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, Indiana, United States of America
| | - Kate D. Meyer
- Department of Biochemistry, Duke University School of Medicine, Durham, North Carolina, United States of America
| | - Sarath Chandra Janga
- Department of BioHealth Informatics, School of Informatics and Computing, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, Indiana, United States of America
| | - Kausik Chakrabarti
- Department of Biological Sciences, University of North Carolina at Charlotte, Charlotte, North Carolina, United States of America
| |
Collapse
|
22
|
Begik O, Mattick JS, Novoa EM. Exploring the epitranscriptome by native RNA sequencing. RNA (NEW YORK, N.Y.) 2022; 28:1430-1439. [PMID: 36104106 PMCID: PMC9745831 DOI: 10.1261/rna.079404.122] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Chemical RNA modifications, collectively referred to as the "epitranscriptome," are essential players in fine-tuning gene expression. Our ability to analyze RNA modifications has improved rapidly in recent years, largely due to the advent of high-throughput sequencing methodologies, which typically consist of coupling modification-specific reagents, such as antibodies or enzymes, to next-generation sequencing. Recently, it also became possible to map RNA modifications directly by sequencing native RNAs using nanopore technologies, which has been applied for the detection of a number of RNA modifications, such as N6-methyladenosine (m6A), pseudouridine (Ψ), and inosine (I). However, the signal modulations caused by most RNA modifications are yet to be determined. A global effort is needed to determine the signatures of the full range of RNA modifications to avoid the technical biases that have so far limited our understanding of the epitranscriptome.
Collapse
Affiliation(s)
- Oguzhan Begik
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain
| | - John S Mattick
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, New South Wales 2052, Australia
| | - Eva Maria Novoa
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain
- Universitat Pompeu Fabra, Barcelona 08002, Spain
| |
Collapse
|
23
|
White LK, Hesselberth JR. Modification mapping by nanopore sequencing. Front Genet 2022; 13:1037134. [PMID: 36386798 PMCID: PMC9650216 DOI: 10.3389/fgene.2022.1037134] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 10/07/2022] [Indexed: 06/26/2024] Open
Abstract
Next generation sequencing (NGS) has provided biologists with an unprecedented view into biological processes and their regulation over the past 2 decades, fueling a wave of development of high throughput methods based on short read DNA and RNA sequencing. For nucleic acid modifications, NGS has been coupled with immunoprecipitation, chemical treatment, enzymatic treatment, and/or the use of reverse transcriptase enzymes with fortuitous activities to enrich for and to identify covalent modifications of RNA and DNA. However, the majority of nucleic acid modifications lack commercial monoclonal antibodies, and mapping techniques that rely on chemical or enzymatic treatments to manipulate modification signatures add additional technical complexities to library preparation. Moreover, such approaches tend to be specific to a single class of RNA or DNA modification, and generate only indirect readouts of modification status. Third generation sequencing technologies such as the commercially available "long read" platforms from Pacific Biosciences and Oxford Nanopore Technologies are an attractive alternative for high throughput detection of nucleic acid modifications. While the former can indirectly sense modified nucleotides through changes in the kinetics of reverse transcription reactions, nanopore sequencing can in principle directly detect any nucleic acid modification that produces a signal distortion as the nucleic acid passes through a nanopore sensor embedded within a charged membrane. To date, more than a dozen endogenous DNA and RNA modifications have been interrogated by nanopore sequencing, as well as a number of synthetic nucleic acid modifications used in metabolic labeling, structure probing, and other emerging applications. This review is intended to introduce the reader to nanopore sequencing and key principles underlying its use in direct detection of nucleic acid modifications in unamplified DNA or RNA samples, and outline current approaches for detecting and quantifying nucleic acid modifications by nanopore sequencing. As this technology matures, we anticipate advances in both sequencing chemistry and analysis methods will lead to rapid improvements in the identification and quantification of these epigenetic marks.
Collapse
Affiliation(s)
| | - Jay R. Hesselberth
- Department of Biochemistry and Molecular Genetics, RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, CO, United States
| |
Collapse
|
24
|
Barozzi C, Zacchini F, Asghar S, Montanaro L. Ribosomal RNA Pseudouridylation: Will Newly Available Methods Finally Define the Contribution of This Modification to Human Ribosome Plasticity? Front Genet 2022; 13:920987. [PMID: 35719370 PMCID: PMC9198423 DOI: 10.3389/fgene.2022.920987] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 05/09/2022] [Indexed: 12/05/2022] Open
Abstract
In human rRNA, at least 104 specific uridine residues are modified to pseudouridine. Many of these pseudouridylation sites are located within functionally important ribosomal domains and can influence ribosomal functional features. Until recently, available methods failed to reliably quantify the level of modification at each specific rRNA site. Therefore, information obtained so far only partially explained the degree of regulation of pseudouridylation in different physiological and pathological conditions. In this focused review, we provide a summary of the methods that are now available for the study of rRNA pseudouridylation, discussing the perspectives that newly developed approaches are offering.
Collapse
Affiliation(s)
- Chiara Barozzi
- Dipartimento di Medicina Specialistica, Diagnostica e Sperimentale (DIMES), Alma Mater Studiorum—Università di Bologna, Bologna, Italy
- Centro di Ricerca Biomedica Applicata, CRBA, Universita di Bologna, Policlinico di Sant’Orsola, Bologna, Italy
| | - Federico Zacchini
- Dipartimento di Medicina Specialistica, Diagnostica e Sperimentale (DIMES), Alma Mater Studiorum—Università di Bologna, Bologna, Italy
- Centro di Ricerca Biomedica Applicata, CRBA, Universita di Bologna, Policlinico di Sant’Orsola, Bologna, Italy
| | - Sidra Asghar
- Dipartimento di Medicina Specialistica, Diagnostica e Sperimentale (DIMES), Alma Mater Studiorum—Università di Bologna, Bologna, Italy
- Centro di Ricerca Biomedica Applicata, CRBA, Universita di Bologna, Policlinico di Sant’Orsola, Bologna, Italy
| | - Lorenzo Montanaro
- Dipartimento di Medicina Specialistica, Diagnostica e Sperimentale (DIMES), Alma Mater Studiorum—Università di Bologna, Bologna, Italy
- Centro di Ricerca Biomedica Applicata, CRBA, Universita di Bologna, Policlinico di Sant’Orsola, Bologna, Italy
- Departmental Program in Laboratory Medicine, IRCCS Azienda Ospedaliero-Universitaria di Bologna, Bologna, Italy
| |
Collapse
|