1
|
Raden M, Miladi M. How to do RNA-RNA Interaction Prediction? A Use-Case Driven Handbook Using IntaRNA. Methods Mol Biol 2024; 2726:209-234. [PMID: 38780733 DOI: 10.1007/978-1-0716-3519-3_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Computational prediction of RNA-RNA interactions (RRI) is a central methodology for the specific investigation of inter-molecular RNA interactions and regulatory effects of non-coding RNAs like eukaryotic microRNAs or prokaryotic small RNAs. Available methods can be classified according to their underlying prediction strategies, each implicating specific capabilities and restrictions often not transparent to the non-expert user. Within this work, we review seven classes of RRI prediction strategies and discuss the advantages and limitations of respective tools, since such knowledge is essential for selecting the right tool in the first place.Among the RRI prediction strategies, accessibility-based approaches have been shown to provide the most reliable predictions. Here, we describe how IntaRNA, as one of the state-of-the-art accessibility-based tools, can be applied in various use cases for the task of computational RRI prediction. Detailed hands-on examples for individual RRI predictions as well as large-scale target prediction scenarios are provided. We illustrate the flexibility and capabilities of IntaRNA through the examples. Each example is designed using real-life data from the literature and is accompanied by instructions on interpreting the respective results from IntaRNA output. Our use-case driven instructions enable non-expert users to comprehensively understand and utilize IntaRNA's features for effective RRI predictions.
Collapse
Affiliation(s)
- Martin Raden
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany.
| | - Milad Miladi
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| |
Collapse
|
2
|
Ibéné M, Legendre A, Postic G, Angel E, Tahi F. C-RCPred: a multi-objective algorithm for interactive secondary structure prediction of RNA complexes integrating user knowledge and SHAPE data. Brief Bioinform 2023:bbad225. [PMID: 37337745 DOI: 10.1093/bib/bbad225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 04/12/2023] [Accepted: 05/26/2023] [Indexed: 06/21/2023] Open
Abstract
RNAs can interact with other molecules in their environment, such as ions, proteins or other RNAs, to form complexes with important biological roles. The prediction of the structure of these complexes is therefore an important issue and a difficult task. We are interested in RNA complexes composed of several (more than two) interacting RNAs. We show how available knowledge on the considered RNAs can help predict their secondary structure. We propose an interactive tool for the prediction of RNA complexes, called C-RCPRed, that considers user knowledge and probing data (which can be generated experimentally or artificially). C-RCPred is based on a multi-objective optimization algorithm. Through an extensive benchmarking procedure, which includes state-of-the-art methods, we show the efficiency of the multi-objective approach and the positive impact of considering user knowledge and probing data on the prediction results. C-RCPred is freely available as an open-source program and web server on the EvryRNA website (https://evryrna.ibisc.univ-evry.fr).
Collapse
Affiliation(s)
- Mandy Ibéné
- Université Paris-Saclay, Univ Evry, IBISC, 91020, Evry-Courcouronnes, France
| | - Audrey Legendre
- Université Paris-Saclay, Univ Evry, IBISC, 91020, Evry-Courcouronnes, France
| | - Guillaume Postic
- Université Paris-Saclay, Univ Evry, IBISC, 91020, Evry-Courcouronnes, France
| | - Eric Angel
- Université Paris-Saclay, Univ Evry, IBISC, 91020, Evry-Courcouronnes, France
| | - Fariza Tahi
- Université Paris-Saclay, Univ Evry, IBISC, 91020, Evry-Courcouronnes, France
| |
Collapse
|
3
|
Yang SL, Ponti RD, Wan Y, Huber RG. Computational and Experimental Approaches to Study the RNA Secondary Structures of RNA Viruses. Viruses 2022; 14:1795. [PMID: 36016417 PMCID: PMC9415818 DOI: 10.3390/v14081795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 08/12/2022] [Accepted: 08/13/2022] [Indexed: 11/16/2022] Open
Abstract
Most pandemics of recent decades can be traced to RNA viruses, including HIV, SARS, influenza, dengue, Zika, and SARS-CoV-2. These RNA viruses impose considerable social and economic burdens on our society, resulting in a high number of deaths and high treatment costs. As these RNA viruses utilize an RNA genome, which is important for different stages of the viral life cycle, including replication, translation, and packaging, studying how the genome folds is important to understand virus function. In this review, we summarize recent advances in computational and high-throughput RNA structure-mapping approaches and their use in understanding structures within RNA virus genomes. In particular, we focus on the genome structures of the dengue, Zika, and SARS-CoV-2 viruses due to recent significant outbreaks of these viruses around the world.
Collapse
Affiliation(s)
- Siwy Ling Yang
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore
| | - Riccardo Delli Ponti
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Singapore 138671, Singapore
| | - Yue Wan
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore
| | - Roland G. Huber
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Singapore 138671, Singapore
| |
Collapse
|
4
|
Wongsurawat T, Jenjaroenpun P, Wanchai V, Nookaew I. Native RNA or cDNA Sequencing for Transcriptomic Analysis: A Case Study on Saccharomyces cerevisiae. Front Bioeng Biotechnol 2022; 10:842299. [PMID: 35497361 PMCID: PMC9039254 DOI: 10.3389/fbioe.2022.842299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 03/01/2022] [Indexed: 11/13/2022] Open
Abstract
Direct sequencing of single molecules through nanopores allows for accurate quantification and full-length characterization of native RNA or complementary DNA (cDNA) without amplification. Both nanopore-based native RNA and cDNA approaches involve complex transcriptome procedures at a lower cost. However, there are several differences between the two approaches. In this study, we perform matched native RNA sequencing and cDNA sequencing to enable relevant comparisons and evaluation. Using Saccharomyces cerevisiae, a eukaryotic model organism widely used in industrial biotechnology, two different growing conditions are considered for comparison, including the poly-A messenger RNA isolated from yeast cells grown in minimum media under respirofermentative conditions supplemented with glucose (glucose growth conditions) and from cells that had shifted to ethanol as a carbon source (ethanol growth conditions). Library preparation for direct RNA sequencing is shorter than that for direct cDNA sequencing. The sequence characteristics of the two methods were different, such as sequence yields, quality score of reads, read length distribution, and mapped on reference ability of reads. However, differential gene expression analyses derived from the two approaches are comparable. The unique feature of direct RNA sequencing is RNA modification; we found that the RNA modification at the 5' end of a transcript was underestimated due to the 3' bias behavior of the direct RNA sequencing. Our comprehensive evaluation from this work could help researchers make informed choices when selecting an appropriate long-read sequencing method for understanding gene functions, pathways, and detailed functional characterization.
Collapse
Affiliation(s)
- Thidathip Wongsurawat
- Division of Bioinformatics and Data Management for Research, Research Group and Research Network Division, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Piroon Jenjaroenpun
- Division of Bioinformatics and Data Management for Research, Research Group and Research Network Division, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Visanu Wanchai
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Intawat Nookaew
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| |
Collapse
|
5
|
Gong J, Xu K, Ma Z, Lu ZJ, Zhang QC. A deep learning method for recovering missing signals in transcriptome-wide RNA structure profiles from probing experiments. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00412-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
6
|
Zeng M, Wu Y, Lu C, Zhang F, Wu FX, Li M. DeepLncLoc: a deep learning framework for long non-coding RNA subcellular localization prediction based on subsequence embedding. Brief Bioinform 2021; 23:6366323. [PMID: 34498677 DOI: 10.1093/bib/bbab360] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Revised: 08/04/2021] [Accepted: 08/16/2021] [Indexed: 11/14/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) are a class of RNA molecules with more than 200 nucleotides. A growing amount of evidence reveals that subcellular localization of lncRNAs can provide valuable insights into their biological functions. Existing computational methods for predicting lncRNA subcellular localization use k-mer features to encode lncRNA sequences. However, the sequence order information is lost by using only k-mer features. We proposed a deep learning framework, DeepLncLoc, to predict lncRNA subcellular localization. In DeepLncLoc, we introduced a new subsequence embedding method that keeps the order information of lncRNA sequences. The subsequence embedding method first divides a sequence into some consecutive subsequences and then extracts the patterns of each subsequence, last combines these patterns to obtain a complete representation of the lncRNA sequence. After that, a text convolutional neural network is employed to learn high-level features and perform the prediction task. Compared with traditional machine learning models, popular representation methods and existing predictors, DeepLncLoc achieved better performance, which shows that DeepLncLoc could effectively predict lncRNA subcellular localization. Our study not only presented a novel computational model for predicting lncRNA subcellular localization but also introduced a new subsequence embedding method which is expected to be applied in other sequence-based prediction tasks. The DeepLncLoc web server is freely accessible at http://bioinformatics.csu.edu.cn/DeepLncLoc/, and source code and datasets can be downloaded from https://github.com/CSUBioGroup/DeepLncLoc.
Collapse
Affiliation(s)
- Min Zeng
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, 410083, China
| | - Yifan Wu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, 410083, China
| | - Chengqian Lu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, 410083, China
| | - Fuhao Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, 410083, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, 410083, China
| |
Collapse
|
7
|
Xu B, Meng Y, Jin Y. RNA structures in alternative splicing and back-splicing. WILEY INTERDISCIPLINARY REVIEWS-RNA 2020; 12:e1626. [PMID: 32929887 DOI: 10.1002/wrna.1626] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 08/14/2020] [Accepted: 08/22/2020] [Indexed: 12/12/2022]
Abstract
Alternative splicing greatly expands the transcriptomic and proteomic diversities related to physiological and developmental processes in higher eukaryotes. Splicing of long noncoding RNAs, and back- and trans- splicing further expanded the regulatory repertoire of alternative splicing. RNA structures were shown to play an important role in regulating alternative splicing and back-splicing. Application of novel sequencing technologies made it possible to identify genome-wide RNA structures and interaction networks, which might provide new insights into RNA splicing regulation in vitro to in vivo. The emerging transcription-folding-splicing paradigm is changing our understanding of RNA alternative splicing regulation. Here, we review the insights into the roles and mechanisms of RNA structures in alternative splicing and back-splicing, as well as how disruption of these structures affects alternative splicing and then leads to human diseases. This article is categorized under: RNA Processing > Splicing Regulation/Alternative Splicing RNA Structure and Dynamics > Influence of RNA Structure in Biological Systems.
Collapse
Affiliation(s)
- Bingbing Xu
- MOE Laboratory of Biosystems Homeostasis & Protection and Innovation Center for Cell Signaling Network, College of Life Sciences, Zhejiang University, Zhejiang, Hangzhou, China
| | - Yijun Meng
- College of Life and Environmental Sciences, Hangzhou Normal University, Zhejiang, Hangzhou, China
| | - Yongfeng Jin
- MOE Laboratory of Biosystems Homeostasis & Protection and Innovation Center for Cell Signaling Network, College of Life Sciences, Zhejiang University, Zhejiang, Hangzhou, China
| |
Collapse
|
8
|
Bliss N, Bindewald E, Shapiro BA. Predicting RNA SHAPE scores with deep learning. RNA Biol 2020; 17:1324-1330. [PMID: 32476596 PMCID: PMC7549691 DOI: 10.1080/15476286.2020.1760534] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Revised: 03/22/2020] [Accepted: 03/24/2020] [Indexed: 11/15/2022] Open
Abstract
Secondary structure prediction approaches rely typically on models of equilibrium free energies that are themselves based on in vitro physical chemistry. Recent transcriptome-wide experiments of in vivo RNA structure based on SHAPE-MaP experiments provide important information that may make it possible to extend current in vitro-based RNA folding models in order to improve the accuracy of computational RNA folding simulations with respect to the experimentally measured in vivo RNA secondary structure. Here we present a machine learning approach that utilizes RNA secondary structure prediction results and nucleotide sequence in order to predict in vivo SHAPE scores. We show that this approach has a higher Pearson correlation coefficient with experimental SHAPE scores than thermodynamic folding. This could be an important step towards augmenting experimental results with computational predictions and help with RNA secondary structure predictions that inherently take in-vivo folding properties into account.
Collapse
Affiliation(s)
- Noah Bliss
- RNA Biology Laboratory, National Cancer Institute, Frederick, MD, USA
| | - Eckart Bindewald
- Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Bruce A. Shapiro
- RNA Biology Laboratory, National Cancer Institute, Frederick, MD, USA
| |
Collapse
|
9
|
Raden M, Müller T, Mautner S, Gelhausen R, Backofen R. The impact of various seed, accessibility and interaction constraints on sRNA target prediction- a systematic assessment. BMC Bioinformatics 2020; 21:15. [PMID: 31931703 PMCID: PMC6956497 DOI: 10.1186/s12859-019-3143-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Accepted: 10/09/2019] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Seed and accessibility constraints are core features to enable highly accurate sRNA target screens based on RNA-RNA interaction prediction. Currently, available tools provide different (sets of) constraints and default parameter sets. Thus, it is hard to impossible for users to estimate the influence of individual restrictions on the prediction results. RESULTS Here, we present a systematic assessment of the impact of established and new constraints on sRNA target prediction both on a qualitative as well as computational level. This is done exemplarily based on the performance of IntaRNA, one of the most exact sRNA target prediction tools. IntaRNA provides various ways to constrain considered seed interactions, e.g. based on seed length, its accessibility, minimal unpaired probabilities, or energy thresholds, beside analogous constraints for the overall interaction. Thus, our results reveal the impact of individual constraints and their combinations. CONCLUSIONS This provides both a guide for users what is important and recommendations for existing and upcoming sRNA target prediction approaches.We show on a large sRNA target screen benchmark data set that only by altering the parameter set, IntaRNA recovers 30% more verified interactions while becoming 5-times faster. This exemplifies the potential of seed, accessibility and interaction constraints for sRNA target prediction.
Collapse
Affiliation(s)
- Martin Raden
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, Freiburg, 79110, Germany.
| | - Teresa Müller
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, Freiburg, 79110, Germany
| | - Stefan Mautner
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, Freiburg, 79110, Germany
| | - Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, Freiburg, 79110, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, Freiburg, 79110, Germany.,Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Schaenzlestr. 18, Freiburg, 79104, Germany
| |
Collapse
|