1
|
Yamamura K, Asai K, Iwakiri J. Consistent features observed in structural probing data of eukaryotic RNAs. NAR Genom Bioinform 2025; 7:lqaf001. [PMID: 39885881 PMCID: PMC11780854 DOI: 10.1093/nargab/lqaf001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 12/25/2024] [Accepted: 01/09/2025] [Indexed: 02/01/2025] Open
Abstract
Understanding RNA structure is crucial for elucidating its regulatory mechanisms. With the recent commercialization of messenger RNA vaccines, the profound impact of RNA structure on stability and translation efficiency has become increasingly evident, underscoring the importance of understanding RNA structure. Chemical probing of RNA has emerged as a powerful technique for investigating RNA structure in living cells. This approach utilizes chemical probes that selectively react with accessible regions of RNA, and by measuring reactivity, the openness and potential of RNA for protein binding or base pairing can be inferred. Extensive experimental data generated using RNA chemical probing have significantly contributed to our understanding of RNA structure in cells. However, it is crucial to acknowledge potential biases in chemical probing data to ensure an accurate interpretation. In this study, we comprehensively analyzed transcriptome-scale RNA chemical probing data in eukaryotes and report common features. Notably, in all experiments, the number of bases modified in probing was small, the bases showing the top 10% reactivity well reflected the known secondary structure, bases with high reactivity were more likely to be exposed to solvent and low reactivity did not reflect solvent exposure, which is important information for the analysis of RNA chemical probing data.
Collapse
Affiliation(s)
- Kazuteru Yamamura
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, Chiba 277-8561, Japan
| | - Kiyoshi Asai
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, Chiba 277-8561, Japan
| | - Junichi Iwakiri
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, Chiba 277-8561, Japan
| |
Collapse
|
2
|
Yang TH. DEBFold: Computational Identification of RNA Secondary Structures for Sequences across Structural Families Using Deep Learning. J Chem Inf Model 2024; 64:3756-3766. [PMID: 38648189 PMCID: PMC11094721 DOI: 10.1021/acs.jcim.4c00458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 04/09/2024] [Accepted: 04/09/2024] [Indexed: 04/25/2024]
Abstract
It is now known that RNAs play more active roles in cellular pathways beyond simply serving as transcription templates. These biological mechanisms might be mediated by higher RNA stereo conformations, triggering the need to understand RNA secondary structures first. However, experimental protocols for solving RNA structures are unavailable for large-scale investigation due to their high costs and time-consuming nature. Various computational tools were thus developed to predict the RNA secondary structures from sequences. Recently, deep networks have been investigated to help predict RNA structures directly from their sequences. However, existing deep-learning-based tools are more or less suffering from model overfitting due to their complicated problem formulation and defective model training processes, limiting their applications across sequences from different structural families. In this research, we designed a two-stage RNA structure prediction strategy called DEBFold (deep ensemble boosting and folding) based on convolution encoding/decoding and self-attention mechanisms to enhance the existing thermodynamic structure models. Moreover, the model training process followed rigorous steps to achieve an acceptable prediction generalization. On the family-wise reserved test sets and the PDB-derived test set, DEBFold achieves better structure prediction performance over traditional tools and existing deep-learning methods. In summary, we obtained a cutting-edge deep-learning-based structure prediction tool with supreme across-family generalization performance. The DEBFold tool can be accessed at https://cobis.bme.ncku.edu.tw/DEBFold/.
Collapse
Affiliation(s)
- Tzu-Hsien Yang
- Department
of Biomedical Engineering, National Cheng
Kung University, No.1, University Road, Tainan 701, Taiwan
- Medical
Device Innovation Center, National Cheng
Kung University, No.1,
University Road, Tainan 701, Taiwan
| |
Collapse
|
3
|
Zhang J, Fei Y, Sun L, Zhang QC. Advances and opportunities in RNA structure experimental determination and computational modeling. Nat Methods 2022; 19:1193-1207. [PMID: 36203019 DOI: 10.1038/s41592-022-01623-y] [Citation(s) in RCA: 64] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 08/23/2022] [Indexed: 11/09/2022]
Abstract
Beyond transferring genetic information, RNAs are molecules with diverse functions that include catalyzing biochemical reactions and regulating gene expression. Most of these activities depend on RNAs' specific structures. Therefore, accurately determining RNA structure is integral to advancing our understanding of RNA functions. Here, we summarize the state-of-the-art experimental and computational technologies developed to evaluate RNA secondary and tertiary structures. We also highlight how the rapid increase of experimental data facilitates the integrative modeling approaches for better resolving RNA structures. Finally, we provide our thoughts on the latest advances and challenges in RNA structure determination methods, as well as on future directions for both experimental approaches and artificial intelligence-based computational tools to model RNA structure. Ultimately, we hope the technological advances will deepen our understanding of RNA biology and facilitate RNA structure-based biomedical research such as designing specific RNA structures for therapeutics and deploying RNA-targeting small-molecule drugs.
Collapse
Affiliation(s)
- Jinsong Zhang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China.,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China.,Tsinghua-Peking Center for Life Sciences, Beijing, China
| | - Yuhan Fei
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China.,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China.,Tsinghua-Peking Center for Life Sciences, Beijing, China
| | - Lei Sun
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China. .,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China. .,Tsinghua-Peking Center for Life Sciences, Beijing, China.
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China. .,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China. .,Tsinghua-Peking Center for Life Sciences, Beijing, China.
| |
Collapse
|
4
|
RNA secondary structure packages evaluated and improved by high-throughput experiments. Nat Methods 2022; 19:1234-1242. [PMID: 36192461 PMCID: PMC9839360 DOI: 10.1038/s41592-022-01605-0] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 08/10/2022] [Indexed: 01/17/2023]
Abstract
Despite the popularity of computer-aided study and design of RNA molecules, little is known about the accuracy of commonly used structure modeling packages in tasks sensitive to ensemble properties of RNA. Here, we demonstrate that the EternaBench dataset, a set of more than 20,000 synthetic RNA constructs designed on the RNA design platform Eterna, provides incisive discriminative power in evaluating current packages in ensemble-oriented structure prediction tasks. We find that CONTRAfold and RNAsoft, packages with parameters derived through statistical learning, achieve consistently higher accuracy than more widely used packages in their standard settings, which derive parameters primarily from thermodynamic experiments. We hypothesized that training a multitask model with the varied data types in EternaBench might improve inference on ensemble-based prediction tasks. Indeed, the resulting model, named EternaFold, demonstrated improved performance that generalizes to diverse external datasets including complete messenger RNAs, viral genomes probed in human cells and synthetic designs modeling mRNA vaccines.
Collapse
|
5
|
Rouse WB, O'Leary CA, Booher NJ, Moss WN. Expansion of the RNAStructuromeDB to include secondary structural data spanning the human protein-coding transcriptome. Sci Rep 2022; 12:14515. [PMID: 36008510 PMCID: PMC9403969 DOI: 10.1038/s41598-022-18699-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 08/17/2022] [Indexed: 11/22/2022] Open
Abstract
RNA plays vital functional roles in almost every component of biology, and these functional roles are often influenced by its folding into secondary and tertiary structures. An important role of RNA secondary structure is in maintaining proper gene regulation; therefore, making accurate predictions of the structures involved in these processes is important. In this study, we have expanded on our previous work that led to the creation of the RNAStructuromeDB. Unlike this previous study that analyzed the human genome at low resolution, we have now scanned the protein-coding human transcriptome at high (single nt) resolution. This provides more robust structure predictions for over 100,000 isoforms of known protein-coding genes. Notably, we also utilize the motif identification tool, ScanFold, to model structures with high propensity for ordered/evolved stability. All data have been uploaded to the RNAStructuromeDB, allowing for easy searching of transcripts, visualization of data tracks (via the Integrative Genomics Viewer or IGV), and download of ScanFold data—including unique highly-ordered motifs. Herein, we provide an example analysis of MAT2A to demonstrate the utility of ScanFold at finding known and novel secondary structures, highlighting regions of potential functionality, and guiding generation of functional hypotheses through use of the data.
Collapse
Affiliation(s)
- Warren B Rouse
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, 50011, USA
| | - Collin A O'Leary
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, 50011, USA
| | - Nicholas J Booher
- Infrastructure and Research IT Services, Iowa State University, Ames, IA, 50011, USA
| | - Walter N Moss
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
6
|
Tsybulskyi V, Meyer IM. ShapeSorter: a fully probabilistic method for detecting conserved RNA structure features supported by SHAPE evidence. Nucleic Acids Res 2022; 50:e85. [PMID: 35641016 PMCID: PMC9410897 DOI: 10.1093/nar/gkac405] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 05/09/2022] [Indexed: 12/20/2022] Open
Abstract
There is an increased interest in the determination of RNA structures in vivo as it is now possible to probe them in a high-throughput manner, e.g. using SHAPE protocols. By now, there exist a range of computational methods that integrate experimental SHAPE-probing evidence into computational RNA secondary structure prediction. The state-of-the-art in this field is currently provided by computational methods that employ the minimum-free energy strategy for prediction RNA secondary structures with SHAPE-probing evidence. These methods, however, rely on the assumption that transcripts in vivo fold into the thermodynamically most stable configuration and ignore evolutionary evidence for conserved RNA structure features. We here present a new computational method, ShapeSorter, that predicts RNA structure features without employing the thermodynamic strategy. Instead, ShapeSorter employs a fully probabilistic framework to identify RNA structure features that are supported by evolutionary and SHAPE-probing evidence. Our method can capture RNA structure heterogeneity, pseudo-knotted RNA structures as well as transient and mutually exclusive RNA structure features. Moreover, it estimates P-values for the predicted RNA structure features which allows for easy filtering and ranking. We investigate the merits of our method in a comprehensive performance benchmarking and conclude that ShapeSorter has a significantly superior performance for predicting base-pairs than the existing state-of-the-art methods.
Collapse
Affiliation(s)
- Volodymyr Tsybulskyi
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany.,Freie Universität Berlin, Department of Mathematics and Computer Science, Institute of Computer Science, Takustraße 9, 14195 Berlin, Germany
| | - Irmtraud M Meyer
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany.,Freie Universität Berlin, Department of Biology, Chemistry and Pharmacy, Institute of Chemistry and Biochemistry, Thielallee 63, 14195 Berlin, Germany.,Freie Universität Berlin, Department of Mathematics and Computer Science, Institute of Computer Science, Takustraße 9, 14195 Berlin, Germany
| |
Collapse
|
7
|
Aviran S, Incarnato D. Computational approaches for RNA structure ensemble deconvolution from structure probing data. J Mol Biol 2022; 434:167635. [PMID: 35595163 DOI: 10.1016/j.jmb.2022.167635] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 04/29/2022] [Accepted: 05/05/2022] [Indexed: 12/15/2022]
Abstract
RNA structure probing experiments have emerged over the last decade as a straightforward way to determine the structure of RNA molecules in a number of different contexts. Although powerful, the ability of RNA to dynamically interconvert between, and to simultaneously populate, alternative structural configurations, poses a nontrivial challenge to the interpretation of data derived from these experiments. Recent efforts aimed at developing computational methods for the reconstruction of coexisting alternative RNA conformations from structure probing data are paving the way to the study of RNA structure ensembles, even in the context of living cells. In this review, we critically discuss these methods, their limitations and possible future improvements.
Collapse
Affiliation(s)
- Sharon Aviran
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA.
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, the Netherlands.
| |
Collapse
|
8
|
Yang TH. An Aggregation Method to Identify the RNA Meta-Stable Secondary Structure and its Functionally Interpretable Structure Ensemble. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:75-86. [PMID: 34014829 DOI: 10.1109/tcbb.2021.3082396] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
RNA can provide vital cellular functions through its secondary or tertiary structure. Due to the low-throughput nature of experimental approaches, studies on RNA structures mainly resort to computational methods. However, current existing tools fail to consider RNA structure ensembles and do not provide ways to decipher functional hypotheses for the new predictions. In this research, a novel method was proposed to identify the functionally interpretable structure ensemble of a given RNA sequence and provide the meta-stable structure, or the most frequently observed functional RNA cellular conformation, based on the ensemble. In the prediction of meta-stable structures, the proposed method outperformed existing tools on a yeast test set. The inferred functional aspects were then manually checked and demonstrated a micro-averaging F1 value of 0.92. Further, a biological example of the yeast ASH1-E1 element was discussed to articulate that these functional aspects can also suggest testable hypotheses. Then the proposed method was verified to be well applicable to other species through a human test set. Finally, the proposed method was demonstrated to show resistance to sequence length-dependent performance deterioration.
Collapse
|
9
|
Radecki P, Uppuluri R, Deshpande K, Aviran S. Accurate detection of RNA stem-loops in structurome data reveals widespread association with protein binding sites. RNA Biol 2021; 18:521-536. [PMID: 34606413 PMCID: PMC8677038 DOI: 10.1080/15476286.2021.1971382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
Abstract
RNA molecules are known to fold into specific structures which often play a central role in their functions and regulation. In silico folding of RNA transcripts, especially when assisted with structure profiling (SP) data, is capable of accurately elucidating relevant structural conformations. However, such methods scale poorly to the swaths of SP data generated by transcriptome-wide experiments, which are becoming more commonplace and advancing our understanding of RNA structure and its regulation at global and local levels. This has created a need for tools capable of rapidly deriving structural assessments from SP data in a scalable manner. One such tool we previously introduced that aims to process such data is patteRNA, a statistical learning algorithm capable of rapidly mining big SP datasets for structural elements. Here, we present a reformulation of patteRNA's pattern recognition scheme that sees significantly improved precision without major compromises to computational overhead. Specifically, we developed a data-driven logistic classifier which interprets patteRNA's statistical characterizations of SP data in addition to local sequence properties as measured with a nearest neighbour thermodynamic model. Application of the classifier to human structurome data reveals a marked association between detected stem-loops and RNA binding protein (RBP) footprints. The results of our application demonstrate that upwards of 30% of RBP footprints occur within loops of stable stem-loop elements. Overall, our work arrives at a rapid and accurate method for automatically detecting families of RNA structure motifs and demonstrates the functional relevance of identifying them transcriptome-wide.
Collapse
Affiliation(s)
- Pierce Radecki
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| | - Rahul Uppuluri
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| | - Kaustubh Deshpande
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| | - Sharon Aviran
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| |
Collapse
|
10
|
Radecki P, Uppuluri R, Aviran S. Rapid structure-function insights via hairpin-centric analysis of big RNA structure probing datasets. NAR Genom Bioinform 2021; 3:lqab073. [PMID: 34447931 PMCID: PMC8384053 DOI: 10.1093/nargab/lqab073] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 07/14/2021] [Accepted: 08/03/2021] [Indexed: 12/23/2022] Open
Abstract
The functions of RNA are often tied to its structure, hence analyzing structure is of significant interest when studying cellular processes. Recently, large-scale structure probing (SP) studies have enabled assessment of global structure-function relationships via standard data summarizations or local folding. Here, we approach structure quantification from a hairpin-centric perspective where putative hairpins are identified in SP datasets and used as a means to capture local structural effects. This has the advantage of rapid processing of big (e.g. transcriptome-wide) data as RNA folding is circumvented, yet it captures more information than simple data summarizations. We reformulate a statistical learning algorithm we previously developed to significantly improve precision of hairpin detection, then introduce a novel nucleotide-wise measure, termed the hairpin-derived structure level (HDSL), which captures local structuredness by accounting for the presence of likely hairpin elements. Applying HDSL to data from recent studies recapitulates, strengthens and expands on their findings which were obtained by more comprehensive folding algorithms, yet our analyses are orders of magnitude faster. These results demonstrate that hairpin detection is a promising avenue for global and rapid structure-function analysis, furthering our understanding of RNA biology and the principal features which drive biological insights from SP data.
Collapse
Affiliation(s)
- Pierce Radecki
- Biomedical Engineering Department and Genome Center, University of California at Davis, Davis, CA 95616, USA
| | - Rahul Uppuluri
- Biomedical Engineering Department and Genome Center, University of California at Davis, Davis, CA 95616, USA
| | - Sharon Aviran
- Biomedical Engineering Department and Genome Center, University of California at Davis, Davis, CA 95616, USA
| |
Collapse
|
11
|
Zhou Y, Li J, Hurst T, Chen SJ. SHAPER: A Web Server for Fast and Accurate SHAPE Reactivity Prediction. Front Mol Biosci 2021; 8:721955. [PMID: 34395533 PMCID: PMC8355595 DOI: 10.3389/fmolb.2021.721955] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 07/13/2021] [Indexed: 11/13/2022] Open
Abstract
Selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) chemical probing serves as a convenient and efficient experiment technique for providing information about RNA local flexibility. The local structural information contained in SHAPE reactivity data can be used as constraints in 2D/3D structure predictions. Here, we present SHAPE predictoR (SHAPER), a web server for fast and accurate SHAPE reactivity prediction. The main purpose of the SHAPER web server is to provide a portal that uses experimental SHAPE data to refine 2D/3D RNA structure selection. Input structures for the SHAPER server can be obtained through experimental or computational modeling. The SHAPER server can accept RNA structures with single or multiple conformations, and the predicted SHAPE profile and correlation with experimental SHAPE data (if provided) for each conformation can be freely downloaded through the web portal. The SHAPER web server is available at http://rna.physics.missouri.edu/shaper/.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, University of Missouri, Columbia, MO, United States
| | - Jun Li
- Department of Physics and Astronomy, University of Missouri, Columbia, MO, United States
| | - Travis Hurst
- Department of Physics and Astronomy, University of Missouri, Columbia, MO, United States
| | - Shi-Jie Chen
- Department of Physics and Astronomy, University of Missouri, Columbia, MO, United States.,Department of Biochemistry, University of Missouri, Columbia, MO, United States.,Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO, United States
| |
Collapse
|
12
|
Yang TH, Wang CY, Tsai HC, Liu CT. Human IRES Atlas: an integrative platform for studying IRES-driven translational regulation in humans. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2021:6263636. [PMID: 33942874 PMCID: PMC8094437 DOI: 10.1093/database/baab025] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 04/16/2021] [Accepted: 04/23/2021] [Indexed: 11/13/2022]
Abstract
It is now known that cap-independent translation initiation facilitated by internal ribosome entry sites (IRESs) is vital in selective cellular protein synthesis under stress and different physiological conditions. However, three problems make it hard to understand transcriptome-wide cellular IRES-mediated translation initiation mechanisms: (i) complex interplay between IRESs and other translation initiation–related information, (ii) reliability issue of in silico cellular IRES investigation and (iii) labor-intensive in vivo IRES identification. In this research, we constructed the Human IRES Atlas database for a comprehensive understanding of cellular IRESs in humans. First, currently available and suitable IRES prediction tools (IRESfinder, PatSearch and IRESpy) were used to obtain transcriptome-wide human IRESs. Then, we collected eight genres of translation initiation–related features to help study the potential molecular mechanisms of each of the putative IRESs. Three functional tests (conservation, structural RNA–protein scores and conditional translation efficiency) were devised to evaluate the functionality of the identified putative IRESs. Moreover, an easy-to-use interface and an IRES–translation initiation interaction map for each gene transcript were implemented to help understand the interactions between IRESs and translation initiation–related features. Researchers can easily search/browse an IRES of interest using the web interface and deduce testable mechanism hypotheses of human IRES-driven translation initiation based on the integrated results. In summary, Human IRES Atlas integrates putative IRES elements and translation initiation–related experiments for better usage of these data and deduction of mechanism hypotheses. Database URL: http://cobishss0.im.nuk.edu.tw/Human_IRES_Atlas/
Collapse
Affiliation(s)
- Tzu-Hsien Yang
- Department of Information Management, National University of Kaohsiung, 700, Kaohsiung University Rd., Nanzih District, Kaohsiung, Taiwan 811, Republic of China
| | - Chung-Yu Wang
- Department of Information Management, National University of Kaohsiung, 700, Kaohsiung University Rd., Nanzih District, Kaohsiung, Taiwan 811, Republic of China
| | - Hsiu-Chun Tsai
- Department of Information Management, National University of Kaohsiung, 700, Kaohsiung University Rd., Nanzih District, Kaohsiung, Taiwan 811, Republic of China
| | - Cheng-Tse Liu
- Department of Information Management, National University of Kaohsiung, 700, Kaohsiung University Rd., Nanzih District, Kaohsiung, Taiwan 811, Republic of China
| |
Collapse
|
13
|
Hurst T, Chen SJ. Sieving RNA 3D Structures with SHAPE and Evaluating Mechanisms Driving Sequence-Dependent Reactivity Bias. J Phys Chem B 2021; 125:1156-1166. [PMID: 33497570 DOI: 10.1021/acs.jpcb.0c11365] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) chemical probing provides local RNA flexibility information at single-nucleotide resolution. In general, SHAPE is thought of as a secondary structure (2D) technology, but we find evidence that robust tertiary structure (3D) information is contained in SHAPE data. Here, we report a new model that achieves a higher correlation between SHAPE data and native RNA 3D structures than the previous 3D structure-SHAPE relationship model. Furthermore, we demonstrate that the new model improves our ability to discern between SHAPE-compatible and -incompatible structures on model decoys. After identifying sequence-dependent bias in SHAPE experiments, we propose a mechanism driving sequence-dependent bias in SHAPE experiments, using replica-exchange umbrella sampling simulations to confirm that the SHAPE sequence bias is largely explained by the stability of the unreacted SHAPE reagent in the binding pocket. Taken together, this work represents multiple practical advances in our mechanistic and predictive understanding of SHAPE technology.
Collapse
Affiliation(s)
- Travis Hurst
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| |
Collapse
|
14
|
|
15
|
Saaidi A, Allouche D, Regnier M, Sargueil B, Ponty Y. IPANEMAP: integrative probing analysis of nucleic acids empowered by multiple accessibility profiles. Nucleic Acids Res 2020; 48:8276-8289. [PMID: 32735675 PMCID: PMC7470984 DOI: 10.1093/nar/gkaa607] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 07/03/2020] [Accepted: 07/29/2020] [Indexed: 11/13/2022] Open
Abstract
The manual production of reliable RNA structure models from chemical probing experiments benefits from the integration of information derived from multiple protocols and reagents. However, the interpretation of multiple probing profiles remains a complex task, hindering the quality and reproducibility of modeling efforts. We introduce IPANEMAP, the first automated method for the modeling of RNA structure from multiple probing reactivity profiles. Input profiles can result from experiments based on diverse protocols, reagents, or collection of variants, and are jointly analyzed to predict the dominant conformations of an RNA. IPANEMAP combines sampling, clustering and multi-optimization, to produce secondary structure models that are both stable and well-supported by experimental evidences. The analysis of multiple reactivity profiles, both publicly available and produced in our study, demonstrates the good performances of IPANEMAP, even in a mono probing setting. It confirms the potential of integrating multiple sources of probing data, informing the design of informative probing assays.
Collapse
Affiliation(s)
- Afaf Saaidi
- CNRS UMR 7161, LIX, Ecole Polytechnique, Institut Polytechnique de Paris, 1 rue Estienne d'Orves, 91120 Palaiseau, France
| | - Delphine Allouche
- CNRS UMR 8038, CitCoM, Université de Paris, 4 avenue de l'observatoire, 75006 Paris, France
| | - Mireille Regnier
- CNRS UMR 7161, LIX, Ecole Polytechnique, Institut Polytechnique de Paris, 1 rue Estienne d'Orves, 91120 Palaiseau, France
| | - Bruno Sargueil
- CNRS UMR 8038, CitCoM, Université de Paris, 4 avenue de l'observatoire, 75006 Paris, France
| | - Yann Ponty
- CNRS UMR 7161, LIX, Ecole Polytechnique, Institut Polytechnique de Paris, 1 rue Estienne d'Orves, 91120 Palaiseau, France
| |
Collapse
|
16
|
Yu B, Lu Y, Zhang QC, Hou L. Prediction and differential analysis of RNA secondary structure. QUANTITATIVE BIOLOGY 2020. [DOI: 10.1007/s40484-020-0205-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
17
|
Shi S, Zhang XL, Yang L, Du W, Zhao XL, Wang YJ. Prediction of RNA Secondary Structure Using Quantum-inspired Genetic Algorithms. Curr Bioinform 2020. [DOI: 10.2174/1574893614666190916154103] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
The prediction of RNA secondary structure using optimization algorithms
is key to understand the real structure of an RNA. Evolutionary algorithms (EAs) are popular
strategies for RNA secondary structure prediction. However, compared to most state-of-the-art
software based on DPAs, the performances of EAs are a bit far from satisfactory.
Objective:
Therefore, a more powerful strategy is required to improve the performances of EAs
when applied to the prediciton of RNA secondary structures.
Methods:
The idea of quantum computing is introduced here yielding a new strategy to find all
possible legal paired-bases with the constraint of minimum free energy. The sate of a stem pool
with size N is encoded as a population of QGA, which is represented by N quantum bits but not
classical bits. The updating of populations is accomplished by so-called quantum crossover
operations, quantum mutation operations and quantum rotation operations.
Results:
The numerical results show that the performances of traditional EAs are significantly
improved by using QGA with regard to not only prediction accuracy and sensitivity but also
complexity. Moreover, for RNA sequences with middle-short length, QGA even improves the
state-of-art software based on DPAs in terms of both prediction accuracy and sensitivity.
Conclusion:
This work sheds an interesting light on the applications of quantum computing on
RNA structure prediction.
Collapse
Affiliation(s)
- Sha Shi
- Engineering Research Centre of Molecular and Neuro Imaging Ministry of Education, School of life Science and Technology, Xidian University, Xi’an, China
| | - Xin-Li Zhang
- Xinxiang Medical University, Xinxiang, Henan, China
| | - Le Yang
- The First Affiliated Hospical of Xi’an Jiaotong University, Xi’an, China
| | - Wei Du
- The First Affiliated Hospical of Zhengzhou University, Zhengzhou, China
| | - Xian-Li Zhao
- Northwestern Women and Children’s Hospital, Xi'an, China
| | - Yun-Jiang Wang
- State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an, China
| |
Collapse
|
18
|
Kai Z, Yuting W, Yulin L, Jun L, Juanjuan H. An efficient simulated annealing algorithm for the RNA secondary structure prediction with Pseudoknots. BMC Genomics 2019; 20:979. [PMID: 31881969 PMCID: PMC6933665 DOI: 10.1186/s12864-019-6300-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND RNA pseudoknot structures play an important role in biological processes. However, existing RNA secondary structure prediction algorithms cannot predict the pseudoknot structure efficiently. Although random matching can improve the number of base pairs, these non-consecutive base pairs cannot make contributions to reduce the free energy. RESULT In order to improve the efficiency of searching procedure, our algorithm take consecutive base pairs as the basic components. Firstly, our algorithm calculates and archive all the consecutive base pairs in triplet data structure, if the number of consecutive base pairs is greater than given minimum stem length. Secondly, the annealing schedule is adapted to select the optimal solution that has minimum free energy. Finally, the proposed algorithm is evaluated with the real instances in PseudoBase. CONCLUSION The experimental results have been demonstrated to provide a competitive and oftentimes better performance when compared against some chosen state-of-the-art RNA structure prediction algorithms.
Collapse
Affiliation(s)
- Zhang Kai
- School of Computer Science, Wuhan University of Science and Technology, Wuhan, 430081, China
- Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan, 430081, China
| | - Wang Yuting
- School of Computer Science, Wuhan University of Science and Technology, Wuhan, 430081, China
| | - Lv Yulin
- School of Computer Science, Wuhan University of Science and Technology, Wuhan, 430081, China
| | - Liu Jun
- School of Computer Science, Wuhan University of Science and Technology, Wuhan, 430081, China
- Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan, 430081, China
| | - He Juanjuan
- School of Computer Science, Wuhan University of Science and Technology, Wuhan, 430081, China.
| |
Collapse
|
19
|
Shi S, Zhang XL, Zhao XL, Yang L, Du W, Wang YJ. Prediction of the RNA Secondary Structure Using a Multi-Population Assisted Quantum Genetic Algorithm. Hum Hered 2019; 84:1-8. [PMID: 31461710 DOI: 10.1159/000501480] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 06/13/2019] [Indexed: 12/15/2022] Open
Abstract
Quantum-inspired genetic algorithms (QGAs) were recently introduced for the prediction of RNA secondary structures, and they showed some superiority over the existing popular strategies. In this paper, for RNA secondary structure prediction, we introduce a new QGA named multi-population assisted quantum genetic algorithm (MAQGA). In contrast to the existing QGAs, our strategy involves multi-populations which evolve together in a cooperative way in each iteration, and the genetic exchange between various populations is performed by an operator transfer operation. The numerical results show that the performances of existing genetic algorithms (evolutionary algorithms [EAs]), including traditional EAs and QGAs, can be significantly improved by using our approach. Moreover, for RNA sequences with middle-short length, the MAQGA improves even this state-of-the-art software in terms of both prediction accuracy and sensitivity.
Collapse
Affiliation(s)
- Sha Shi
- Engineering Research Center of Molecular and Neuroimaging, Ministry of Education of China, and School of Life Science and Technology, Xidian University, Xi'an, China
| | | | - Xian-Li Zhao
- Northwestern Women and Children's Hospital, Xi'an, China
| | - Le Yang
- The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an Jiaotong University, Xi'an, China
| | - Wei Du
- The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yun-Jiang Wang
- The State Key Laboratory of Integrated Services Network (ISN), Xidian University, Xi'an, China,
| |
Collapse
|
20
|
Spasic A, Assmann SM, Bevilacqua PC, Mathews DH. Modeling RNA secondary structure folding ensembles using SHAPE mapping data. Nucleic Acids Res 2019; 46:314-323. [PMID: 29177466 PMCID: PMC5758915 DOI: 10.1093/nar/gkx1057] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 10/30/2017] [Indexed: 12/22/2022] Open
Abstract
RNA secondary structure prediction is widely used for developing hypotheses about the structures of RNA sequences, and structure can provide insight about RNA function. The accuracy of structure prediction is known to be improved using experimental mapping data that provide information about the pairing status of single nucleotides, and these data can now be acquired for whole transcriptomes using high-throughput sequencing. Prior methods for using these experimental data focused on predicting structures for sequences assuming that they populate a single structure. Most RNAs populate multiple structures, however, where the ensemble of strands populates structures with different sets of canonical base pairs. The focus on modeling single structures has been a bottleneck for accurately modeling RNA structure. In this work, we introduce Rsample, an algorithm for using experimental data to predict more than one RNA structure for sequences that populate multiple structures at equilibrium. We demonstrate, using SHAPE mapping data, that we can accurately model RNA sequences that populate multiple structures, including the relative probabilities of those structures. This program is freely available as part of the RNAstructure software package.
Collapse
Affiliation(s)
- Aleksandar Spasic
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Sarah M Assmann
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Philip C Bevilacqua
- Department of Chemistry, Department of Biochemistry & Molecular Biology, Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| |
Collapse
|
21
|
Choudhary K, Lai YH, Tran EJ, Aviran S. dStruct: identifying differentially reactive regions from RNA structurome profiling data. Genome Biol 2019; 20:40. [PMID: 30791935 PMCID: PMC6385470 DOI: 10.1186/s13059-019-1641-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Accepted: 01/24/2019] [Indexed: 12/16/2022] Open
Abstract
RNA biology is revolutionized by recent developments of diverse high-throughput technologies for transcriptome-wide profiling of molecular RNA structures. RNA structurome profiling data can be used to identify differentially structured regions between groups of samples. Existing methods are limited in scope to specific technologies and/or do not account for biological variation. Here, we present dStruct which is the first broadly applicable method for differential analysis accounting for biological variation in structurome profiling data. dStruct is compatible with diverse profiling technologies, is validated with experimental data and simulations, and outperforms existing methods.
Collapse
Affiliation(s)
- Krishna Choudhary
- Department of Biomedical Engineering and Genome Center, University of California, Davis, One Shields Avenue, Davis, 95616 CA USA
| | - Yu-Hsuan Lai
- Department of Biochemistry, Purdue University, BCHM 305, 175 S. University Street, West Lafayette, 47907-2063 IN USA
| | - Elizabeth J. Tran
- Department of Biochemistry, Purdue University, BCHM 305, 175 S. University Street, West Lafayette, 47907-2063 IN USA
- Purdue University Center for Cancer Research, Purdue University, Hansen Life Sciences Research Building, Room 141, 201 S. University Street, West Lafayette, 47907-2064 IN USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California, Davis, One Shields Avenue, Davis, 95616 CA USA
| |
Collapse
|
22
|
Extracting information from RNA SHAPE data: Kalman filtering approach. PLoS One 2018; 13:e0207029. [PMID: 30462682 PMCID: PMC6248965 DOI: 10.1371/journal.pone.0207029] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2018] [Accepted: 10/23/2018] [Indexed: 01/26/2023] Open
Abstract
RNA SHAPE experiments have become important and successful sources of information for RNA structure prediction. In such experiments, chemical reagents are used to probe RNA backbone flexibility at the nucleotide level, which in turn provides information on base pairing and therefore secondary structure. Little is known, however, about the statistics of such SHAPE data. In this work, we explore different representations of noise in SHAPE data and propose a statistically sound framework for extracting reliable reactivity information from multiple SHAPE replicates. Our analyses of RNA SHAPE experiments underscore that a normal noise model is not adequate to represent their data. We propose instead a log-normal representation of noise and discuss its relevance. Under this assumption, we observe that processing simulated SHAPE data by directly averaging different replicates leads to bias. Such bias can be reduced by analyzing the data following a log transformation, either by log-averaging or Kalman filtering. Application of Kalman filtering has the additional advantage that a prior on the nucleotide reactivities can be introduced. We show that the performance of Kalman filtering is then directly dependent on the quality of that prior. We conclude the paper with guidelines on signal processing of RNA SHAPE data.
Collapse
|
23
|
Automated Recognition of RNA Structure Motifs by Their SHAPE Data Signatures. Genes (Basel) 2018; 9:genes9060300. [PMID: 29904019 PMCID: PMC6027059 DOI: 10.3390/genes9060300] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 06/04/2018] [Accepted: 06/13/2018] [Indexed: 02/03/2023] Open
Abstract
High-throughput structure profiling (SP) experiments that provide information at nucleotide resolution are revolutionizing our ability to study RNA structures. Of particular interest are RNA elements whose underlying structures are necessary for their biological functions. We previously introduced patteRNA, an algorithm for rapidly mining SP data for patterns characteristic of such motifs. This work provided a proof-of-concept for the detection of motifs and the capability of distinguishing structures displaying pronounced conformational changes. Here, we describe several improvements and automation routines to patteRNA. We then consider more elaborate biological situations starting with the comparison or integration of results from searches for distinct motifs and across datasets. To facilitate such analyses, we characterize patteRNA’s outputs and describe a normalization framework that regularizes results. We then demonstrate that our algorithm successfully discerns between highly similar structural variants of the human immunodeficiency virus type 1 (HIV-1) Rev response element (RRE) and readily identifies its exact location in whole-genome structure profiles of HIV-1. This work highlights the breadth of information that can be gleaned from SP data and broadens the utility of data-driven methods as tools for the detection of novel RNA elements.
Collapse
|
24
|
Ledda M, Aviran S. PATTERNA: transcriptome-wide search for functional RNA elements via structural data signatures. Genome Biol 2018; 19:28. [PMID: 29495968 PMCID: PMC5833111 DOI: 10.1186/s13059-018-1399-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 01/30/2018] [Indexed: 02/08/2023] Open
Abstract
Establishing a link between RNA structure and function remains a great challenge in RNA biology. The emergence of high-throughput structure profiling experiments is revolutionizing our ability to decipher structure, yet principled approaches for extracting information on structural elements directly from these data sets are lacking. We present PATTERNA, an unsupervised pattern recognition algorithm that rapidly mines RNA structure motifs from profiling data. We demonstrate that PATTERNA detects motifs with an accuracy comparable to commonly used thermodynamic models and highlight its utility in automating data-directed structure modeling from large data sets. PATTERNA is versatile and compatible with diverse profiling techniques and experimental conditions.
Collapse
Affiliation(s)
- Mirko Ledda
- Department of Biomedical Engineering and Genome Center, UC Davis, 1 Shields Ave, Davis, 95616 USA
- Integrative Genetics and Genomics Graduate Group, UC Davis, 1 Shields Ave, Davis, 95616 USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, UC Davis, 1 Shields Ave, Davis, 95616 USA
| |
Collapse
|
25
|
Tian S, Kladwang W, Das R. Allosteric mechanism of the V. vulnificus adenine riboswitch resolved by four-dimensional chemical mapping. eLife 2018; 7:29602. [PMID: 29446752 PMCID: PMC5847336 DOI: 10.7554/elife.29602] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Accepted: 02/13/2018] [Indexed: 12/23/2022] Open
Abstract
The structural interconversions that mediate the gene regulatory functions of RNA molecules may be different from classic models of allostery, but the relevant structural correlations have remained elusive in even intensively studied systems. Here, we present a four-dimensional expansion of chemical mapping called lock-mutate-map-rescue (LM2R), which integrates multiple layers of mutation with nucleotide-resolution chemical mapping. This technique resolves the core mechanism of the adenine-responsive V. vulnificus add riboswitch, a paradigmatic system for which both Monod-Wyman-Changeux (MWC) conformational selection models and non-MWC alternatives have been proposed. To discriminate amongst these models, we locked each functionally important helix through designed mutations and assessed formation or depletion of other helices via compensatory rescue evaluated by chemical mapping. These LM2R measurements give strong support to the pre-existing correlations predicted by MWC models, disfavor alternative models, and suggest additional structural heterogeneities that may be general across ligand-free riboswitches.
Collapse
Affiliation(s)
- Siqi Tian
- Department of Biochemistry, Stanford University, Stanford, United States
| | - Wipapat Kladwang
- Department of Biochemistry, Stanford University, Stanford, United States
| | - Rhiju Das
- Department of Physics, Stanford University, Stanford, United States
| |
Collapse
|
26
|
Statistical modeling of RNA structure profiling experiments enables parsimonious reconstruction of structure landscapes. Nat Commun 2018; 9:606. [PMID: 29426922 PMCID: PMC5807309 DOI: 10.1038/s41467-018-02923-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Accepted: 01/09/2018] [Indexed: 11/23/2022] Open
Abstract
RNA plays key regulatory roles in diverse cellular processes, where its functionality often derives from folding into and converting between structures. Many RNAs further rely on co-existence of alternative structures, which govern their response to cellular signals. However, characterizing heterogeneous landscapes is difficult, both experimentally and computationally. Recently, structure profiling experiments have emerged as powerful and affordable structure characterization methods, which improve computational structure prediction. To date, efforts have centered on predicting one optimal structure, with much less progress made on multiple-structure prediction. Here, we report a probabilistic modeling approach that predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data. We demonstrate robust landscape reconstruction and quantitative insights into structural dynamics by analyzing numerous data sets. This work establishes a framework for data-directed characterization of structure landscapes to aid experimentalists in performing structure-function studies. Different experimental and computational approaches can be used to study RNA structures. Here, the authors present a computational method for data-directed reconstruction of complex RNA structure landscapes, which predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data.
Collapse
|
27
|
Tan Z, Sharma G, Mathews DH. Modeling RNA Secondary Structure with Sequence Comparison and Experimental Mapping Data. Biophys J 2017; 113:330-338. [PMID: 28735622 DOI: 10.1016/j.bpj.2017.06.039] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Revised: 06/07/2017] [Accepted: 06/19/2017] [Indexed: 10/19/2022] Open
Abstract
Secondary structure prediction is an important problem in RNA bioinformatics because knowledge of structure is critical to understanding the functions of RNA sequences. Significant improvements in prediction accuracy have recently been demonstrated though the incorporation of experimentally obtained structural information, for instance using selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) mapping. However, such mapping data is currently available only for a limited number of RNA sequences. In this article, we present a method for extending the benefit of experimental mapping data in secondary structure prediction to homologous sequences. Specifically, we propose a method for integrating experimental mapping data into a comparative sequence analysis algorithm for secondary structure prediction of multiple homologs, whereby the mapping data benefits not only the prediction for the specific sequence that was mapped but also other homologs. The proposed method is realized by modifying the TurboFold II algorithm for prediction of RNA secondary structures to utilize basepairing probabilities guided by SHAPE experimental data when such data are available. The SHAPE-mapping-guided basepairing probabilities are obtained using the RSample method. Results demonstrate that the SHAPE mapping data for a sequence improves structure prediction accuracy of other homologous sequences beyond the accuracy obtained by sequence comparison alone (TurboFold II). The updated version of TurboFold II is freely available as part of the RNAstructure software package.
Collapse
Affiliation(s)
- Zhen Tan
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, New York; Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - Gaurav Sharma
- Center for RNA Biology, University of Rochester Medical Center, Rochester, New York; Department of Electrical and Computer Engineering, University of Rochester Medical Center, Rochester, New York; Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York.
| | - David H Mathews
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, New York; Center for RNA Biology, University of Rochester Medical Center, Rochester, New York; Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York.
| |
Collapse
|
28
|
Ignatova Z, Narberhaus F. Systematic probing of the bacterial RNA structurome to reveal new functions. Curr Opin Microbiol 2017; 36:14-19. [DOI: 10.1016/j.mib.2017.01.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Revised: 01/06/2017] [Accepted: 01/11/2017] [Indexed: 12/15/2022]
|
29
|
Choudhary K, Deng F, Aviran S. Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions. QUANTITATIVE BIOLOGY 2017; 5:3-24. [PMID: 28717530 PMCID: PMC5510538 DOI: 10.1007/s40484-017-0093-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 12/08/2016] [Accepted: 12/15/2016] [Indexed: 12/30/2022]
Abstract
BACKGROUND Structure profiling experiments provide single-nucleotide information on RNA structure. Recent advances in chemistry combined with application of high-throughput sequencing have enabled structure profiling at transcriptome scale and in living cells, creating unprecedented opportunities for RNA biology. Propelled by these experimental advances, massive data with ever-increasing diversity and complexity have been generated, which give rise to new challenges in interpreting and analyzing these data. RESULTS We review current practices in analysis of structure profiling data with emphasis on comparative and integrative analysis as well as highlight emerging questions. Comparative analysis has revealed structural patterns across transcriptomes and has become an integral component of recent profiling studies. Additionally, profiling data can be integrated into traditional structure prediction algorithms to improve prediction accuracy. CONCLUSIONS To keep pace with experimental developments, methods to facilitate, enhance and refine such analyses are needed. Parallel advances in analysis methodology will complement profiling technologies and help them reach their full potential.
Collapse
Affiliation(s)
| | | | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA 95616, USA
| |
Collapse
|
30
|
Robust statistical modeling improves sensitivity of high-throughput RNA structure probing experiments. Nat Methods 2016; 14:83-89. [DOI: 10.1038/nmeth.4068] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2016] [Accepted: 10/03/2016] [Indexed: 12/20/2022]
|
31
|
Choudhary K, Shih NP, Deng F, Ledda M, Li B, Aviran S. Metrics for rapid quality control in RNA structure probing experiments. Bioinformatics 2016; 32:3575-3583. [PMID: 27497441 DOI: 10.1093/bioinformatics/btw501] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Revised: 07/02/2016] [Accepted: 07/26/2016] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The diverse functionalities of RNA can be attributed to its capacity to form complex and varied structures. The recent proliferation of new structure probing techniques coupled with high-throughput sequencing has helped RNA studies expand in both scope and depth. Despite differences in techniques, most experiments face similar challenges in reproducibility due to the stochastic nature of chemical probing and sequencing. As these protocols expand to transcriptome-wide studies, quality control becomes a more daunting task. General and efficient methodologies are needed to quantify variability and quality in the wide range of current and emerging structure probing experiments. RESULTS We develop metrics to rapidly and quantitatively evaluate data quality from structure probing experiments, demonstrating their efficacy on both small synthetic libraries and transcriptome-wide datasets. We use a signal-to-noise ratio concept to evaluate replicate agreement, which has the capacity to identify high-quality data. We also consider and compare two methods to assess variability inherent in probing experiments, which we then utilize to evaluate the coverage adjustments needed to meet desired quality. The developed metrics and tools will be useful in summarizing large-scale datasets and will help standardize quality control in the field. AVAILABILITY AND IMPLEMENTATION The data and methods used in this article are freely available at: http://bme.ucdavis.edu/aviranlab/SPEQC_software CONTACT: saviran@ucdavis.eduSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Krishna Choudhary
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Nathan P Shih
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Fei Deng
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Mirko Ledda
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Bo Li
- Center for RNA Systems Biology, University of California at Berkeley, Berkeley, CA, USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| |
Collapse
|