1
|
Marangio P, Law KYT, Sanguinetti G, Granneman S. diffBUM-HMM: a robust statistical modeling approach for detecting RNA flexibility changes in high-throughput structure probing data. Genome Biol 2021; 22:165. [PMID: 34044851 PMCID: PMC8157727 DOI: 10.1186/s13059-021-02379-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 05/10/2021] [Indexed: 11/21/2022] Open
Abstract
Advancing RNA structural probing techniques with next-generation sequencing has generated demands for complementary computational tools to robustly extract RNA structural information amidst sampling noise and variability. We present diffBUM-HMM, a noise-aware model that enables accurate detection of RNA flexibility and conformational changes from high-throughput RNA structure-probing data. diffBUM-HMM is widely compatible, accounting for sampling variation and sequence coverage biases, and displays higher sensitivity than existing methods while robust against false positives. Our analyses of datasets generated with a variety of RNA probing chemistries demonstrate the value of diffBUM-HMM for quantitatively detecting RNA structural changes and RNA-binding protein binding sites.
Collapse
Affiliation(s)
- Paolo Marangio
- School of Informatics, The University of Edinburgh, Edinburgh, UK
- SISSA Data Science Excellence Department Initiative, Trieste, Italy
| | - Ka Ying Toby Law
- Centre for Synthetic and Systems Biology, The University of Edinburgh, Edinburgh, UK
| | - Guido Sanguinetti
- Centre for Synthetic and Systems Biology, The University of Edinburgh, Edinburgh, UK.
- School of Informatics, The University of Edinburgh, Edinburgh, UK.
- SISSA Data Science Excellence Department Initiative, Trieste, Italy.
| | - Sander Granneman
- Centre for Synthetic and Systems Biology, The University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
2
|
Abstract
RNA performs and regulates a diverse range of cellular processes, with new functional roles being uncovered at a rapid pace. Interest is growing in how these functions are linked to RNA structures that form in the complex cellular environment. A growing suite of technologies that use advances in RNA structural probes, high-throughput sequencing and new computational approaches to interrogate RNA structure at unprecedented throughput are beginning to provide insights into RNA structures at new spatial, temporal and cellular scales.
Collapse
Affiliation(s)
- Eric J Strobel
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | - Angela M Yu
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Julius B Lucks
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
3
|
Genome-Wide Discovery of DEAD-Box RNA Helicase Targets Reveals RNA Structural Remodeling in Transcription Termination. Genetics 2019; 212:153-174. [PMID: 30902808 DOI: 10.1534/genetics.119.302058] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 03/19/2019] [Indexed: 11/18/2022] Open
Abstract
RNA helicases are a class of enzymes that unwind RNA duplexes in vitro but whose cellular functions are largely enigmatic. Here, we provide evidence that the DEAD-box protein Dbp2 remodels RNA-protein complex (RNP) structure to facilitate efficient termination of transcription in Saccharomyces cerevisiae via the Nrd1-Nab3-Sen1 (NNS) complex. First, we find that loss of DBP2 results in RNA polymerase II accumulation at the 3' ends of small nucleolar RNAs and a subset of mRNAs. In addition, Dbp2 associates with RNA sequence motifs and regions bound by Nrd1 and can promote its recruitment to NNS-targeted regions. Using Structure-seq, we find altered RNA/RNP structures in dbp2∆ cells that correlate with inefficient termination. We also show a positive correlation between the stability of structures in the 3' ends and a requirement for Dbp2 in termination. Taken together, these studies provide a role for RNA remodeling by Dbp2 and further suggests a mechanism whereby RNA structure is exploited for gene regulation.
Collapse
|
4
|
Choudhary K, Lai YH, Tran EJ, Aviran S. dStruct: identifying differentially reactive regions from RNA structurome profiling data. Genome Biol 2019; 20:40. [PMID: 30791935 PMCID: PMC6385470 DOI: 10.1186/s13059-019-1641-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Accepted: 01/24/2019] [Indexed: 12/16/2022] Open
Abstract
RNA biology is revolutionized by recent developments of diverse high-throughput technologies for transcriptome-wide profiling of molecular RNA structures. RNA structurome profiling data can be used to identify differentially structured regions between groups of samples. Existing methods are limited in scope to specific technologies and/or do not account for biological variation. Here, we present dStruct which is the first broadly applicable method for differential analysis accounting for biological variation in structurome profiling data. dStruct is compatible with diverse profiling technologies, is validated with experimental data and simulations, and outperforms existing methods.
Collapse
Affiliation(s)
- Krishna Choudhary
- Department of Biomedical Engineering and Genome Center, University of California, Davis, One Shields Avenue, Davis, 95616 CA USA
| | - Yu-Hsuan Lai
- Department of Biochemistry, Purdue University, BCHM 305, 175 S. University Street, West Lafayette, 47907-2063 IN USA
| | - Elizabeth J. Tran
- Department of Biochemistry, Purdue University, BCHM 305, 175 S. University Street, West Lafayette, 47907-2063 IN USA
- Purdue University Center for Cancer Research, Purdue University, Hansen Life Sciences Research Building, Room 141, 201 S. University Street, West Lafayette, 47907-2064 IN USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California, Davis, One Shields Avenue, Davis, 95616 CA USA
| |
Collapse
|
5
|
Extracting information from RNA SHAPE data: Kalman filtering approach. PLoS One 2018; 13:e0207029. [PMID: 30462682 PMCID: PMC6248965 DOI: 10.1371/journal.pone.0207029] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2018] [Accepted: 10/23/2018] [Indexed: 01/26/2023] Open
Abstract
RNA SHAPE experiments have become important and successful sources of information for RNA structure prediction. In such experiments, chemical reagents are used to probe RNA backbone flexibility at the nucleotide level, which in turn provides information on base pairing and therefore secondary structure. Little is known, however, about the statistics of such SHAPE data. In this work, we explore different representations of noise in SHAPE data and propose a statistically sound framework for extracting reliable reactivity information from multiple SHAPE replicates. Our analyses of RNA SHAPE experiments underscore that a normal noise model is not adequate to represent their data. We propose instead a log-normal representation of noise and discuss its relevance. Under this assumption, we observe that processing simulated SHAPE data by directly averaging different replicates leads to bias. Such bias can be reduced by analyzing the data following a log transformation, either by log-averaging or Kalman filtering. Application of Kalman filtering has the additional advantage that a prior on the nucleotide reactivities can be introduced. We show that the performance of Kalman filtering is then directly dependent on the quality of that prior. We conclude the paper with guidelines on signal processing of RNA SHAPE data.
Collapse
|
6
|
Choudhary K, Ruan L, Deng F, Shih N, Aviran S. SEQualyzer: interactive tool for quality control and exploratory analysis of high-throughput RNA structural profiling data. Bioinformatics 2018; 33:441-443. [PMID: 28172632 DOI: 10.1093/bioinformatics/btw627] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 09/25/2016] [Accepted: 09/26/2016] [Indexed: 11/14/2022] Open
Abstract
Summary To serve numerous functional roles, RNA must fold into specific structures. Determining these structures is thus of paramount importance. The recent advent of high-throughput sequencing-based structure profiling experiments has provided important insights into RNA structure and widened the scope of RNA studies. However, as a broad range of approaches continues to emerge, a universal framework is needed to quantitatively ensure consistent and high-quality data. We present SEQualyzer, a visual and interactive application that makes it easy and efficient to gauge data quality, screen for transcripts with high-quality information and identify discordant replicates in structure profiling experiments. Our methods rely on features common to a wide range of protocols and can serve as standards for quality control and analyses. Availability and Implementation SEQualyzer is written in R, is platform-independent, and is freely available at http://bme.ucdavis.edu/aviranlab/SEQualyzer. Contact saviran@ucdavis.edu Supplementary Informantion Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Krishna Choudhary
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Luyao Ruan
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Fei Deng
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Nathan Shih
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA, USA
| |
Collapse
|
7
|
Automated Recognition of RNA Structure Motifs by Their SHAPE Data Signatures. Genes (Basel) 2018; 9:genes9060300. [PMID: 29904019 PMCID: PMC6027059 DOI: 10.3390/genes9060300] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 06/04/2018] [Accepted: 06/13/2018] [Indexed: 02/03/2023] Open
Abstract
High-throughput structure profiling (SP) experiments that provide information at nucleotide resolution are revolutionizing our ability to study RNA structures. Of particular interest are RNA elements whose underlying structures are necessary for their biological functions. We previously introduced patteRNA, an algorithm for rapidly mining SP data for patterns characteristic of such motifs. This work provided a proof-of-concept for the detection of motifs and the capability of distinguishing structures displaying pronounced conformational changes. Here, we describe several improvements and automation routines to patteRNA. We then consider more elaborate biological situations starting with the comparison or integration of results from searches for distinct motifs and across datasets. To facilitate such analyses, we characterize patteRNA’s outputs and describe a normalization framework that regularizes results. We then demonstrate that our algorithm successfully discerns between highly similar structural variants of the human immunodeficiency virus type 1 (HIV-1) Rev response element (RRE) and readily identifies its exact location in whole-genome structure profiles of HIV-1. This work highlights the breadth of information that can be gleaned from SP data and broadens the utility of data-driven methods as tools for the detection of novel RNA elements.
Collapse
|
8
|
Watters KE, Choudhary K, Aviran S, Lucks JB, Perry KL, Thompson JR. Probing of RNA structures in a positive sense RNA virus reveals selection pressures for structural elements. Nucleic Acids Res 2018; 46:2573-2584. [PMID: 29294088 PMCID: PMC5861449 DOI: 10.1093/nar/gkx1273] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Revised: 12/07/2017] [Accepted: 12/18/2017] [Indexed: 12/20/2022] Open
Abstract
In single stranded (+)-sense RNA viruses, RNA structural elements (SEs) play essential roles in the infection process from replication to encapsidation. Using selective 2'-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq) and covariation analysis, we explore the structural features of the third genome segment of cucumber mosaic virus (CMV), RNA3 (2216 nt), both in vitro and in plant cell lysates. Comparing SHAPE-Seq and covariation analysis results revealed multiple SEs in the coat protein open reading frame and 3' untranslated region. Four of these SEs were mutated and serially passaged in Nicotiana tabacum plants to identify biologically selected changes to the original mutated sequences. After passaging, loop mutants showed partial reversion to their wild-type sequence and SEs that were structurally disrupted by mutations were restored to wild-type-like structures via synonymous mutations in planta. These results support the existence and selection of virus open reading frame SEs in the host organism and provide a framework for further studies on the role of RNA structure in viral infection. Additionally, this work demonstrates the applicability of high-throughput chemical probing in plant cell lysates and presents a new method for calculating SHAPE reactivities from overlapping reverse transcriptase priming sites.
Collapse
Affiliation(s)
- Kyle E Watters
- Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Krishna Choudhary
- Department of Biomedical Engineering and Genome Center, University of California Davis, Davis, CA, USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California Davis, Davis, CA, USA
| | - Julius B Lucks
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60201, USA
| | - Keith L Perry
- Plant Pathology and Plant-Microbe Biology Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, USA
| | - Jeremy R Thompson
- Plant Pathology and Plant-Microbe Biology Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, USA
| |
Collapse
|
9
|
Ledda M, Aviran S. PATTERNA: transcriptome-wide search for functional RNA elements via structural data signatures. Genome Biol 2018; 19:28. [PMID: 29495968 PMCID: PMC5833111 DOI: 10.1186/s13059-018-1399-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 01/30/2018] [Indexed: 02/08/2023] Open
Abstract
Establishing a link between RNA structure and function remains a great challenge in RNA biology. The emergence of high-throughput structure profiling experiments is revolutionizing our ability to decipher structure, yet principled approaches for extracting information on structural elements directly from these data sets are lacking. We present PATTERNA, an unsupervised pattern recognition algorithm that rapidly mines RNA structure motifs from profiling data. We demonstrate that PATTERNA detects motifs with an accuracy comparable to commonly used thermodynamic models and highlight its utility in automating data-directed structure modeling from large data sets. PATTERNA is versatile and compatible with diverse profiling techniques and experimental conditions.
Collapse
Affiliation(s)
- Mirko Ledda
- Department of Biomedical Engineering and Genome Center, UC Davis, 1 Shields Ave, Davis, 95616 USA
- Integrative Genetics and Genomics Graduate Group, UC Davis, 1 Shields Ave, Davis, 95616 USA
| | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, UC Davis, 1 Shields Ave, Davis, 95616 USA
| |
Collapse
|
10
|
Statistical modeling of RNA structure profiling experiments enables parsimonious reconstruction of structure landscapes. Nat Commun 2018; 9:606. [PMID: 29426922 PMCID: PMC5807309 DOI: 10.1038/s41467-018-02923-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Accepted: 01/09/2018] [Indexed: 11/23/2022] Open
Abstract
RNA plays key regulatory roles in diverse cellular processes, where its functionality often derives from folding into and converting between structures. Many RNAs further rely on co-existence of alternative structures, which govern their response to cellular signals. However, characterizing heterogeneous landscapes is difficult, both experimentally and computationally. Recently, structure profiling experiments have emerged as powerful and affordable structure characterization methods, which improve computational structure prediction. To date, efforts have centered on predicting one optimal structure, with much less progress made on multiple-structure prediction. Here, we report a probabilistic modeling approach that predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data. We demonstrate robust landscape reconstruction and quantitative insights into structural dynamics by analyzing numerous data sets. This work establishes a framework for data-directed characterization of structure landscapes to aid experimentalists in performing structure-function studies. Different experimental and computational approaches can be used to study RNA structures. Here, the authors present a computational method for data-directed reconstruction of complex RNA structure landscapes, which predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data.
Collapse
|
11
|
Choudhary K, Deng F, Aviran S. Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions. QUANTITATIVE BIOLOGY 2017; 5:3-24. [PMID: 28717530 PMCID: PMC5510538 DOI: 10.1007/s40484-017-0093-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 12/08/2016] [Accepted: 12/15/2016] [Indexed: 12/30/2022]
Abstract
BACKGROUND Structure profiling experiments provide single-nucleotide information on RNA structure. Recent advances in chemistry combined with application of high-throughput sequencing have enabled structure profiling at transcriptome scale and in living cells, creating unprecedented opportunities for RNA biology. Propelled by these experimental advances, massive data with ever-increasing diversity and complexity have been generated, which give rise to new challenges in interpreting and analyzing these data. RESULTS We review current practices in analysis of structure profiling data with emphasis on comparative and integrative analysis as well as highlight emerging questions. Comparative analysis has revealed structural patterns across transcriptomes and has become an integral component of recent profiling studies. Additionally, profiling data can be integrated into traditional structure prediction algorithms to improve prediction accuracy. CONCLUSIONS To keep pace with experimental developments, methods to facilitate, enhance and refine such analyses are needed. Parallel advances in analysis methodology will complement profiling technologies and help them reach their full potential.
Collapse
Affiliation(s)
| | | | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA 95616, USA
| |
Collapse
|