51
|
Bonilla SL, Sherlock ME, MacFadden A, Kieft JS. A viral RNA hijacks host machinery using dynamic conformational changes of a tRNA-like structure. Science 2021; 374:955-960. [PMID: 34793227 PMCID: PMC9033304 DOI: 10.1126/science.abe8526] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Viruses require multifunctional structured RNAs to hijack their host’s biochemistry, but their mechanisms can be obscured by the difficulty of solving conformationally dynamic RNA structures. Using cryo–electron microscopy (cryo-EM), we visualized the structure of the mysterious viral transfer RNA (tRNA)–like structure (TLS) from the brome mosaic virus, which affects replication, translation, and genome encapsidation. Structures in isolation and those bound to tyrosyl-tRNA synthetase (TyrRS) show that this ~55-kilodalton purported tRNA mimic undergoes large conformational rearrangements to bind TyrRS in a form that differs substantially from that of tRNA. Our study reveals how viral RNAs can use a combination of static and dynamic RNA structures to bind host machinery through highly noncanonical interactions, and we highlight the utility of cryo-EM for visualizing small, conformationally dynamic structured RNAs.
Collapse
Affiliation(s)
- Steve L. Bonilla
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Madeline E. Sherlock
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Andrea MacFadden
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Jeffrey S. Kieft
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
- RNA BioScience Initiative, University of Colorado Anschutz Medical Campus, Aurora, CO 10 80045, USA
| |
Collapse
|
52
|
Li S, Zhang H, Zhang L, Liu K, Liu B, Mathews DH, Huang L. LinearTurboFold: Linear-Time Global Prediction of Conserved Structures for RNA Homologs with Applications to SARS-CoV-2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021:2020.11.23.393488. [PMID: 34816262 PMCID: PMC8609897 DOI: 10.1101/2020.11.23.393488] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in SARS-CoV-2 genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length, and are thus infeasible for coronaviruses, which possess the longest genomes (∼30,000 nt ) among RNA viruses. As a result, existing efforts on modeling SARS-CoV-2 structures resort to single sequence folding as well as local folding methods with short window sizes, which inevitably neglect long-range interactions that are crucial in RNA functions. Here we present LinearTurboFold, an efficient algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. Surprisingly, on a group of SARS-CoV-2 and SARS-related genomes, LinearTurbo-Fold's purely in silico prediction not only is close to experimentally-guided models for local structures, but also goes far beyond them by capturing the end-to-end pairs between 5' and 3' UTRs (∼29,800 nt apart) that match perfectly with a purely experimental work. Furthermore, LinearTurboFold identifies novel conserved structures and conserved accessible regions as potential targets for designing efficient and mutation-insensitive small-molecule drugs, antisense oligonucleotides, siRNAs, CRISPR-Cas13 guide RNAs and RT-PCR primers. LinearTurboFold is a general technique that can also be applied to other RNA viruses and full-length genome studies, and will be a useful tool in fighting the current and future pandemics. SIGNIFICANCE STATEMENT Conserved RNA structures are critical for designing diagnostic and therapeutic tools for many diseases including COVID-19. However, existing algorithms are much too slow to model the global structures of full-length RNA viral genomes. We present LinearTurboFold, a linear-time algorithm that is orders of magnitude faster, making it the first method to simultaneously fold and align whole genomes of SARS-CoV-2 variants, the longest known RNA virus (∼30 kilobases). Our work enables unprecedented global structural analysis and captures long-range interactions that are out of reach for existing algorithms but crucial for RNA functions. LinearTurboFold is a general technique for full-length genome studies and can help fight the current and future pandemics.
Collapse
Affiliation(s)
- Sizhen Li
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR
| | - He Zhang
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR
- Baidu Research, Sunnyvale, CA
| | - Liang Zhang
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR
- Baidu Research, Sunnyvale, CA
| | - Kaibo Liu
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR
- Baidu Research, Sunnyvale, CA
| | | | - David H. Mathews
- Department of Biochemistry & Biophysics, Center for RNA Biology, and Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY
| | - Liang Huang
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR
- Baidu Research, Sunnyvale, CA
| |
Collapse
|
53
|
Gao W, Jones TA, Rivas E. Discovery of 17 conserved structural RNAs in fungi. Nucleic Acids Res 2021; 49:6128-6143. [PMID: 34086938 PMCID: PMC8216456 DOI: 10.1093/nar/gkab355] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 03/25/2021] [Accepted: 04/21/2021] [Indexed: 11/13/2022] Open
Abstract
Many non-coding RNAs with known functions are structurally conserved: their intramolecular secondary and tertiary interactions are maintained across evolutionary time. Consequently, the presence of conserved structure in multiple sequence alignments can be used to identify candidate functional non-coding RNAs. Here, we present a bioinformatics method that couples iterative homology search with covariation analysis to assess whether a genomic region has evidence of conserved RNA structure. We used this method to examine all unannotated regions of five well-studied fungal genomes (Saccharomyces cerevisiae, Candida albicans, Neurospora crassa, Aspergillus fumigatus, and Schizosaccharomyces pombe). We identified 17 novel structurally conserved non-coding RNA candidates, which include four H/ACA box small nucleolar RNAs, four intergenic RNAs and nine RNA structures located within the introns and untranslated regions (UTRs) of mRNAs. For the two structures in the 3' UTRs of the metabolic genes GLY1 and MET13, we performed experiments that provide evidence against them being eukaryotic riboswitches.
Collapse
Affiliation(s)
- William Gao
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, USA
| | - Thomas A Jones
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, USA
| | - Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, USA
| |
Collapse
|
54
|
Chen SC, Olsthoorn RCL, Yu CH. Structural phylogenetic analysis reveals lineage-specific RNA repetitive structural motifs in all coronaviruses and associated variations in SARS-CoV-2. Virus Evol 2021; 7:veab021. [PMID: 34141447 PMCID: PMC8206606 DOI: 10.1093/ve/veab021] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
In many single-stranded (ss) RNA viruses, the cis-acting packaging signal that confers selectivity genome packaging usually encompasses short structured RNA repeats. These structural units, termed repetitive structural motifs (RSMs), potentially mediate capsid assembly by specific RNA–protein interactions. However, general knowledge of the conservation and/or the diversity of RSMs in the positive-sense ssRNA coronaviruses (CoVs) is limited. By performing structural phylogenetic analysis, we identified a variety of RSMs in nearly all CoV genomic RNAs, which are exclusively located in the 5′-untranslated regions (UTRs) and/or in the inter-domain regions of poly-protein 1ab coding sequences in a lineage-specific manner. In all alpha- and beta-CoVs, except for Embecovirus spp, two to four copies of 5′-gUUYCGUc-3′ RSMs displaying conserved hexa-loop sequences were generally identified in Stem-loop 5 (SL5) located in the 5′-UTRs of genomic RNAs. In Embecovirus spp., however, two to eight copies of 5′-agc-3′/guAAu RSMs were found in the coding regions of non-structural protein (NSP) 3 and/or NSP15 in open reading frame (ORF) 1ab. In gamma- and delta-CoVs, other types of RSMs were found in several clustered structural elements in 5′-UTRs and/or ORF1ab. The identification of RSM-encompassing structural elements in all CoVs suggests that these RNA elements play fundamental roles in the life cycle of CoVs. In the recently emerged SARS-CoV-2, beta-CoV-specific RSMs are also found in its SL5, displaying two copies of 5′-gUUUCGUc-3′ motifs. However, multiple sequence alignment reveals that the majority of SARS-CoV-2 possesses a variant RSM harboring SL5b C241U, and intriguingly, several variations in the coding sequences of viral proteins, such as Nsp12 P323L, S protein D614G, and N protein R203K-G204R, are concurrently found with such variant RSM. In conclusion, the comprehensive exploration for RSMs reveals phylogenetic insights into the RNA structural elements in CoVs as a whole and provides a new perspective on variations currently found in SARS-CoV-2.
Collapse
Affiliation(s)
- Shih-Cheng Chen
- Department of Biochemistry and Molecular Biology, College of Medicine, National Cheng-Kung University, No.1, University Road, Tainan City 701, Taiwan
| | - René C L Olsthoorn
- Department of Supramolecular Biomaterials Chemistry, Leiden Institute of Chemistry, Gorlaeus Laboratories, Leiden University, Einsteinweg 55, 2333 CC, Leiden,The Netherlands
| | - Chien-Hung Yu
- Department of Biochemistry and Molecular Biology, College of Medicine, National Cheng-Kung University, No.1, University Road, Tainan City 701, Taiwan
| |
Collapse
|
55
|
Multi-omics annotation of human long non-coding RNAs. Biochem Soc Trans 2021; 48:1545-1556. [PMID: 32756901 DOI: 10.1042/bst20191063] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Revised: 07/05/2020] [Accepted: 07/07/2020] [Indexed: 12/12/2022]
Abstract
LncRNAs (long non-coding RNAs) are pervasively transcribed in the human genome and also extensively involved in a variety of essential biological processes and human diseases. The comprehensive annotation of human lncRNAs is of great significance in navigating the functional landscape of the human genome and deepening the understanding of the multi-featured RNA world. However, the unique characteristics of lncRNAs as well as their enormous quantity have complicated and challenged the annotation of lncRNAs. Advances in high-throughput sequencing technologies give rise to a large volume of omics data that are generated at an unprecedented rate and scale, providing possibilities in the identification, characterization and functional annotation of lncRNAs. Here, we review the recent important discoveries of human lncRNAs through analysis of various omics data and summarize specialized lncRNA database resources. Moreover, we highlight the multi-omics integrative analysis as a powerful strategy to efficiently discover and characterize the functional lncRNAs and elucidate their potential molecular mechanisms.
Collapse
|
56
|
Andrews RJ, O’Leary CA, Tompkins VS, Peterson JM, Haniff H, Williams C, Disney MD, Moss WN. A map of the SARS-CoV-2 RNA structurome. NAR Genom Bioinform 2021; 3:lqab043. [PMID: 34046592 PMCID: PMC8140738 DOI: 10.1093/nargab/lqab043] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 04/06/2021] [Accepted: 04/28/2021] [Indexed: 12/11/2022] Open
Abstract
SARS-CoV-2 has exploded throughout the human population. To facilitate efforts to gain insights into SARS-CoV-2 biology and to target the virus therapeutically, it is essential to have a roadmap of likely functional regions embedded in its RNA genome. In this report, we used a bioinformatics approach, ScanFold, to deduce the local RNA structural landscape of the SARS-CoV-2 genome with the highest likelihood of being functional. We recapitulate previously-known elements of RNA structure and provide a model for the folding of an essential frameshift signal. Our results find that SARS-CoV-2 is greatly enriched in unusually stable and likely evolutionarily ordered RNA structure, which provides a large reservoir of potential drug targets for RNA-binding small molecules. Results are enhanced via the re-analyses of publicly-available genome-wide biochemical structure probing datasets that are broadly in agreement with our models. Additionally, ScanFold was updated to incorporate experimental data as constraints in the analysis to facilitate comparisons between ScanFold and other RNA modelling approaches. Ultimately, ScanFold was able to identify eight highly structured/conserved motifs in SARS-CoV-2 that agree with experimental data, without explicitly using these data. All results are made available via a public database (the RNAStructuromeDB: https://structurome.bb.iastate.edu/sars-cov-2) and model comparisons are readily viewable at https://structurome.bb.iastate.edu/sars-cov-2-global-model-comparisons.
Collapse
Affiliation(s)
- Ryan J Andrews
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| | - Collin A O’Leary
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| | - Van S Tompkins
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| | - Jake M Peterson
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| | - Hafeez S Haniff
- Department of Chemistry, The Scripps Research Institute, Jupiter, FL 33458, USA
| | | | - Matthew D Disney
- Department of Chemistry, The Scripps Research Institute, Jupiter, FL 33458, USA
| | - Walter N Moss
- Roy J. Carver Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| |
Collapse
|
57
|
Langeberg CJ, Sherlock ME, MacFadden A, Kieft JS. An expanded class of histidine-accepting viral tRNA-like structures. RNA (NEW YORK, N.Y.) 2021; 27:653-664. [PMID: 33811147 PMCID: PMC8127992 DOI: 10.1261/rna.078550.120] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Accepted: 03/30/2021] [Indexed: 05/12/2023]
Abstract
Structured RNA elements are common in the genomes of RNA viruses, often playing critical roles during viral infection. Some viral RNA elements use forms of tRNA mimicry, but the diverse ways this mimicry can be achieved are poorly understood. Histidine-accepting tRNA-like structures (TLSHis) are examples found at the 3' termini of some positive-sense single-stranded RNA (+ssRNA) viruses where they interact with several host proteins, induce histidylation of the RNA genome, and facilitate processes important for infection, to include genome replication. As only five TLSHis examples had been reported, we explored the possible larger phylogenetic distribution and diversity of this TLS class using bioinformatic approaches. We identified many new examples of TLSHis, yielding a rigorous consensus sequence and secondary structure model that we validated by chemical probing of representative TLSHis RNAs. We confirmed new examples as authentic TLSHis by demonstrating their ability to be histidylated in vitro, then used mutational analyses to imply a tertiary interaction that is likely analogous to the D- and T-loop interaction found in canonical tRNAs. These results expand our understanding of how diverse RNA sequences achieve tRNA-like structure and function in the context of viral RNA genomes and lay the groundwork for high-resolution structural studies of tRNA mimicry by histidine-accepting TLSs.
Collapse
Affiliation(s)
- Conner J Langeberg
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA
| | - Madeline E Sherlock
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA
| | - Andrea MacFadden
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA
| | - Jeffrey S Kieft
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA
- RNA BioScience Initiative, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA
| |
Collapse
|
58
|
Conserved long-range base pairings are associated with pre-mRNA processing of human genes. Nat Commun 2021; 12:2300. [PMID: 33863890 PMCID: PMC8052449 DOI: 10.1038/s41467-021-22549-7] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 03/20/2021] [Indexed: 02/07/2023] Open
Abstract
The ability of nucleic acids to form double-stranded structures is essential for all living systems on Earth. Current knowledge on functional RNA structures is focused on locally-occurring base pairs. However, crosslinking and proximity ligation experiments demonstrated that long-range RNA structures are highly abundant. Here, we present the most complete to-date catalog of conserved complementary regions (PCCRs) in human protein-coding genes. PCCRs tend to occur within introns, suppress intervening exons, and obstruct cryptic and inactive splice sites. Double-stranded structure of PCCRs is supported by decreased icSHAPE nucleotide accessibility, high abundance of RNA editing sites, and frequent occurrence of forked eCLIP peaks. Introns with PCCRs show a distinct splicing pattern in response to RNAPII slowdown suggesting that splicing is widely affected by co-transcriptional RNA folding. The enrichment of 3'-ends within PCCRs raises the intriguing hypothesis that coupling between RNA folding and splicing could mediate co-transcriptional suppression of premature pre-mRNA cleavage and polyadenylation.
Collapse
|
59
|
Fremin BJ, Bhatt AS. Comparative genomics identifies thousands of candidate structured RNAs in human microbiomes. Genome Biol 2021; 22:100. [PMID: 33845850 PMCID: PMC8040213 DOI: 10.1186/s13059-021-02319-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 03/19/2021] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Structured RNAs play varied bioregulatory roles within microbes. To date, hundreds of candidate structured RNAs have been predicted using informatic approaches that search for motif structures in genomic sequence data. The human microbiome contains thousands of species and strains of microbes. Yet, much of the metagenomic data from the human microbiome remains unmined for structured RNA motifs primarily due to computational limitations. RESULTS We sought to apply a large-scale, comparative genomics approach to these organisms to identify candidate structured RNAs. With a carefully constructed, though computationally intensive automated analysis, we identify 3161 conserved candidate structured RNAs in intergenic regions, as well as 2022 additional candidate structured RNAs that may overlap coding regions. We validate the RNA expression of 177 of these candidate structures by analyzing small fragment RNA-seq data from four human fecal samples. CONCLUSIONS This approach identifies a wide variety of candidate structured RNAs, including tmRNAs, antitoxins, and likely ribosome protein leaders, from a wide variety of taxa. Overall, our pipeline enables conservative predictions of thousands of novel candidate structured RNAs from human microbiomes.
Collapse
Affiliation(s)
- Brayon J Fremin
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Ami S Bhatt
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA.
- Department of Medicine (Hematology), Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
60
|
Functional and structural basis of extreme conservation in vertebrate 5' untranslated regions. Nat Genet 2021; 53:729-741. [PMID: 33821006 PMCID: PMC8825242 DOI: 10.1038/s41588-021-00830-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 02/26/2021] [Indexed: 01/07/2023]
Abstract
The lack of knowledge about extreme conservation in genomes remains a major gap in our understanding of the evolution of gene regulation. Here, we reveal an unexpected role of extremely conserved 5' untranslated regions (UTRs) in noncanonical translational regulation that is linked to the emergence of essential developmental features in vertebrate species. Endogenous deletion of conserved elements within these 5' UTRs decreased gene expression, and extremely conserved 5' UTRs possess cis-regulatory elements that promote cell-type-specific regulation of translation. We further developed in-cell mutate-and-map (icM2), a new methodology that maps RNA structure inside cells. Using icM2, we determined that an extremely conserved 5' UTR encodes multiple alternative structures and that each single nucleotide within the conserved element maintains the balance of alternative structures important to control the dynamic range of protein expression. These results explain how extreme sequence conservation can lead to RNA-level biological functions encoded in the untranslated regions of vertebrate genomes.
Collapse
|
61
|
Rivas E. Evolutionary conservation of RNA sequence and structure. WILEY INTERDISCIPLINARY REVIEWS-RNA 2021; 12:e1649. [PMID: 33754485 PMCID: PMC8250186 DOI: 10.1002/wrna.1649] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 02/24/2021] [Accepted: 02/25/2021] [Indexed: 12/22/2022]
Abstract
An RNA structure prediction from a single‐sequence RNA folding program is not evidence for an RNA whose structure is important for function. Random sequences have plausible and complex predicted structures not easily distinguishable from those of structural RNAs. How to tell when an RNA has a conserved structure is a question that requires looking at the evolutionary signature left by the conserved RNA. This question is important not just for long noncoding RNAs which usually lack an identified function, but also for RNA binding protein motifs which can be single stranded RNAs or structures. Here we review recent advances using sequence and structural analysis to determine when RNA structure is conserved or not. Although covariation measures assess structural RNA conservation, one must distinguish covariation due to RNA structure from covariation due to independent phylogenetic substitutions. We review a statistical test to measure false positives expected under the null hypothesis of phylogenetic covariation alone (specificity). We also review a complementary test that measures power, that is, expected covariation derived from sequence variation alone (sensitivity). Power in the absence of covariation signals the absence of a conserved RNA structure. We analyze artifacts that falsely identify conserved RNA structure such as the misuse of programs that do not assess significance, the use of inappropriate statistics confounded by signals other than covariation, or misalignments that induce spurious covariation. Among artifacts that obscure the signal of a conserved RNA structure, we discuss the inclusion of pseudogenes in alignments which increase power but destroy covariation. This article is categorized under:RNA Structure and Dynamics > RNA Structure, Dynamics and Chemistry RNA Evolution and Genomics > Computational Analyses of RNA RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution
Collapse
Affiliation(s)
- Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
62
|
Abzhanova A, Hirschi A, Reiter NJ. An exon-biased biophysical approach and NMR spectroscopy define the secondary structure of a conserved helical element within the HOTAIR long non-coding RNA. J Struct Biol 2021; 213:107728. [PMID: 33753203 DOI: 10.1016/j.jsb.2021.107728] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 02/16/2021] [Accepted: 03/17/2021] [Indexed: 11/16/2022]
Abstract
HOTAIR is a large, multi-exon spliced non-coding RNA proposed to function as a molecular scaffold and competes with chromatin to bind to histone modification enzymes. Previous sequence analysis and biochemical experiments identified potential conserved regions and characterized the full length HOTAIR secondary structure. Here, we examine the thermodynamic folding properties and structural propensity of the individual exonic regions of HOTAIR using an array of biophysical methods and NMR spectroscopy. We demonstrate that different exons of HOTAIR contain variable degrees of heterogeneity, and identify one exonic region, exon 4, that adopts a stable and compact fold under low magnesium concentrations. Close agreement of NMR spectroscopy and chemical probing unambiguously confirm conserved base pair interactions within the structural element, termed helix 10 of exon 4, located within domain I of human HOTAIR. This combined exon-biased and integrated biophysical approach introduces a new strategy to examine conformational heterogeneity in lncRNAs and emphasizes NMR as a key method to validate base pair interactions and corroborate large RNA secondary structures.
Collapse
Affiliation(s)
- Ainur Abzhanova
- Department of Chemistry, Marquette University, Milwaukee 53233, WI, United States
| | - Alexander Hirschi
- Department of Biochemistry, Vanderbilt University Medical Center, Nashville 37205-0146, TN, United States
| | - Nicholas J Reiter
- Department of Chemistry, Marquette University, Milwaukee 53233, WI, United States.
| |
Collapse
|
63
|
Sherlock ME, Hartwick EW, MacFadden A, Kieft JS. Structural diversity and phylogenetic distribution of valyl tRNA-like structures in viruses. RNA (NEW YORK, N.Y.) 2021; 27:27-39. [PMID: 33008837 PMCID: PMC7749636 DOI: 10.1261/rna.076968.120] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 09/26/2020] [Indexed: 05/26/2023]
Abstract
Viruses commonly use specifically folded RNA elements that interact with both host and viral proteins to perform functions important for diverse viral processes. Examples are found at the 3' termini of certain positive-sense ssRNA virus genomes where they partially mimic tRNAs, including being aminoacylated by host cell enzymes. Valine-accepting tRNA-like structures (TLSVal) are an example that share some clear homology with canonical tRNAs but have several important structural differences. Although many examples of TLSVal have been identified, we lacked a full understanding of their structural diversity and phylogenetic distribution. To address this, we undertook an in-depth bioinformatic and biochemical investigation of these RNAs, guided by recent high-resolution structures of a TLSVal We cataloged many new examples in plant-infecting viruses but also in unrelated insect-specific viruses. Using biochemical and structural approaches, we verified the secondary structure of representative TLSVal substrates and tested their ability to be valylated, confirming previous observations of structural heterogeneity within this class. In a few cases, large stem-loop structures are inserted within variable regions located in an area of the TLS distal to known host cell factor binding sites. In addition, we identified one virus whose TLS has switched its anticodon away from valine, causing a loss of valylation activity; the implications of this remain unclear. These results refine our understanding of the structural and functional mechanistic details of tRNA mimicry and how this may be used in viral infection.
Collapse
MESH Headings
- Anticodon/chemistry
- Anticodon/metabolism
- Base Sequence
- Binding Sites
- Computational Biology
- Genetic Variation
- Insect Viruses/classification
- Insect Viruses/genetics
- Insect Viruses/metabolism
- Models, Molecular
- Molecular Mimicry
- Phylogeny
- Plant Viruses/classification
- Plant Viruses/genetics
- Plant Viruses/metabolism
- RNA Folding
- RNA, Transfer, Val/chemistry
- RNA, Transfer, Val/genetics
- RNA, Transfer, Val/metabolism
- RNA, Viral/chemistry
- RNA, Viral/genetics
- RNA, Viral/metabolism
- Sequence Homology, Nucleic Acid
- Valine/metabolism
Collapse
Affiliation(s)
- Madeline E Sherlock
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA
| | - Erik W Hartwick
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA
| | - Andrea MacFadden
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA
| | - Jeffrey S Kieft
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA
- RNA BioScience Initiative, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA
| |
Collapse
|
64
|
Ramírez-Colmenero A, Oktaba K, Fernandez-Valverde SL. Evolution of Genome-Organizing Long Non-coding RNAs in Metazoans. Front Genet 2020; 11:589697. [PMID: 33329735 PMCID: PMC7734150 DOI: 10.3389/fgene.2020.589697] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 11/09/2020] [Indexed: 12/28/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) have important regulatory functions across eukarya. It is now clear that many of these functions are related to gene expression regulation through their capacity to recruit epigenetic modifiers and establish chromatin interactions. Several lncRNAs have been recently shown to participate in modulating chromatin within the spatial organization of the genome in the three-dimensional space of the nucleus. The identification of lncRNA candidates is challenging, as it is their functional characterization. Conservation signatures of lncRNAs are different from those of protein-coding genes, making identifying lncRNAs under selection a difficult task, and the homology between lncRNAs may not be readily apparent. Here, we review the evidence for these higher-order genome organization functions of lncRNAs in animals and the evolutionary signatures they display.
Collapse
Affiliation(s)
- América Ramírez-Colmenero
- Unidad de Genómica Avanzada (Langebio), Centro de Investigación y de Estudios Avanzados del IPN, Irapuato, México
| | - Katarzyna Oktaba
- Unidad Irapuato, Centro de Investigación y de Estudios Avanzados del IPN, Irapuato, México
| | - Selene L Fernandez-Valverde
- Unidad de Genómica Avanzada (Langebio), Centro de Investigación y de Estudios Avanzados del IPN, Irapuato, México
| |
Collapse
|
65
|
Graf J, Kretz M. From structure to function: Route to understanding lncRNA mechanism. Bioessays 2020; 42:e2000027. [PMID: 33164244 DOI: 10.1002/bies.202000027] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 09/03/2020] [Indexed: 12/13/2022]
Abstract
RNAs have emerged as a major target for diagnostics and therapeutics approaches. Regulatory nonprotein-coding RNAs (ncRNAs) in particular display remarkable versatility. They can fold into complex structures and interact with proteins, DNA, and other RNAs, thus modulating activity, localization, or interactome of multi-protein complexes. Thus, ncRNAs confer regulatory plasticity and represent a new layer of regulatory control. Interestingly, long noncoding RNAs (lncRNAs) tend to acquire complex secondary and tertiary structures and their function-in many cases-is dependent on structural conservation rather than primary sequence conservation. Whereas for many proteins, structure and its associated function are closely connected, for lncRNAs, the structural domains that determine functionality and its interactome are still not well understood. Numerous approaches for analyzing the structural configuration of lncRNAs have been developed recently. Here, will provide an overview of major experimental approaches used in the field, and discuss the potential benefit of using combinatorial strategies to analyze lncRNA modes of action based on structural information.
Collapse
Affiliation(s)
- Johannes Graf
- Institute of Biochemistry, Genetics and Microbiology, University of Regensburg, Regensburg, Germany
| | - Markus Kretz
- Institute of Biochemistry, Genetics and Microbiology, University of Regensburg, Regensburg, Germany
| |
Collapse
|
66
|
Rivas E. RNA structure prediction using positive and negative evolutionary information. PLoS Comput Biol 2020; 16:e1008387. [PMID: 33125376 PMCID: PMC7657543 DOI: 10.1371/journal.pcbi.1008387] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 11/11/2020] [Accepted: 09/24/2020] [Indexed: 12/22/2022] Open
Abstract
Knowing the structure of conserved structural RNAs is important to elucidate their function and mechanism of action. However, predicting a conserved RNA structure remains unreliable, even when using a combination of thermodynamic stability and evolutionary covariation information. Here we present a method to predict a conserved RNA structure that combines the following three features. First, it uses significant covariation due to RNA structure and removes spurious covariation due to phylogeny. Second, it uses negative evolutionary information: basepairs that have variation but no significant covariation are prevented from occurring. Lastly, it uses a battery of probabilistic folding algorithms that incorporate all positive covariation into one structure. The method, named CaCoFold (Cascade variation/covariation Constrained Folding algorithm), predicts a nested structure guided by a maximal subset of positive basepairs, and recursively incorporates all remaining positive basepairs into alternative helices. The alternative helices can be compatible with the nested structure such as pseudoknots, or overlapping such as competing structures, base triplets, or other 3D non-antiparallel interactions. We present evidence that CaCoFold predictions are consistent with structures modeled from crystallography. The availability of deeper comparative sequence alignments and recent advances in statistical analysis of RNA sequence covariation have made it possible to identify a reliable set of conserved base pairs, as well as a reliable set of non-basepairs (positions that vary without covarying). Predicting an overall consensus secondary structure consistent with a set of individual inferred pairs and non-pairs remains a problem. Current RNA structure prediction algorithms that predict nested secondary structures cannot use the full set of inferred covarying pairs, because covariation analysis also identifies important non-nested pairing interactions such as pseudoknots, base triples, and alternative structures. Moreover, although algorithms for incorporating negative constraints exist, negative information from covariation analysis (inferred non-pairs) has not been systematically exploited. Here I introduce an efficient approximate RNA structure prediction algorithm that incorporates all inferred pairs and excludes all non-pairs. Using this, and an improved visualization tool, I show that the method correctly identifies many non-nested structures in agreement with known crystal structures, and improves many curated consensus secondary structure annotations in RNA sequence alignment databases.
Collapse
Affiliation(s)
- Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
- * E-mail:
| |
Collapse
|
67
|
Li B, Cao Y, Westhof E, Miao Z. Advances in RNA 3D Structure Modeling Using Experimental Data. Front Genet 2020; 11:574485. [PMID: 33193680 PMCID: PMC7649352 DOI: 10.3389/fgene.2020.574485] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 09/02/2020] [Indexed: 12/26/2022] Open
Abstract
RNA is a unique bio-macromolecule that can both record genetic information and perform biological functions in a variety of molecular processes, including transcription, splicing, translation, and even regulating protein function. RNAs adopt specific three-dimensional conformations to enable their functions. Experimental determination of high-resolution RNA structures using x-ray crystallography is both laborious and demands expertise, thus, hindering our comprehension of RNA structural biology. The computational modeling of RNA structure was a milestone in the birth of bioinformatics. Although computational modeling has been greatly improved over the last decade showing many successful cases, the accuracy of such computational modeling is not only length-dependent but also varies according to the complexity of the structure. To increase credibility, various experimental data were integrated into computational modeling. In this review, we summarize the experiments that can be integrated into RNA structure modeling as well as the computational methods based on these experimental data. We also demonstrate how computational modeling can help the experimental determination of RNA structure. We highlight the recent advances in computational modeling which can offer reliable structure models using high-throughput experimental data.
Collapse
Affiliation(s)
- Bing Li
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Eric Westhof
- Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, Strasbourg, France
| | - Zhichao Miao
- Translational Research Institute of Brain and Brain-Like Intelligence, Department of Anesthesiology, Shanghai Fourth People’s Hospital Affiliated to Tongji University School of Medicine, Shanghai, China
- Newcastle Fibrosis Research Group, Institute of Cellular Medicine, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| |
Collapse
|
68
|
Rangan R, Zheludev IN, Hagey RJ, Pham EA, Wayment-Steele HK, Glenn JS, Das R. RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses: a first look. RNA (NEW YORK, N.Y.) 2020; 26:937-959. [PMID: 32398273 PMCID: PMC7373990 DOI: 10.1261/rna.076141.120] [Citation(s) in RCA: 191] [Impact Index Per Article: 38.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 05/11/2020] [Indexed: 05/11/2023]
Abstract
As the COVID-19 outbreak spreads, there is a growing need for a compilation of conserved RNA genome regions in the SARS-CoV-2 virus along with their structural propensities to guide development of antivirals and diagnostics. Here we present a first look at RNA sequence conservation and structural propensities in the SARS-CoV-2 genome. Using sequence alignments spanning a range of betacoronaviruses, we rank genomic regions by RNA sequence conservation, identifying 79 regions of length at least 15 nt as exactly conserved over SARS-related complete genome sequences available near the beginning of the COVID-19 outbreak. We then confirm the conservation of the majority of these genome regions across 739 SARS-CoV-2 sequences subsequently reported from the COVID-19 outbreak, and we present a curated list of 30 "SARS-related-conserved" regions. We find that known RNA structured elements curated as Rfam families and in prior literature are enriched in these conserved genome regions, and we predict additional conserved, stable secondary structures across the viral genome. We provide 106 "SARS-CoV-2-conserved-structured" regions as potential targets for antivirals that bind to structured RNA. We further provide detailed secondary structure models for the extended 5' UTR, frameshifting stimulation element, and 3' UTR. Lastly, we predict regions of the SARS-CoV-2 viral genome that have low propensity for RNA secondary structure and are conserved within SARS-CoV-2 strains. These 59 "SARS-CoV-2-conserved-unstructured" genomic regions may be most easily accessible by hybridization in primer-based diagnostic strategies.
Collapse
Affiliation(s)
- Ramya Rangan
- Biophysics Program, Stanford University, Stanford, California 94305, USA
| | - Ivan N Zheludev
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Rachel J Hagey
- Departments of Medicine (Division of Gastroenterology and Hepatology) and Microbiology & Immunology, Stanford School of Medicine, Stanford, California 94305, USA
| | - Edward A Pham
- Departments of Medicine (Division of Gastroenterology and Hepatology) and Microbiology & Immunology, Stanford School of Medicine, Stanford, California 94305, USA
| | | | - Jeffrey S Glenn
- Departments of Medicine (Division of Gastroenterology and Hepatology) and Microbiology & Immunology, Stanford School of Medicine, Stanford, California 94305, USA
- Palo Alto Veterans Administration, Palo Alto, California 94304, USA
| | - Rhiju Das
- Biophysics Program, Stanford University, Stanford, California 94305, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California 94305, USA
- Department of Physics, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
69
|
Rangan R, Zheludev IN, Das R. RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2020:2020.03.27.012906. [PMID: 32511306 PMCID: PMC7217285 DOI: 10.1101/2020.03.27.012906] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
As the COVID-19 outbreak spreads, there is a growing need for a compilation of conserved RNA genome regions in the SARS-CoV-2 virus along with their structural propensities to guide development of antivirals and diagnostics. Using sequence alignments spanning a range of betacoronaviruses, we rank genomic regions by RNA sequence conservation, identifying 79 regions of length at least 15 nucleotides as exactly conserved over SARS-related complete genome sequences available near the beginning of the COVID-19 outbreak. We then confirm the conservation of the majority of these genome regions across 739 SARS-CoV-2 sequences reported to date from the current COVID-19 outbreak, and we present a curated list of 30 'SARS-related-conserved' regions. We find that known RNA structured elements curated as Rfam families and in prior literature are enriched in these conserved genome regions, and we predict additional conserved, stable secondary structures across the viral genome. We provide 106 'SARS-CoV-2-conserved-structured' regions as potential targets for antivirals that bind to structured RNA. We further provide detailed secondary structure models for the 5´ UTR, frame-shifting element, and 3´ UTR. Last, we predict regions of the SARS-CoV-2 viral genome have low propensity for RNA secondary structure and are conserved within SARS-CoV-2 strains. These 59 'SARS-CoV-2-conserved-unstructured' genomic regions may be most easily targeted in primer-based diagnostic and oligonucleotide-based therapeutic strategies.
Collapse
Affiliation(s)
- Ramya Rangan
- Biophysics Program, Stanford University, Stanford CA 94305
| | - Ivan N. Zheludev
- Department of Biochemistry, Stanford University School of Medicine, Stanford CA 94305
| | - Rhiju Das
- Biophysics Program, Stanford University, Stanford CA 94305
- Department of Biochemistry, Stanford University School of Medicine, Stanford CA 94305
- Department of Physics, Stanford University, Stanford CA 94305
| |
Collapse
|