1
|
Khan N, Rahaman M, Zhang S. GINClus: RNA structural motif clustering using graph isomorphism network. NAR Genom Bioinform 2025; 7:lqaf050. [PMID: 40290315 PMCID: PMC12034103 DOI: 10.1093/nargab/lqaf050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Revised: 04/02/2025] [Accepted: 04/15/2025] [Indexed: 04/30/2025] Open
Abstract
Ribonucleic acid (RNA) structural motif identification is a crucial step for understanding RNA structure and functionality. Due to the complexity and variations of RNA 3D structures, identifying RNA structural motifs is challenging and time-consuming. Particularly, discovering new RNA structural motif families is a hard problem and still largely depends on manual analysis. In this paper, we proposed an RNA structural motif clustering tool, named GINClus, which uses a semi-supervised deep learning model to cluster RNA motif candidates (RNA loop regions) based on both base interaction and 3D structure similarities. GINClus converts base interactions and 3D structures of RNA motif candidates into graph representations and using graph isomorphism network (GIN) model in combination with K-means and hierarchical agglomerative clustering, GINClus clusters the RNA motif candidates based on their structural similarities. GINClus has a clustering accuracy of 87.88% for known internal loop motifs and 97.69% for known hairpin loop motifs. Using GINClus, we successfully clustered the motifs of the same families together and were able to find 927 new instances of Sarcin-ricin, Kink-turn, Tandem-shear, Hook-turn, E-loop, C-loop, T-loop, and GNRA loop motif families. We also identified 12 new RNA structural motif families with unique structure and base-pair interactions.
Collapse
Affiliation(s)
- Nabila Shahnaz Khan
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Md Mahfuzur Rahaman
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Shaojie Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| |
Collapse
|
2
|
Herbert A. Triplexes Color the Chromaverse by Modulating Nucleosome Phasing and Anchoring Chromatin Condensates. Int J Mol Sci 2025; 26:4032. [PMID: 40362270 PMCID: PMC12071334 DOI: 10.3390/ijms26094032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2025] [Revised: 04/16/2025] [Accepted: 04/22/2025] [Indexed: 05/15/2025] Open
Abstract
Genomic sequences that form three-stranded triplexes (TPXs) under physiological conditions (called T-flipons) play an important role in defining DNA nucleosome-free regions (NFRs). Within these NFRs, other flipon types can cycle conformations to actuate gene expression. The transcripts read from the NFR form condensates that engage proteins and small RNAs. The helicases bound then trigger RNA polymerase release by dissociating the 7SK ribonucleoprotein. The TPXs formed usually incorporate RNA as the third strand. TPXs made only from DNA arise mostly during DNA replication. Many small RNA types (sRNAs) and long noncoding (lncRNA) can direct TPX formation. TPXs made with circular RNAs have greater stability and specificity than those formed with linear RNAs. LncRNAs can affect local gene expression through TPX formation and transcriptional interference. The condensates seeded by lncRNAs are updated by feedback loops involving proteins and noncoding RNAs from the genes they regulate. Some lncRNAs also target distant loci in a sequence-specific manner. Overall, lncRNAs can rapidly evolve by adding or subtracting sequence motifs that modify the condensates they nucleate. LncRNAs show less sequence conservation than protein-coding sequences. TPXs formed by lncRNAs and sRNAs help place nucleosomes to restrict endogenous retroelement (ERE) expression. The silencing of EREs starts early in embryogenesis and is essential for bootstrapping development. Once the system is set, EREs play a different role, with a notable enrichment of Short Interspersed Nuclear Repeats (SINEs) in Enhancer-Promoter condensates. The highly programmable TPX-dependent processes create a chromaverse capable of many complexities.
Collapse
Affiliation(s)
- Alan Herbert
- Discovery, InsideOutBio, Charlestown, MA 02129, USA
| |
Collapse
|
3
|
Rahaman MM, Zhang S. RNAMotifProfile: a graph-based approach to build RNA structural motif profiles. NAR Genom Bioinform 2024; 6:lqae128. [PMID: 39328267 PMCID: PMC11426329 DOI: 10.1093/nargab/lqae128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/24/2024] [Accepted: 09/09/2024] [Indexed: 09/28/2024] Open
Abstract
RNA structural motifs are the recurrent segments in RNA three-dimensional structures that play a crucial role in the functional diversity of RNAs. Understanding the similarities and variations within these recurrent motif groups is essential for gaining insights into RNA structure and function. While recurrent structural motifs are generally assumed to be composed of the same isosteric base interactions, this consistent pattern is not observed across all examples of these motifs. Existing methods for analyzing and comparing RNA structural motifs may overlook variations in base interactions and associated nucleotides. RNAMotifProfile is a novel profile-to-profile alignment algorithm that generates a comprehensive profile from a group of structural motifs, incorporating all base interactions and associated nucleotides at each position. By structurally aligning input motif instances using a guide-tree-based approach, RNAMotifProfile captures the similarities and variations within recurrent motif groups. Additionally, RNAMotifProfile can function as a motif search tool, enabling the identification of instances of a specific motif family by searching with the corresponding profile. The ability to generate accurate and comprehensive profiles for RNA structural motif families, and to search for these motifs, facilitates a deeper understanding of RNA structure-function relationships and potential applications in RNA engineering and therapeutic design.
Collapse
Affiliation(s)
- Md Mahfuzur Rahaman
- Department of Computer Science, University of Central Florida, 4328 Scorpius Street, Orlando, FL 32816-2362, USA
| | - Shaojie Zhang
- Department of Computer Science, University of Central Florida, 4328 Scorpius Street, Orlando, FL 32816-2362, USA
| |
Collapse
|
4
|
Chol A, Sarrazin-Gendron R, Lécuyer É, Blanchette M, Waldispühl J. PERFUMES: pipeline to extract RNA functional motifs and exposed structures. Bioinformatics 2024; 40:btae056. [PMID: 38291894 PMCID: PMC10868343 DOI: 10.1093/bioinformatics/btae056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 11/28/2023] [Accepted: 01/28/2024] [Indexed: 02/01/2024] Open
Abstract
MOTIVATION Up to 75% of the human genome encodes RNAs. The function of many non-coding RNAs relies on their ability to fold into 3D structures. Specifically, nucleotides inside secondary structure loops form non-canonical base pairs that help stabilize complex local 3D structures. These RNA 3D motifs can promote specific interactions with other molecules or serve as catalytic sites. RESULTS We introduce PERFUMES, a computational pipeline to identify 3D motifs that can be associated with observable features. Given a set of RNA sequences with associated binary experimental measurements, PERFUMES searches for RNA 3D motifs using BayesPairing2 and extracts those that are over-represented in the set of positive sequences. It also conducts a thermodynamics analysis of the structural context that can support the interpretation of the predictions. We illustrate PERFUMES' usage on the SNRPA protein binding site, for which the tool retrieved both previously known binder motifs and new ones. AVAILABILITY AND IMPLEMENTATION PERFUMES is an open-source Python package (https://jwgitlab.cs.mcgill.ca/arnaud_chol/perfumes).
Collapse
Affiliation(s)
- Arnaud Chol
- School of Computer Science, McGill University, Montréal, QC H3A 0E9, Canada
| | | | - Éric Lécuyer
- Institut de Recherches Cliniques de Montréal (IRCM), Montréal, QC H2W 1R7, Canada
| | - Mathieu Blanchette
- School of Computer Science, McGill University, Montréal, QC H3A 0E9, Canada
| | - Jérôme Waldispühl
- School of Computer Science, McGill University, Montréal, QC H3A 0E9, Canada
| |
Collapse
|
5
|
Rahaman MM, Khan NS, Zhang S. RNAMotifComp: a comprehensive method to analyze and identify structurally similar RNA motif families. Bioinformatics 2023; 39:i337-i346. [PMID: 37387191 DOI: 10.1093/bioinformatics/btad223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION The 3D structures of RNA play a critical role in understanding their functionalities. There exist several computational methods to study RNA 3D structures by identifying structural motifs and categorizing them into several motif families based on their structures. Although the number of such motif families is not limited, a few of them are well-studied. Out of these structural motif families, there exist several families that are visually similar or very close in structure, even with different base interactions. Alternatively, some motif families share a set of base interactions but maintain variation in their 3D formations. These similarities among different motif families, if known, can provide a better insight into the RNA 3D structural motifs as well as their characteristic functions in cell biology. RESULTS In this work, we proposed a method, RNAMotifComp, that analyzes the instances of well-known structural motif families and establishes a relational graph among them. We also have designed a method to visualize the relational graph where the families are shown as nodes and their similarity information is represented as edges. We validated our discovered correlations of the motif families using RNAMotifContrast. Additionally, we used a basic Naïve Bayes classifier to show the importance of RNAMotifComp. The relational analysis explains the functional analogies of divergent motif families and illustrates the situations where the motifs of disparate families are predicted to be of the same family. AVAILABILITY AND IMPLEMENTATION Source code publicly available at https://github.com/ucfcbb/RNAMotifFamilySimilarity.
Collapse
Affiliation(s)
- Md Mahfuzur Rahaman
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Nabila Shahnaz Khan
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Shaojie Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| |
Collapse
|
6
|
Chen X, Zhang S. CircularSTAR3D: a stack-based RNA 3D structural alignment tool for circular matching. Nucleic Acids Res 2023; 51:e53. [PMID: 36987885 PMCID: PMC10201423 DOI: 10.1093/nar/gkad222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 03/04/2023] [Accepted: 03/28/2023] [Indexed: 03/30/2023] Open
Abstract
The functions of non-coding RNAs usually depend on their 3D structures. Therefore, comparing RNA 3D structures is critical in analyzing their functions. We noticed an interesting phenomenon that two non-coding RNAs may share similar substructures when rotating their sequence order. To the best of our knowledge, no existing RNA 3D structural alignment tools can detect this type of matching. In this article, we defined the RNA 3D structure circular matching problem and developed a software tool named CircularSTAR3D to solve this problem. CircularSTAR3D first uses the conserved stacks (consecutive base pairs with similar 3D structures) in the input RNAs to identify the circular matched internal loops and multiloops. Then it performs a local extension iteratively to obtain the whole circular matched substructures. The computational experiments conducted on a non-redundant RNA structure dataset show that circular matching is ubiquitous. Furthermore, we demonstrated the utility of CircularSTAR3D by detecting the conserved substructures missed by regular alignment tools, including structural motifs and conserved structures between riboswitches and ribozymes from different classes. We anticipate CircularSTAR3D to be a valuable supplement to the existing RNA 3D structural analysis techniques.
Collapse
Affiliation(s)
- Xiaoli Chen
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Shaojie Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| |
Collapse
|
7
|
Zeke A, Schád É, Horváth T, Abukhairan R, Szabó B, Tantos A. Deep structural insights into RNA-binding disordered protein regions. WILEY INTERDISCIPLINARY REVIEWS. RNA 2022; 13:e1714. [PMID: 35098694 PMCID: PMC9539567 DOI: 10.1002/wrna.1714] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Revised: 12/22/2021] [Accepted: 01/07/2022] [Indexed: 12/11/2022]
Abstract
Recent efforts to identify RNA binding proteins in various organisms and cellular contexts have yielded a large collection of proteins that are capable of RNA binding in the absence of conventional RNA recognition domains. Many of the recently identified RNA interaction motifs fall into intrinsically disordered protein regions (IDRs). While the recognition mode and specificity of globular RNA binding elements have been thoroughly investigated and described, much less is known about the way IDRs can recognize their RNA partners. Our aim was to summarize the current state of structural knowledge on the RNA binding modes of disordered protein regions and to propose a classification system based on their sequential and structural properties. Through a detailed structural analysis of the complexes that contain disordered protein regions binding to RNA, we found two major binding modes that represent different recognition strategies and, most likely, functions. We compared these examples with DNA binding disordered proteins and found key differences stemming from the nucleic acids as well as similar binding strategies, implying a broader substrate acceptance by these proteins. Due to the very limited number of known structures, we integrated molecular dynamics simulations in our study, whose results support the proposed structural preferences of specific RNA‐binding IDRs. To broaden the scope of our review, we included a brief analysis of RNA‐binding small molecules and compared their structural characteristics and RNA recognition strategies to the RNA‐binding IDRs. This article is categorized under:RNA Structure and Dynamics > RNA Structure, Dynamics, and Chemistry RNA Interactions with Proteins and Other Molecules > Protein–RNA Recognition RNA Interactions with Proteins and Other Molecules > Small Molecule–RNA Interactions
Collapse
Affiliation(s)
- András Zeke
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Éva Schád
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Tamás Horváth
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Rawan Abukhairan
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Beáta Szabó
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Agnes Tantos
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| |
Collapse
|
8
|
Oliver C, Mallet V, Philippopoulos P, Hamilton WL, Waldispühl J. Vernal: a tool for mining fuzzy network motifs in RNA. Bioinformatics 2022; 38:970-976. [PMID: 34791045 DOI: 10.1093/bioinformatics/btab768] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 09/19/2021] [Accepted: 11/09/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION RNA 3D motifs are recurrent substructures, modeled as networks of base pair interactions, which are crucial for understanding structure-function relationships. The task of automatically identifying such motifs is computationally hard, and remains a key challenge in the field of RNA structural biology and network analysis. State-of-the-art methods solve special cases of the motif problem by constraining the structural variability in occurrences of a motif, and narrowing the substructure search space. RESULTS Here, we relax these constraints by posing the motif finding problem as a graph representation learning and clustering task. This framing takes advantage of the continuous nature of graph representations to model the flexibility and variability of RNA motifs in an efficient manner. We propose a set of node similarity functions, clustering methods and motif construction algorithms to recover flexible RNA motifs. Our tool, Vernal can be easily customized by users to desired levels of motif flexibility, abundance and size. We show that Vernal is able to retrieve and expand known classes of motifs, as well as to propose novel motifs. AVAILABILITY AND IMPLEMENTATION The source code, data and a webserver are available at vernal.cs.mcgill.ca. We also provide a flexible interface and a user-friendly webserver to browse and download our results. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Carlos Oliver
- School of Computer Science, McGill University, Montréal, QC H3A 0E9, Canada.,Montreal Institute for Learning Algorithms (MILA), Montréal, QC H2S 3H1, Canada
| | - Vincent Mallet
- Structural Bioinformatics Unit, Department of Structural Biology and Chemistry, Institut Pasteur, CNRS UMR3528, C3BI, USR3756, Paris, France.,Mines ParisTech, Paris-Sciences-et-Lettres Research University, Center for Computational Biology, Paris 75272, France
| | | | - William L Hamilton
- School of Computer Science, McGill University, Montréal, QC H3A 0E9, Canada.,Montreal Institute for Learning Algorithms (MILA), Montréal, QC H2S 3H1, Canada
| | - Jérôme Waldispühl
- School of Computer Science, McGill University, Montréal, QC H3A 0E9, Canada
| |
Collapse
|
9
|
Hong X, Zheng J, Xie J, Tong X, Liu X, Song Q, Liu S, Liu S. RR3DD: an RNA global structure-based RNA three-dimensional structural classification database. RNA Biol 2021; 18:738-746. [PMID: 34663179 DOI: 10.1080/15476286.2021.1989200] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
The three-dimensional (3D) structure of RNA usually plays an important role in the recognition with RNA-binding protein. Along with the discovering of RNAs, several RNA databases are developed to study the functions of RNA based on sequence, secondary structure, local 3D structural motif and global structure. Based on RNA function and structure, different RNAs are classified and stored in SCOR and DARTS, respectively. The classification of RNA structures is useful in RNA structure prediction and function annotation. However, the SCOR and DARTS are not updated any more. In this study, we present an RNA classification database RR3DD based on RNA fold with the global 3D structural similarity. The RR3DD includes 13,601 RNA chains from PDB and mmCIF format structures which are classified into 780 RNA folds. The RNA chains from PDB and mmCIF format structures are aligned and clustered into 675 and 220 RNA folds, respectively. By analysing the RNA structure in RR3DD, we find that there are 11 clusters with more than 50 members. These clusters include rRNAs, riboswitches, tRNAs and so on. By mapping RR3DD into Rfam, we found that some RNAs without annotation by Rfam can be annotated through structural alignment. For example, we analysed tRNAs and found that tRNA were successfully grouped in RR3DD for which Rfam did not classify them into one family. Finally, we provide a web interface of RR3DD offering functions of browsing RR3DD, annotating RNA 3D structure and finding templates for RNA homology modelling.
Collapse
Affiliation(s)
- Xu Hong
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Jinfang Zheng
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Juan Xie
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Xiaoxue Tong
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Xudong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| | - Qi Song
- Key Laboratory of Fermentation Engineering (Ministry of Education, Hubei University of Technology, Wuhan, China
| | - Sen Liu
- Key Laboratory of Fermentation Engineering (Ministry of Education, Hubei University of Technology, Wuhan, China
| | - Shiyong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
10
|
Islam S, Rahaman MM, Zhang S. RNAMotifContrast: a method to discover and visualize RNA structural motif subfamilies. Nucleic Acids Res 2021; 49:e61. [PMID: 33693841 PMCID: PMC8216276 DOI: 10.1093/nar/gkab131] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Revised: 02/16/2021] [Accepted: 02/18/2021] [Indexed: 01/17/2023] Open
Abstract
Understanding the 3D structural properties of RNAs will play a critical role in identifying their functional characteristics and designing new RNAs for RNA-based therapeutics and nanotechnology. While several existing computational methods can help in the analysis of RNA properties by recognizing structural motifs, they do not provide the means to compare and contrast those motifs extensively. We have developed a new method, RNAMotifContrast, which focuses on analyzing the similarities and variations of RNA structural motif characteristics. In this method, a graph is formed to represent the similarities among motifs, and a new traversal algorithm is applied to generate visualizations of their structural properties. Analyzing the structural features among motifs, we have recognized and generalized the concept of motif subfamilies. To asses its effectiveness, we have applied RNAMotifContrast on a dataset of known RNA structural motif families. From the results, we observed that the derived subfamilies possess unique structural variations while holding standard features of the families. Overall, the visualization approach of this method presents a new perspective to observe the relation among motifs more closely, and the discovered subfamilies provide opportunities to achieve valuable insights into RNA’s diverse roles.
Collapse
Affiliation(s)
- Shahidul Islam
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Md Mahfuzur Rahaman
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Shaojie Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| |
Collapse
|
11
|
Chen X, Khan NS, Zhang S. LocalSTAR3D: a local stack-based RNA 3D structural alignment tool. Nucleic Acids Res 2020; 48:e77. [PMID: 32496533 PMCID: PMC7367197 DOI: 10.1093/nar/gkaa453] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Revised: 05/15/2020] [Accepted: 05/27/2020] [Indexed: 11/29/2022] Open
Abstract
A fast-growing number of non-coding RNA structures have been resolved and deposited in Protein Data Bank (PDB). In contrast to the wide range of global alignment and motif search tools, there is still a lack of local alignment tools. Among all the global alignment tools for RNA 3D structures, STAR3D has become a valuable tool for its unprecedented speed and accuracy. STAR3D compares the 3D structures of RNA molecules using consecutive base-pairs (stacks) as anchors and generates an optimal global alignment. In this article, we developed a local RNA 3D structural alignment tool, named LocalSTAR3D, which was extended from STAR3D and designed to report multiple local alignments between two RNAs. The benchmarking results show that LocalSTAR3D has better accuracy and coverage than other local alignment tools. Furthermore, the utility of this tool has been demonstrated by rediscovering kink-turn motif instances, conserved domains in group II intron RNAs, and the tRNA mimicry of IRES RNAs.
Collapse
Affiliation(s)
- Xiaoli Chen
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Nabila Shahnaz Khan
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Shaojie Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| |
Collapse
|
12
|
Černý J, Božíková P, Svoboda J, Schneider B. A unified dinucleotide alphabet describing both RNA and DNA structures. Nucleic Acids Res 2020; 48:6367-6381. [PMID: 32406923 PMCID: PMC7293047 DOI: 10.1093/nar/gkaa383] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Revised: 04/11/2020] [Accepted: 04/30/2020] [Indexed: 12/13/2022] Open
Abstract
By analyzing almost 120 000 dinucleotides in over 2000 nonredundant nucleic acid crystal structures, we define 96+1 diNucleotide Conformers, NtCs, which describe the geometry of RNA and DNA dinucleotides. NtC classes are grouped into 15 codes of the structural alphabet CANA (Conformational Alphabet of Nucleic Acids) to simplify symbolic annotation of the prominent structural features of NAs and their intuitive graphical display. The search for nontrivial patterns of NtCs resulted in the identification of several types of RNA loops, some of them observed for the first time. Over 30% of the nearly six million dinucleotides in the PDB cannot be assigned to any NtC class but we demonstrate that up to a half of them can be re-refined with the help of proper refinement targets. A statistical analysis of the preferences of NtCs and CANA codes for the 16 dinucleotide sequences showed that neither the NtC class AA00, which forms the scaffold of RNA structures, nor BB00, the DNA most populated class, are sequence neutral but their distributions are significantly biased. The reported automated assignment of the NtC classes and CANA codes available at dnatco.org provides a powerful tool for unbiased analysis of nucleic acid structures by structural and molecular biologists.
Collapse
Affiliation(s)
- Jiří Černý
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, CZ-252 50 Vestec, Prague-West, Czech Republic
| | - Paulína Božíková
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, CZ-252 50 Vestec, Prague-West, Czech Republic
| | - Jakub Svoboda
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, CZ-252 50 Vestec, Prague-West, Czech Republic
| | - Bohdan Schneider
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, CZ-252 50 Vestec, Prague-West, Czech Republic
| |
Collapse
|
13
|
Becquey L, Angel E, Tahi F. BiORSEO: a bi-objective method to predict RNA secondary structures with pseudoknots using RNA 3D modules. Bioinformatics 2020; 36:2451-2457. [DOI: 10.1093/bioinformatics/btz962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 11/15/2019] [Accepted: 01/02/2020] [Indexed: 11/12/2022] Open
Abstract
Abstract
Motivation
RNA loops have been modelled and clustered from solved 3D structures into ordered collections of recurrent non-canonical interactions called ‘RNA modules’, available in databases. This work explores what information from such modules can be used to improve secondary structure prediction. We propose a bi-objective method for predicting RNA secondary structures by minimizing both an energy-based and a knowledge-based potential. The tool, called BiORSEO, outputs secondary structures corresponding to the optimal solutions from the Pareto set.
Results
We compare several approaches to predict secondary structures using inserted RNA modules information: two module data sources, Rna3Dmotif and the RNA 3D Motif Atlas, and different ways to score the module insertions: module size, module complexity or module probability according to models like JAR3D and BayesPairing. We benchmark them against a large set of known secondary structures, including some state-of-the-art tools, and comment on the usefulness of the half physics-based, half data-based approach.
Availability and implementation
The software is available for download on the EvryRNA website, as well as the datasets.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Louis Becquey
- Université Paris-Saclay, Univ Evry, IBISC, 91020, Evry, France
| | - Eric Angel
- Université Paris-Saclay, Univ Evry, IBISC, 91020, Evry, France
| | - Fariza Tahi
- Université Paris-Saclay, Univ Evry, IBISC, 91020, Evry, France
| |
Collapse
|
14
|
Abstract
INTRODUCTION The success of binding site comparisons in drug discovery is based on the recognized fact that many different proteins have similar binding sites. Indeed, binding site comparisons have found many uses in drug development and have the potential to dramatically cut the cost and shorten the time necessary for the development of new drugs. Areas covered: The authors review recent methods for comparing protein binding sites and their use in drug repurposing and polypharmacology. They examine emerging fields including the use of binding site comparisons in precision medicine, the prediction of structured water molecules, the search for targets of natural compounds, and their application in the development of protein-based drugs by loop modeling and for comparison of RNA binding sites. Expert opinion: Binding site comparisons have produced many interesting results in drug development, but relatively little work has been done on protein-protein interaction sites, which are particularly relevant in view of the success of biological drugs. Growth of protein loop modeling for modulating biological drugs is anticipated. The fusion of currently distinct methods for the comparison of RNA and protein binding sites into a single comprehensive approach could allow the search for new selective ribosomal antibiotics and initiate pharmaceutical research into other nucleoproteins.
Collapse
Affiliation(s)
- Janez Konc
- a Theory Department , National Institute of Chemistry , Ljubljana , Slovenia.,b Faculty of Pharmacy , University of Ljubljana , Ljubljana , Slovenia.,c Faculty of Mathematics , Natural Sciences and Information Technologies, University of Primorska , Koper , Slovenia.,d Faculty of Chemistry and Chemical Technology , University of Maribor , Maribor , Slovenia
| |
Collapse
|