1
|
Tong Y, Childs-Disney JL, Disney MD. Targeting RNA with small molecules, from RNA structures to precision medicines: IUPHAR review: 40. Br J Pharmacol 2024; 181:4152-4173. [PMID: 39224931 DOI: 10.1111/bph.17308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 06/10/2024] [Accepted: 07/09/2024] [Indexed: 09/04/2024] Open
Abstract
RNA plays important roles in regulating both health and disease biology in all kingdoms of life. Notably, RNA can form intricate three-dimensional structures, and their biological functions are dependent on these structures. Targeting the structured regions of RNA with small molecules has gained increasing attention over the past decade, because it provides both chemical probes to study fundamental biology processes and lead medicines for diseases with unmet medical needs. Recent advances in RNA structure prediction and determination and RNA biology have accelerated the rational design and development of RNA-targeted small molecules to modulate disease pathology. However, challenges remain in advancing RNA-targeted small molecules towards clinical applications. This review summarizes strategies to study RNA structures, to identify small molecules recognizing these structures, and to augment the functionality of RNA-binding small molecules. We focus on recent advances in developing RNA-targeted small molecules as potential therapeutics in a variety of diseases, encompassing different modes of actions and targeting strategies. Furthermore, we present the current gaps between early-stage discovery of RNA-binding small molecules and their clinical applications, as well as a roadmap to overcome these challenges in the near future.
Collapse
Affiliation(s)
- Yuquan Tong
- Department of Chemistry, The Scripps Research Institute, Jupiter, Florida, USA
- Department of Chemistry, The Herbert Wertheim UF Scripps Institute for Biomedical Innovation & Technology, Jupiter, Florida, USA
| | - Jessica L Childs-Disney
- Department of Chemistry, The Herbert Wertheim UF Scripps Institute for Biomedical Innovation & Technology, Jupiter, Florida, USA
| | - Matthew D Disney
- Department of Chemistry, The Scripps Research Institute, Jupiter, Florida, USA
- Department of Chemistry, The Herbert Wertheim UF Scripps Institute for Biomedical Innovation & Technology, Jupiter, Florida, USA
| |
Collapse
|
2
|
Mittal A, Ali SE, Mathews DH. Using the RNAstructure Software Package to Predict Conserved RNA Structures. Curr Protoc 2024; 4:e70054. [PMID: 39540715 DOI: 10.1002/cpz1.70054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
The structures of many non-coding RNAs (ncRNA) are conserved by evolution to a greater extent than their sequences. By predicting the conserved structure of two or more homologous sequences, the accuracy of secondary structure prediction can be improved as compared to structure prediction for a single sequence. Here, we provide protocols for the use of four programs in the RNAstructure suite to predict conserved structures: Multilign, TurboFold, Dynalign, and PARTS. TurboFold iteratively aligns multiple homologous sequences and estimates the pairing probabilities for the conserved structure. Dynalign, PARTS, and Multilign are dynamic programming algorithms that simultaneously align sequences and identify the common secondary structure. Dynalign uses a pair of homologs and finds the lowest free energy common structure. PARTS uses a pair of homologs and estimates pairing probabilities from the base pairing probabilities estimated for each sequence. Multilign uses two or more homologs and finds the lowest free energy common structure using multiple pairwise calculations with Dynalign. It scales linearly with the number of sequences. We outline the strengths of each program. These programs can be run through web servers, on the command line, or with graphical user interfaces. © 2024 Wiley Periodicals LLC. Basic Protocol 1: Predicting a structure conserved in three or more sequences with the RNAstructure web server Basic Protocol 2: Predicting a structure conserved in two sequences with the RNAstructure web server Alternative Protocol 1: Predicting a structure conserved in multiple sequences in the RNAstructure graphical user interface Alternative Protocol 2: Predicting a structure conserved in two sequences with Dynalign in the RNAstructure graphical user interface Alternative Protocol 3: Running TurboFold on the command line.
Collapse
Affiliation(s)
- Abhinav Mittal
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - Sara E Ali
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| |
Collapse
|
3
|
Yeh HY, Cox NA, Hinton A, Berrang ME. Detection and Distribution of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) in Campylobacter jejuni Isolates from Chicken Livers. J Food Prot 2024; 87:100250. [PMID: 38382707 DOI: 10.1016/j.jfp.2024.100250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 02/12/2024] [Accepted: 02/15/2024] [Indexed: 02/23/2024]
Abstract
Campylobacter jejuni is the leading foodborne bacterial pathogen that causes human gastroenteritis worldwide linked to the consumption of undercooked broiler livers. Application of bacteriophages during poultry production has been used as an alternative approach to reduce contamination of poultry meat by Campylobacter. To make this approach effective, understanding the presence of the bacteriophage sequences in the CRISPR spacers in C. jejuni is critical as they may confer bacterial resistance to bacteriophage treatment. Therefore, in this study, we explored the distribution of the CRISPR arrays from 178 C. jejuni isolated from chicken livers between January and July 2018. Genomic DNA of C. jejuni isolates was extracted, and CRISPR type 1 sequences were amplified by PCR. Amplicons were purified and sequenced by the Sanger dideoxy sequencing method. Direct repeats (DRs) and spacers of CRISPR sequences were identified using the CRISPRFinder program. Further, spacer sequences were submitted to the CRISPRTarget to identify potential homology to bacteriophage types. Even though CRISPR-Cas is reportedly not an active system in Campylobacter, a total of 155 (87%) C. jejuni isolates were found to harbor CRISPR sequences; one type of DR was identified in all 155 isolates. The CRISPR loci lengths ranged from 97 to 431 nucleotides. The numbers of spacers ranged from one to six. A total of 371 spacer sequences were identified in the 155 isolates that could be grouped into 51 distinctive individual sequences. Further comparison of these 51 spacer sequences with those in databases showed that most spacer sequences were homologous to Campylobacter bacteriophage DA10. The results of our study provide important information relative to the development of an effective bacteriophage treatment to mitigate Campylobacter during poultry production.
Collapse
Affiliation(s)
- Hung-Yueh Yeh
- U.S. National Poultry Research Center, Agricultural Research Service, United States Department of Agriculture, 950 College Station Road, Athens, GA 30605-2720, USA.
| | - Nelson A Cox
- U.S. National Poultry Research Center, Agricultural Research Service, United States Department of Agriculture, 950 College Station Road, Athens, GA 30605-2720, USA
| | - Arthur Hinton
- U.S. National Poultry Research Center, Agricultural Research Service, United States Department of Agriculture, 950 College Station Road, Athens, GA 30605-2720, USA
| | - Mark E Berrang
- U.S. National Poultry Research Center, Agricultural Research Service, United States Department of Agriculture, 950 College Station Road, Athens, GA 30605-2720, USA
| |
Collapse
|
4
|
Abstract
RNAstructure is a user-friendly program for the prediction and analysis of RNA secondary structure. It is available as a web server, a program with a graphical user interface, or a set of command line tools. The programs are available for Microsoft Windows, macOS, or Linux. This article provides protocols for prediction of RNA secondary structure (using the web server, the graphical user interface, or the command line) and high-affinity oligonucleotide binding sites to a structured RNA target (using the graphical user interface). © 2023 Wiley Periodicals LLC. Basic Protocol 1: Predicting RNA secondary structure using the RNAstructure web server Alternate Protocol 1: Predicting secondary structure and base pair probabilities using the RNAstructure graphical user interface Alternate Protocol 2: Predicting secondary structure and base pair probabilities using the RNAstructure command line interface Basic Protocol 2: Predicting binding affinities of oligonucleotides complementary to an RNA target using OligoWalk.
Collapse
Affiliation(s)
- Sara E. Ali
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, New York 14642
| | - Abhinav Mittal
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, New York 14642
| | - David H. Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, New York 14642
| |
Collapse
|
5
|
Palit P, Chowdhury FT, Baruah N, Sarkar B, Mou SN, Kamal M, Siddiqua TJ, Noor Z, Ahmed T. A Comprehensive Computational Investigation into the Conserved Virulent Proteins of Shigella species Unveils Potential Small-Interfering RNA Candidates as a New Therapeutic Strategy against Shigellosis. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27061936. [PMID: 35335300 PMCID: PMC8950558 DOI: 10.3390/molecules27061936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 12/19/2021] [Accepted: 12/27/2021] [Indexed: 11/16/2022]
Abstract
Shigella species account for the second-leading cause of deaths due to diarrheal diseases among children of less than 5 years of age. The emergence of multi-drug-resistant Shigella isolates and the lack of availability of Shigella vaccines have led to the pertinence in the efforts made for the development of new therapeutic strategies against shigellosis. Consequently, designing small-interfering RNA (siRNA) candidates against such infectious agents represents a novel approach to propose new therapeutic candidates to curb the rampant rise of anti-microbial resistance in such pathogens. In this study, we analyzed 264 conserved sequences from 15 different conserved virulence genes of Shigella sp., through extensive rational validation using a plethora of first-generation and second-generation computational algorithms for siRNA designing. Fifty-eight siRNA candidates were obtained by using the first-generation algorithms, out of which only 38 siRNA candidates complied with the second-generation rules of siRNA designing. Further computational validation showed that 16 siRNA candidates were found to have a substantial functional efficiency, out of which 11 siRNA candidates were found to be non-immunogenic. Finally, three siRNA candidates exhibited a sterically feasible three-dimensional structure as exhibited by parameters of nucleic acid geometry such as: the probability of wrong sugar puckers, bad backbone confirmations, bad bonds, and bad angles being within the accepted threshold for stable tertiary structure. Although the findings of our study require further wet-lab validation and optimization for therapeutic use in the treatment of shigellosis, the computationally validated siRNA candidates are expected to suppress the expression of the virulence genes, namely: IpgD (siRNA 9) and OspB (siRNA 15 and siRNA 17) and thus act as a prospective tool in the RNA interference (RNAi) pathway. However, the findings of our study require further wet-lab validation and optimization for regular therapeutic use for treatment of shigellosis.
Collapse
Affiliation(s)
- Parag Palit
- International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh; (P.P.); (M.K.); (T.J.S.); (T.A.)
| | - Farhana Tasnim Chowdhury
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka 1000, Bangladesh; (F.T.C.); (B.S.); (S.N.M.)
| | - Namrata Baruah
- Department of Biological Sciences and Bioengineering, Indian Institute of Technology, Kanpur 208016, Uttar Pradesh, India;
| | - Bonoshree Sarkar
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka 1000, Bangladesh; (F.T.C.); (B.S.); (S.N.M.)
| | - Sadia Noor Mou
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka 1000, Bangladesh; (F.T.C.); (B.S.); (S.N.M.)
| | - Mehnaz Kamal
- International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh; (P.P.); (M.K.); (T.J.S.); (T.A.)
| | - Towfida Jahan Siddiqua
- International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh; (P.P.); (M.K.); (T.J.S.); (T.A.)
| | - Zannatun Noor
- International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh; (P.P.); (M.K.); (T.J.S.); (T.A.)
- Correspondence:
| | - Tahmeed Ahmed
- International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka 1212, Bangladesh; (P.P.); (M.K.); (T.J.S.); (T.A.)
| |
Collapse
|
6
|
Hasan M, Ashik AI, Chowdhury MB, Tasnim AT, Nishat ZS, Hossain T, Ahmed S. Computational prediction of potential siRNA and human miRNA sequences to silence orf1ab associated genes for future therapeutics against SARS-CoV-2. INFORMATICS IN MEDICINE UNLOCKED 2021; 24:100569. [PMID: 33846694 PMCID: PMC8028608 DOI: 10.1016/j.imu.2021.100569] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 03/26/2021] [Accepted: 03/31/2021] [Indexed: 12/12/2022] Open
Abstract
The coronavirus disease 2019 (COVID-19) is an ongoing pandemic caused by an RNA virus termed as severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). SARS-CoV-2 possesses an almost 30kbp long genome. The genome contains open-reading frame 1ab (ORF1ab) gene, the largest one of SARS-CoV-2, encoding polyprotein PP1ab and PP1a responsible for viral transcription and replication. Several vaccines have already been approved by the respective authorities over the world to develop herd immunity among the population. In consonance with this effort, RNA interference (RNAi) technology holds the possibility to strengthen the fight against this virus. Here, we have implemented a computational approach to predict potential short interfering RNAs including small interfering RNAs (siRNAs) and microRNAs (miRNAs), which are presumed to be intrinsically active against SARS-CoV-2. In doing so, we have screened miRNA library and siRNA library targeting the ORF1ab gene. We predicted the potential miRNA and siRNA candidate molecules utilizing an array of bioinformatic tools. By extending the analysis, out of 24 potential pre-miRNA hairpins and 131 siRNAs, 12 human miRNA and 10 siRNA molecules were sorted as potential therapeutic agents against SARS-CoV-2 based on their GC content, melting temperature (Tm), heat capacity (Cp), hybridization and minimal free energy (MFE) of hybridization. This computational study is focused on lessening the extensive time and labor needed in conventional trial and error based wet lab methods and it has the potential to act as a decent base for future researchers to develop a successful RNAi therapeutic.
Collapse
Key Words
- ACE-2, Angiotensin-converting enzyme 2
- COVID-19
- COVID-19, coronavirus disease 2019
- Cp, heat capacity
- Gene silencing
- ORF, open reading frame
- Posttranscriptional regulation
- RNAi Therapeutics
- RNAi, RNA interference
- SARS-CoV-2
- SARS-CoV-2, severe acute respiratory syndrome coronavirus-2
- TMPRSS2, transmembrane protease serine 2
- Tm, melting temperature
- UTR, untranslated region
- hsa-miR, human microRNA
- miRNA
- miRNA, microRNA
- sgRNA, sub-genomic RNA
- siRNA
- siRNA, small interfering RNA
Collapse
Affiliation(s)
- Mahedi Hasan
- Department of Biochemistry and Molecular Biology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh
| | - Arafat Islam Ashik
- Department of Biochemistry and Molecular Biology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh
| | - Md Belal Chowdhury
- Department of Biochemistry and Molecular Biology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh
| | - Atiya Tahira Tasnim
- Department of Biochemistry and Molecular Biology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh
| | - Zakia Sultana Nishat
- Department of Biochemistry and Molecular Biology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh
| | - Tanvir Hossain
- Department of Biochemistry and Molecular Biology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh
| | - Shamim Ahmed
- Department of Biochemistry and Molecular Biology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh
| |
Collapse
|
7
|
Conserved Structural Motifs of Two Distant IAV Subtypes in Genomic Segment 5 RNA. Viruses 2021; 13:v13030525. [PMID: 33810157 PMCID: PMC8004953 DOI: 10.3390/v13030525] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 03/18/2021] [Accepted: 03/19/2021] [Indexed: 12/14/2022] Open
Abstract
The functionality of RNA is fully dependent on its structure. For the influenza A virus (IAV), there are confirmed structural motifs mediating processes which are important for the viral replication cycle, including genome assembly and viral packaging. Although the RNA of strains originating from distant IAV subtypes might fold differently, some structural motifs are conserved, and thus, are functionally important. Nowadays, NGS-based structure modeling is a source of new in vivo data helping to understand RNA biology. However, for accurate modeling of in vivo RNA structures, these high-throughput methods should be supported with other analyses facilitating data interpretation. In vitro RNA structural models complement such approaches and offer RNA structures based on experimental data obtained in a simplified environment, which are needed for proper optimization and analysis. Herein, we present the secondary structure of the influenza A virus segment 5 vRNA of A/California/04/2009 (H1N1) strain, based on experimental data from DMS chemical mapping and SHAPE using NMIA, supported by base-pairing probability calculations and bioinformatic analyses. A comparison of the available vRNA5 structures among distant IAV strains revealed that a number of motifs present in the A/California/04/2009 (H1N1) vRNA5 model are highly conserved despite sequence differences, located within previously identified packaging signals, and the formation of which in in virio conditions has been confirmed. These results support functional roles of the RNA secondary structure motifs, which may serve as candidates for universal RNA-targeting inhibitory methods.
Collapse
|
8
|
Berrio A, Gartner V, Wray GA. Positive selection within the genomes of SARS-CoV-2 and other Coronaviruses independent of impact on protein function. PeerJ 2020; 8:e10234. [PMID: 33088633 PMCID: PMC7571416 DOI: 10.7717/peerj.10234] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 10/04/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND The emergence of a novel coronavirus (SARS-CoV-2) associated with severe acute respiratory disease (COVID-19) has prompted efforts to understand the genetic basis for its unique characteristics and its jump from non-primate hosts to humans. Tests for positive selection can identify apparently nonrandom patterns of mutation accumulation within genomes, highlighting regions where molecular function may have changed during the origin of a species. Several recent studies of the SARS-CoV-2 genome have identified signals of conservation and positive selection within the gene encoding Spike protein based on the ratio of synonymous to nonsynonymous substitution. Such tests cannot, however, detect changes in the function of RNA molecules. METHODS Here we apply a test for branch-specific oversubstitution of mutations within narrow windows of the genome without reference to the genetic code. RESULTS We recapitulate the finding that the gene encoding Spike protein has been a target of both purifying and positive selection. In addition, we find other likely targets of positive selection within the genome of SARS-CoV-2, specifically within the genes encoding Nsp4 and Nsp16. Homology-directed modeling indicates no change in either Nsp4 or Nsp16 protein structure relative to the most recent common ancestor. These SARS-CoV-2-specific mutations may affect molecular processes mediated by the positive or negative RNA molecules, including transcription, translation, RNA stability, and evasion of the host innate immune system. Our results highlight the importance of considering mutations in viral genomes not only from the perspective of their impact on protein structure, but also how they may impact other molecular processes critical to the viral life cycle.
Collapse
Affiliation(s)
| | - Valerie Gartner
- Department of Biology, Duke University, Durham, NC, USA
- University Program in Genetics and Genomics, Duke University, Durham, NC, USA
| | - Gregory A. Wray
- Department of Biology, Duke University, Durham, NC, USA
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
| |
Collapse
|
9
|
Newire E, Aydin A, Juma S, Enne VI, Roberts AP. Identification of a Type IV-A CRISPR-Cas System Located Exclusively on IncHI1B/IncFIB Plasmids in Enterobacteriaceae. Front Microbiol 2020; 11:1937. [PMID: 32903441 PMCID: PMC7434947 DOI: 10.3389/fmicb.2020.01937] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 07/22/2020] [Indexed: 12/15/2022] Open
Abstract
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are diverse immune systems found in many prokaryotic genomes that target invading foreign DNA such as bacteriophages and plasmids. There are multiple types of CRISPR with arguably the most enigmatic being Type IV. During an investigation of CRISPR carriage in clinical, multi-drug resistant, Klebsiella pneumoniae, a Type IV-A3 CRISPR-Cas system was detected on plasmids from two K. pneumoniae isolates from Egypt (isolated in 2002-2003) and a single K. pneumoniae isolate from the United Kingdom (isolated in 2017). Sequence analysis of all other genomes available in GenBank revealed that this CRISPR-Cas system was present on 28 other plasmids from various Enterobacteriaceae hosts and was never found on a bacterial chromosome. This system is exclusively located on IncHI1B/IncFIB plasmids and is associated with multiple putative transposable elements. Expression of the cas loci was confirmed in the available clinical isolates by RT-PCR. In all cases, the CRISPR-Cas system has a single CRISPR array (CRISPR1) upstream of the cas loci which has several, conserved, spacers which, amongst things, match regions within conjugal transfer genes of IncFIIK/IncFIB(K) plasmids. Our results reveal a Type IV-A3 CRISPR-Cas system exclusively located on IncHI1B/IncFIB plasmids in Enterobacteriaceae that is likely to be able to target IncFIIK/IncFIB(K) plasmids presumably facilitating intracellular, inter-plasmid competition.
Collapse
Affiliation(s)
- Enas Newire
- UCL Eastman Dental Institute, University College London, London, United Kingdom
| | - Alp Aydin
- Centre for Clinical Microbiology, Royal Free Hospital, University College London, London, United Kingdom
| | - Samina Juma
- Centre for Clinical Microbiology, Royal Free Hospital, University College London, London, United Kingdom
| | - Virve I. Enne
- Centre for Clinical Microbiology, Royal Free Hospital, University College London, London, United Kingdom
| | - Adam P. Roberts
- Department of Tropical Disease Biology, Liverpool School of Tropical Medicine, Liverpool, United Kingdom
- Centre for Drugs and Diagnostics, Liverpool School of Tropical Medicine, Liverpool, United Kingdom
| |
Collapse
|
10
|
Islam MO, Palit P, Shawon J, Hasan MK, Mahmud A, Mahfuz M, Ahmed T, Mondal D. Exploring novel therapeutic strategies against vivax malaria through an integrated computational investigation to inhibit the merozoite surface protein−1 of Plasmodium vivax. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2020.100471] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
|
11
|
Agüero-Chapin G, Galpert D, Molina-Ruiz R, Ancede-Gallardo E, Pérez-Machado G, De la Riva GA, Antunes A. Graph Theory-Based Sequence Descriptors as Remote Homology Predictors. Biomolecules 2019; 10:E26. [PMID: 31878100 PMCID: PMC7022958 DOI: 10.3390/biom10010026] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 12/16/2019] [Accepted: 12/18/2019] [Indexed: 12/23/2022] Open
Abstract
Alignment-free (AF) methodologies have increased in popularity in the last decades as alternative tools to alignment-based (AB) algorithms for performing comparative sequence analyses. They have been especially useful to detect remote homologs within the twilight zone of highly diverse gene/protein families and superfamilies. The most popular alignment-free methodologies, as well as their applications to classification problems, have been described in previous reviews. Despite a new set of graph theory-derived sequence/structural descriptors that have been gaining relevance in the detection of remote homology, they have been omitted as AF predictors when the topic is addressed. Here, we first go over the most popular AF approaches used for detecting homology signals within the twilight zone and then bring out the state-of-the-art tools encoding graph theory-derived sequence/structure descriptors and their success for identifying remote homologs. We also highlight the tendency of integrating AF features/measures with the AB ones, either into the same prediction model or by assembling the predictions from different algorithms using voting/weighting strategies, for improving the detection of remote signals. Lastly, we briefly discuss the efforts made to scale up AB and AF features/measures for the comparison of multiple genomes and proteomes. Alongside the achieved experiences in remote homology detection by both the most popular AF tools and other less known ones, we provide our own using the graphical-numerical methodologies, MARCH-INSIDE, TI2BioP, and ProtDCal. We also present a new Python-based tool (SeqDivA) with a friendly graphical user interface (GUI) for delimiting the twilight zone by using several similar criteria.
Collapse
Affiliation(s)
- Guillermin Agüero-Chapin
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n 4450-208 Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Deborah Galpert
- Departamento de Ciencia de la Computación. Universidad Central ¨Marta Abreu¨ de Las Villas (UCLV), Santa Clara 54830, Cuba;
| | - Reinaldo Molina-Ruiz
- Centro de Bioactivos Químicos (CBQ), Universidad Central ¨Marta Abreu¨ de Las Villas (UCLV), Santa Clara 54830, Cuba;
| | - Evys Ancede-Gallardo
- Programa de Doctorado en Fisicoquímica Molecular, Facultad de Ciencias Exactas, Universidad Andrés Bello, Av. República 239, Santiago 8370146, Chile;
| | - Gisselle Pérez-Machado
- EpiDisease S.L. Spin-Off of Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), 46980 Valencia, Spain;
| | - Gustavo A. De la Riva
- Laboratorio de Biotecnología Aplicada S. de R.L. de C.V., GRECA Inc., Carretera La Piedad-Carapán, km 3.5, La Piedad, Michoacán 59300, Mexico;
- Tecnológico Nacional de México, Instituto Tecnológico de la Piedad, Av. Ricardo Guzmán Romero, Santa Fe, La Piedad de Cavadas, Michoacán 59370, Mexico
| | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n 4450-208 Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| |
Collapse
|
12
|
Chowdhury FT, Shohan MU, Islam T, Mimu TT, Palit P. A Therapeutic Approach Against Leishmania donovani by Predicting RNAi Molecules Against the Surface Protein, gp63. Curr Bioinform 2019. [DOI: 10.2174/1574893613666180828095737] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Background:
Leishmaniasis is a disease caused by the Leishmania sp. and can be
classified into two major types: cutaneous and visceral leismaniasis. Visceral leishmaniasis is the
deadlier type and is mediated by Leishmania donovani and involves the establishment of persistent
infection and causes damage to the liver, spleen and bone marrow. With no vaccine yet available
against leishmaniasis and the current therapeutic drugs of leishmaniasis being toxic and expensive;
an alternative treatment is necessary.
Objective:
Surface glycocalyx protein gp63, plays a major role in the virulence and resulting
pathogenicity associated with the disease. Henceforth, silencing the gp63 mRNA through the RNA
interference system was the aim of this study.
Methods:
In this study two competent siRNAs and three miRNAs have been designed against gp63
for five different strains of L. donovani by using various computational methods. Target specific
siRNAs were designed using siDirect 2.0 and to design possible miRNA, another tool named IDT
(IntegratedDNA Technology). Screening for off-target similarity was done by BLAST and the GC
contents and the secondary structures of the designed RNAs were determined. RNA-RNA
interaction was calculated by RNAcofold and IntraRNA, followed by the determination of heat
capacity and the concentration of duplex by DNAmelt web server.
Results:
The selected RNAi molecules; two siRNA and three miRNA had no off-target in human
genome and the ones with lower GC content were selected for efficient RNAi function. The
selected ones showed proper thermodynamic characteristics to suppress the expression of the
pathogenic gene of gp63.
Collapse
Affiliation(s)
- Farhana T. Chowdhury
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh
| | - Mohammad U.S. Shohan
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh
| | - Tasmia Islam
- Department of Genetic Engineering and Biotechnology, University of Dhaka, Dhaka, Bangladesh
| | - Taisha T. Mimu
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh
| | - Parag Palit
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh
| |
Collapse
|
13
|
Sullivan R, Adams MC, Naik RR, Milam VT. Analyzing Secondary Structure Patterns in DNA Aptamers Identified via CompELS. Molecules 2019; 24:molecules24081572. [PMID: 31010064 PMCID: PMC6515186 DOI: 10.3390/molecules24081572] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 04/09/2019] [Accepted: 04/15/2019] [Indexed: 12/12/2022] Open
Abstract
In contrast to sophisticated high-throughput sequencing tools for genomic DNA, analytical tools for comparing secondary structure features between multiple single-stranded DNA sequences are less developed. For single-stranded nucleic acid ligands called aptamers, secondary structure is widely thought to play a pivotal role in driving recognition-based binding activity between an aptamer sequence and its specific target. Here, we employ a competition-based aptamer screening platform called CompELS to identify DNA aptamers for a colloidal target. We then analyze predicted secondary structures of the aptamers and a large population of random sequences to identify sequence features and patterns. Our secondary structure analysis identifies patterns ranging from position-dependent score matrixes of individual structural elements to position-independent consensus domains resulting from global alignment.
Collapse
Affiliation(s)
- Richard Sullivan
- School of Materials Science and Engineering, Georgia Institute of Technology, 771 Ferst Dr. NW, Atlanta, GA 30332-0245, USA.
| | - Mary Catherine Adams
- School of Materials Science and Engineering, Georgia Institute of Technology, 771 Ferst Dr. NW, Atlanta, GA 30332-0245, USA.
| | - Rajesh R Naik
- 711 Human Performance Wing, Air Force Research Laboratory, Wright Patterson AFB, OH 45433, USA.
| | - Valeria T Milam
- School of Materials Science and Engineering, Georgia Institute of Technology, 771 Ferst Dr. NW, Atlanta, GA 30332-0245, USA.
- Wallace H. Coulter, Department of Biomedical Engineering, Georgia Institute of Technology, 313 Ferst Dr., Atlanta, GA 30332, USA.
- Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, 315 Ferst Dr., Atlanta, GA 30332-0363, USA.
| |
Collapse
|
14
|
Mobasseri N, Nikzad H, Karimian M. Protective effect of oestrogen receptor α-PvuII transition against idiopathic male infertility: a case-control study and meta-analysis. Reprod Biomed Online 2019; 38:588-598. [PMID: 30738766 DOI: 10.1016/j.rbmo.2019.01.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Revised: 01/17/2019] [Accepted: 01/18/2019] [Indexed: 12/21/2022]
Abstract
RESEARCH QUESTION Is there any genetic association between oestrogen receptor alpha [ERα]-PvuII polymorphism and idiopathic male infertility? DESIGN A total of 226 infertile and 213 fertile men participated in the present case-control study. ERα-PvuII genotyping was performed using the polymerase chain reaction-restriction fragment length polymorphism [PCR-RFLP] method. Meta-analysis was also performed by pooling data collected from seven other eligible studies identified by searches of PubMed, Embase, Google Scholar, and Science Direct databases. Summary odds ratios were estimated by fixed- or random-effects models. The molecular effects of ERα-PvuII polymorphism were evaluated by bioinformatics tools. RESULTS A significant protective association was reported between ERα-PvuII and male infertility in the homozygote model [OR=0.54, 95%CI=0.3-0.98, p=0.042]. Also, a similar association was observed in asthenozoospermia subgroup [OR=0.4, 95%CI=0.18-0.9, p=0.025]. Meta-analysis also revealed that the ER-PvuII polymorphism was significantly associated with the decreased risk of male infertility in the heterozygote co-dominant model [OR=0.80, 95%CI=0.64-0.99, p=0.042]. Moreover, similar protective results were reported in stratified analyses in Caucasian subgroup in the dominant genetic model [OR=0.66, 95%CI=0.45-0.96, p=0.029] and in the heterozygote co-dominant model [OR=0.62, 95%CI=0.41-0.93, p=0.021]. A significant association was also found in studies with sample size of less than 400 subjects in heterozygote co-dominant model [OR=0.69, 95%CI=0.50-0.95, p=0.023]. The bioinformatics data indicated that ER-PvuII polymorphism could significantly affect RNA structure of ERα [p=0.004]. CONCLUSION The ERα-PvuII polymorphism could be considered as a possible protective factor against male infertility.
Collapse
Affiliation(s)
- Narges Mobasseri
- Gametogenesis Research Center, Kashan University of Medical Sciences, Kashan, Iran
| | - Hossein Nikzad
- Gametogenesis Research Center, Kashan University of Medical Sciences, Kashan, Iran.
| | - Mohammad Karimian
- Gametogenesis Research Center, Kashan University of Medical Sciences, Kashan, Iran.
| |
Collapse
|
15
|
Vijayaraghavan B, Danabal K, Padmanabhan G, Ramanathan K. Study on Regulation of Low Density Lipoprotein Cholesterol Metabolism using PCSK9 Gene Silencing: A computational Approach. Bioinformation 2018; 14:248-251. [PMID: 30108423 PMCID: PMC6077823 DOI: 10.6026/97320630014248] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Revised: 05/05/2018] [Accepted: 05/30/2018] [Indexed: 01/15/2023] Open
Abstract
Combating and preventing abnormality in lipid metabolism becomes a pivotal criterion for research. Proprotein convertase subtilisin/kexin type 9 (PCSK9) is a circulating protein; it promotes the degradation of low-density lipoprotein receptors (LDL-R) and hence increases LDL-C levels. Silencing the gene PCSK9 at post-transcriptional level with the help of small interfering Ribo nucleic acid (siRNA) gives a new insight and a novel therapeutic way to regulate LDL-C metabolism. Designing and selecting an efficient siRNA for silencing PCSK9 at post transcriptional level through computational approach. We have designed three siRNAs to silence each mRNA of PCSK9 through computational analysis using software Invivogen. Their minimum free energy of hybridization along with their secondary structure was obtained using bioinformatics tool BIBISERV2-RNAHYBRID. Further factors like GC content, structural linearity and h-b index of mRNA-siRNA complex was calculated to assess their knockdown efficiency. The minimum free energy of hybridization of the three designed siRNA1, siRNA2 and siRNA3 for target mRNA is as follows -27.1kcal/mol, -25.7kcal/mol and - 28.8 kcal/mol. siRNA1 having the least minimum free energy of hybridization i.e. -27.1 kcal/mol are predicted to be the most efficient towards the PCSK9 gene silencing.
Collapse
Affiliation(s)
| | - Kavitha Danabal
- Department of Botany & Microbiology, AVVM Sri Puspam College (Autonomous), Poondi, Thanjavur, India
| | - Giri Padmanabhan
- Kidney Care, C50, 10TH B Cross, East Thillai Nagar, Tiruchirappalli-620 018, India
| | - Kumaresan Ramanathan
- Department of Medical Biochemistry, Division of Biomedical Sciences, School of Medicine, College of Health Sciences, Mekelle University (Ayder Campus), Mekelle, Ethiopia
| |
Collapse
|
16
|
Sharif Shohan MU, Paul A, Hossain M. Computational design of potential siRNA molecules for silencing nucleoprotein gene of rabies virus. Future Virol 2018. [DOI: 10.2217/fvl-2017-0117] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Aim: Rabies virus infections are a global threat to human and animal health, yet no progressive curative therapy has been developed. In this study, the nucleoprotein gene of rabies virus which is responsible for viral infection was used as a target to design our desired siRNA. Methods: The conserved regions were analyzed by doing alignment of sequences from different strains. Subsequently, different computational tools were used for designing and validation of siRNA molecules. Results: We identified four probable siRNA molecules from twelve different strains of rabies virus which may silence the nucleoprotein gene and inhibit the multiplication of the virus. Conclusion: Our study may help to take an effective therapeutic approach against rabies virus and lead to better control of rabies in humans.
Collapse
Affiliation(s)
| | - Anik Paul
- Department of Biochemistry & Molecular Biology, University of Dhaka, Dhaka 1000, Bangladesh
| | - Motaher Hossain
- Department of Biochemistry & Molecular Biology, University of Dhaka, Dhaka 1000, Bangladesh
| |
Collapse
|
17
|
Li Y, Shi X, Liang Y, Xie J, Zhang Y, Ma Q. RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation. BMC Bioinformatics 2017; 18:51. [PMID: 28109252 PMCID: PMC5251234 DOI: 10.1186/s12859-017-1481-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 01/10/2017] [Indexed: 01/10/2023] Open
Abstract
Background RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. Results An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. Conclusion RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1481-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ying Li
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China
| | - Xiaohu Shi
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China
| | - Yanchun Liang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China.,Zhuhai Laboratory of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University, Zhuhai, 519041, China
| | - Juan Xie
- Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, 57007, USA.,Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, 57007, USA.,BioSNTR, Brookings, SD, USA
| | - Yu Zhang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China. .,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China.
| | - Qin Ma
- Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, 57007, USA. .,Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, 57007, USA. .,BioSNTR, Brookings, SD, USA.
| |
Collapse
|
18
|
Lenartowicz E, Nogales A, Kierzek E, Kierzek R, Martínez-Sobrido L, Turner DH. Antisense Oligonucleotides Targeting Influenza A Segment 8 Genomic RNA Inhibit Viral Replication. Nucleic Acid Ther 2016; 26:277-285. [PMID: 27463680 PMCID: PMC5067832 DOI: 10.1089/nat.2016.0619] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Influenza A virus (IAV) affects 5%–10% of the world's population every year. Through genome changes, many IAV strains develop resistance to currently available anti-influenza therapeutics. Therefore, there is an urgent need to find new targets for therapeutics against this important human respiratory pathogen. In this study, 2′-O-methyl and locked nucleic acid antisense oligonucleotides (ASOs) were designed to target internal regions of influenza A/California/04/2009 (H1N1) genomic viral RNA segment 8 (vRNA8) based on a base-pairing model of vRNA8. Ten of 14 tested ASOs showed inhibition of viral replication in Madin-Darby canine kidney cells. The best five ASOs were 11–15 nucleotides long and showed inhibition ranging from 5- to 25-fold. In a cell viability assay they showed no cytotoxicity. The same five ASOs also showed no inhibition of influenza B/Brisbane/60/2008 (Victoria lineage), indicating that they are sequence specific for IAV. Moreover, combinations of ASOs slightly improved anti-influenza activity. These studies establish the accessibility of IAV vRNA for ASOs in regions other than the panhandle formed between the 5′ and 3′ ends. Thus, these regions can provide targets for the development of novel IAV antiviral approaches.
Collapse
Affiliation(s)
| | - Aitor Nogales
- 2 Department of Microbiology and Immunology, University of Rochester , Rochester, New York
| | - Elzbieta Kierzek
- 3 Institute of Bioorganic Chemistry, Polish Academy of Sciences , Poznan, Poland
| | - Ryszard Kierzek
- 3 Institute of Bioorganic Chemistry, Polish Academy of Sciences , Poznan, Poland
| | - Luis Martínez-Sobrido
- 2 Department of Microbiology and Immunology, University of Rochester , Rochester, New York
| | - Douglas H Turner
- 1 Department of Chemistry, University of Rochester , Rochester, New York
| |
Collapse
|
19
|
Yan K, Arfat Y, Li D, Zhao F, Chen Z, Yin C, Sun Y, Hu L, Yang T, Qian A. Structure Prediction: New Insights into Decrypting Long Noncoding RNAs. Int J Mol Sci 2016; 17:ijms17010132. [PMID: 26805815 PMCID: PMC4730372 DOI: 10.3390/ijms17010132] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2015] [Revised: 12/18/2015] [Accepted: 01/12/2016] [Indexed: 12/31/2022] Open
Abstract
Long noncoding RNAs (lncRNAs), which form a diverse class of RNAs, remain the least understood type of noncoding RNAs in terms of their nature and identification. Emerging evidence has revealed that a small number of newly discovered lncRNAs perform important and complex biological functions such as dosage compensation, chromatin regulation, genomic imprinting, and nuclear organization. However, understanding the wide range of functions of lncRNAs related to various processes of cellular networks remains a great experimental challenge. Structural versatility is critical for RNAs to perform various functions and provides new insights into probing the functions of lncRNAs. In recent years, the computational method of RNA structure prediction has been developed to analyze the structure of lncRNAs. This novel methodology has provided basic but indispensable information for the rapid, large-scale and in-depth research of lncRNAs. This review focuses on mainstream RNA structure prediction methods at the secondary and tertiary levels to offer an additional approach to investigating the functions of lncRNAs.
Collapse
Affiliation(s)
- Kun Yan
- Key Laboratory for Space Bioscience & Biotechnology, Institute of Special Environmental Biophysics, School of Life Sciences, Northwestern Polytechnical University, 127 Youyi Xilu, Xi'an 710072, China.
| | - Yasir Arfat
- Key Laboratory for Space Bioscience & Biotechnology, Institute of Special Environmental Biophysics, School of Life Sciences, Northwestern Polytechnical University, 127 Youyi Xilu, Xi'an 710072, China.
| | - Dijie Li
- Key Laboratory for Space Bioscience & Biotechnology, Institute of Special Environmental Biophysics, School of Life Sciences, Northwestern Polytechnical University, 127 Youyi Xilu, Xi'an 710072, China.
| | - Fan Zhao
- Key Laboratory for Space Bioscience & Biotechnology, Institute of Special Environmental Biophysics, School of Life Sciences, Northwestern Polytechnical University, 127 Youyi Xilu, Xi'an 710072, China.
| | - Zhihao Chen
- Key Laboratory for Space Bioscience & Biotechnology, Institute of Special Environmental Biophysics, School of Life Sciences, Northwestern Polytechnical University, 127 Youyi Xilu, Xi'an 710072, China.
| | - Chong Yin
- Key Laboratory for Space Bioscience & Biotechnology, Institute of Special Environmental Biophysics, School of Life Sciences, Northwestern Polytechnical University, 127 Youyi Xilu, Xi'an 710072, China.
| | - Yulong Sun
- Key Laboratory for Space Bioscience & Biotechnology, Institute of Special Environmental Biophysics, School of Life Sciences, Northwestern Polytechnical University, 127 Youyi Xilu, Xi'an 710072, China.
| | - Lifang Hu
- Key Laboratory for Space Bioscience & Biotechnology, Institute of Special Environmental Biophysics, School of Life Sciences, Northwestern Polytechnical University, 127 Youyi Xilu, Xi'an 710072, China.
| | - Tuanmin Yang
- Department of Bone Disease Oncology, Hong-Hui Hospital, Xi'an Jiaotong University College of Medicine, South Door slightly Friendship Road 555, Xi'an 710054, China.
| | - Airong Qian
- Key Laboratory for Space Bioscience & Biotechnology, Institute of Special Environmental Biophysics, School of Life Sciences, Northwestern Polytechnical University, 127 Youyi Xilu, Xi'an 710072, China.
| |
Collapse
|
20
|
Abstract
Experimental probing data can be used to improve the accuracy of RNA secondary structure prediction. The software package RNAstructure can take advantage of enzymatic cleavage data, FMN cleavage data, traditional chemical modification reactivity data, and SHAPE reactivity data for secondary structure modeling. This chapter provides protocols for using experimental probing data with RNAstructure to restrain or constrain RNA secondary structure prediction.
Collapse
Affiliation(s)
- Zhenjiang Zech Xu
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY, 14642, USA
- Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY, 14642, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY, 14642, USA.
- Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY, 14642, USA.
- Department of Biostatistics & Computational Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY, 14642, USA.
| |
Collapse
|
21
|
Abstract
RNA structure is conserved by evolution to a greater extent than sequence. Predicting the conserved structure for multiple homologous sequences can be much more accurate than predicting the structure for a single sequence. RNAstructure is a software package that includes the programs Dynalign, Multilign, TurboFold, and PARTS for predicting conserved RNA secondary structure. This chapter provides protocols for using these programs.
Collapse
|
22
|
Chatzou M, Magis C, Chang JM, Kemena C, Bussotti G, Erb I, Notredame C. Multiple sequence alignment modeling: methods and applications. Brief Bioinform 2015; 17:1009-1023. [PMID: 26615024 DOI: 10.1093/bib/bbv099] [Citation(s) in RCA: 98] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2015] [Revised: 10/16/2015] [Indexed: 12/20/2022] Open
Abstract
This review provides an overview on the development of Multiple sequence alignment (MSA) methods and their main applications. It is focused on progress made over the past decade. The three first sections review recent algorithmic developments for protein, RNA/DNA and genomic alignments. The fourth section deals with benchmarks and explores the relationship between empirical and simulated data, along with the impact on method developments. The last part of the review gives an overview on available MSA local reliability estimators and their dependence on various algorithmic properties of available methods.
Collapse
|
23
|
Nur SM, Hasan MA, Amin MA, Hossain M, Sharmin T. Design of Potential RNAi (miRNA and siRNA) Molecules for Middle East Respiratory Syndrome Coronavirus (MERS-CoV) Gene Silencing by Computational Method. Interdiscip Sci 2015. [PMID: 26223545 PMCID: PMC7090891 DOI: 10.1007/s12539-015-0266-9] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The Middle East respiratory syndrome coronavirus (MERS-CoV) is a virus that manifests itself in viral infection with fever, cough, shortness of breath, renal failure and severe acute pneumonia, which often result in a fatal outcome. MERS-CoV has been shown to spread between people who are in close contact. Transmission from infected patients to healthcare personnel has also been observed and is irredeemable with present technology. Genetic studies on MERS-CoV have shown that ORF1ab encodes replicase polyproteins and play a foremost role in viral infection. Therefore, ORF1ab replicase polyprotein may be used as a suitable target for disease control. Viral activity can be controlled by RNA interference (RNAi) technology, a leading method for post transcriptional gene silencing in a sequence-specific manner. However, there is a genetic inconsistency in different viral isolates; it is a great challenge to design potential RNAi (miRNA and siRNA) molecules which can silence the respective target genes rather than any other viral gene simultaneously. In the current study, four effective miRNA and five siRNA molecules for silencing of nine different strains of MERS-CoV were rationally designed and corroborated using computational methods, which might lead to knockdown the activity of virus. siRNA and miRNA molecules were predicted against ORF1ab gene of different strains of MERS-CoV as effective candidate using computational methods. Thus, this method may provide an insight for the chemical synthesis of antiviral RNA molecule for the treatment of MERS-CoV, at genomic level.
Collapse
Affiliation(s)
- Suza Mohammad Nur
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Sciences, University of Chittagong, Chittagong, 4331, Bangladesh
| | - Md Anayet Hasan
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Sciences, University of Chittagong, Chittagong, 4331, Bangladesh.
| | - Mohammad Al Amin
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Sciences, University of Chittagong, Chittagong, 4331, Bangladesh
| | - Mehjabeen Hossain
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Sciences, University of Chittagong, Chittagong, 4331, Bangladesh
| | - Tahmina Sharmin
- Department of Biotechnology and Genetic Engineering, Mawlana Bhashani Science and Technology University, Santosh, Tangail, 1902, Bangladesh
| |
Collapse
|
24
|
Saule C, Giegerich R. Pareto optimization in algebraic dynamic programming. Algorithms Mol Biol 2015; 10:22. [PMID: 26150892 PMCID: PMC4491898 DOI: 10.1186/s13015-015-0051-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2014] [Accepted: 05/07/2015] [Indexed: 11/10/2022] Open
Abstract
Pareto optimization combines independent objectives by computing the Pareto front of its search space, defined as the set of all solutions for which no other candidate solution scores better under all objectives. This gives, in a precise sense, better information than an artificial amalgamation of different scores into a single objective, but is more costly to compute. Pareto optimization naturally occurs with genetic algorithms, albeit in a heuristic fashion. Non-heuristic Pareto optimization so far has been used only with a few applications in bioinformatics. We study exact Pareto optimization for two objectives in a dynamic programming framework. We define a binary Pareto product operator [Formula: see text] on arbitrary scoring schemes. Independent of a particular algorithm, we prove that for two scoring schemes A and B used in dynamic programming, the scoring scheme [Formula: see text] correctly performs Pareto optimization over the same search space. We study different implementations of the Pareto operator with respect to their asymptotic and empirical efficiency. Without artificial amalgamation of objectives, and with no heuristics involved, Pareto optimization is faster than computing the same number of answers separately for each objective. For RNA structure prediction under the minimum free energy versus the maximum expected accuracy model, we show that the empirical size of the Pareto front remains within reasonable bounds. Pareto optimization lends itself to the comparative investigation of the behavior of two alternative scoring schemes for the same purpose. For the above scoring schemes, we observe that the Pareto front can be seen as a composition of a few macrostates, each consisting of several microstates that differ in the same limited way. We also study the relationship between abstract shape analysis and the Pareto front, and find that they extract information of a different nature from the folding space and can be meaningfully combined.
Collapse
|
25
|
Fu Y, Sharma G, Mathews DH. Dynalign II: common secondary structure prediction for RNA homologs with domain insertions. Nucleic Acids Res 2015; 42:13939-48. [PMID: 25416799 PMCID: PMC4267632 DOI: 10.1093/nar/gku1172] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Homologous non-coding RNAs frequently exhibit domain insertions, where a branch of secondary structure is inserted in a sequence with respect to its homologs. Dynamic programming algorithms for common secondary structure prediction of multiple RNA homologs, however, do not account for these domain insertions. This paper introduces a novel dynamic programming algorithm methodology that explicitly accounts for the possibility of inserted domains when predicting common RNA secondary structures. The algorithm is implemented as Dynalign II, an update to the Dynalign software package for predicting the common secondary structure of two RNA homologs. This update is accomplished with negligible increase in computational cost. Benchmarks on ncRNA families with domain insertions validate the method. Over base pairs occurring in inserted domains, Dynalign II improves accuracy over Dynalign, attaining 80.8% sensitivity (compared with 14.4% for Dynalign) and 91.4% positive predictive value (PPV) for tRNA; 66.5% sensitivity (compared with 38.9% for Dynalign) and 57.0% PPV for RNase P RNA; and 50.1% sensitivity (compared with 24.3% for Dynalign) and 58.5% PPV for SRP RNA. Compared with Dynalign, Dynalign II also exhibits statistically significant improvements in overall sensitivity and PPV. Dynalign II is available as a component of RNAstructure, which can be downloaded from http://rna.urmc.rochester.edu/RNAstructure.html.
Collapse
Affiliation(s)
- Yinghan Fu
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY 14642, USA
- Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY 14642, USA
| | - Gaurav Sharma
- Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY 14642, USA
- Department of Electrical and Computer Engineering, University of Rochester, Hopeman 204, RC Box 270126, Rochester, NY 14627, USA
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 630, Rochester, NY 14642, USA
- To whom correspondence should be addressed. Tel: +1 585 275 1734; Fax: +1 585 275 6007;
| | - David H. Mathews
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY 14642, USA
- Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, NY 14642, USA
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 630, Rochester, NY 14642, USA
- To whom correspondence should be addressed. Tel: +1 585 275 1734; Fax: +1 585 275 6007;
| |
Collapse
|
26
|
Nur SM, Hasan MA, Amin MA, Hossain M, Sharmin T. Design of potential RNAi (miRNA and siRNA) molecules for Middle East respiratory syndrome coronavirus (MERS-CoV) gene silencing by computational method. Interdiscip Sci 2014. [PMID: 25373633 DOI: 10.1007/s12539-014-0208-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Revised: 08/17/2014] [Accepted: 09/22/2014] [Indexed: 06/04/2023]
Abstract
The Middle East respiratory syndrome coronavirus (MERS-CoV) is a virus that manifests itself in viral infection with fever, cough, shortness of breath, renal failure and severe acute pneumonia, which often result in a fatal outcome. MERS-CoV has been shown to spread between people who are in close contact. Transmission from infected patients to healthcare personnel has also been observed and is irredeemable with present technology. Genetic studies on MERS-CoV have shown that ORF 1ab encodes replicase polyproteins and play a foremost role in viral infection. Therefore, ORF 1ab replicase polyprotein may be used as suitable target for disease control. Viral activity can be controlled by RNA interference (RNAi) technology, a leading method for post transcriptional gene silencing in a sequence specific manner. However, there is a genetic inconsistency in different viral isolates; it is a great challenge to design potential RNAi (miRNA and siRNA) molecules which can silence the respective target genes rather than any other viral gene simultaneously. In current study four effective miRNA and five siRNA molecules for silencing of nine different strains of MERS-CoV were rationally designed and corroborated using computational methods, which might lead to knockdown the activity of virus. siRNA and miRNA molecules were predicted against ORF1ab gene of different strains of MERS-CoV as effective candidate using computational methods. Thus, this method may provide an insight for the chemical synthesis of antiviral RNA molecule for the treatment of MERS-CoV, at genomic level.
Collapse
Affiliation(s)
- Suza Mohammad Nur
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Sciences, University of Chittagong, Chittagong, 4331, Bangladesh
| | | | | | | | | |
Collapse
|
27
|
Nur SM, Hasan MA, Amin MA, Hossain M, Sharmin T. Design of potential RNAi (miRNA and siRNA) molecules for Middle East respiratory syndrome coronavirus (MERS-CoV) gene silencing by computational method. Interdiscip Sci 2014. [PMID: 25519155 DOI: 10.1007/s12539-014-0233-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Revised: 08/17/2014] [Accepted: 09/22/2014] [Indexed: 06/04/2023]
Abstract
The Middle East respiratory syndrome coronavirus (MERS-CoV) is a virus that manifests itself in viral infection with fever, cough, shortness of breath, renal failure and severe acute pneumonia, which often result in a fatal outcome. MERS-CoV has been shown to spread between people who are in close contact. Transmission from infected patients to healthcare personnel has also been observed and is irredeemable with present technology. Genetic studies on MERS-CoV have shown that ORF 1ab encodes replicase polyproteins and play a foremost role in viral infection. Therefore, ORF 1ab replicase polyprotein may be used as suitable target for disease control. Viral activity can be controlled by RNA interference (RNAi) technology, a leading method for post transcriptional gene silencing in a sequence specific manner. However, there is a genetic inconsistency in different viral isolates; it is a great challenge to design potential RNAi (miRNA and siRNA) molecules which can silence the respective target genes rather than any other viral gene simultaneously. In current study four effective miRNA and five siRNA molecules for silencing of nine different strains of MERS-CoV were rationally designed and corroborated using computational methods, which might lead to knockdown the activity of virus. siRNA and miRNA molecules were predicted against ORF1ab gene of different strains of MERS-CoV as effective candidate using computational methods. Thus, this method may provide an insight for the chemical synthesis of antiviral RNA molecule for the treatment of MERS-CoV, at genomic level.
Collapse
Affiliation(s)
- Suza Mohammad Nur
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Sciences, University of Chittagong, Chittagong, 4331, Bangladesh
| | | | | | | | | |
Collapse
|
28
|
Affiliation(s)
- David H. Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center Rochester New York
| |
Collapse
|
29
|
Mathews DH. Using the RNAstructure Software Package to Predict Conserved RNA Structures. ACTA ACUST UNITED AC 2014; 46:12.4.1-12.4.22. [PMID: 24939126 DOI: 10.1002/0471250953.bi1204s46] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The structures of many non-coding RNA (ncRNA) are conserved by evolution to a greater extent than their sequences. By predicting the conserved structure of two or more homologous sequences, the accuracy of secondary structure prediction can be improved as compared to structure prediction for a single sequence. This unit provides protocols for the use of four programs in the RNAstructure suite for prediction of conserved structures, Multilign, TurboFold, Dynalign, and PARTS. These programs can be run via Web servers, on the command line, or with graphical interfaces.
Collapse
Affiliation(s)
- David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| |
Collapse
|
30
|
Rasekhian M, Roohvand F, Teimoori-Toolabi L, Amini S, Azadmanesh K. Application of the 3'-noncoding region of poliovirus RNA for cell-based regulation of mRNA stability: implication for biotechnological applications. Biotechnol Appl Biochem 2014; 61:699-706. [PMID: 24612228 DOI: 10.1002/bab.1218] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2013] [Accepted: 02/12/2014] [Indexed: 11/08/2022]
Abstract
Enrichment of production yield of therapeutic proteins in mammalian cell cultures by modulation of the mRNA stability of the target protein to increase its in vivo half-life is a new strategy in biotechnological applications. The present article describes one of the most novel approaches to modulate mRNA stability by application of 3'-noncoding region (3'NCR) from RNA viral genome in the expression constructs. Our data indicated that although utilizing the 3'NCR sequence form poliovirus (PV-3'NCR) downstream of the target gene might generally stabilize the secondary structure of RNA, it influenced the mRNA stability (and thereby the amount of protein production) in a cell type and time-dependent manner, thus indicating a central role of mRNA-stabilizing binding sites/cellular factors in this process. Our data might be of interest to the biotechnology community to improve recombinant protein production in mammalian cell cultures and RNA-based therapy/vaccination approaches.
Collapse
Affiliation(s)
- Mahsa Rasekhian
- Virology Department, Pasteur Institute of Iran, Tehran, Iran
| | | | | | | | | |
Collapse
|
31
|
Dotu I, Mechery V, Clote P. Energy parameters and novel algorithms for an extended nearest neighbor energy model of RNA. PLoS One 2014; 9:e85412. [PMID: 24586240 PMCID: PMC3931620 DOI: 10.1371/journal.pone.0085412] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2013] [Accepted: 12/04/2013] [Indexed: 11/18/2022] Open
Abstract
We describe the first algorithm and software, RNAenn, to compute the partition function and minimum free energy secondary structure for RNA with respect to an extended nearest neighbor energy model. Our next-nearest-neighbor triplet energy model appears to lead to somewhat more cooperative folding than does the nearest neighbor energy model, as judged by melting curves computed with RNAenn and with two popular software implementations for the nearest-neighbor energy model. A web server is available at http://bioinformatics.bc.edu/clotelab/RNAenn/.
Collapse
Affiliation(s)
- Ivan Dotu
- Biology Department, Boston College, Chestnut Hill, Massachusetts, United States of America
| | - Vinodh Mechery
- Hofstra North Shore-LIJ School of Medicine, Hempstead, New York, United States of America
| | - Peter Clote
- Biology Department, Boston College, Chestnut Hill, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
32
|
Who watches the watchmen? An appraisal of benchmarks for multiple sequence alignment. Methods Mol Biol 2014; 1079:59-73. [PMID: 24170395 DOI: 10.1007/978-1-62703-646-7_4] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Multiple sequence alignment (MSA) is a fundamental and ubiquitous technique in bioinformatics used to infer related residues among biological sequences. Thus alignment accuracy is crucial to a vast range of analyses, often in ways difficult to assess in those analyses. To compare the performance of different aligners and help detect systematic errors in alignments, a number of benchmarking strategies have been pursued. Here we present an overview of the main strategies-based on simulation, consistency, protein structure, and phylogeny-and discuss their different advantages and associated risks. We outline a set of desirable characteristics for effective benchmarking, and evaluate each strategy in light of them. We conclude that there is currently no universally applicable means of benchmarking MSA, and that developers and users of alignment tools should base their choice of benchmark depending on the context of application-with a keen awareness of the assumptions underlying each benchmarking strategy.
Collapse
|
33
|
Andronescu M, Condon A, Turner DH, Mathews DH. The determination of RNA folding nearest neighbor parameters. Methods Mol Biol 2014; 1097:45-70. [PMID: 24639154 DOI: 10.1007/978-1-62703-709-9_3] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The stability of RNA secondary structure can be predicted using a set of nearest neighbor parameters. These parameters are widely used by algorithms that predict secondary structure. This contribution introduces the UV optical melting experiments that are used to determine the folding stability of short RNA strands. It explains how the nearest neighbor parameters are chosen and how the values are fit to the data. A sample nearest neighbor calculation is provided. The contribution concludes with new methods that use the database of sequences with known structures to determine parameter values.
Collapse
Affiliation(s)
- Mirela Andronescu
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | | | | | | |
Collapse
|
34
|
Andersen ES. The art of editing RNA structural alignments. Methods Mol Biol 2014; 1097:379-394. [PMID: 24639168 DOI: 10.1007/978-1-62703-709-9_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Manual editing of RNA structural alignments may be considered more art than science, since it still requires an expert biologist to take multiple levels of information into account and be slightly creative when constructing high-quality alignments. Even though the task is rather tedious, it is rewarded by great insight into the evolution of structure and function of your favorite RNA molecule. In this chapter I will review the methods and considerations that go into constructing RNA structural alignments at the secondary and tertiary structure level; introduce software, databases, and algorithms that have proven useful in semiautomating the work process; and suggest future directions towards full automatization.
Collapse
|
35
|
Lai D, Proctor JR, Meyer IM. On the importance of cotranscriptional RNA structure formation. RNA (NEW YORK, N.Y.) 2013; 19:1461-1473. [PMID: 24131802 PMCID: PMC3851714 DOI: 10.1261/rna.037390.112] [Citation(s) in RCA: 125] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
The expression of genes, both coding and noncoding, can be significantly influenced by RNA structural features of their corresponding transcripts. There is by now mounting experimental and some theoretical evidence that structure formation in vivo starts during transcription and that this cotranscriptional folding determines the functional RNA structural features that are being formed. Several decades of research in bioinformatics have resulted in a wide range of computational methods for predicting RNA secondary structures. Almost all state-of-the-art methods in terms of prediction accuracy, however, completely ignore the process of structure formation and focus exclusively on the final RNA structure. This review hopes to bridge this gap. We summarize the existing evidence for cotranscriptional folding and then review the different, currently used strategies for RNA secondary-structure prediction. Finally, we propose a range of ideas on how state-of-the-art methods could be potentially improved by explicitly capturing the process of cotranscriptional structure formation.
Collapse
|
36
|
Schnattinger T, Schöning U, Marchfelder A, Kestler HA. RNA-Pareto: interactive analysis of Pareto-optimal RNA sequence-structure alignments. Bioinformatics 2013; 29:3102-4. [PMID: 24045774 DOI: 10.1093/bioinformatics/btt536] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Incorporating secondary structure information into the alignment process improves the quality of RNA sequence alignments. Instead of using fixed weighting parameters, sequence and structure components can be treated as different objectives and optimized simultaneously. The result is not a single, but a Pareto-set of equally optimal solutions, which all represent different possible weighting parameters. We now provide the interactive graphical software tool RNA-Pareto, which allows a direct inspection of all feasible results to the pairwise RNA sequence-structure alignment problem and greatly facilitates the exploration of the optimal solution set.
Collapse
Affiliation(s)
- Thomas Schnattinger
- Institute of Theoretical Computer Science, Medical Systems Biology and Biology II, Ulm University, D-89069 Ulm, Germany
| | | | | | | |
Collapse
|
37
|
Bussotti G, Notredame C, Enright AJ. Detecting and comparing non-coding RNAs in the high-throughput era. Int J Mol Sci 2013; 14:15423-58. [PMID: 23887659 PMCID: PMC3759867 DOI: 10.3390/ijms140815423] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Revised: 07/16/2013] [Accepted: 07/17/2013] [Indexed: 02/07/2023] Open
Abstract
In recent years there has been a growing interest in the field of non-coding RNA. This surge is a direct consequence of the discovery of a huge number of new non-coding genes and of the finding that many of these transcripts are involved in key cellular functions. In this context, accurately detecting and comparing RNA sequences has become important. Aligning nucleotide sequences is a key requisite when searching for homologous genes. Accurate alignments reveal evolutionary relationships, conserved regions and more generally any biologically relevant pattern. Comparing RNA molecules is, however, a challenging task. The nucleotide alphabet is simpler and therefore less informative than that of amino-acids. Moreover for many non-coding RNAs, evolution is likely to be mostly constrained at the structural level and not at the sequence level. This results in very poor sequence conservation impeding comparison of these molecules. These difficulties define a context where new methods are urgently needed in order to exploit experimental results to their full potential. This review focuses on the comparative genomics of non-coding RNAs in the context of new sequencing technologies and especially dealing with two extremely important and timely research aspects: the development of new methods to align RNAs and the analysis of high-throughput data.
Collapse
Affiliation(s)
- Giovanni Bussotti
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK; E-Mail:
| | - Cedric Notredame
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), Aiguader, 88, 08003 Barcelona, Spain; E-Mail:
| | - Anton J. Enright
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK; E-Mail:
| |
Collapse
|
38
|
Meyer F, Kurtz S, Beckstette M. Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns. BMC Bioinformatics 2013; 14:226. [PMID: 23865810 PMCID: PMC3765529 DOI: 10.1186/1471-2105-14-226] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2013] [Accepted: 07/11/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND It is well known that the search for homologous RNAs is more effective if both sequence and structure information is incorporated into the search. However, current tools for searching with RNA sequence-structure patterns cannot fully handle mutations occurring on both these levels or are simply not fast enough for searching large sequence databases because of the high computational costs of the underlying sequence-structure alignment problem. RESULTS We present new fast index-based and online algorithms for approximate matching of RNA sequence-structure patterns supporting a full set of edit operations on single bases and base pairs. Our methods efficiently compute semi-global alignments of structural RNA patterns and substrings of the target sequence whose costs satisfy a user-defined sequence-structure edit distance threshold. For this purpose, we introduce a new computing scheme to optimally reuse the entries of the required dynamic programming matrices for all substrings and combine it with a technique for avoiding the alignment computation of non-matching substrings. Our new index-based methods exploit suffix arrays preprocessed from the target database and achieve running times that are sublinear in the size of the searched sequences. To support the description of RNA molecules that fold into complex secondary structures with multiple ordered sequence-structure patterns, we use fast algorithms for the local or global chaining of approximate sequence-structure pattern matches. The chaining step removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our improved online algorithm is faster than the best previous method by up to factor 45. Our best new index-based algorithm achieves a speedup of factor 560. CONCLUSIONS The presented methods achieve considerable speedups compared to the best previous method. This, together with the expected sublinear running time of the presented index-based algorithms, allows for the first time approximate matching of RNA sequence-structure patterns in large sequence databases. Beyond the algorithmic contributions, we provide with RaligNAtor a robust and well documented open-source software package implementing the algorithms presented in this manuscript. The RaligNAtor software is available at http://www.zbh.uni-hamburg.de/ralignator.
Collapse
Affiliation(s)
- Fernando Meyer
- Center for Bioinformatics, University of Hamburg, Bundesstrasse 43, Hamburg 20146, Germany.
| | | | | |
Collapse
|
39
|
Schnattinger T, Schöning U, Kestler HA. Structural RNA alignment by multi-objective optimization. Bioinformatics 2013; 29:1607-13. [DOI: 10.1093/bioinformatics/btt188] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
40
|
Integrating chemical footprinting data into RNA secondary structure prediction. PLoS One 2012; 7:e45160. [PMID: 23091593 PMCID: PMC3473038 DOI: 10.1371/journal.pone.0045160] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2012] [Accepted: 08/16/2012] [Indexed: 01/20/2023] Open
Abstract
Chemical and enzymatic footprinting experiments, such as shape (selective 2′-hydroxyl acylation analyzed by primer extension), yield important information about RNA secondary structure. Indeed, since the -hydroxyl is reactive at flexible (loop) regions, but unreactive at base-paired regions, shape yields quantitative data about which RNA nucleotides are base-paired. Recently, low error rates in secondary structure prediction have been reported for three RNAs of moderate size, by including base stacking pseudo-energy terms derived from shape data into the computation of minimum free energy secondary structure. Here, we describe a novel method, RNAsc (RNA soft constraints), which includes pseudo-energy terms for each nucleotide position, rather than only for base stacking positions. We prove that RNAsc is self-consistent, in the sense that the nucleotide-specific probabilities of being unpaired in the low energy Boltzmann ensemble always become more closely correlated with the input shape data after application of RNAsc. From this mathematical perspective, the secondary structure predicted by RNAsc should be ‘correct’, in as much as the shape data is ‘correct’. We benchmark RNAsc against the previously mentioned method for eight RNAs, for which both shape data and native structures are known, to find the same accuracy in 7 out of 8 cases, and an improvement of 25% in one case. Furthermore, we present what appears to be the first direct comparison of shape data and in-line probing data, by comparing yeast asp-tRNA shape data from the literature with data from in-line probing experiments we have recently performed. With respect to several criteria, we find that shape data appear to be more robust than in-line probing data, at least in the case of asp-tRNA.
Collapse
|
41
|
Wei DD, Shao R, Yuan ML, Dou W, Barker SC, Wang JJ. The multipartite mitochondrial genome of Liposcelis bostrychophila: insights into the evolution of mitochondrial genomes in bilateral animals. PLoS One 2012; 7:e33973. [PMID: 22479490 PMCID: PMC3316519 DOI: 10.1371/journal.pone.0033973] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2012] [Accepted: 02/24/2012] [Indexed: 11/18/2022] Open
Abstract
Booklice (order Psocoptera) in the genus Liposcelis are major pests to stored grains worldwide and are closely related to parasitic lice (order Phthiraptera). We sequenced the mitochondrial (mt) genome of Liposcelis bostrychophila and found that the typical single mt chromosome of bilateral animals has fragmented into and been replaced by two medium-sized chromosomes in this booklouse; each of these chromosomes has about half of the genes of the typical mt chromosome of bilateral animals. These mt chromosomes are 8,530 bp (mt chromosome I) and 7,933 bp (mt chromosome II) in size. Intriguingly, mt chromosome I is twice as abundant as chromosome II. It appears that the selection pressure for compact mt genomes in bilateral animals favors small mt chromosomes when small mt chromosomes co-exist with the typical large mt chromosomes. Thus, small mt chromosomes may have selective advantages over large mt chromosomes in bilateral animals. Phylogenetic analyses of mt genome sequences of Psocodea (i.e. Psocoptera plus Phthiraptera) indicate that: 1) the order Psocoptera (booklice and barklice) is paraphyletic; and 2) the order Phthiraptera (the parasitic lice) is monophyletic. Within parasitic lice, however, the suborder Ischnocera is paraphyletic; this differs from the traditional view that each suborder of parasitic lice is monophyletic.
Collapse
Affiliation(s)
- Dan-Dan Wei
- Key Laboratory of Entomology and Pest Control Engineering, College of Plant Protection, Southwest University, Chongqing, China
| | - Renfu Shao
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Queensland, Australia
- School of Science, Education and Engineering, University of the Sunshine Coast, Maroochydore, Queensland, Australia
- * E-mail: (RS) (RS); (JW) (JW)
| | - Ming-Long Yuan
- Key Laboratory of Entomology and Pest Control Engineering, College of Plant Protection, Southwest University, Chongqing, China
| | - Wei Dou
- Key Laboratory of Entomology and Pest Control Engineering, College of Plant Protection, Southwest University, Chongqing, China
| | - Stephen C. Barker
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Queensland, Australia
| | - Jin-Jun Wang
- Key Laboratory of Entomology and Pest Control Engineering, College of Plant Protection, Southwest University, Chongqing, China
- * E-mail: (RS) (RS); (JW) (JW)
| |
Collapse
|
42
|
Advances in the study of helminth mitochondrial genomes and their associated applications. CHINESE SCIENCE BULLETIN-CHINESE 2012. [DOI: 10.1007/s11434-011-4748-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
43
|
Abstract
RNA is now appreciated to serve numerous cellular roles, and understanding RNA structure is important for understanding a mechanism of action. This contribution discusses the methods available for predicting RNA structure. Secondary structure is the set of the canonical base pairs, and secondary structure can be accurately determined by comparative sequence analysis. Secondary structure can also be predicted. The most commonly used method is free energy minimization. The accuracy of structure prediction is improved either by using experimental mapping data or by predicting a structure conserved in a set of homologous sequences. Additionally, tertiary structure, the three-dimensional arrangement of atoms, can be modeled with guidance from comparative analysis and experimental techniques. New approaches are also available for predicting tertiary structure.
Collapse
Affiliation(s)
- Matthew G Seetin
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY, USA
| | | |
Collapse
|
44
|
Agüero-Chapin G, Sánchez-Rodríguez A, Hidalgo-Yanes PI, Pérez-Castillo Y, Molina-Ruiz R, Marchal K, Vasconcelos V, Antunes A. An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference. PLoS One 2011; 6:e26638. [PMID: 22046320 PMCID: PMC3202569 DOI: 10.1371/journal.pone.0026638] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2011] [Accepted: 09/29/2011] [Indexed: 02/02/2023] Open
Abstract
The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality and its use for phylogenetic inferences. Although, alignment based methods have been exploited to the top of its complexity to tackle both issues, no alignment-free approaches have been able to successfully address both topics. By contrast, the use of simple alignment-free classifiers, like the topological indices (TIs) containing information about the sequence and structure of ITS2, may reveal to be a useful approach for the gene prediction and for assessing the phylogenetic relationships of the ITS2 class in eukaryotes. Thus, we used the TI2BioP (Topological Indices to BioPolymers) methodology [1], [2], freely available at http://ti2biop.sourceforge.net/ to calculate two different TIs. One class was derived from the ITS2 artificial 2D structures generated from DNA strings and the other from the secondary structure inferred from RNA folding algorithms. Two alignment-free models based on Artificial Neural Networks were developed for the ITS2 class prediction using the two classes of TIs referred above. Both models showed similar performances on the training and the test sets reaching values above 95% in the overall classification. Due to the importance of the ITS2 region for fungi identification, a novel ITS2 genomic sequence was isolated from Petrakia sp. This sequence and the test set were used to comparatively evaluate the conventional classification models based on multiple sequence alignments like Hidden Markov based approaches, revealing the success of our models to identify novel ITS2 members. The isolated sequence was assessed using traditional and alignment-free based techniques applied to phylogenetic inference to complement the taxonomy of the Petrakia sp. fungal isolate.
Collapse
Affiliation(s)
- Guillermin Agüero-Chapin
- CIMAR/CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Porto, Portugal
- Molecular Simulation and Drug Design (CBQ), Universidad Central “Marta Abreu” de Las Villas (UCLV), Santa Clara, Cuba
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | | | - Pedro I. Hidalgo-Yanes
- Molecular Simulation and Drug Design (CBQ), Universidad Central “Marta Abreu” de Las Villas (UCLV), Santa Clara, Cuba
- Area of Microbiology, University of León, León, Spain
| | - Yunierkis Pérez-Castillo
- Molecular Simulation and Drug Design (CBQ), Universidad Central “Marta Abreu” de Las Villas (UCLV), Santa Clara, Cuba
| | - Reinaldo Molina-Ruiz
- Molecular Simulation and Drug Design (CBQ), Universidad Central “Marta Abreu” de Las Villas (UCLV), Santa Clara, Cuba
| | - Kathleen Marchal
- CMPG, Department of Microbial and Molecular Systems, KU Leuven, Leuven, Belgium
| | - Vítor Vasconcelos
- CIMAR/CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Porto, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - Agostinho Antunes
- CIMAR/CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Porto, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| |
Collapse
|
45
|
Complete mitochondrial genomes of Baylisascaris schroederi, Baylisascaris ailuri and Baylisascaris transfuga from giant panda, red panda and polar bear. Gene 2011; 482:59-67. [PMID: 21621593 DOI: 10.1016/j.gene.2011.05.004] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2011] [Revised: 05/01/2011] [Accepted: 05/10/2011] [Indexed: 11/24/2022]
Abstract
Roundworms of the genus Baylisascaris are the most common parasitic nematodes of the intestinal tracts of wild mammals, and most of them have significant impacts in veterinary and public health. Mitochondrial (mt) genomes provide a foundation for studying epidemiology and ecology of these parasites and therefore may be used to assist in the control of Baylisascariasis. Here, we determined the complete sequences of mtDNAs for Baylisascaris schroederi, Baylisascaris ailuri and Baylisascaris transfuga, with 14,778 bp, 14,657 bp and 14,898 bp in size, respectively. Each mtDNA encodes 12 protein-coding genes, 22 transfer RNAs and 2 ribosomal RNAs, typical for other chromadorean nematodes. The gene arrangements for the three Baylisascaris species are the same as those of the Ascaridata species, but radically different from those of the Spirurida species. Phylogenetic analysis based on concatenated amino acid sequences of 12 protein-coding genes from nine nematode species indicated that the three Baylisascaris species are more closely related to Ascaris suum than to the three Toxocara species (Toxocara canis, Toxocara cati and Toxocara malaysiensis) and Anisakis simplex, and that B. ailuri is more closely related to B. transfuga than to B. schroeder. The determination of the complete mt genome sequences for these three Baylisascaris species (the first members of the genus Baylisascaris ever sequenced) is of importance in refining the phylogenetic relationships within the order Ascaridida, and provides new molecular data for population genetic, systematic, epidemiological and ecological studies of parasitic nematodes of socio-economic importance in wildlife.
Collapse
|
46
|
Sahraeian SME, Yoon BJ. PicXAA-R: efficient structural alignment of multiple RNA sequences using a greedy approach. BMC Bioinformatics 2011; 12 Suppl 1:S38. [PMID: 21342569 PMCID: PMC3044294 DOI: 10.1186/1471-2105-12-s1-s38] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Background Accurate and efficient structural alignment of non-coding RNAs (ncRNAs) has grasped more and more attentions as recent studies unveiled the significance of ncRNAs in living organisms. While the Sankoff style structural alignment algorithms cannot efficiently serve for multiple sequences, mostly progressive schemes are used to reduce the complexity. However, this idea tends to propagate the early stage errors throughout the entire process, thereby degrading the quality of the final alignment. For multiple protein sequence alignment, we have recently proposed PicXAA which constructs an accurate alignment in a non-progressive fashion. Results Here, we propose PicXAA-R as an extension to PicXAA for greedy structural alignment of ncRNAs. PicXAA-R efficiently grasps both folding information within each sequence and local similarities between sequences. It uses a set of probabilistic consistency transformations to improve the posterior base-pairing and base alignment probabilities using the information of all sequences in the alignment. Using a graph-based scheme, we greedily build up the structural alignment from sequence regions with high base-pairing and base alignment probabilities. Conclusions Several experiments on datasets with different characteristics confirm that PicXAA-R is one of the fastest algorithms for structural alignment of multiple RNAs and it consistently yields accurate alignment results, especially for datasets with locally similar sequences. PicXAA-R source code is freely available at: http://www.ece.tamu.edu/~bjyoon/picxaa/.
Collapse
|
47
|
Abstract
RNA localisation is an important mode of delivering proteins to their site of function. Cis-acting signals within the RNAs, which can be thought of as zip-codes, determine the site of localisation. There are few examples of fully characterised RNA signals, but the signals are thought to be defined through a combination of primary, secondary, and tertiary structures. In this chapter, we describe a selection of computational methods for predicting RNA secondary structure, identifying localisation signals, and searching for similar localisation signals on a genome-wide scale. The chapter is aimed at the biologist rather than presenting the details of each of the individual methods.
Collapse
|
48
|
Xu Z, Mathews DH. Multilign: an algorithm to predict secondary structures conserved in multiple RNA sequences. ACTA ACUST UNITED AC 2010; 27:626-32. [PMID: 21193521 DOI: 10.1093/bioinformatics/btq726] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
MOTIVATION With recent advances in sequencing, structural and functional studies of RNA lag behind the discovery of sequences. Computational analysis of RNA is increasingly important to reveal structure-function relationships with low cost and speed. The purpose of this study is to use multiple homologous sequences to infer a conserved RNA structure. RESULTS A new algorithm, called Multilign, is presented to find the lowest free energy RNA secondary structure common to multiple sequences. Multilign is based on Dynalign, which is a program that simultaneously aligns and folds two sequences to find the lowest free energy conserved structure. For Multilign, Dynalign is used to progressively construct a conserved structure from multiple pairwise calculations, with one sequence used in all pairwise calculations. A base pair is predicted only if it is contained in the set of low free energy structures predicted by all Dynalign calculations. In this way, Multilign improves prediction accuracy by keeping the genuine base pairs and excluding competing false base pairs. Multilign has computational complexity that scales linearly in the number of sequences. Multilign was tested on extensive datasets of sequences with known structure and its prediction accuracy is among the best of available algorithms. Multilign can run on long sequences (> 1500 nt) and an arbitrarily large number of sequences. AVAILABILITY The algorithm is implemented in ANSI C++ and can be downloaded as part of the RNAstructure package at: http://rna.urmc.rochester.edu.
Collapse
Affiliation(s)
- Zhenjiang Xu
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, USA
| | | |
Collapse
|
49
|
Raasch P, Schmitz U, Patenge N, Vera J, Kreikemeyer B, Wolkenhauer O. Non-coding RNA detection methods combined to improve usability, reproducibility and precision. BMC Bioinformatics 2010; 11:491. [PMID: 20920260 PMCID: PMC2955705 DOI: 10.1186/1471-2105-11-491] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2010] [Accepted: 09/29/2010] [Indexed: 11/10/2022] Open
Abstract
Background Non-coding RNAs gain more attention as their diverse roles in many cellular processes are discovered. At the same time, the need for efficient computational prediction of ncRNAs increases with the pace of sequencing technology. Existing tools are based on various approaches and techniques, but none of them provides a reliable ncRNA detector yet. Consequently, a natural approach is to combine existing tools. Due to a lack of standard input and output formats combination and comparison of existing tools is difficult. Also, for genomic scans they often need to be incorporated in detection workflows using custom scripts, which decreases transparency and reproducibility. Results We developed a Java-based framework to integrate existing tools and methods for ncRNA detection. This framework enables users to construct transparent detection workflows and to combine and compare different methods efficiently. We demonstrate the effectiveness of combining detection methods in case studies with the small genomes of Escherichia coli, Listeria monocytogenes and Streptococcus pyogenes. With the combined method, we gained 10% to 20% precision for sensitivities from 30% to 80%. Further, we investigated Streptococcus pyogenes for novel ncRNAs. Using multiple methods--integrated by our framework--we determined four highly probable candidates. We verified all four candidates experimentally using RT-PCR. Conclusions We have created an extensible framework for practical, transparent and reproducible combination and comparison of ncRNA detection methods. We have proven the effectiveness of this approach in tests and by guiding experiments to find new ncRNAs. The software is freely available under the GNU General Public License (GPL), version 3 at http://www.sbi.uni-rostock.de/moses along with source code, screen shots, examples and tutorial material.
Collapse
Affiliation(s)
- Peter Raasch
- Systems Biology and Bioinformatics Group, University of Rostock, Rostock, Germany
| | | | | | | | | | | |
Collapse
|
50
|
|