1
|
Eggenhofer F, Höner Zu Siederdissen C. Evolutionary Structure Conservation and Covariance Scores. Methods Mol Biol 2024; 2726:255-284. [PMID: 38780735 DOI: 10.1007/978-1-0716-3519-3_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Effective homology search for non-coding RNAs is frequently not possible via sequence similarity alone. Current methods leverage evolutionary information like structure conservation or covariance scores to identify homologs in organisms that are phylogenetically more distant. In this chapter, we introduce the theoretical background of evolutionary structure conservation and covariance score, and we show hands-on how current methods in the field are applied on example datasets.
Collapse
Affiliation(s)
- Florian Eggenhofer
- Bioinformatics Group, Department of Computer Science University of Freiburg, Freiburg, Germany
| | - Christian Höner Zu Siederdissen
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany.
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.
- Bioinformatics/High-Throughput Analysis, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Jena, Germany.
| |
Collapse
|
2
|
Yazbeck AM, Stadler PF, Tout K, Fallmann J. Automatic curation of large comparative animal MicroRNA datasets. Bioinformatics 2020; 35:4553-4559. [PMID: 30993337 DOI: 10.1093/bioinformatics/btz271] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Revised: 03/11/2019] [Accepted: 04/10/2019] [Indexed: 12/22/2022] Open
Abstract
MOTIVATION MicroRNAs form an important class of RNA regulators that has been studied extensively. The miRBase and Rfam database provide rich, frequently updated information on both pre-miRNAs and their mature forms. These data sources, however, rely on individual data submission and thus are neither complete nor consistent in their coverage across different miRNA families. Quantitative studies of miRNA evolution therefore are difficult or impossible on this basis. RESULTS We present here a workflow and a corresponding implementation, MIRfix, that automatically curates miRNA datasets by improving alignments of their precursors, the consistency of the annotation of mature miR and miR* sequence, and the phylogenetic coverage. MIRfix produces alignments that are comparable across families and sets the stage for improved homology search as well as quantitative analyses. AVAILABILITY AND IMPLEMENTATION MIRfix can be downloaded from https://github.com/Bierinformatik/MIRfix. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ali M Yazbeck
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, D-04107 Leipzig, Germany.,Doctoral School of Science and Technology, Center for Biotechnology Research, Lebanese University, Hadath Campus, Beirut, Lebanon.,Helmholtz Centre for Environmental Research - UFZ, Young Investigators Group Bioinformatics and Transcriptomics, D-04318 Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, D-04107 Leipzig, Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Center for Scalable Data Services and Solutions, and Leipzig Research Center for Civilization Diseases, University Leipzig, D-04107 Leipzig, Germany.,Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany.,Institute for Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria.,Facultad de Ciencias, Universidad National de Colombia, Sede Bogotá, Colombia.,Santa Fe Institute, Santa Fe, NM 87501, USA
| | - Kifah Tout
- Doctoral School of Science and Technology, Center for Biotechnology Research, Lebanese University, Hadath Campus, Beirut, Lebanon
| | - Jörg Fallmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, D-04107 Leipzig, Germany
| |
Collapse
|
3
|
Miladi M, Sokhoyan E, Houwaart T, Heyne S, Costa F, Grüning B, Backofen R. GraphClust2: Annotation and discovery of structured RNAs with scalable and accessible integrative clustering. Gigascience 2019; 8:giz150. [PMID: 31808801 PMCID: PMC6897289 DOI: 10.1093/gigascience/giz150] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Revised: 08/23/2019] [Accepted: 11/20/2019] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND RNA plays essential roles in all known forms of life. Clustering RNA sequences with common sequence and structure is an essential step towards studying RNA function. With the advent of high-throughput sequencing techniques, experimental and genomic data are expanding to complement the predictive methods. However, the existing methods do not effectively utilize and cope with the immense amount of data becoming available. RESULTS Hundreds of thousands of non-coding RNAs have been detected; however, their annotation is lagging behind. Here we present GraphClust2, a comprehensive approach for scalable clustering of RNAs based on sequence and structural similarities. GraphClust2 bridges the gap between high-throughput sequencing and structural RNA analysis and provides an integrative solution by incorporating diverse experimental and genomic data in an accessible manner via the Galaxy framework. GraphClust2 can efficiently cluster and annotate large datasets of RNAs and supports structure-probing data. We demonstrate that the annotation performance of clustering functional RNAs can be considerably improved. Furthermore, an off-the-shelf procedure is introduced for identifying locally conserved structure candidates in long RNAs. We suggest the presence and the sparseness of phylogenetically conserved local structures for a collection of long non-coding RNAs. CONCLUSIONS By clustering data from 2 cross-linking immunoprecipitation experiments, we demonstrate the benefits of GraphClust2 for motif discovery under the presence of biological and methodological biases. Finally, we uncover prominent targets of double-stranded RNA binding protein Roquin-1, such as BCOR's 3' untranslated region that contains multiple binding stem-loops that are evolutionary conserved.
Collapse
Affiliation(s)
- Milad Miladi
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Eteri Sokhoyan
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Torsten Houwaart
- Institute of Medical Microbiology and Hospital Hygiene, University of Dusseldorf, Universitaetsstr. 1, 40225 Dusseldorf, Germany
| | - Steffen Heyne
- Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Stuebeweg 51, 79108 Freiburg, Germany
| | - Fabrizio Costa
- Department of Computer Science, University of Exeter, North Park Road, EX4 4QF Exeter, UK
| | - Björn Grüning
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
- ZBSA Centre for Biological Systems Analysis, University of Freiburg, Hauptstr. 1, 79104 Freiburg, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
- ZBSA Centre for Biological Systems Analysis, University of Freiburg, Hauptstr. 1, 79104 Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Schaenzlestr. 18, 79104 Freiburg, Germany
| |
Collapse
|
4
|
Grau J, Nettling M, Keilwagen J. DepLogo: visualizing sequence dependencies in R. Bioinformatics 2019; 35:4812-4814. [PMID: 31225867 DOI: 10.1093/bioinformatics/btz507] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2019] [Revised: 05/27/2019] [Accepted: 06/13/2019] [Indexed: 11/13/2022] Open
Abstract
SUMMARY Statistical dependencies are present in a variety of sequence data, but are not discernible from traditional sequence logos. Here, we present the R package DepLogo for visualizing inter-position dependencies in aligned sequence data as dependency logos. Dependency logos make dependency structures, which correspond to regular co-occurrences of symbols at dependent positions, visually perceptible. To this end, sequences are partitioned based on their symbols at highly dependent positions as measured by mutual information, and each partition obtains its own visual representation. We illustrate the utility of the DepLogo package in several use cases generating dependency logos from DNA, RNA and protein sequences. AVAILABILITY AND IMPLEMENTATION The DepLogo R package is available from CRAN and its source code is available at https://github.com/Jstacs/DepLogo. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jan Grau
- Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Martin Nettling
- Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Jens Keilwagen
- Institute for Biosafety in Plant Biotechnology, Julius Kühn-Institut (JKI), Quedlinburg, Germany
| |
Collapse
|
5
|
Fallmann J, Videm P, Bagnacani A, Batut B, Doyle MA, Klingstrom T, Eggenhofer F, Stadler PF, Backofen R, Grüning B. The RNA workbench 2.0: next generation RNA data analysis. Nucleic Acids Res 2019; 47:W511-W515. [PMID: 31073612 PMCID: PMC6602469 DOI: 10.1093/nar/gkz353] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 04/11/2019] [Accepted: 04/29/2019] [Indexed: 12/30/2022] Open
Abstract
RNA has become one of the major research topics in molecular biology. As a central player in key processes regulating gene expression, RNA is in the focus of many efforts to decipher the pathways that govern the transition of genetic information to a fully functional cell. As more and more researchers join this endeavour, there is a rapidly growing demand for comprehensive collections of tools that cover the diverse layers of RNA-related research. However, increasing amounts of data, from diverse types of experiments, addressing different aspects of biological questions need to be consolidated and integrated into a single framework. Only then is it possible to connect findings from e.g. RNA-Seq experiments and methods for e.g. target predictions. To address these needs, we present the RNA Workbench 2.0 , an updated online resource for RNA related analysis. With the RNA Workbench we created a comprehensive set of analysis tools and workflows that enables researchers to analyze their data without the need for sophisticated command-line skills. This update takes the established framework to the next level, providing not only a containerized infrastructure for analysis, but also a ready-to-use platform for hands-on training, analysis, data exploration, and visualization. The new framework is available at https://rna.usegalaxy.eu , and login is free and open to all users. The containerized version can be found at https://github.com/bgruening/galaxy-rna-workbench.
Collapse
Affiliation(s)
- Jörg Fallmann
- Bioinformatics Group, Department of Computer Science; Leipzig University, Härtelstraße 16-18, D-04107 Leipzig
| | - Pavankumar Videm
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Georges-Köhler-Allee 106, Freiburg 79110, Germany
| | - Andrea Bagnacani
- Department of Systems Biology and Bioinformatics, Institute of Computer Science, University of Rostock, Ulmenstr. 69, 18057 Rostock, Germany
| | - Bérénice Batut
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Georges-Köhler-Allee 106, Freiburg 79110, Germany
| | - Maria A Doyle
- Research Computing Facility, Peter MacCallum Cancer Centre, Melbourne, Victoria 3000, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria 3010, Australia
| | - Tomas Klingstrom
- SLU-Global Bioinformatics Centre, Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences
| | - Florian Eggenhofer
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Georges-Köhler-Allee 106, Freiburg 79110, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science; Leipzig University, Härtelstraße 16-18, D-04107 Leipzig
- Interdisciplinary Center of Bioinformatics; German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig; Competence Center for Scalable Data Services and Solutions; and Leipzig Research Center for Civilization Diseases, Leipzig University, Härtelstraße 16-18, D-04107 Leipzig
- Max-Planck-Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig Inst. f. Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria; Facultad de Ciencias, Universidad Nacional de Colombia, Sede Bogotá, Colombia Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Georges-Köhler-Allee 106, Freiburg 79110, Germany
- Signalling Research Centres BIOSS and CIBSS, Albert-Ludwigs-University Freiburg, Schänzlestr. 18, Freiburg 79104, Germany
| | - Björn Grüning
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Georges-Köhler-Allee 106, Freiburg 79110, Germany
- Center for Biological Systems Analysis (ZBSA), University of Freiburg, Habsburgerstr. 49, 79104 Freiburg, Germany
| |
Collapse
|
6
|
Raden M, Ali SM, Alkhnbashi OS, Busch A, Costa F, Davis JA, Eggenhofer F, Gelhausen R, Georg J, Heyne S, Hiller M, Kundu K, Kleinkauf R, Lott SC, Mohamed MM, Mattheis A, Miladi M, Richter AS, Will S, Wolff J, Wright PR, Backofen R. Freiburg RNA tools: a central online resource for RNA-focused research and teaching. Nucleic Acids Res 2018; 46:W25-W29. [PMID: 29788132 PMCID: PMC6030932 DOI: 10.1093/nar/gky329] [Citation(s) in RCA: 106] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Revised: 04/03/2018] [Accepted: 05/18/2018] [Indexed: 12/20/2022] Open
Abstract
The Freiburg RNA tools webserver is a well established online resource for RNA-focused research. It provides a unified user interface and comprehensive result visualization for efficient command line tools. The webserver includes RNA-RNA interaction prediction (IntaRNA, CopraRNA, metaMIR), sRNA homology search (GLASSgo), sequence-structure alignments (LocARNA, MARNA, CARNA, ExpaRNA), CRISPR repeat classification (CRISPRmap), sequence design (antaRNA, INFO-RNA, SECISDesign), structure aberration evaluation of point mutations (RaSE), and RNA/protein-family models visualization (CMV), and other methods. Open education resources offer interactive visualizations of RNA structure and RNA-RNA interaction prediction as well as basic and advanced sequence alignment algorithms. The services are freely available at http://rna.informatik.uni-freiburg.de.
Collapse
Affiliation(s)
- Martin Raden
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Syed M Ali
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Omer S Alkhnbashi
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Anke Busch
- Institute of Molecular Biology (IMB), Ackermannweg 4, 55128 Mainz, Germany
| | - Fabrizio Costa
- Department of Computer Science, University of Exeter, Exeter EX4 4QF, UK
| | - Jason A Davis
- Coreva Scientific, Kaiser-Joseph-Str 198-200, 79098 Freiburg, Germany
| | - Florian Eggenhofer
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Rick Gelhausen
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Jens Georg
- Genetics and Experimental Bioinformatics, University of Freiburg, Schänzlestraße 1, 79104 Freiburg, Germany
| | - Steffen Heyne
- Max Planck Institute of Immunobiology and Epigenetics, Stübeweg 51, 79108 Freiburg, Germany
| | - Michael Hiller
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, Dresden, Germany
| | - Kousik Kundu
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK
- Department of Human Genetics, The Wellcome Trust Sanger Institute, Hinxton Cambridge CB10 1HH, UK
| | - Robert Kleinkauf
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Steffen C Lott
- Genetics and Experimental Bioinformatics, University of Freiburg, Schänzlestraße 1, 79104 Freiburg, Germany
| | - Mostafa M Mohamed
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Alexander Mattheis
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Milad Miladi
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | | | - Sebastian Will
- Theoretical Biochemistry Group, University of Vienna, Währingerstraße 17, 1090 Vienna, Austria
| | - Joachim Wolff
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Patrick R Wright
- Department of Clinical Research, Clinical Trial Unit, University of Basel Hospital, Schanzenstrasse 55, 4031 Basel, Switzerland
| | - Rolf Backofen
- Bioinformatics, Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
- Centre for Biological Signalling Studies (BIOSS), University of Freiburg, Schaenzlestr. 18, 79104 Freiburg, Germany
| |
Collapse
|