1
|
Backofen R, Gorodkin J, Hofacker IL, Stadler PF. Comparative RNA Genomics. Methods Mol Biol 2024; 2802:347-393. [PMID: 38819565 DOI: 10.1007/978-1-0716-3838-5_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Over the last quarter of a century it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large-scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible non-coding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of non-coding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Jan Gorodkin
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria
- Bioinformatics and Computational Biology research group, University of Vienna, Vienna, Austria
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany.
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
- Universidad National de Colombia, Bogotá, Colombia.
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria.
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
2
|
Tan XY, Citartan M, Chinni SV, Ahmed SA, Tang TH. Biocomputational Identification of sRNAs in Leptospira interrogans Serovar Lai. Indian J Microbiol 2023; 63:33-41. [PMID: 37188232 PMCID: PMC10172424 DOI: 10.1007/s12088-022-01050-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 12/02/2022] [Indexed: 12/24/2022] Open
Abstract
Regulatory small RNAs (sRNA) are RNA transcripts that are not translated into proteins but act as functional RNAs. Pathogenic Leptospira cause an epidemic spirochaetal zoonosis, Leptospirosis. It is speculated that Leptospiral sRNAs are involved in orchestrating their pathogenicity. In this study, biocomputational approach was adopted to identify Leptospiral sRNAs. In this study, two sRNA prediction programs, i.e., RNAz and nocoRNAc, were employed to screen the reference genome of Leptospira interrogans serovar Lai. Out of 126 predicted sRNAs, there are 96 cis-antisense sRNAs, 28 trans-encoded sRNAs and 2 sRNAs that partially overlap with protein-coding genes in a sense orientation. To determine whether these candidates are expressed in the pathogen, they were compared with the coverage files generated from our RNA-seq datasets. It was found out that 7 predicted sRNAs are expressed in mid-log phase, stationary phase, serum stress, temperature stress and iron stress while 2 sRNAs are expressed in mid-log phase, stationary phase, serum stress, and temperature stress. Besides, their expressions were also confirmed experimentally via RT-PCR. These experimentally validated candidates were also subjected to mRNA target prediction using TargetRNA2. Taken together, our study demonstrated that biocomputational strategy can serve as an alternative or as a complementary strategy to the laborious and expensive deep sequencing methods not only to uncover putative sRNAs but also to predict their targets in bacteria. In fact, this is the first study that integrates computational approach to predict putative sRNAs in L. interrogans serovar Lai. Supplementary Information The online version contains supplementary material available at 10.1007/s12088-022-01050-9.
Collapse
Affiliation(s)
- Xinq Yuan Tan
- Advanced Medical and Dental Institute (AMDI), Universiti Sains Malaysia, Bertam, 13200 Kepala Batas, Penang Malaysia
| | - Marimuthu Citartan
- Advanced Medical and Dental Institute (AMDI), Universiti Sains Malaysia, Bertam, 13200 Kepala Batas, Penang Malaysia
| | - Suresh Venkata Chinni
- Department of Biotechnology, Faculty of Applied Sciences, AIMST University, 08100 Bedong, Kedah Malaysia
| | - Siti Aminah Ahmed
- Advanced Medical and Dental Institute (AMDI), Universiti Sains Malaysia, Bertam, 13200 Kepala Batas, Penang Malaysia
| | - Thean-Hock Tang
- Advanced Medical and Dental Institute (AMDI), Universiti Sains Malaysia, Bertam, 13200 Kepala Batas, Penang Malaysia
| |
Collapse
|
3
|
Mahendran G, Jayasinghe OT, Thavakumaran D, Arachchilage GM, Silva GN. Key players in regulatory RNA realm of bacteria. Biochem Biophys Rep 2022; 30:101276. [PMID: 35592614 PMCID: PMC9111926 DOI: 10.1016/j.bbrep.2022.101276] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 04/30/2022] [Accepted: 05/04/2022] [Indexed: 11/30/2022] Open
Abstract
Precise regulation of gene expression is crucial for living cells to adapt for survival in diverse environmental conditions. Among the common cellular regulatory mechanisms, RNA-based regulators play a key role in all domains of life. Discovery of regulatory RNAs have made a paradigm shift in molecular biology as many regulatory functions of RNA have been identified beyond its canonical roles as messenger, ribosomal and transfer RNA. In the complex regulatory RNA network, riboswitches, small RNAs, and RNA thermometers can be identified as some of the key players. Herein, we review the discovery, mechanism, and potential therapeutic use of these classes of regulatory RNAs mainly found in bacteria. Being highly adaptive organisms that inhabit a broad range of ecological niches, bacteria have adopted tight and rapid-responding gene regulation mechanisms. This review aims to highlight how bacteria utilize versatile RNA structures and sequences to build a sophisticated gene regulation network. The three major classes of prokaryotic ncRNAs and their characterized mechanisms of operation in gene regulation. sRNAs emerging as major players in global gene regulatory networks. Riboswitch mediated gene control mechanisms through on/off switches in response to ligand binding. RNA thermo sensors for temperature-dependent gene expression. Therapeutic importance of ncRNAs and computational approaches involved in the discovery of ncRNAs.
Collapse
Affiliation(s)
- Gowthami Mahendran
- Department of Chemistry, University of Colombo, Colombo, Sri Lanka
- Department of Chemistry and Biochemistry, University of Notre Dame, IN, 46556, USA
| | - Oshadhi T. Jayasinghe
- Department of Chemistry, University of Colombo, Colombo, Sri Lanka
- Department of Biochemistry and Molecular Biology, Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA, 16802, USA
| | - Dhanushika Thavakumaran
- Department of Chemistry, University of Colombo, Colombo, Sri Lanka
- Department of Chemistry and Biochemistry, University of Notre Dame, IN, 46556, USA
| | - Gayan Mirihana Arachchilage
- Howard Hughes Medical Institute, Yale University, New Haven, CT, 06520-8103, USA
- PTC Therapeutics Inc, South Plainfield, NJ, 07080, USA
| | - Gayathri N. Silva
- Department of Chemistry, University of Colombo, Colombo, Sri Lanka
- Corresponding author.
| |
Collapse
|
4
|
Evolution and Phylogeny of MicroRNAs - Protocols, Pitfalls, and Problems. Methods Mol Biol 2021. [PMID: 34432281 DOI: 10.1007/978-1-0716-1170-8_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/17/2023]
Abstract
MicroRNAs are important regulators in many eukaryotic lineages. Typical miRNAs have a length of about 22nt and are processed from precursors that form a characteristic hairpin structure. Once they appear in a genome, miRNAs are among the best-conserved elements in both animal and plant genomes. Functionally, they play an important role in particular in development. In contrast to protein-coding genes, miRNAs frequently emerge de novo. The genomes of animals and plants harbor hundreds of mutually unrelated families of homologous miRNAs that tend to be persistent throughout evolution. The evolution of their genomic miRNA complement closely correlates with important morphological innovation. In addition, miRNAs have been used as valuable characters in phylogenetic studies. An accurate and comprehensive annotation of miRNAs is required as a basis to understand their impact on phenotypic evolution. Since experimental data on miRNA expression are limited to relatively few species and are subject to unavoidable ascertainment biases, it is inevitable to complement miRNA sequencing by homology based annotation methods. This chapter reviews the state of the art workflows for homology based miRNA annotation, with an emphasis on their limitations and open problems.
Collapse
|
5
|
Navarro-Martín L, Martyniuk CJ, Mennigen JA. Comparative epigenetics in animal physiology: An emerging frontier. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY D-GENOMICS & PROTEOMICS 2020; 36:100745. [PMID: 33126028 DOI: 10.1016/j.cbd.2020.100745] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 09/08/2020] [Accepted: 09/13/2020] [Indexed: 12/19/2022]
Abstract
The unprecedented access to annotated genomes now facilitates the investigation of the molecular basis of epigenetic phenomena in phenotypically diverse animals. In this critical review, we describe the roles of molecular epigenetic mechanisms in regulating mitotically and meiotically stable spatiotemporal gene expression, phenomena that provide the molecular foundation for the intra-, inter-, and trans-generational emergence of physiological phenotypes. By focusing principally on emerging comparative epigenetic roles of DNA-level and transcriptome-level epigenetic mark dynamics in the emergence of phenotypes, we highlight the relationship between evolutionary conservation and innovation of specific epigenetic pathways, and their interplay as a priority for future study. This comparative approach is expected to significantly advance our understanding of epigenetic phenomena, as animals show a diverse array of strategies to epigenetically modify physiological responses. Additionally, we review recent technological advances in the field of molecular epigenetics (single-cell epigenomics and transcriptomics and editing of epigenetic marks) in order to (1) investigate environmental and endogenous factor dependent epigenetic mark dynamics in an integrative manner; (2) functionally test the contribution of specific epigenetic marks for animal phenotypes via genome and transcript-editing tools. Finally, we describe advantages and limitations of emerging animal models, which under the Krogh principle, may be particularly useful in the advancement of comparative epigenomics and its potential translational applications in animal science, ecotoxicology, ecophysiology, climate change science and wild-life conservation, as well as organismal health.
Collapse
Affiliation(s)
- Laia Navarro-Martín
- Institute of Environmental Assessment and Water Research, IDAEA-CSIC, Barcelona, Catalunya 08034, Spain.
| | - Christopher J Martyniuk
- Department of Physiological Sciences and Center for Environmental and Human Toxicology, University of Florida Genetics Institute, Interdisciplinary Program in Biomedical Sciences Neuroscience, College of Veterinary Medicine, University of Florida, Gainesville, FL 32611, USA
| | - Jan A Mennigen
- Department of Biology, University of Ottawa, Ottawa, ON K1N6N5, Canada
| |
Collapse
|
6
|
Bayegan AH, Clote P. RNAmountAlign: Efficient software for local, global, semiglobal pairwise and multiple RNA sequence/structure alignment. PLoS One 2020; 15:e0227177. [PMID: 31978147 PMCID: PMC6980424 DOI: 10.1371/journal.pone.0227177] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Accepted: 12/13/2019] [Indexed: 11/19/2022] Open
Abstract
Alignment of structural RNAs is an important problem with a wide range of applications. Since function is often determined by molecular structure, RNA alignment programs should take into account both sequence and base-pairing information for structural homology identification. This paper describes C++ software, RNAmountAlign, for RNA sequence/structure alignment that runs in O(n3) time and O(n2) space for two sequences of length n; moreover, our software returns a p-value (transformable to expect value E) based on Karlin-Altschul statistics for local alignment, as well as parameter fitting for local and global alignment. Using incremental mountain height, a representation of structural information computable in cubic time, RNAmountAlign implements quadratic time pairwise local, global and global/semiglobal (query search) alignment using a weighted combination of sequence and structural similarity. RNAmountAlign is capable of performing progressive multiple alignment as well. Benchmarking of RNAmountAlign against LocARNA, LARA, FOLDALIGN, DYNALIGN, STRAL, MXSCARNA, and MUSCLE shows that RNAmountAlign has reasonably good accuracy and faster run time supporting all alignment types. Additionally, our extension of RNAmountAlign, called RNAmountAlignScan, which scans a target genome sequence to find hits having high sequence and structural similarity to a given query sequence, outperforms RSEARCH and sequence-only query scans and runs faster than FOLDALIGN query scan.
Collapse
Affiliation(s)
- Amir H. Bayegan
- Biology Department, Boston College, Chestnut Hill, MA, United States of America
| | - Peter Clote
- Biology Department, Boston College, Chestnut Hill, MA, United States of America
- * E-mail:
| |
Collapse
|
7
|
Song X, Li C, Li J, Liu L, Meng L, Ding H, Long W. The long noncoding RNA uc.294 is upregulated in early-onset pre-eclampsia and inhibits proliferation, invasion of trophoblast cells (HTR-8/SVneo). J Cell Physiol 2018; 234:11001-11008. [PMID: 30569493 DOI: 10.1002/jcp.27916] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2018] [Accepted: 10/25/2018] [Indexed: 12/17/2022]
Abstract
Recently, a large number of long noncoding RNAs (lncRNAs) have been reported in human diseases that are evolutionarily conserved and are likely to play a role in many biological events including pre-eclampsia. In our previous research, we selected thousands of lncRNAs for their relationship with early-onset pre-eclampsia. Among these lncRNAs, a lncRNA named uc.294 attracted our attention, was once reported to specifically be expressed at a high level in the early-onset of pre-eclampsia. This study aims to investigate the function of uc.294 in early-onset pre-eclampsia and the possible mechanism. The uc.294 expression level in early-onset pre-eclampsia or in normal placenta tissues was evaluated by quantitative real-time polymerase chain reaction. To detect the proliferation, invasion, and apoptosis capacity of the trophoblast cells, we performed the Cell Counting Kit-8 assay, transwell assay, and flow cytometry, respectively. Here we report, for the first time, that uc.294 inhibits proliferation, invasion, and promotes apoptosis of trophoblast cells HTR-8/SVneo by working in key aspects of biological behaviors. However, how uc.294 acts to regulate gene functions in early-onset pre-eclampsia needs further exploration.
Collapse
Affiliation(s)
- Xuejing Song
- Department of Obstetrics, Women's Hospital of Nanjing Medical University (Nanjing Maternity and Child Health Care Hospital), Nanjing, China
| | - Chunyan Li
- Department of Obstetrics, Women's Hospital of Nanjing Medical University (Nanjing Maternity and Child Health Care Hospital), Nanjing, China.,Department of Clinical Medicine, Fourth Clinical Medicine College, Nanjing Medical University, Nanjing, China
| | - Jingyun Li
- Maternal and Child Health Medical Institute, Women's Hospital of Nanjing Medical University (Nanjing Maternity and Child Health Care Hospital), Nanjing, China
| | - Lan Liu
- Department of Obstetrics, Women's Hospital of Nanjing Medical University (Nanjing Maternity and Child Health Care Hospital), Nanjing, China
| | - Li Meng
- Department of Obstetrics, Women's Hospital of Nanjing Medical University (Nanjing Maternity and Child Health Care Hospital), Nanjing, China
| | - Hongjuan Ding
- Department of Obstetrics, Women's Hospital of Nanjing Medical University (Nanjing Maternity and Child Health Care Hospital), Nanjing, China
| | - Wei Long
- Department of Obstetrics, Women's Hospital of Nanjing Medical University (Nanjing Maternity and Child Health Care Hospital), Nanjing, China
| |
Collapse
|
8
|
Wright PR, Mann M, Backofen R. Structure and Interaction Prediction in Prokaryotic RNA Biology. Microbiol Spectr 2018; 6:10.1128/microbiolspec.rwr-0001-2017. [PMID: 29676245 PMCID: PMC11633574 DOI: 10.1128/microbiolspec.rwr-0001-2017] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Indexed: 01/01/2023] Open
Abstract
Many years of research in RNA biology have soundly established the importance of RNA-based regulation far beyond most early traditional presumptions. Importantly, the advances in "wet" laboratory techniques have produced unprecedented amounts of data that require efficient and precise computational analysis schemes and algorithms. Hence, many in silico methods that attempt topological and functional classification of novel putative RNA-based regulators are available. In this review, we technically outline thermodynamics-based standard RNA secondary structure and RNA-RNA interaction prediction approaches that have proven valuable to the RNA research community in the past and present. For these, we highlight their usability with a special focus on prokaryotic organisms and also briefly mention recent advances in whole-genome interactomics and how this may influence the field of predictive RNA research.
Collapse
Affiliation(s)
| | | | - Rolf Backofen
- Bioinformatics Group
- Center for Biological Signaling Studies (BIOSS), University of Freiburg, Freiburg, Germany
| |
Collapse
|
9
|
Abstract
Over the last two decades it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible noncoding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of noncoding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, D-79110 Freiburg, Germany.,Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Ivo L Hofacker
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark.,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria.,Bioinformatics and Computational Biology Research Group, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria
| | - Peter F Stadler
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark. .,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria. .,Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany. .,Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany. .,Fraunhofer Institute for Cell Therapy and Immunology, Perlickstraße 1, D-04103 Leipzig, Germany. .,Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA.
| |
Collapse
|
10
|
Li D, Xu D, Zou Y, Xu Y, Fu L, Xu X, Liu Y, Zhang X, Zhang J, Ming H, Zheng L. Non‑coding RNAs and ovarian diseases (Review). Mol Med Rep 2017; 15:1435-1440. [PMID: 28259997 DOI: 10.3892/mmr.2017.6176] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 10/26/2016] [Indexed: 11/06/2022] Open
Abstract
Non-coding RNAs (ncRNAs) are a diverse family of untranslated transcripts, which serve important roles in numerous biological processes. ncRNAs are emerging as major mediators of gene expression with crucial regulatory functions. Ovarian diseases have a wide variety of clinical pathological types, which have serious impacts on women's health. In this review, current studies on ncRNAs are summarized with respect to ovarian diseases. Understanding of the role of ncRNAs in ovarian diseases is currently limited; further studies on the molecular mechanisms by which abnormal expression of ncRNAs contributes to ovarian diseases will aid in the identification of ncRNAs as novel diagnostic markers and therapeutic targets for ovarian diseases.
Collapse
Affiliation(s)
- Dandan Li
- Reproductive Medical Center, Department of Obstetrics and Gynecology, The Second Hospital of Jilin University, Changchun, Jilin 130062, P.R. China
| | - Duo Xu
- Department of Breast Oncology, Tumor Hospital of Jilin Province, Changchun, Jilin 130021, P.R. China
| | - Yinggang Zou
- Reproductive Medical Center, Department of Obstetrics and Gynecology, The Second Hospital of Jilin University, Changchun, Jilin 130062, P.R. China
| | - Ying Xu
- Reproductive Medical Center, Department of Obstetrics and Gynecology, The Second Hospital of Jilin University, Changchun, Jilin 130062, P.R. China
| | - Lulu Fu
- Reproductive Medical Center, Department of Obstetrics and Gynecology, The Second Hospital of Jilin University, Changchun, Jilin 130062, P.R. China
| | - Xin Xu
- Reproductive Medical Center, Department of Obstetrics and Gynecology, The Second Hospital of Jilin University, Changchun, Jilin 130062, P.R. China
| | - Yongzheng Liu
- Reproductive Medical Center, Department of Obstetrics and Gynecology, The Second Hospital of Jilin University, Changchun, Jilin 130062, P.R. China
| | - Xueying Zhang
- Reproductive Medical Center, Department of Obstetrics and Gynecology, The Second Hospital of Jilin University, Changchun, Jilin 130062, P.R. China
| | - Jingshun Zhang
- Reproductive Medical Center, Department of Obstetrics and Gynecology, The Second Hospital of Jilin University, Changchun, Jilin 130062, P.R. China
| | - Hao Ming
- Reproductive Medical Center, Department of Obstetrics and Gynecology, The Second Hospital of Jilin University, Changchun, Jilin 130062, P.R. China
| | - Lianwen Zheng
- Reproductive Medical Center, Department of Obstetrics and Gynecology, The Second Hospital of Jilin University, Changchun, Jilin 130062, P.R. China
| |
Collapse
|
11
|
Neuhaus K, Landstorfer R, Simon S, Schober S, Wright PR, Smith C, Backofen R, Wecko R, Keim DA, Scherer S. Differentiation of ncRNAs from small mRNAs in Escherichia coli O157:H7 EDL933 (EHEC) by combined RNAseq and RIBOseq - ryhB encodes the regulatory RNA RyhB and a peptide, RyhP. BMC Genomics 2017; 18:216. [PMID: 28245801 PMCID: PMC5331693 DOI: 10.1186/s12864-017-3586-9] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2016] [Accepted: 02/13/2017] [Indexed: 12/14/2022] Open
Abstract
Background While NGS allows rapid global detection of transcripts, it remains difficult to distinguish ncRNAs from short mRNAs. To detect potentially translated RNAs, we developed an improved protocol for bacterial ribosomal footprinting (RIBOseq). This allowed distinguishing ncRNA from mRNA in EHEC. A high ratio of ribosomal footprints per transcript (ribosomal coverage value, RCV) is expected to indicate a translated RNA, while a low RCV should point to a non-translated RNA. Results Based on their low RCV, 150 novel non-translated EHEC transcripts were identified as putative ncRNAs, representing both antisense and intergenic transcripts, 74 of which had expressed homologs in E. coli MG1655. Bioinformatics analysis predicted statistically significant target regulons for 15 of the intergenic transcripts; experimental analysis revealed 4-fold or higher differential expression of 46 novel ncRNA in different growth media. Out of 329 annotated EHEC ncRNAs, 52 showed an RCV similar to protein-coding genes, of those, 16 had RIBOseq patterns matching annotated genes in other enterobacteriaceae, and 11 seem to possess a Shine-Dalgarno sequence, suggesting that such ncRNAs may encode small proteins instead of being solely non-coding. To support that the RIBOseq signals are reflecting translation, we tested the ribosomal-footprint covered ORF of ryhB and found a phenotype for the encoded peptide in iron-limiting condition. Conclusion Determination of the RCV is a useful approach for a rapid first-step differentiation between bacterial ncRNAs and small mRNAs. Further, many known ncRNAs may encode proteins as well. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3586-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Klaus Neuhaus
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, D-85354, Freising, Germany. .,Core Facility Microbiome/NGS, ZIEL Institute for Food & Health, Weihenstephaner Berg 3, D-85354, Freising, Germany.
| | - Richard Landstorfer
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, D-85354, Freising, Germany
| | - Svenja Simon
- Informatik und Informationswissenschaft, Universität Konstanz, D-78457, Konstanz, Germany
| | - Steffen Schober
- Institut für Nachrichtentechnik, Universität Ulm, Albert-Einstein-Allee 43, D-89081, Ulm, Germany
| | - Patrick R Wright
- Bioinformatics Group, Department of Computer Science and BIOSS Centre for Biological Signaling Studies, Cluster of Excellence, University of Freiburg, D-79110, Freiburg, Germany
| | - Cameron Smith
- Bioinformatics Group, Department of Computer Science and BIOSS Centre for Biological Signaling Studies, Cluster of Excellence, University of Freiburg, D-79110, Freiburg, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science and BIOSS Centre for Biological Signaling Studies, Cluster of Excellence, University of Freiburg, D-79110, Freiburg, Germany
| | - Romy Wecko
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, D-85354, Freising, Germany
| | - Daniel A Keim
- Informatik und Informationswissenschaft, Universität Konstanz, D-78457, Konstanz, Germany
| | - Siegfried Scherer
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, D-85354, Freising, Germany
| |
Collapse
|
12
|
Canzler S, Stadler PF, Hertel J. U6 snRNA intron insertion occurred multiple times during fungi evolution. RNA Biol 2016; 13:119-27. [PMID: 26828373 DOI: 10.1080/15476286.2015.1132139] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
U6 small nuclear RNAs are part of the splicing machinery. They exhibit several unique features setting them appart from other snRNAs. Reports of introns in structured non-coding RNAs have been very rare. U6 genes, however, were found to be interrupted by an intron in several Schizosaccharomyces species and in 2 Basidiomycota. We conducted a homology search across 147 currently available fungal genome and identified the U6 genes in all but 2 of them. A detailed comparison of their sequences and predicted secondary structures showed that intron insertion events in the U6 snRNA were much more common in the fungal lineage than previously thought. Their positional distribution across the entire mature snRNA strongly suggests a large number of independent events. All the intron sequences reported here show canonical splice site and branch site motifs indicating that they require the splicesomal pathway for their removal.
Collapse
Affiliation(s)
- Sebastian Canzler
- a Bioinformatics Group , Department of Computer Science,and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18 , D-04107 Leipzig , Germany
| | - Peter F Stadler
- a Bioinformatics Group , Department of Computer Science,and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18 , D-04107 Leipzig , Germany.,b Computational EvoDevo Group , Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18 , D-04107 Leipzig , Germany.,c LIFE - Leipzig Research Center for Civilization Diseases, Universität Leipzig , Germany.,d Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22 , D-04103 Leipzig , Germany.,e Fraunhofer Institut für Zelltherapie und Immunologie - IZI Perlickstraße 1 , D-04103 Leipzig , Germany.,f Department of Theoretical Chemistry , University of Vienna, Währingerstraße 17, A-1090 Wien , Austria.,g Center for non-coding RNA in Technology and Health , University of Copenhagen, Grønnegårdsvej 3 , DK-1870 Frederiksberg C, Denmark.,h Santa Fe Institute; 1399 Hyde Park Rd. ; Santa Fe ; NM 87501 , USA
| | - Jana Hertel
- a Bioinformatics Group , Department of Computer Science,and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18 , D-04107 Leipzig , Germany.,i Department of Proteomics , Helmholtz Centre for Environmental Research - UFZ , Permoserstrabe 15, 04318 Leipzig , Germany
| |
Collapse
|
13
|
Barquist L, Burge SW, Gardner PP. Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families. CURRENT PROTOCOLS IN BIOINFORMATICS 2016; 54:12.13.1-12.13.25. [PMID: 27322404 PMCID: PMC5010141 DOI: 10.1002/cpbi.4] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Lars Barquist
- Institute for Molecular Infection Biology, University of Würzburg, Würzburg, D-97080 Germany
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA United Kingdom; Fax: +44 (0)1223 494919
| | - Sarah W. Burge
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA United Kingdom; Fax: +44 (0)1223 494919
| | - Paul P. Gardner
- School of Biological Sciences, University of Canterbury, Private Bag 4800, Christchurch, New Zealand
- Biomolecular Interaction Centre, University of Canterbury, Private Bag 4800, Christchurch, New Zealand
| |
Collapse
|
14
|
Lai D, Meyer IM. A comprehensive comparison of general RNA-RNA interaction prediction methods. Nucleic Acids Res 2016; 44:e61. [PMID: 26673718 PMCID: PMC4838349 DOI: 10.1093/nar/gkv1477] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Revised: 12/03/2015] [Accepted: 12/05/2015] [Indexed: 12/15/2022] Open
Abstract
RNA-RNA interactions are fast emerging as a major functional component in many newly discovered non-coding RNAs. Basepairing is believed to be a major contributor to the stability of these intermolecular interactions, much like intramolecular basepairs formed in RNA secondary structure. As such, using algorithms similar to those for predicting RNA secondary structure, computational methods have been recently developed for the prediction of RNA-RNA interactions. We provide the first comprehensive comparison comprising 14 methods that predict general intermolecular basepairs. To evaluate these, we compile an extensive data set of 54 experimentally confirmed fungal snoRNA-rRNA interactions and 102 bacterial sRNA-mRNA interactions. We test the performance accuracy of all methods, evaluating the effects of tool settings, sequence length, and multiple sequence alignment usage and quality. Our results show that-unlike for RNA secondary structure prediction--the overall best performing tools are non-comparative energy-based tools utilizing accessibility information that predict short interactions on this data set. Furthermore, we find that maintaining high accuracy across biologically different data sets and increasing input lengths remains a huge challenge, causing implications for de novo transcriptome-wide searches. Finally, we make our interaction data set publicly available for future development and benchmarking efforts.
Collapse
Affiliation(s)
- Daniel Lai
- Centre for High-Throughput Biology, Department of Computer Science and Department of Medical Genetics, University of British Columbia, Vancouver V6T 1Z4, Canada
| | - Irmtraud M Meyer
- Centre for High-Throughput Biology, Department of Computer Science and Department of Medical Genetics, University of British Columbia, Vancouver V6T 1Z4, Canada
| |
Collapse
|
15
|
Schierwater B, Holland PWH, Miller DJ, Stadler PF, Wiegmann BM, Wörheide G, Wray GA, DeSalle R. Never Ending Analysis of a Century Old Evolutionary Debate: “Unringing” the Urmetazoon Bell. Front Ecol Evol 2016. [DOI: 10.3389/fevo.2016.00005] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
16
|
Abstract
6S RNA is a highly abundant small non-coding RNA widely spread among diverse bacterial groups. By competing with DNA promoters for binding to RNA polymerase (RNAP), the RNA regulates transcription on a global scale. RNAP produces small product RNAs derived from 6S RNA as template, which rearranges the 6S RNA structure leading to dissociation of 6S RNA:RNAP complexes. Although 6S RNA has been experimentally analysed in detail for some species, such as Escherichia coli and Bacillus subtilis, and was computationally predicted in many diverse bacteria, a complete and up-to-date overview of the distribution among all bacteria is missing. In this study we searched with new methods for 6S RNA genes in all currently available bacterial genomes. We ended up with a set of 1,750 6S RNA genes, of which 1,367 are novel and bona fide, distributed among 1,610 bacteria, and had a few tentative candidates among the remaining 510 assembled bacterial genomes accessible. We were able to confirm two tentative candidates by Northern blot analysis. We extended 6S RNA genes of the Flavobacteriia significantly in length compared to the present Rfam entry. We describe multiple homologs of 6S RNAs (including split 6S RNA genes) and performed a detailed synteny analysis.
Collapse
Affiliation(s)
- Stefanie Wehner
- a Department for Bioinformatics; Faculty of Mathematics and Computer Science ; Friedrich-Schiller-University of Jena , Jena , Germany
| | | | | | | |
Collapse
|
17
|
Ge P, Zhong C, Zhang S. ProbeAlign: incorporating high-throughput sequencing-based structure probing information into ncRNA homology search. BMC Bioinformatics 2014; 15 Suppl 9:S15. [PMID: 25253206 PMCID: PMC4168714 DOI: 10.1186/1471-2105-15-s9-s15] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Background Recent advances in RNA structure probing technologies, including the ones based on high-throughput sequencing, have improved the accuracy of thermodynamic folding with quantitative nucleotide-resolution structural information. Results In this paper, we present a novel approach, ProbeAlign, to incorporate the reactivities from high-throughput RNA structure probing into ncRNA homology search for functional annotation. To reduce the overhead of structure alignment on large-scale data, the specific pairing patterns in the query sequences are ignored. On the other hand, the partial structural information of the target sequences embedded in probing data is retrieved to guide the alignment. Thus the structure alignment problem is transformed into a sequence alignment problem with additional reactivity information. The benchmark results show that the prediction accuracy of ProbeAlign outperforms filter-based CMsearch with high computational efficiency. The application of ProbeAlign to the FragSeq data, which is based on genome-wide structure probing, has demonstrated its capability to search ncRNAs in a large-scale dataset from high-throughput sequencing. Conclusions By incorporating high-throughput sequencing-based structure probing information, ProbeAlign can improve the accuracy and efficiency of ncRNA homology search. It is a promising tool for ncRNA functional annotation on genome-wide datasets. Availability The source code of ProbeAlign is available at http://genome.ucf.edu/ProbeAlign.
Collapse
|
18
|
Wu Z, Wu C, Shao J, Zhu Z, Wang W, Zhang W, Tang M, Pei N, Fan H, Li J, Yao H, Gu H, Xu X, Lu C. The Streptococcus suis transcriptional landscape reveals adaptation mechanisms in pig blood and cerebrospinal fluid. RNA (NEW YORK, N.Y.) 2014; 20:882-898. [PMID: 24759092 PMCID: PMC4024642 DOI: 10.1261/rna.041822.113] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2013] [Accepted: 03/11/2014] [Indexed: 06/03/2023]
Abstract
Streptococcus suis (SS) is an important pathogen of pigs, and it is also recognized as a zoonotic agent for humans. SS infection may result in septicemia or meningitis in the host. However, little is known about genes that contribute to the virulence process and survival within host blood or cerebrospinal fluid (CSF). Small RNAs (sRNA) have emerged as key regulators of virulence in several bacteria, but they have not been investigated in SS. Here, using a differential RNA-sequencing approach and RNAs from SS strain P1/7 grown in rich medium, pig blood, or CSF, we present the SS genome-wide map of 793 transcriptional start sites and 370 operons. In addition to identifying 29 sRNAs, we show that five sRNA deletion mutants attenuate SS virulence in a zebrafish infection model. Homology searches revealed that 10 sRNAs were predicted to be present in other pathogenic Streptococcus species. Compared with wild-type strain P1/7, sRNAs rss03, rss05, and rss06 deletion mutants were significantly more sensitive to killing by pig blood. It is possible that rss06 contributes to SS virulence by indirectly activating expression of SSU0308, a virulence gene encoding a zinc-binding lipoprotein. In blood, genes involved in the synthesis of capsular polysaccharide (CPS) and subversion of host defenses were up-regulated. In contrast, in CSF, genes for CPS synthesis were down-regulated. Our study is the first analysis of SS sRNAs involved in virulence and has both improved our understanding of SS pathogenesis and increased the number of sRNAs known to play definitive roles in bacterial virulence.
Collapse
Affiliation(s)
- Zongfu Wu
- College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, China
- Key Lab of Animal Bacteriology, Ministry of Agriculture, Nanjing 210095, China
- OIE Reference Laboratory for Swine Streptococcosis, Nanjing 210095, China
| | | | - Jing Shao
- College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, China
- Key Lab of Animal Bacteriology, Ministry of Agriculture, Nanjing 210095, China
- OIE Reference Laboratory for Swine Streptococcosis, Nanjing 210095, China
| | | | - Weixue Wang
- College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, China
- Key Lab of Animal Bacteriology, Ministry of Agriculture, Nanjing 210095, China
- OIE Reference Laboratory for Swine Streptococcosis, Nanjing 210095, China
| | | | - Min Tang
- College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, China
- Key Lab of Animal Bacteriology, Ministry of Agriculture, Nanjing 210095, China
- OIE Reference Laboratory for Swine Streptococcosis, Nanjing 210095, China
| | - Na Pei
- BGI-Shenzhen, Shenzhen 518083, China
| | - Hongjie Fan
- College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, China
- Key Lab of Animal Bacteriology, Ministry of Agriculture, Nanjing 210095, China
- OIE Reference Laboratory for Swine Streptococcosis, Nanjing 210095, China
- Jiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou 225009, China
| | | | - Huochun Yao
- College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, China
- Key Lab of Animal Bacteriology, Ministry of Agriculture, Nanjing 210095, China
- OIE Reference Laboratory for Swine Streptococcosis, Nanjing 210095, China
| | - Hongwei Gu
- Jiangsu Engineering Research Center for microRNA Biology and Biotechnology, State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing 210093, China
| | - Xun Xu
- BGI-Shenzhen, Shenzhen 518083, China
| | - Chengping Lu
- College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, China
- Key Lab of Animal Bacteriology, Ministry of Agriculture, Nanjing 210095, China
- OIE Reference Laboratory for Swine Streptococcosis, Nanjing 210095, China
| |
Collapse
|
19
|
Backofen R, Amman F, Costa F, Findeiß S, Richter AS, Stadler PF. Bioinformatics of prokaryotic RNAs. RNA Biol 2014; 11:470-83. [PMID: 24755880 PMCID: PMC4152356 DOI: 10.4161/rna.28647] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Revised: 03/17/2014] [Accepted: 03/25/2014] [Indexed: 02/02/2023] Open
Abstract
The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group; Department of Computer Science; University of Freiburg; Georges-Köhler-Allee 106; D-79110 Freiburg, Germany
- Center for non-coding RNA in Technology and Health; University of Copenhagen; Grønnegårdsvej 3; DK-1870 Frederiksberg C, Denmark
| | - Fabian Amman
- Institute for Theoretical Chemistry; University of Vienna; Währingerstraße 17; A-1090 Wien, Austria
- Bioinformatics Group; Department of Computer Science, and Interdisciplinary Center for Bioinformatics; University of Leipzig; Härtelstraße 16-18; D-04107 Leipzig, Germany
| | - Fabrizio Costa
- Bioinformatics Group; Department of Computer Science; University of Freiburg; Georges-Köhler-Allee 106; D-79110 Freiburg, Germany
| | - Sven Findeiß
- Institute for Theoretical Chemistry; University of Vienna; Währingerstraße 17; A-1090 Wien, Austria
- Bioinformatics and Computational Biology Research Group; University of Vienna; Währingerstraße 29; A-1090 Wien, Austria
| | - Andreas S Richter
- Bioinformatics Group; Department of Computer Science; University of Freiburg; Georges-Köhler-Allee 106; D-79110 Freiburg, Germany
- Max Planck Institute of Immunobiology and Epigenetics; Stübeweg 51; D-79108 Freiburg, Germany
| | - Peter F Stadler
- Center for non-coding RNA in Technology and Health; University of Copenhagen; Grønnegårdsvej 3; DK-1870 Frederiksberg C, Denmark
- Institute for Theoretical Chemistry; University of Vienna; Währingerstraße 17; A-1090 Wien, Austria
- Bioinformatics Group; Department of Computer Science, and Interdisciplinary Center for Bioinformatics; University of Leipzig; Härtelstraße 16-18; D-04107 Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences; Inselstraße 22; D-04103 Leipzig, Germany
- Fraunhofer Institute for Cell Therapy and Immunology – IZI; Perlickstraße 1; D-04103 Leipzig, Germany
- Santa Fe Institute; Santa Fe, NM USA
| |
Collapse
|
20
|
Wang C, Wei L, Guo M, Zou Q. Computational approaches in detecting non- coding RNA. Curr Genomics 2014; 14:371-7. [PMID: 24396270 PMCID: PMC3861888 DOI: 10.2174/13892029113149990005] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2013] [Revised: 07/18/2013] [Accepted: 07/18/2013] [Indexed: 12/21/2022] Open
Abstract
The important role of non coding RNAs (ncRNAs) in the cell has made their identification a critical issue in the biological research. However, traditional approaches such as PT-PCR and Northern Blot are costly. With recent progress in bioinformatics and computational prediction technology, the discovery of ncRNAs has become realistically possible. This paper aims to introduce major computational approaches in the identification of ncRNAs, including homologous search, de novo prediction and mining in deep sequencing data. Furthermore, related software tools have been compared and reviewed along with a discussion on future improvements.
Collapse
Affiliation(s)
- Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Leyi Wei
- School of Information Science and Technology, Xiamen University, Xiamen 361005, China
| | - Maozu Guo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Quan Zou
- School of Information Science and Technology, Xiamen University, Xiamen 361005, China
| |
Collapse
|
21
|
Abstract
The computational identification of novel microRNA (miRNA) genes is a challenging task in bioinformatics. Massive amounts of data describing unknown functional RNA transcripts have to be analyzed for putative miRNA candidates with automated computational pipelines. Beyond those miRNAs that meet the classical definition, high-throughput sequencing techniques have revealed additional miRNA-like molecules that are derived by alternative biogenesis pathways. Exhaustive bioinformatics analyses on such data involve statistical issues as well as precise sequence and structure inspection not only of the functional mature part but also of the whole precursor sequence of the putative miRNA. Apart from a considerable amount of species-specific miRNAs, the majority of all those genes are conserved at least among closely related organisms. Some miRNAs, however, can be traced back to very early points in the evolution of eukaryotic species. Thus, the investigation of the conservation of newly found miRNA candidates comprises an important step in the computational annotation of miRNAs.Topics covered in this chapter include a review on the obvious problem of miRNA annotation and family definition, recommended pipelines of computational miRNA annotation or detection, and an overview of current computer tools for the prediction of miRNAs and their limitations. The chapter closes discussing how those bioinformatic approaches address the problem of faithful miRNA prediction and correct annotation.
Collapse
Affiliation(s)
- Jana Hertel
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany
| | | | | |
Collapse
|
22
|
Will S, Siebauer MF, Heyne S, Engelhardt J, Stadler PF, Reiche K, Backofen R. LocARNAscan: Incorporating thermodynamic stability in sequence and structure-based RNA homology search. Algorithms Mol Biol 2013; 8:14. [PMID: 23601347 PMCID: PMC3716875 DOI: 10.1186/1748-7188-8-14] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2013] [Accepted: 03/28/2013] [Indexed: 12/15/2022] Open
Abstract
Background The search for distant homologs has become an import issue in genome annotation. A particular difficulty is posed by divergent homologs that have lost recognizable sequence similarity. This same problem also arises in the recognition of novel members of large classes of RNAs such as snoRNAs or microRNAs that consist of families unrelated by common descent. Current homology search tools for structured RNAs are either based entirely on sequence similarity (such as blast or hmmer) or combine sequence and secondary structure. The most prominent example of the latter class of tools is Infernal. Alternatives are descriptor-based methods. In most practical applications published to-date, however, the information contained in covariance models or manually prescribed search patterns is dominated by sequence information. Here we ask two related questions: (1) Is secondary structure alone informative for homology search and the detection of novel members of RNA classes? (2) To what extent is the thermodynamic propensity of the target sequence to fold into the correct secondary structure helpful for this task? Results Sequence-structure alignment can be used as an alternative search strategy. In this scenario, the query consists of a base pairing probability matrix, which can be derived either from a single sequence or from a multiple alignment representing a set of known representatives. Sequence information can be optionally added to the query. The target sequence is pre-processed to obtain local base pairing probabilities. As a search engine we devised a semi-global scanning variant of LocARNA’s algorithm for sequence-structure alignment. The LocARNAscan tool is optimized for speed and low memory consumption. In benchmarking experiments on artificial data we observe that the inclusion of thermodynamic stability is helpful, albeit only in a regime of extremely low sequence information in the query. We observe, furthermore, that the sensitivity is bounded in particular by the limited accuracy of the predicted local structures of the target sequence. Conclusions Although we demonstrate that a purely structure-based homology search is feasible in principle, it is unlikely to outperform tools such as Infernal in most application scenarios, where a substantial amount of sequence information is typically available. The LocARNAscan approach will profit, however, from high throughput methods to determine RNA secondary structure. In transcriptome-wide applications, such methods will provide accurate structure annotations on the target side. Availability Source code of the free software LocARNAscan 1.0 and supplementary data are available at
http://www.bioinf.uni-leipzig.de/Software/LocARNAscan.
Collapse
|
23
|
Maxwell EK, Ryan JF, Schnitzler CE, Browne WE, Baxevanis AD. MicroRNAs and essential components of the microRNA processing machinery are not encoded in the genome of the ctenophore Mnemiopsis leidyi. BMC Genomics 2012; 13:714. [PMID: 23256903 PMCID: PMC3563456 DOI: 10.1186/1471-2164-13-714] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2012] [Accepted: 11/30/2012] [Indexed: 01/21/2023] Open
Abstract
Background MicroRNAs play a vital role in the regulation of gene expression and have been identified in every animal with a sequenced genome examined thus far, except for the placozoan Trichoplax. The genomic repertoires of metazoan microRNAs have become increasingly endorsed as phylogenetic characters and drivers of biological complexity. Results In this study, we report the first investigation of microRNAs in a species from the phylum Ctenophora. We use short RNA sequencing and the assembled genome of the lobate ctenophore Mnemiopsis leidyi to show that this species appears to lack any recognizable microRNAs, as well as the nuclear proteins Drosha and Pasha, which are critical to canonical microRNA biogenesis. This finding represents the first reported case of a metazoan lacking a Drosha protein. Conclusions Recent phylogenomic analyses suggest that Mnemiopsis may be the earliest branching metazoan lineage. If this is true, then the origins of canonical microRNA biogenesis and microRNA-mediated gene regulation may postdate the last common metazoan ancestor. Alternatively, canonical microRNA functionality may have been lost independently in the lineages leading to both Mnemiopsis and the placozoan Trichoplax, suggesting that microRNA functionality was not critical until much later in metazoan evolution.
Collapse
Affiliation(s)
- Evan K Maxwell
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | | | | | | | | |
Collapse
|
24
|
Mukherjee K, Campos H, Kolaczkowski B. Evolution of animal and plant dicers: early parallel duplications and recurrent adaptation of antiviral RNA binding in plants. Mol Biol Evol 2012. [PMID: 23180579 PMCID: PMC3563972 DOI: 10.1093/molbev/mss263] [Citation(s) in RCA: 105] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
RNA interference (RNAi) is a eukaryotic molecular system that serves two primary functions: 1) gene regulation and 2) protection against selfish elements such as viruses and transposable DNA. Although the biochemistry of RNAi has been detailed in model organisms, very little is known about the broad-scale patterns and forces that have shaped RNAi evolution. Here, we provide a comprehensive evolutionary analysis of the Dicer protein family, which carries out the initial RNA recognition and processing steps in the RNAi pathway. We show that Dicer genes duplicated and diversified independently in early animal and plant evolution, coincident with the origins of multicellularity. We identify a strong signature of long-term protein-coding adaptation that has continually reshaped the RNA-binding pocket of the plant Dicer responsible for antiviral immunity, suggesting an evolutionary arms race with viral factors. We also identify key changes in Dicer domain architecture and sequence leading to specialization in either gene-regulatory or protective functions in animal and plant paralogs. As a whole, these results reveal a dynamic picture in which the evolution of Dicer function has driven elaboration of parallel RNAi functional pathways in animals and plants.
Collapse
Affiliation(s)
- Krishanu Mukherjee
- Department of Microbiology & Cell Science, University of Florida, FL, USA.
| | | | | |
Collapse
|
25
|
Dieci G, Conti A, Pagano A, Carnevali D. Identification of RNA polymerase III-transcribed genes in eukaryotic genomes. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2012; 1829:296-305. [PMID: 23041497 DOI: 10.1016/j.bbagrm.2012.09.010] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Revised: 09/20/2012] [Accepted: 09/21/2012] [Indexed: 12/16/2022]
Abstract
The RNA polymerase (Pol) III transcription system is devoted to the production of short, generally abundant noncoding (nc) RNAs in all eukaryotic cells. Previously thought to be restricted to a few housekeeping genes easily detectable in genome sequences, the set of known Pol III-transcribed genes (class III genes) has been expanding in the last ten years, and the issue of their detection, annotation and actual expression has been stimulated and revived by the results of recent high-resolution genome-wide location analyses of the mammalian Pol III machinery, together with those of Pol III-centered computational studies and of ncRNA-focused transcriptomic approaches. In this article, we provide an outline of distinctive features of Pol III-transcribed genes that have allowed and currently allow for their detection in genome sequences, we critically review the currently practiced strategies for the identification of novel class III genes and transcripts, and we discuss emerging themes in Pol III transcription regulation which might orient future transcriptomic studies. This article is part of a Special Issue entitled: Transcription by Odd Pols.
Collapse
Affiliation(s)
- Giorgio Dieci
- Dipartimento di Bioscienze, Università degli Studi di Parma, Parco Area delle Scienze 23/A, 43124 Parma, Italy.
| | | | | | | |
Collapse
|
26
|
Perina D, Korolija M, Mikoč A, Roller M, Pleše B, Imešek M, Morrow C, Batel R, Ćetković H. Structural and functional characterization of ribosomal protein gene introns in sponges. PLoS One 2012; 7:e42523. [PMID: 22880015 PMCID: PMC3412847 DOI: 10.1371/journal.pone.0042523] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2011] [Accepted: 07/10/2012] [Indexed: 11/25/2022] Open
Abstract
Ribosomal protein genes (RPGs) are a powerful tool for studying intron evolution. They exist in all three domains of life and are much conserved. Accumulating genomic data suggest that RPG introns in many organisms abound with non-protein-coding-RNAs (ncRNAs). These ancient ncRNAs are small nucleolar RNAs (snoRNAs) essential for ribosome assembly. They are also mobile genetic elements and therefore probably important in diversification and enrichment of transcriptomes through various mechanisms such as intron/exon gain/loss. snoRNAs in basal metazoans are poorly characterized. We examined 449 RPG introns, in total, from four demosponges: Amphimedon queenslandica, Suberites domuncula, Suberites ficus and Suberites pagurorum and showed that RPG introns from A. queenslandica share position conservancy and some structural similarity with "higher" metazoans. Moreover, our study indicates that mobile element insertions play an important role in the evolution of their size. In four sponges 51 snoRNAs were identified. The analysis showed discrepancies between the snoRNA pools of orthologous RPG introns between S. domuncula and A. queenslandica. Furthermore, these two sponges show as much conservancy of RPG intron positions between each other as between themselves and human. Sponges from the Suberites genus show consistency in RPG intron position conservation. However, significant differences in some of the orthologous RPG introns of closely related sponges were observed. This indicates that RPG introns are dynamic even on these shorter evolutionary time scales.
Collapse
Affiliation(s)
- Drago Perina
- Department of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Marina Korolija
- Department of Molecular Medicine, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Andreja Mikoč
- Department of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Maša Roller
- Department of Molecular Biology, Faculty of Science University of Zagreb, Zagreb, Croatia
| | - Bruna Pleše
- Department of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Mirna Imešek
- Department of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Christine Morrow
- School of Biological Sciences, Queen's University, Belfast, United Kingdom
| | - Renato Batel
- Center for Marine Research, Rudjer Boskovic Institute, Rovinj, Croatia
| | - Helena Ćetković
- Department of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| |
Collapse
|
27
|
Richter AS, Backofen R. Accessibility and conservation: general features of bacterial small RNA-mRNA interactions? RNA Biol 2012; 9:954-65. [PMID: 22767260 PMCID: PMC3495738 DOI: 10.4161/rna.20294] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Bacterial small RNAs (sRNAs) are a class of structural RNAs that often regulate mRNA targets via post-transcriptional base pair interactions. We determined features that discriminate functional from non-functional interactions and assessed the influence of these features on genome-wide target predictions. For this purpose, we compiled a set of 71 experimentally verified sRNA–target pairs from Escherichia coli and Salmonella enterica. Furthermore, we collected full-length 5′ untranslated regions by using genome-wide experimentally verified transcription start sites.
Only interaction sites in sRNAs, but not in targets, show significant sequence conservation. In addition to this observation, we found that the base pairing between sRNAs and their targets is not conserved in general across more distantly related species. A closer inspection of RybB and RyhB sRNAs and their targets revealed that the base pairing complementarity is only conserved in a small subset of the targets. In contrast to conservation, accessibility of functional interaction sites is significantly higher in both sRNAs and targets in comparison to non-functional sites. Based on the above observations, we successfully used the following constraints to improve the specificity of genome-wide target predictions: the region of interaction initiation must be located in (1) highly accessible regions in both interaction partners or (2) unstructured conserved sRNA regions derived from reliability profiles of multiple sRNA alignments.
Aligned sequences of homologous sRNAs, functional and non-functional targets, and a sup document with sup tables, figures and references are available at www.bioinf.uni-freiburg.de/Supplements/srna-interact-feat/.
Collapse
Affiliation(s)
- Andreas S Richter
- University of Freiburg, Department of Computer Science, Georges-Köhler-Allee 106, Freiburg 79110, Germany
| | | |
Collapse
|
28
|
Niehuis O, Hartig G, Grath S, Pohl H, Lehmann J, Tafer H, Donath A, Krauss V, Eisenhardt C, Hertel J, Petersen M, Mayer C, Meusemann K, Peters RS, Stadler PF, Beutel RG, Bornberg-Bauer E, McKenna DD, Misof B. Genomic and morphological evidence converge to resolve the enigma of Strepsiptera. Curr Biol 2012; 22:1309-13. [PMID: 22704986 DOI: 10.1016/j.cub.2012.05.018] [Citation(s) in RCA: 96] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2012] [Revised: 05/04/2012] [Accepted: 05/04/2012] [Indexed: 01/31/2023]
Abstract
The phylogeny of insects, one of the most spectacular radiations of life on earth, has received considerable attention. However, the evolutionary roots of one intriguing group of insects, the twisted-wing parasites (Strepsiptera), remain unclear despite centuries of study and debate. Strepsiptera exhibit exceptional larval developmental features, consistent with a predicted step from direct (hemimetabolous) larval development to complete metamorphosis that could have set the stage for the spectacular radiation of metamorphic (holometabolous) insects. Here we report the sequencing of a Strepsiptera genome and show that the analysis of sequence-based genomic data (comprising more than 18 million nucleotides from nearly 4,500 genes obtained from a total of 13 insect genomes), along with genomic metacharacters, clarifies the phylogenetic origin of Strepsiptera and sheds light on the evolution of holometabolous insect development. Our results provide overwhelming support for Strepsiptera as the closest living relatives of beetles (Coleoptera). They demonstrate that the larval developmental features of Strepsiptera, reminiscent of those of hemimetabolous insects, are the result of convergence. Our analyses solve the long-standing enigma of the evolutionary roots of Strepsiptera and reveal that the holometabolous mode of insect development is more malleable than previously thought.
Collapse
Affiliation(s)
- Oliver Niehuis
- Center for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113 Bonn, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Bao M, Cervantes Cervantes M, Zhong L, Wang JTL. Searching for non-coding RNAs in genomic sequences using ncRNAscout. GENOMICS PROTEOMICS & BIOINFORMATICS 2012; 10:114-21. [PMID: 22768985 PMCID: PMC5054157 DOI: 10.1016/j.gpb.2012.05.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2011] [Accepted: 12/05/2011] [Indexed: 12/16/2022]
Abstract
Recently non-coding RNA (ncRNA) genes have been found to serve many important functions in the cell such as regulation of gene expression at the transcriptional level. Potentially there are more ncRNA molecules yet to be found and their possible functions are to be revealed. The discovery of ncRNAs is a difficult task because they lack sequence indicators such as the start and stop codons displayed by protein-coding RNAs. Current methods utilize either sequence motifs or structural parameters to detect novel ncRNAs within genomes. Here, we present an ab initio ncRNA finder, named ncRNAscout, by utilizing both sequence motifs and structural parameters. Specifically, our method has three components: (i) a measure of the frequency of a sequence, (ii) a measure of the structural stability of a sequence contained in a t-score, and (iii) a measure of the frequency of certain patterns within a sequence that may indicate the presence of ncRNA. Experimental results show that, given a genome and a set of known ncRNAs, our method is able to accurately identify and locate a significant number of ncRNA sequences in the genome. The ncRNAscout tool is available for downloading at http://bioinformatics.njit.edu/ncRNAscout.
Collapse
Affiliation(s)
- Michael Bao
- Bioinformatics Center, New Jersey Institute of Technology, Newark, NJ 07102, USA
| | | | | | | |
Collapse
|
30
|
Abstract
The increase of bodyplan complexity in early bilaterian evolution is correlates with the advent and diversification of microRNAs. These small RNAs guide animal development by regulating temporal transitions in gene expression involved in cell fate choices and transitions between pluripotency and differentiation. One of the two known microRNAs whose origins date back before the bilaterian ancestor is mir-100. In Bilateria, it appears stably associated in polycistronic transcripts with let-7 and mir-125, two key regulators of development. In vertebrates, these three microRNA families have expanded to form a complex system of developmental regulators. In this contribution, we disentangle the evolutionary history of the let-7 locus, which was restructured independently in nematodes, platyhelminths, and deuterostomes. The foundation of a second let-7 locus in the common ancestor of vertebrates and urochordates predates the vertebrate-specific genome duplications, which then caused a rapid expansion of the let-7 family.
Collapse
Affiliation(s)
- Jana Hertel
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
| | | | | | | | | | | |
Collapse
|
31
|
King-Fung Wong T, Wing-Yan Cheung B, Lam TW, Yiu SM. Local structural alignment of RNA with affine gap model. BMC Proc 2011; 5 Suppl 2:S2. [PMID: 21554760 PMCID: PMC3090760 DOI: 10.1186/1753-6561-5-s2-s2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Background Predicting new non-coding RNAs (ncRNAs) of a family can be done by aligning the potential candidate with a member of the family with known sequence and secondary structure. Existing tools either only consider the sequence similarity or cannot handle local alignment with gaps. Results In this paper, we consider the problem of finding the optimal local structural alignment between a query RNA sequence (with known secondary structure) and a target sequence (with unknown secondary structure) with the affine gap penalty model. We provide the algorithm to solve the problem. Conclusions Based on an experiment, we show that there are ncRNA families in which considering local structural alignment with gap penalty model can identify real hits more effectively than using global alignment or local alignment without gap penalty model.
Collapse
|
32
|
Conservation and Occurrence of Trans-Encoded sRNAs in the Rhizobiales. Genes (Basel) 2011; 2:925-56. [PMID: 24710299 PMCID: PMC3927594 DOI: 10.3390/genes2040925] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2011] [Revised: 10/24/2011] [Accepted: 10/26/2011] [Indexed: 12/13/2022] Open
Abstract
Post-transcriptional regulation by trans-encoded sRNAs, for example via base-pairing with target mRNAs, is a common feature in bacteria and influences various cell processes, e.g., response to stress factors. Several studies based on computational and RNA-seq approaches identified approximately 180 trans-encoded sRNAs in Sinorhizobium meliloti. The initial point of this report is a set of 52 trans-encoded sRNAs derived from the former studies. Sequence homology combined with structural conservation analyses were applied to elucidate the occurrence and distribution of conserved trans-encoded sRNAs in the order of Rhizobiales. This approach resulted in 39 RNA family models (RFMs) which showed various taxonomic distribution patterns. Whereas the majority of RFMs was restricted to Sinorhizobium species or the Rhizobiaceae, members of a few RFMs were more widely distributed in the Rhizobiales. Access to this data is provided via the RhizoGATE portal [1,2].
Collapse
|
33
|
Schmidtke C, Findeiss S, Sharma CM, Kuhfuss J, Hoffmann S, Vogel J, Stadler PF, Bonas U. Genome-wide transcriptome analysis of the plant pathogen Xanthomonas identifies sRNAs with putative virulence functions. Nucleic Acids Res 2011; 40:2020-31. [PMID: 22080557 PMCID: PMC3300014 DOI: 10.1093/nar/gkr904] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The Gram-negative plant-pathogenic bacterium Xanthomonas campestris pv. vesicatoria (Xcv) is an important model to elucidate the mechanisms involved in the interaction with the host. To gain insight into the transcriptome of the Xcv strain 85–10, we took a differential RNA sequencing (dRNA-seq) approach. Using a novel method to automatically generate comprehensive transcription start site (TSS) maps we report 1421 putative TSSs in the Xcv genome. Genes in Xcv exhibit a poorly conserved −10 promoter element and no consensus Shine-Dalgarno sequence. Moreover, 14% of all mRNAs are leaderless and 13% of them have unusually long 5′-UTRs. Northern blot analyses confirmed 16 intergenic small RNAs and seven cis-encoded antisense RNAs in Xcv. Expression of eight intergenic transcripts was controlled by HrpG and HrpX, key regulators of the Xcv type III secretion system. More detailed characterization identified sX12 as a small RNA that controls virulence of Xcv by affecting the interaction of the pathogen and its host plants. The transcriptional landscape of Xcv is unexpectedly complex, featuring abundant antisense transcripts, alternative TSSs and clade-specific small RNAs.
Collapse
Affiliation(s)
- Cornelius Schmidtke
- Department of Genetics, Martin-Luther-Universität Halle-Wittenberg, Institute for Biology, D-06099 Halle, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
34
|
Cruz JA, Westhof E. Identification and annotation of noncoding RNAs in Saccharomycotina. C R Biol 2011; 334:671-8. [PMID: 21819949 DOI: 10.1016/j.crvi.2011.05.016] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2010] [Accepted: 03/23/2011] [Indexed: 11/16/2022]
Abstract
The importance of ncRNAs in biological processes makes their annotation an essential component of any genome-sequencing project. The identification of ncRNAs in genomes requires specific expertise and tools that are distinct from the traditional protein gene annotation tools. Here, we describe the assembly of two automatic annotation pipelines, integrating publicly available tools, for homology and de novo ncRNA search in genomes. We applied both pipelines to 10 Saccharomycotina genomes and were able to find and annotate 693 ncRNA genes, corresponding to 81% of the ncRNAs expected for those genomes assuming the number of ncRNAs in Saccharomyces cerevisiae (86) as a reference. Several new ncRNAs, not yet known in the Saccharomycotina clade, were also detected. The results show the feasibility of automatic search for ncRNAs in full genomes and the utility of such approaches in large multi-genome sequencing and annotation projects.
Collapse
Affiliation(s)
- José Almeida Cruz
- Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, 15 rue René-Descartes, 67084 Strasbourg cedex, France.
| | | |
Collapse
|
35
|
Abstract
Rapid improvements in high-throughput experimental technologies make it nowadays possible to study the expression, as well as changes in expression, of whole transcriptomes under different environmental conditions in a detailed view. We describe current approaches to identify genome-wide functional RNA transcripts (experimentally as well as computationally), and focus on computational methods that may be utilized to disclose their function. While genome databases offer a wealth of information about known and putative functions for protein-coding genes, functional information for novel non-coding RNA genes is almost nonexistent. This is mainly explained by the lack of established software tools to efficiently reveal the function and evolutionary origin of non-coding RNA genes. Here, we describe in detail computational approaches one may follow to annotate and classify an RNA transcript.
Collapse
Affiliation(s)
- Kristin Reiche
- Fraunhofer Institute for Cell Therapy and Immunology, Leipzig, Germany
| | | | | | | | | |
Collapse
|
36
|
Seemann SE, Richter AS, Gesell T, Backofen R, Gorodkin J. PETcofold: predicting conserved interactions and structures of two multiple alignments of RNA sequences. ACTA ACUST UNITED AC 2010; 27:211-9. [PMID: 21088024 PMCID: PMC3018821 DOI: 10.1093/bioinformatics/btq634] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Motivation: Predicting RNA–RNA interactions is essential for determining the function of putative non-coding RNAs. Existing methods for the prediction of interactions are all based on single sequences. Since comparative methods have already been useful in RNA structure determination, we assume that conserved RNA–RNA interactions also imply conserved function. Of these, we further assume that a non-negligible amount of the existing RNA–RNA interactions have also acquired compensating base changes throughout evolution. We implement a method, PETcofold, that can take covariance information in intra-molecular and inter-molecular base pairs into account to predict interactions and secondary structures of two multiple alignments of RNA sequences. Results:PETcofold's ability to predict RNA–RNA interactions was evaluated on a carefully curated dataset of 32 bacterial small RNAs and their targets, which was manually extracted from the literature. For evaluation of both RNA–RNA interaction and structure prediction, we were able to extract only a few high-quality examples: one vertebrate small nucleolar RNA and four bacterial small RNAs. For these we show that the prediction can be improved by our comparative approach. Furthermore, PETcofold was evaluated on controlled data with phylogenetically simulated sequences enriched for covariance patterns at the interaction sites. We observed increased performance with increased amounts of covariance. Availability: The program PETcofold is available as source code and can be downloaded from http://rth.dk/resources/petcofold. Contact:gorodkin@rth.dk; backofen@informatik.uni-freiburg.de Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Stefan E Seemann
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg C, Denmark
| | | | | | | | | |
Collapse
|
37
|
Yusuf D, Marz M, Stadler PF, Hofacker IL. Bcheck: a wrapper tool for detecting RNase P RNA genes. BMC Genomics 2010; 11:432. [PMID: 20626900 PMCID: PMC2996960 DOI: 10.1186/1471-2164-11-432] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2010] [Accepted: 07/13/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Effective bioinformatics solutions are needed to tackle challenges posed by industrial-scale genome annotation. We present Bcheck, a wrapper tool which predicts RNase P RNA genes by combining the speed of pattern matching and sensitivity of covariance models. The core of Bcheck is a library of subfamily specific descriptor models and covariance models. RESULTS Scanning all microbial genomes in GenBank identifies RNase P RNA genes in 98% of 1024 microbial chromosomal sequences within just 4 hours on single CPU. Comparing to existing annotations found in 387 of the GenBank files, Bcheck predictions have more intact structure and are automatically classified by subfamily membership. For eukaryotic chromosomes Bcheck could identify the known RNase P RNA genes in 84 out of 85 metazoan genomes and 19 out of 21 fungi genomes. Bcheck predicted 37 novel eukaryotic RNase P RNA genes, 32 of which are from fungi. Gene duplication events are observed in at least 20 metazoan organisms. Scanning of meta-genomic data from the Global Ocean Sampling Expedition, comprising over 10 million sample sequences (18 Gigabases), predicted 2909 unique genes, 98% of which fall into ancestral bacteria A type of RNase P RNA and 66% of which have no close homolog to known prokaryotic RNase P RNA. CONCLUSIONS The combination of efficient filtering by means of a descriptor-based search and subsequent construction of a high-quality gene model by means of a covariance model provides an efficient method for the detection of RNase P RNA genes in large-scale sequencing data. Bcheck is implemented as webserver and can also be downloaded for local use from http://rna.tbi.univie.ac.at/bcheck.
Collapse
Affiliation(s)
- Dilmurat Yusuf
- Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, A-1090 Wien, Austria
| | | | | | | |
Collapse
|
38
|
Chikkagoudar S, Livesay DR, Roshan U. PLAST-ncRNA: Partition function Local Alignment Search Tool for non-coding RNA sequences. Nucleic Acids Res 2010; 38:W59-63. [PMID: 20522510 PMCID: PMC2896107 DOI: 10.1093/nar/gkq487] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Alignment-based programs are valuable tools for finding potential homologs in genome sequences. Previously, it has been shown that partition function posterior probabilities attuned to local alignment achieve a high accuracy in identifying distantly similar non-coding RNA sequences that are hidden in a large genome. Here, we present an online implementation of that alignment algorithm based on such probabilities. Our server takes as input a query RNA sequence and a large genome sequence, and outputs a list of hits that are above a mean posterior probability threshold. The output is presented in a format suited to local alignment. It can also be viewed within the PLAST alignment viewer applet that provides a list of all hits found and highlights regions of high posterior probability within each local alignment. The server is freely available at http://plastrna.njit.edu.
Collapse
Affiliation(s)
- Satish Chikkagoudar
- Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA
| | | | | |
Collapse
|
39
|
Boria I, Gruber AR, Tanzer A, Bernhart SH, Lorenz R, Mueller MM, Hofacker IL, Stadler PF. Nematode sbRNAs: Homologs of Vertebrate Y RNAs. J Mol Evol 2010; 70:346-58. [DOI: 10.1007/s00239-010-9332-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2009] [Accepted: 03/01/2010] [Indexed: 01/20/2023]
|
40
|
Complete HOX cluster characterization of the coelacanth provides further evidence for slow evolution of its genome. Proc Natl Acad Sci U S A 2010; 107:3622-7. [PMID: 20139301 DOI: 10.1073/pnas.0914312107] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
The living coelacanth is a lobe-finned fish that represents an early evolutionary departure from the lineage that led to land vertebrates, and is of extreme interest scientifically. It has changed very little in appearance from fossilized coelacanths of the Cretaceous (150 to 65 million years ago), and is often referred to as a "living fossil." An important general question is whether long-term stasis in morphological evolution is associated with stasis in genome evolution. To this end we have used targeted genome sequencing for acquiring 1,612,752 bp of high quality finished sequence encompassing the four HOX clusters of the Indonesian coelacanth Latimeria menadoensis. Detailed analyses were carried out on genomic structure, gene and repeat contents, conserved noncoding regions, and relative rates of sequence evolution in both coding and noncoding tracts. Our results demonstrate conclusively that the coelacanth HOX clusters are evolving comparatively slowly and that this taxon should serve as a viable outgroup for interpretation of the genomes of tetrapod species.
Collapse
|
41
|
Abstract
Increasing lines of evidence indicate that small non-coding RNAs including miRNAs, piRNAs, rasiRNAs, 21U endo-siRNAs, and snoRNAs are involved in many critical biological processes. Functional studies of these small RNAs require a simple, sensitive, and reliable method for detecting and quantifying levels of small RNAs. Here, we describe such a method that has been widely used for the validation of cloned small RNAs and also for quantitative analyses of small RNAs in both tissues and cells.
Collapse
Affiliation(s)
- Seungil Ro
- Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, NV 89557
| | - Wei Yan
- Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, NV 89557
| |
Collapse
|
42
|
Menzel P, Gorodkin J, Stadler PF. The tedious task of finding homologous noncoding RNA genes. RNA (NEW YORK, N.Y.) 2009; 15:2075-82. [PMID: 19861422 PMCID: PMC2779685 DOI: 10.1261/rna.1556009] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
User-driven in silico RNA homology search is still a nontrivial task. In part, this is the consequence of a limited precision of the computational tools in spite of recent exciting progress in this area, and to a certain extent, computational costs are still problematic in practice. An important, and as we argue here, dominating issue is the dependence on good curated (secondary) structural alignments of the RNAs. These are often hard to obtain, not so much because of an inherent limitation in the available data, but because they require substantial manual curation, an effort that is rarely acknowledged. Here, we qualitatively describe a realistic scenario for what a "regular user" (i.e., a nonexpert in a particular RNA family) can do in practice, and what kind of results are likely to be achieved. Despite the indisputable advances in computational RNA biology, the conclusion is discouraging: BLAST still works better or equally good as other methods unless extensive expert knowledge on the RNA family is included. However, when good curated data are available the recent development yields further improvements in finding remote homologs. Homology search beyond the reach of BLAST hence is not at all a routine task.
Collapse
Affiliation(s)
- Peter Menzel
- Section for Genetics and Bioinformatics, IBHV, and Center for Applied Bioinformatics, University of Copenhagen, DK-1870 Frederiksberg, Denmark
| | | | | |
Collapse
|
43
|
Copeland CS, Marz M, Rose D, Hertel J, Brindley PJ, Santana CB, Kehr S, Attolini CSO, Stadler PF. Homology-based annotation of non-coding RNAs in the genomes of Schistosoma mansoni and Schistosoma japonicum. BMC Genomics 2009; 10:464. [PMID: 19814823 PMCID: PMC2770079 DOI: 10.1186/1471-2164-10-464] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2009] [Accepted: 10/08/2009] [Indexed: 11/27/2022] Open
Abstract
BACKGROUND Schistosomes are trematode parasites of the phylum Platyhelminthes. They are considered the most important of the human helminth parasites in terms of morbidity and mortality. Draft genome sequences are now available for Schistosoma mansoni and Schistosoma japonicum. Non-coding RNA (ncRNA) plays a crucial role in gene expression regulation, cellular function and defense, homeostasis, and pathogenesis. The genome-wide annotation of ncRNAs is a non-trivial task unless well-annotated genomes of closely related species are already available. RESULTS A homology search for structured ncRNA in the genome of S. mansoni resulted in 23 types of ncRNAs with conserved primary and secondary structure. Among these, we identified rRNA, snRNA, SL RNA, SRP, tRNAs and RNase P, and also possibly MRP and 7SK RNAs. In addition, we confirmed five miRNAs that have recently been reported in S. japonicum and found two additional homologs of known miRNAs. The tRNA complement of S. mansoni is comparable to that of the free-living planarian Schmidtea mediterranea, although for some amino acids differences of more than a factor of two are observed: Leu, Ser, and His are overrepresented, while Cys, Meth, and Ile are underrepresented in S. mansoni. On the other hand, the number of tRNAs in the genome of S. japonicum is reduced by more than a factor of four. Both schistosomes have a complete set of minor spliceosomal snRNAs. Several ncRNAs that are expected to exist in the S. mansoni genome were not found, among them the telomerase RNA, vault RNAs, and Y RNAs. CONCLUSION The ncRNA sequences and structures presented here represent the most complete dataset of ncRNA from any lophotrochozoan reported so far. This data set provides an important reference for further analysis of the genomes of schistosomes and indeed eukaryotic genomes at large.
Collapse
Affiliation(s)
- Claudia S Copeland
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Mosig A, Zhu L, Stadler PF. Customized strategies for discovering distant ncRNA homologs. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2009; 8:451-60. [PMID: 19779009 DOI: 10.1093/bfgp/elp035] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
A large fraction of non-coding RNAs is short and/or poorly conserved in sequence. Most of the longer examples, furthermore, consist of a collection of conserved structural motifs rather than a coherent globally conserved secondary structure. As a consequence, the conceptually simple problem of homology search becomes a complex and technically demanding task. Despite the best efforts of databases such as Rfam, the situation is complicated further by the sparsity of information in many--in particular prokaryotic--RNA families. In this contribution, we review recent efforts to customize sequence-based search tools for ncRNA applications. In particular, semi-global alignments and the development of methods for fragmented pattern search have brought significant practical advances. Current developments in this area focus on the integration of fragmented sequence pattern search with search algorithms for secondary structure patterns. We focus here, in particular, on strategies that can be successful in the 'twilight zone' where generic approaches from blast to infernal to start to fail.
Collapse
Affiliation(s)
- Axel Mosig
- Chair of Bioinformatics, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | | | | |
Collapse
|
45
|
Marz M, Donath A, Verstraete N, Nguyen VT, Stadler PF, Bensaude O. Evolution of 7SK RNA and its protein partners in metazoa. Mol Biol Evol 2009; 26:2821-30. [PMID: 19734296 DOI: 10.1093/molbev/msp198] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
7SK RNA is a key player in the regulation of polymerase II transcription. 7SK RNA was considered as a highly conserved vertebrate innovation. The discovery of poorly conserved homologs in several insects and lophotrochozoans, however, implies a much earlier evolutionary origin. The mechanism of 7SK function requires interaction with the proteins HEXIM and La-related protein 7. Here, we present a comprehensive computational analysis of these two proteins in metazoa, and we extend the collection of 7SK RNAs by several additional candidates. In particular, we describe 7SK homologs in Caenorhabditis species. Furthermore, we derive an improved secondary structure model of 7SK RNA, which shows that the structure is quite well-conserved across animal phyla despite the extreme divergence at sequence level.
Collapse
Affiliation(s)
- Manja Marz
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.
| | | | | | | | | | | |
Collapse
|
46
|
Stadler PF, Chen JJL, Hackermuller J, Hoffmann S, Horn F, Khaitovich P, Kretzschmar AK, Mosig A, Prohaska SJ, Qi X, Schutt K, Ullmann K. Evolution of Vault RNAs. Mol Biol Evol 2009; 26:1975-91. [DOI: 10.1093/molbev/msp112] [Citation(s) in RCA: 114] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
|