1
|
Kong E, Polacek N. TRIM21 modulates stability of pro-survival non-coding RNA vtRNA1-1 in human hepatocellular carcinoma cells. PLoS Genet 2025; 21:e1011614. [PMID: 40096176 PMCID: PMC11940608 DOI: 10.1371/journal.pgen.1011614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Revised: 03/26/2025] [Accepted: 02/10/2025] [Indexed: 03/19/2025] Open
Abstract
Recent studies expanded our knowledge of diverse pro-survival functions of short non-coding vault RNAs. One of the human vault RNA paralogs, vtRNA1-1, modulates several intracellular processes, including proliferation, apoptosis, autophagy, and drug resistance in various types of human cancer cells. However, protein interaction partners and mechanisms by which vtRNA1-1 levels are controlled within the cells remained elusive. Here, we describe a regulatory process for vtRNA1-1 stabilization mediated by the newly identified interacting proteins, TRIM21 and TRIM25, in human hepatocellular carcinoma (HCC) cells. Depleting TRIM21 or TRIM25 reduced the stability of vtRNA1-1 both in vivo and in vitro. We also identified the responsible sequence of vtRNA1-1 for the stability regulation by TRIM21 and TRIM25 and revealed another critical factor for vtRNA1-1 stability, an NSUN2-mediated methylation at C69 of vtRNA1-1. Consequently, our findings demonstrated that the TRIM proteins govern the stability of vtRNA1-1 depending on its methylation status in HCC cells. Since vtRNA1-1 is crucial for pro-survival characteristics in HCC cells, insight into vtRNA1-1 protein binding partners and the regulation of its stability can impact the development of new anticancer strategies.
Collapse
Affiliation(s)
- EunBin Kong
- Department for Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Bern, Switzerland
| | - Norbert Polacek
- Department for Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Bern, Switzerland
| |
Collapse
|
2
|
Backofen R, Gorodkin J, Hofacker IL, Stadler PF. Comparative RNA Genomics. Methods Mol Biol 2024; 2802:347-393. [PMID: 38819565 DOI: 10.1007/978-1-0716-3838-5_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Over the last quarter of a century it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large-scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible non-coding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of non-coding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Jan Gorodkin
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria
- Bioinformatics and Computational Biology research group, University of Vienna, Vienna, Austria
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany.
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
- Universidad National de Colombia, Bogotá, Colombia.
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria.
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
3
|
Velandia-Huerto CA, Fallmann J, Stadler PF. miRNAture-Computational Detection of microRNA Candidates. Genes (Basel) 2021; 12:348. [PMID: 33673400 PMCID: PMC7996739 DOI: 10.3390/genes12030348] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 02/19/2021] [Accepted: 02/20/2021] [Indexed: 12/16/2022] Open
Abstract
Homology-based annotation of short RNAs, including microRNAs, is a difficult problem because their inherently small size limits the available information. Highly sensitive methods, including parameter optimized blast, nhmmer, or cmsearch runs designed to increase sensitivity inevitable lead to large numbers of false positives, which can be detected only by detailed analysis of specific features typical for a RNA family and/or the analysis of conservation patterns in structure-annotated multiple sequence alignments. The miRNAture pipeline implements a workflow specific to animal microRNAs that automatizes homology search and validation steps. The miRNAture pipeline yields very good results for a large number of "typical" miRBase families. However, it also highlights difficulties with atypical cases, in particular microRNAs deriving from repetitive elements and microRNAs with unusual, branched precursor structures and atypical locations of the mature product, which require specific curation by domain experts.
Collapse
Affiliation(s)
- Cristian A. Velandia-Huerto
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, D-04107 Leipzig, Germany
| | - Jörg Fallmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, D-04107 Leipzig, Germany
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, D-04107 Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria
- Facultad de Ciencias, Universidad National de Colombia, CO-111321 Bogotá, Colombia
- Santa Fe Insitute, Santa Fe, NM 87501, USA
| |
Collapse
|
4
|
Seal RL, Chen LL, Griffiths-Jones S, Lowe TM, Mathews MB, O'Reilly D, Pierce AJ, Stadler PF, Ulitsky I, Wolin SL, Bruford EA. A guide to naming human non-coding RNA genes. EMBO J 2020; 39:e103777. [PMID: 32090359 PMCID: PMC7073466 DOI: 10.15252/embj.2019103777] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Revised: 01/23/2020] [Accepted: 01/30/2020] [Indexed: 12/15/2022] Open
Abstract
Research on non-coding RNA (ncRNA) is a rapidly expanding field. Providing an official gene symbol and name to ncRNA genes brings order to otherwise potential chaos as it allows unambiguous communication about each gene. The HUGO Gene Nomenclature Committee (HGNC, www.genenames.org) is the only group with the authority to approve symbols for human genes. The HGNC works with specialist advisors for different classes of ncRNA to ensure that ncRNA nomenclature is accurate and informative, where possible. Here, we review each major class of ncRNA that is currently annotated in the human genome and describe how each class is assigned a standardised nomenclature.
Collapse
Affiliation(s)
- Ruth L Seal
- Department of Haematology, University of Cambridge School of Clinical Medicine, Cambridge, UK.,European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Ling-Ling Chen
- State Key Laboratory of Molecular Biology, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Science, Shanghai, China
| | - Sam Griffiths-Jones
- School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| | - Todd M Lowe
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, USA
| | - Michael B Mathews
- Department of Medicine, Rutgers New Jersey Medical School, Newark, NJ, USA
| | - Dawn O'Reilly
- Computational Biology and Integrative Genomics Lab, MRC/CRUK Oxford Institute and Department of Oncology, University of Oxford, Oxford, UK
| | - Andrew J Pierce
- Translational Medicine, Oncology R&D, AstraZeneca, Cambridge, UK
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.,Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.,Institute of Theoretical Chemistry, University of Vienna, Vienna, Austria.,Facultad de Ciencias, Universidad National de Colombia, Sede Bogotá, Colombia.,Santa Fe Institute, Santa Fe, USA
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel
| | - Sandra L Wolin
- RNA Biology Laboratory, National Cancer Institute, National Institutes of Health, Frederick, MD, USA
| | - Elspeth A Bruford
- Department of Haematology, University of Cambridge School of Clinical Medicine, Cambridge, UK.,European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| |
Collapse
|
5
|
Abstract
Over the last two decades it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible noncoding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of noncoding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, D-79110 Freiburg, Germany.,Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Ivo L Hofacker
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark.,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria.,Bioinformatics and Computational Biology Research Group, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria
| | - Peter F Stadler
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark. .,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria. .,Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany. .,Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany. .,Fraunhofer Institute for Cell Therapy and Immunology, Perlickstraße 1, D-04103 Leipzig, Germany. .,Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA.
| |
Collapse
|
6
|
Boivin V, Deschamps-Francoeur G, Scott MS. Protein coding genes as hosts for noncoding RNA expression. Semin Cell Dev Biol 2017; 75:3-12. [PMID: 28811264 DOI: 10.1016/j.semcdb.2017.08.016] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Revised: 08/02/2017] [Accepted: 08/03/2017] [Indexed: 12/17/2022]
Abstract
With the emergence of high-throughput sequence characterization methods and the subsequent improvements in gene annotations, it is becoming increasingly clear that a large proportion of eukaryotic protein-coding genes (as many as 50% in human) serve as host genes for non-coding RNA genes. Amongst the most extensively characterized embedded non-coding RNA genes, small nucleolar RNAs and microRNAs represent abundant families. Encoded individually or clustered, in sense or antisense orientation with respect to their host and independently expressed or dependent on host expression, the genomic characteristics of embedded genes determine their biogenesis and the extent of their relationship with their host gene. Not only can host genes and the embedded genes they harbour be co-regulated and mutually modulate each other, many are functionally coupled playing a role in the same cellular pathways. And while host-non-coding RNA relationships can be highly conserved, mechanisms have been identified, and in particular an association with transposable elements, allowing the appearance of copies of non-coding genes nested in host genes, or the migration of embedded genes from one host gene to another. The study of embedded non-coding genes and their relationship with their host genes increases the complexity of cellular networks and provides important new regulatory links that are essential to properly understand cell function.
Collapse
Affiliation(s)
- Vincent Boivin
- Département de biochimie, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, Québec J1E 4K8, Canada
| | - Gabrielle Deschamps-Francoeur
- Département de biochimie, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, Québec J1E 4K8, Canada
| | - Michelle S Scott
- Département de biochimie, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, Québec J1E 4K8, Canada.
| |
Collapse
|
7
|
Differential transcription profiles of long non-coding RNAs in primary human brain microvascular endothelial cells in response to meningitic Escherichia coli. Sci Rep 2016; 6:38903. [PMID: 27958323 PMCID: PMC5153642 DOI: 10.1038/srep38903] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 11/15/2016] [Indexed: 12/29/2022] Open
Abstract
Accumulating studies have indicated the influence of long non-coding RNAs (lncRNAs) on various biological processes as well as disease development and progression. However, the lncRNAs involved in bacterial meningitis and their regulatory effects are largely unknown. By RNA-sequencing, the transcriptional profiles of host lncRNAs in primary human brain microvascular endothelial cells (hBMECs) in response to meningitic Escherichia coli were demonstrated. Here, 25,257 lncRNAs were identified, including 24,645 annotated lncRNAs and 612 newly found ones. A total of 895 lncRNAs exhibited significant differences upon infection, among which 382 were upregulated and 513 were downregulated (≥2-fold, p < 0.05). Via bioinformatic analysis, the features of these lncRNAs, their possible functions, and the potential regulatory relationships between lncRNAs and mRNAs were predicted. Moreover, we compared the transcriptional specificity of these differential lncRNAs among hBMECs, human astrocyte cell U251, and human umbilical vein endothelial cells, and demonstrated the novel regulatory effects of proinflammatory cytokines on these differential lncRNAs. To our knowledge, this is the first time the transcriptional profiles of host lncRNAs involved in E. coli-induced meningitis have been reported, which shall provide novel insight into the regulatory mechanisms behind bacterial meningitis involving lncRNAs, and contribute to better prevention and therapy of CNS infection.
Collapse
|
8
|
Kotakis C. Non-coding RNAs' partitioning in the evolution of photosynthetic organisms via energy transduction and redox signaling. RNA Biol 2015; 12:101-4. [PMID: 25826417 DOI: 10.1080/15476286.2015.1017201] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
Abstract
Ars longa, vita brevis -Hippocrates Chloroplasts and mitochondria are genetically semi-autonomous organelles inside the plant cell. These constructions formed after endosymbiosis and keep evolving throughout the history of life. Experimental evidence is provided for active non-coding RNAs (ncRNAs) in these prokaryote-like structures, and a possible functional imprinting on cellular electrophysiology by those RNA entities is described. Furthermore, updated knowledge on RNA metabolism of organellar genomes uncovers novel inter-communication bridges with the nucleus. This class of RNA molecules is considered as a unique ontogeny which transforms their biological role as a genetic rheostat into a synchronous biochemical one that can affect the energetic charge and redox homeostasis inside cells. A hypothesis is proposed where such modulation by non-coding RNAs is integrated with genetic signals regulating gene transfer. The implications of this working hypothesis are discussed, with particular reference to ncRNAs involvement in the organellar and nuclear genomes evolution since their integrity is functionally coupled with redox signals in photosynthetic organisms.
Collapse
Affiliation(s)
- Christos Kotakis
- a Agro-environmental cooperative BioNet West Hellas ; Gastouni Ileias, Hellas , Greece
| |
Collapse
|
9
|
Perina D, Korolija M, Hadžija MP, Grbeša I, Belužić R, Imešek M, Morrow C, Marjanović MP, Bakran-Petricioli T, Mikoč A, Ćetković H. Functional and Structural Characterization of FAU Gene/Protein from Marine Sponge Suberites domuncula. Mar Drugs 2015. [PMID: 26198235 PMCID: PMC4515611 DOI: 10.3390/md13074179] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Finkel-Biskis-Reilly murine sarcoma virus (FBR-MuSV) ubiquitously expressed (FAU) gene is down-regulated in human prostate, breast and ovarian cancers. Moreover, its dysregulation is associated with poor prognosis in breast cancer. Sponges (Porifera) are animals without tissues which branched off first from the common ancestor of all metazoans. A large majority of genes implicated in human cancers have their homologues in the sponge genome. Our study suggests that FAU gene from the sponge Suberites domuncula reflects characteristics of the FAU gene from the metazoan ancestor, which have changed only slightly during the course of animal evolution. We found pro-apoptotic activity of sponge FAU protein. The same as its human homologue, sponge FAU increases apoptosis in human HEK293T cells. This indicates that the biological functions of FAU, usually associated with "higher" metazoans, particularly in cancer etiology, possess a biochemical background established early in metazoan evolution. The ancestor of all animals possibly possessed FAU protein with the structure and function similar to evolutionarily more recent versions of the protein, even before the appearance of true tissues and the origin of tumors and metastasis. It provides an opportunity to use pre-bilaterian animals as a simpler model for studying complex interactions in human cancerogenesis.
Collapse
Affiliation(s)
- Dragutin Perina
- Division of Molecular Biology, Ruđer Bošković Institute, Zagreb 10000, Croatia.
| | - Marina Korolija
- Forensic Science Centre "Ivan Vučetić", Zagreb 10000, Croatia.
| | | | - Ivana Grbeša
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramaty-Gan 5290002, Israel.
| | - Robert Belužić
- Division of Molecular Medicine, Ruđer Bošković Institute, Zagreb 10000, Croatia.
| | - Mirna Imešek
- Division of Molecular Biology, Ruđer Bošković Institute, Zagreb 10000, Croatia.
| | - Christine Morrow
- Queen's University Belfast, Marine Laboratory, Portaferry BT22 1PF, Northern Ireland, UK.
| | | | | | - Andreja Mikoč
- Division of Molecular Biology, Ruđer Bošković Institute, Zagreb 10000, Croatia.
| | - Helena Ćetković
- Division of Molecular Biology, Ruđer Bošković Institute, Zagreb 10000, Croatia.
| |
Collapse
|
10
|
Gupta Y, Witte M, Möller S, Ludwig RJ, Restle T, Zillikens D, Ibrahim SM. ptRNApred: computational identification and classification of post-transcriptional RNA. Nucleic Acids Res 2014; 42:e167. [PMID: 25303994 PMCID: PMC4267668 DOI: 10.1093/nar/gku918] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
UNLABELLED Non-coding RNAs (ncRNAs) are known to play important functional roles in the cell. However, their identification and recognition in genomic sequences remains challenging. In silico methods, such as classification tools, offer a fast and reliable way for such screening and multiple classifiers have already been developed to predict well-defined subfamilies of RNA. So far, however, out of all the ncRNAs, only tRNA, miRNA and snoRNA can be predicted with a satisfying sensitivity and specificity. We here present ptRNApred, a tool to detect and classify subclasses of non-coding RNA that are involved in the regulation of post-transcriptional modifications or DNA replication, which we here call post-transcriptional RNA (ptRNA). It (i) detects RNA sequences coding for post-transcriptional RNA from the genomic sequence with an overall sensitivity of 91% and a specificity of 94% and (ii) predicts ptRNA-subclasses that exist in eukaryotes: snRNA, snoRNA, RNase P, RNase MRP, Y RNA or telomerase RNA. AVAILABILITY The ptRNApred software is open for public use on http://www.ptrnapred.org/.
Collapse
Affiliation(s)
- Yask Gupta
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| | - Mareike Witte
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| | - Steffen Möller
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| | - Ralf J Ludwig
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| | - Tobias Restle
- Institute for Molecular Medicine, University of Lübeck, 23538 Lübeck, Germany
| | - Detlef Zillikens
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| | - Saleh M Ibrahim
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| |
Collapse
|
11
|
de Boer FK, Hogeweg P. Mutation rates and evolution of multiple coding in RNA-based protocells. J Mol Evol 2014; 79:193-203. [PMID: 25280530 PMCID: PMC4247474 DOI: 10.1007/s00239-014-9648-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 09/18/2014] [Indexed: 11/28/2022]
Abstract
RNA has a myriad of biological roles in contemporary life. We use the RNA paradigm for genotype-phenotype mappings to study the evolution of multiple coding in dependence to mutation rates. We study three different one-to-many genotype-phenotype mappings which have the potential to encode the information for multiple functions on a single sequence. These three different maps are (i) cofolding, where two sequences can bind and “cofold,” (ii) suboptimal folding, where the alternative foldings within a certain range of the native state of sequences are considered, and (iii) adapter-based folding, in which protocells can evolve adapter-mediated alternative foldings. We study how protocells with a set of sequences can code for a set of predefined functional structures, while avoiding all other structures, which are considered to be misfoldings. Note that such misfolded structures are far more prevalent than functional ones. Our results highlight the flexibility of the RNA sequence to secondary structure mapping and the power of evolution to shape the genotype-phenotype mapping. We show that high fitness can be achieved even at high mutation rates. Mutation rates affect genome size, but differently depending on which folding method is used. We observe that cofolding limits the possibility to avoid misfolded structures and that adapters are always beneficial for fitness, but even more beneficial at low mutation rates. In all cases, the evolution procedure selects for molecules that can form additional structures. Our results indicate that inherent properties of RNA molecules and their interactions allow the evolution of complexity even at high mutation rates.
Collapse
Affiliation(s)
- Folkert K de Boer
- Theoretical Biology and Bioinformatics, Universiteit Utrecht, Utrecht, The Netherlands,
| | | |
Collapse
|
12
|
Abstract
De novo discovery of "motifs" capturing the commonalities among related noncoding ncRNA structured RNAs is among the most difficult problems in computational biology. This chapter outlines the challenges presented by this problem, together with some approaches towards solving them, with an emphasis on an approach based on the CMfinder CMfinder program as a case study. Applications to genomic screens for novel de novo structured ncRNA ncRNA s, including structured RNA elements in untranslated portions of protein-coding genes, are presented.
Collapse
Affiliation(s)
- Walter L Ruzzo
- Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | | |
Collapse
|
13
|
Identification and characterisation of non-coding small RNAs in the pathogenic filamentous fungus Trichophyton rubrum. BMC Genomics 2013; 14:931. [PMID: 24377353 PMCID: PMC3890542 DOI: 10.1186/1471-2164-14-931] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2013] [Accepted: 12/20/2013] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Accumulating evidence demonstrates that non-coding RNAs (ncRNAs) are indispensable components of many organisms and play important roles in cellular events, regulation, and development. RESULTS Here, we analysed the small non-coding RNA (ncRNA) transcriptome of Trichophyton rubrum by constructing and sequencing a cDNA library from conidia and mycelia. We identified 352 ncRNAs and their corresponding genomic loci. These ncRNA candidates included 198 entirely novel ncRNAs and 154 known ncRNAs classified as snRNAs, snoRNAs and other known ncRNAs. Further bioinformatic analysis detected 96 snoRNAs, including 56 snoRNAs that had been annotated in other organisms and 40 novel snoRNAs. All snoRNAs belonged to two major classes--C/D box snoRNAs and H/ACA snoRNAs--and their potential target sites in rRNAs and snRNAs were predicted. To analyse the evolutionary conservation of the ncRNAs in T. rubrum, we aligned all 352 ncRNAs to the genomes of six dermatophytes and to the NCBI non-redundant nucleotide database (NT). The results showed that most of the identified snRNAs were conserved in dermatophytes. Of the 352 ncRNAs, 102 also had genomic loci in other dermatophytes, and 27 were dermatophyte-specific. CONCLUSIONS Our systematic analysis may provide important clues to the function and evolution of ncRNAs in T. rubrum. These results also provide important information to complement the current annotation of the T. rubrum genome, which primarily comprises protein-coding genes.
Collapse
|
14
|
Lei J, Techa-Angkoon P, Sun Y. Chain-RNA: a comparative ncRNA search tool based on the two-dimensional chain algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:274-285. [PMID: 23929857 DOI: 10.1109/tcbb.2012.137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Noncoding RNA (ncRNA) identification is highly important to modern biology. The state-of-the-art method for ncRNA identification is based on comparative genomics, in which evolutionary conservations of sequences and secondary structures provide important evidence for ncRNA search. For ncRNAs with low sequence conservation but high structural similarity, conventional local alignment tools such as BLAST yield low sensitivity. Thus, there is a need for ncRNA search methods that can incorporate both sequence and structural similarities. We introduce chain-RNA, a pairwise structural alignment tool that can effectively locate cross-species conserved RNA elements with low sequence similarity. In chain-RNA, stem-loop structures are extracted from dot plots generated by an efficient local-folding algorithm. Then, we formulate stem alignment as an extended 2D chain problem and employ existing chain algorithms. Chain-RNA is tested on a data set containing annotated ncRNA homologs and is applied to novel ncRNA search in a transcriptomic data set. The experimental results show that chain-RNA has better tradeoff between sensitivity and false positive rate in ncRNA prediction than conventional sequence similarity search tools and is more time efficient than structural alignment tools. The source codes of chain-RNA can be downloaded at http://sourceforge.net/projects/chain-rna/ or at http://www.cse.msu.edu/~leijikai/chain-rna/.
Collapse
Affiliation(s)
- Jikai Lei
- Michigan State University, East Lansing, MI 48824, USA
| | | | | |
Collapse
|
15
|
Mono-uridylation of pre-microRNA as a key step in the biogenesis of group II let-7 microRNAs. Cell 2012; 151:521-32. [PMID: 23063654 DOI: 10.1016/j.cell.2012.09.022] [Citation(s) in RCA: 238] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2012] [Revised: 06/26/2012] [Accepted: 08/15/2012] [Indexed: 11/23/2022]
Abstract
RNase III Drosha initiates microRNA (miRNA) maturation by cleaving a primary miRNA transcript and releasing a pre-miRNA with a 2 nt 3' overhang. Dicer recognizes the 2 nt 3' overhang structure to selectively process pre-miRNAs. Here, we find that, unlike prototypic pre-miRNAs (group I), group II pre-miRNAs acquire a shorter (1 nt) 3' overhang from Drosha processing and therefore require a 3'-end mono-uridylation for Dicer processing. The majority of let-7 and miR-105 belong to group II. We identify TUT7/ZCCHC6, TUT4/ZCCHC11, and TUT2/PAPD4/GLD2 as the terminal uridylyl transferases responsible for pre-miRNA mono-uridylation. The TUTs act specifically on dsRNAs with a 1 nt 3' overhang, thereby creating a 2 nt 3' overhang. Depletion of TUTs reduces let-7 levels and disrupts let-7 function. Although the let-7 suppressor, Lin28, induces inhibitory oligo-uridylation in embryonic stem cells, mono-uridylation occurs in somatic cells lacking Lin28 to promote let-7 biogenesis. Our study reveals functional duality of uridylation and introduces TUT7/4/2 as components of the miRNA biogenesis pathway.
Collapse
|
16
|
Perina D, Korolija M, Mikoč A, Roller M, Pleše B, Imešek M, Morrow C, Batel R, Ćetković H. Structural and functional characterization of ribosomal protein gene introns in sponges. PLoS One 2012; 7:e42523. [PMID: 22880015 PMCID: PMC3412847 DOI: 10.1371/journal.pone.0042523] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2011] [Accepted: 07/10/2012] [Indexed: 11/25/2022] Open
Abstract
Ribosomal protein genes (RPGs) are a powerful tool for studying intron evolution. They exist in all three domains of life and are much conserved. Accumulating genomic data suggest that RPG introns in many organisms abound with non-protein-coding-RNAs (ncRNAs). These ancient ncRNAs are small nucleolar RNAs (snoRNAs) essential for ribosome assembly. They are also mobile genetic elements and therefore probably important in diversification and enrichment of transcriptomes through various mechanisms such as intron/exon gain/loss. snoRNAs in basal metazoans are poorly characterized. We examined 449 RPG introns, in total, from four demosponges: Amphimedon queenslandica, Suberites domuncula, Suberites ficus and Suberites pagurorum and showed that RPG introns from A. queenslandica share position conservancy and some structural similarity with "higher" metazoans. Moreover, our study indicates that mobile element insertions play an important role in the evolution of their size. In four sponges 51 snoRNAs were identified. The analysis showed discrepancies between the snoRNA pools of orthologous RPG introns between S. domuncula and A. queenslandica. Furthermore, these two sponges show as much conservancy of RPG intron positions between each other as between themselves and human. Sponges from the Suberites genus show consistency in RPG intron position conservation. However, significant differences in some of the orthologous RPG introns of closely related sponges were observed. This indicates that RPG introns are dynamic even on these shorter evolutionary time scales.
Collapse
Affiliation(s)
- Drago Perina
- Department of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Marina Korolija
- Department of Molecular Medicine, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Andreja Mikoč
- Department of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Maša Roller
- Department of Molecular Biology, Faculty of Science University of Zagreb, Zagreb, Croatia
| | - Bruna Pleše
- Department of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Mirna Imešek
- Department of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Christine Morrow
- School of Biological Sciences, Queen's University, Belfast, United Kingdom
| | - Renato Batel
- Center for Marine Research, Rudjer Boskovic Institute, Rovinj, Croatia
| | - Helena Ćetković
- Department of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| |
Collapse
|
17
|
Sun Y, Aljawad O, Lei J, Liu A. Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy. BMC Bioinformatics 2012; 13 Suppl 3:S12. [PMID: 22536896 PMCID: PMC3311100 DOI: 10.1186/1471-2105-13-s3-s12] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND NCRNAs (noncoding RNAs) play important roles in many biological processes. Existing genome-scale ncRNA search tools identify ncRNAs in local sequence alignments generated by conventional sequence comparison methods. However, some types of ncRNA lack strong sequence conservation and tend to be missed or mis-aligned by conventional sequence comparison. RESULTS In this paper, we propose an ncRNA identification framework that is complementary to existing sequence comparison tools. By integrating a filtration step based on Hamming distance and ncRNA alignment programs such as FOLDALIGN or PLAST-ncRNA, the proposed ncRNA search framework can identify ncRNAs that lack strong sequence conservation. In addition, as the ratio of transition and transversion mutation is often used as a discriminative feature for functional ncRNA identification, we incorporate this feature into the filtration step using a coding strategy. We apply Hamming distance seeds to ncRNA search in the intergenic regions of human and mouse genomes and between the Burkholderia cenocepacia J2315 genome and the Ralstonia solanacearum genome. The experimental results demonstrate that a carefully designed Hamming distance seed can achieve better sensitivity in searching for poorly conserved ncRNAs than conventional sequence comparison tools. CONCLUSIONS Hamming distance seeds provide better sensitivity as a filtration strategy for genome-wide ncRNA homology search than the existing seeding strategies used in BLAST-like tools. By combining Hamming distance seeds matching and ncRNA alignment, we are able to find ncRNAs with sequence similarities below 60%.
Collapse
Affiliation(s)
- Yanni Sun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Osama Aljawad
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jikai Lei
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Alex Liu
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
18
|
Martínez-Gómez P, Sánchez-Pérez R, Rubio M. Clarifying omics concepts, challenges, and opportunities for Prunus breeding in the postgenomic era. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2012; 16:268-83. [PMID: 22394278 DOI: 10.1089/omi.2011.0133] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The recent sequencing of the complete genome of the peach, together with the availability of new high-throughput genome, transcriptome, proteome, and metabolome analysis technologies, offers new possibilities for Prunus breeders in what has been described as the postgenomic era. In this context, new biological challenges and opportunities for the application of these technologies in the development of efficient marker-assisted selection strategies in Prunus breeding include genome resequencing using DNA-Seq, the study of RNA regulation at transcriptional and posttranscriptional levels using tilling microarray and RNA-Seq, protein and metabolite identification and annotation, and standardization of phenotype evaluation. Additional biological opportunities include the high level of synteny among Prunus genomes. Finally, the existence of biases presents another important biological challenge in attaining knowledge from these new high-throughput omics disciplines. On the other hand, from the philosophical point of view, we are facing a revolution in the use of new high-throughput analysis techniques that may mean a scientific paradigm shift in Prunus genetics and genomics theories. The evaluation of scientific progress is another important question in this postgenomic context. Finally, the incommensurability of omics theories in the new high-throughput analysis context presents an additional philosophical challenge.
Collapse
|
19
|
Abstract
The increase of bodyplan complexity in early bilaterian evolution is correlates with the advent and diversification of microRNAs. These small RNAs guide animal development by regulating temporal transitions in gene expression involved in cell fate choices and transitions between pluripotency and differentiation. One of the two known microRNAs whose origins date back before the bilaterian ancestor is mir-100. In Bilateria, it appears stably associated in polycistronic transcripts with let-7 and mir-125, two key regulators of development. In vertebrates, these three microRNA families have expanded to form a complex system of developmental regulators. In this contribution, we disentangle the evolutionary history of the let-7 locus, which was restructured independently in nematodes, platyhelminths, and deuterostomes. The foundation of a second let-7 locus in the common ancestor of vertebrates and urochordates predates the vertebrate-specific genome duplications, which then caused a rapid expansion of the let-7 family.
Collapse
Affiliation(s)
- Jana Hertel
- Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
| | | | | | | | | | | |
Collapse
|
20
|
Katiyar A, Smita S, Chinnusamy V, Pandey DM, Bansal K. Identification of miRNAs in sorghum by using bioinformatics approach. PLANT SIGNALING & BEHAVIOR 2012; 7:246-59. [PMID: 22415044 PMCID: PMC3405690 DOI: 10.4161/psb.18914] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
MicroRNAs (miRNAs) regulate gene expression mainly by post-transcriptional gene silencing (PTGS) and in some cases by transcriptional genes silencing (TGS). miRNAs play critical roles in developmental processes, nutrient homeostasis, abiotic stress and pathogen responses of plants. In contrast to the large number of miRNAs predicted in cereal model plant rice, only 148 miRNAs were predicted in sorghum till date (miRBase release 17). This suggested that miRNAs identified in sorghum is far from saturation. Hence, we developed a bioinformatics pipeline using an in-house PERL script and publicly available structure prediction tools to identify miRNAs and their target genes from publically available Expressed Sequence Tags (EST) and Genomic Survey Sequence (GSS). About 1379 known and unique plant miRNAs from 33 different crops were used to predict new miRNAs in sorghum. We identified 31 new miRNAs belonging to 10 different miRNA families. We predicted 72 potential target genes for 31 miRNAs, and most of these target genes are predicted to be involved in plant growth and development.These newly identified miRNAs add to the growing database of miRNA and lay the foundation for further understanding of miRNA function in sorghum plant development.
Collapse
Affiliation(s)
- Amit Katiyar
- National Research Centre on Plant Biotechnology; Indian Agricultural Research Institute Campus; New Delhi, India
| | - Shuchi Smita
- National Research Centre on Plant Biotechnology; Indian Agricultural Research Institute Campus; New Delhi, India
| | | | - Dev Mani Pandey
- Department of Biotechnology; Birla Institute of Technology; Mesra; Ranchi; Jharkhand, India
| | - Kailash Bansal
- National Research Centre on Plant Biotechnology; Indian Agricultural Research Institute Campus; New Delhi, India
- Correspondence to: Kailash Bansal,
| |
Collapse
|
21
|
Collins LJ. Characterizing ncRNAs in Human Pathogenic Protists Using High-Throughput Sequencing Technology. Front Genet 2011; 2:96. [PMID: 22303390 PMCID: PMC3268645 DOI: 10.3389/fgene.2011.00096] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2011] [Accepted: 12/07/2011] [Indexed: 11/16/2022] Open
Abstract
ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses, and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, small nucleolar RNAs (snoRNAs), and long ncRNAs on a genomic scale, making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational, and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases.
Collapse
Affiliation(s)
- Lesley Joan Collins
- Institute of Fundamental Sciences, Massey University Palmerston North, New Zealand
| |
Collapse
|
22
|
XIA FEI, DOU YONG, LEI GUOQING. FPQRNA: HARDWARE-ACCELERATED QRNA PACKAGE FOR NONCODING RNA GENE DETECTING ON FPGA. J Bioinform Comput Biol 2011. [DOI: 10.1142/s0219720010004902] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Noncoding RNAs (ncRNAs) have important functional roles in biological processes and have become a central research interest in modern molecular biology. However, how to find ncRNA attracts much more attention since ncRNA gene sequences do not have strong statistical signals, unlike protein coding genes. QRNA is a powerful program and has been widely used as an efficient analysis tool to detect ncRNA gene at present. Unfortunately, the O(L3) computing requirements and complicated data dependency greatly limit the usefulness of QRNA package with the explosion in gene database. In this paper, we present a fine-grained parallel QRNA prototype system, FPQRNA, for accelerating ncRNA gene detection application on FPGA chip. We propose a systolic-like array architecture with multiple PEs (Processing Elements). We partition the tasks by columns and assign tasks to PEs for load balance. We exploit data reuse schemes to reduce the need to load matrices from external memory. The experimental results show a speedup factor of more than 18× over the QRNA - 2.0.3c software running on a PC platform with AMD Phenom 9650 Quad CPU for pairwise sequence alignment with 996 residues, however the power consumption of our FPGA accelerator is only about 30% of that of the general-purpose microprocessors.
Collapse
Affiliation(s)
- FEI XIA
- National Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha 410073, China
| | - YONG DOU
- National Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha 410073, China
| | - GUO-QING LEI
- National Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha 410073, China
| |
Collapse
|
23
|
Scott MS, Ono M. From snoRNA to miRNA: Dual function regulatory non-coding RNAs. Biochimie 2011; 93:1987-92. [PMID: 21664409 PMCID: PMC3476530 DOI: 10.1016/j.biochi.2011.05.026] [Citation(s) in RCA: 178] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2011] [Accepted: 05/19/2011] [Indexed: 11/03/2022]
Abstract
Small nucleolar RNAs (snoRNAs) are an ancient class of small non-coding RNAs present in all eukaryotes and a subset of archaea that carry out a fundamental role in the modification and processing of ribosomal RNA. In recent years, however, a large proportion of snoRNAs have been found to be further processed into smaller molecules, some of which display different functionality. In parallel, several studies have uncovered extensive similarities between snoRNAs and other types of small non-coding RNAs, and in particular microRNAs. Here, we explore the extent of the relationship between these types of non-coding RNA and the possible underlying evolutionary forces that shaped this subset of the current non-coding RNA landscape.
Collapse
Affiliation(s)
- Michelle S Scott
- Division of Biological Chemistry and Drug Discovery, College of Life Sciences, University of Dundee, Dow Street, Dundee DD1 5EH, UK.
| | | |
Collapse
|
24
|
Abstract
Noncoding RNAs form an indispensible component of the cellular information processing networks, a role that crucially depends on the specificity of their interactions among each other as well as with DNA and protein. Patterns of intramolecular and intermolecular base pairs govern most RNA interactions. Specific base pairs dominate the structure formation of nucleic acids. Only little details distinguish intramolecular secondary structures from those cofolding molecules. RNA-protein interactions, on the other hand, are strongly dependent on the RNA structure as well since the sequence content of helical regions is largely unreadable, so that sequence specificity is mostly restricted to unpaired loop regions. Conservation of both sequence and structure thus this can give indications of the functioning of the diversity of ncRNAs.
Collapse
Affiliation(s)
- Manja Marz
- Department of Computer Science, University of Leipzig, Leipzig, Germany.
| | | |
Collapse
|
25
|
Reinius B, Shi C, Hengshuo L, Sandhu KS, Radomska KJ, Rosen GD, Lu L, Kullander K, Williams RW, Jazin E. Female-biased expression of long non-coding RNAs in domains that escape X-inactivation in mouse. BMC Genomics 2010; 11:614. [PMID: 21047393 PMCID: PMC3091755 DOI: 10.1186/1471-2164-11-614] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2010] [Accepted: 11/03/2010] [Indexed: 02/01/2023] Open
Abstract
Background Sexual dimorphism in brain gene expression has been recognized in several animal species. However, the relevant regulatory mechanisms remain poorly understood. To investigate whether sex-biased gene expression in mammalian brain is globally regulated or locally regulated in diverse brain structures, and to study the genomic organisation of brain-expressed sex-biased genes, we performed a large scale gene expression analysis of distinct brain regions in adult male and female mice. Results This study revealed spatial specificity in sex-biased transcription in the mouse brain, and identified 173 sex-biased genes in the striatum; 19 in the neocortex; 12 in the hippocampus and 31 in the eye. Genes located on sex chromosomes were consistently over-represented in all brain regions. Analysis on a subset of genes with sex-bias in more than one tissue revealed Y-encoded male-biased transcripts and X-encoded female-biased transcripts known to escape X-inactivation. In addition, we identified novel coding and non-coding X-linked genes with female-biased expression in multiple tissues. Interestingly, the chromosomal positions of all of the female-biased non-coding genes are in close proximity to protein-coding genes that escape X-inactivation. This defines X-chromosome domains each of which contains a coding and a non-coding female-biased gene. Lack of repressive chromatin marks in non-coding transcribed loci supports the possibility that they escape X-inactivation. Moreover, RNA-DNA combined FISH experiments confirmed the biallelic expression of one such novel domain. Conclusion This study demonstrated that the amount of genes with sex-biased expression varies between individual brain regions in mouse. The sex-biased genes identified are localized on many chromosomes. At the same time, sexually dimorphic gene expression that is common to several parts of the brain is mostly restricted to the sex chromosomes. Moreover, the study uncovered multiple female-biased non-coding genes that are non-randomly co-localized on the X-chromosome with protein-coding genes that escape X-inactivation. This raises the possibility that expression of long non-coding RNAs may play a role in modulating gene expression in domains that escape X-inactivation in mouse.
Collapse
Affiliation(s)
- Björn Reinius
- Department of Evolution and Development, EBC, Uppsala University, Sweden.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Guo L, Lu Z. The fate of miRNA* strand through evolutionary analysis: implication for degradation as merely carrier strand or potential regulatory molecule? PLoS One 2010; 5:e11387. [PMID: 20613982 PMCID: PMC2894941 DOI: 10.1371/journal.pone.0011387] [Citation(s) in RCA: 172] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2010] [Accepted: 06/08/2010] [Indexed: 11/29/2022] Open
Abstract
Background During typical microRNA (miRNA) biogenesis, one strand of a ∼22 nt RNA duplex is preferentially selected for entry into a silencing complex, whereas the other strand, known as the passenger strand or miRNA* strand, is degraded. Recently, some miRNA* sequences were reported as guide miRNAs with abundant expression. Here, we intended to discover evolutionary implication of the fate of miRNA* strand by analyzing miRNA/miRNA* sequences across vertebrates. Principal Findings Mature miRNAs based on gene families were well conserved especially for their seed sequences across vertebrates, while their passenger strands always showed various divergence patterns. The divergence mainly resulted from divergence of different animal species, homologous miRNA genes and multicopy miRNA hairpin precursors. Some miRNA* sequences were phylogenetically conserved in seed and anchor sequences similar to mature miRNAs, while others revealed high levels of nucleotide divergence despite some of their partners were highly conserved. Most of those miRNA precursors that could generate abundant miRNAs from both strands always were well conserved in sequences of miR-#-5p and miR-#-3p, especially for their seed sequences. Conclusions The final fate of miRNA* strand, either degraded as merely carrier strand or expressed abundantly as potential functional guide miRNA, may be destined across evolution. Well-conserved miRNA* strands, particularly conservation in seed sequences, maybe afford potential opportunities for contributing to regulation network. The study will broaden our understanding of potential functional miRNA* species.
Collapse
Affiliation(s)
- Li Guo
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Zuhong Lu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing, China
- * E-mail:
| |
Collapse
|
27
|
Abstract
Noncoding RNAs (ncRNAs) are increasingly recognized as important functional molecules in the cell. Here we give a short overview of fundamental computational techniques to analyze ncRNAs that can help us better understand their function. Topics covered include prediction of secondary structure from the primary sequence, prediction of consensus structures for homologous sequences, search for homologous sequences in databases using sequence and structure comparisons, annotation of tRNAs, rRNAs, snoRNAs, and microRNAs, de novo prediction of novel ncRNAs, and prediction of RNA/RNA interactions including miRNA target prediction.
Collapse
|
28
|
Gardner PP. The use of covariance models to annotate RNAs in whole genomes. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2009; 8:444-50. [PMID: 19833700 DOI: 10.1093/bfgp/elp042] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
In this review we discuss bioinformatic issues in non-coding RNA analysis, in particular the annotation of genome sequences using covariance models. Some recent innovations for improving the speed and accuracy of covariance models is discussed.
Collapse
Affiliation(s)
- Paul P Gardner
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK.
| |
Collapse
|
29
|
Abstract
Eukaryote gene expression is mediated by a cascade of RNA functions that regulate, process, store, transport, and translate RNA transcripts. The RNA network that promotes this cascade depends on a large cohort of proteins that partner RNAs; thus, the modern RNA world of eukaryotes is really a ribonucleoprotein (RNP) world. Features of this "RNP infrastructure" can be related to the high cytosolic density of macromolecules and the large size of eukaryote cells. Because of the densely packed cytosol or nucleoplasm (with its severe restriction on diffusion of macromolecules), partitioning of the eukaryote cell into functionally specialized compartments is essential for efficiency. This necessitates the association of RNA and protein into large RNP complexes including ribosomes and spliceosomes. This is well illustrated by the ubiquitous spliceosome for which most components are conserved throughout eukaryotes and which interacts with other RNP-based machineries. The complexes involved in gene processing in modern eukaryotes have broad phylogenetic distributions suggesting that the common ancestor of extant eukaryotes had a fully evolved RNP network. Thus, the eukaryote genome may be uniquely informative about the transition from an earlier RNA genome world to the modern DNA genome world.
Collapse
Affiliation(s)
- Lesley J Collins
- Allan Wilson Center for Molecular Ecology and Evolution, Palmerston North, New Zealand.
| | | | | | | |
Collapse
|
30
|
Marz M, Kirsten T, Stadler PF. Evolution of spliceosomal snRNA genes in metazoan animals. J Mol Evol 2009; 67:594-607. [PMID: 19030770 DOI: 10.1007/s00239-008-9149-6] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2008] [Accepted: 07/14/2008] [Indexed: 11/28/2022]
Abstract
While studies of the evolutionary histories of protein families are commonplace, little is known on noncoding RNAs beyond microRNAs and some snoRNAs. Here we investigate in detail the evolutionary history of the nine spliceosomal snRNA families (U1, U2, U4, U5, U6, U11, U12, U4atac, and U6atac) across the completely or partially sequenced genomes of metazoan animals. Representatives of the five major spliceosomal snRNAs were found in all genomes. None of the minor splicesomal snRNAs were detected in nematodes or in the shotgun traces of Oikopleura dioica, while in all other animal genomes at most one of them is missing. Although snRNAs are present in multiple copies in most genomes, distinguishable paralogue groups are not stable over long evolutionary times, although they appear independently in several clades. In general, animal snRNA secondary structures are highly conserved, albeit, in particular, U11 and U12 in insects exhibit dramatic variations. An analysis of genomic context of snRNAs reveals that they behave like mobile elements, exhibiting very little syntenic conservation.
Collapse
Affiliation(s)
- Manuela Marz
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, 04107 Leipzig, Germany.
| | | | | |
Collapse
|
31
|
Soldà G, Makunin IV, Sezerman OU, Corradin A, Corti G, Guffanti A. An Ariadne's thread to the identification and annotation of noncoding RNAs in eukaryotes. Brief Bioinform 2009; 10:475-89. [PMID: 19383843 DOI: 10.1093/bib/bbp022] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Non-protein coding RNAs (ncRNAs) have emerged as a vast and heterogeneous portion of eukaryotic transcriptomes. Several ncRNA families, either short (<200 nucleotides, nt) or long (>200 nt), have been described and implicated in a variety of biological processes, from translation to gene expression regulation and nuclear trafficking. Most probably, other families are still to be discovered. Computational methods for ncRNA research require different approaches from the ones normally used in the prediction of protein-coding genes. Indeed, primary sequence alone is often insufficient to infer ncRNA functionality, whereas secondary structure and local conservation of portions of the transcript could provide useful information for both the prediction and the functional annotation of ncRNAs. Here we present an overview of computational methods and bioinformatics resources currently available for studying ncRNA genes, introducing the common themes as well as the different approaches required for long and short ncRNA identification and annotation.
Collapse
Affiliation(s)
- Giulia Soldà
- Department of Biology and Genetics for Medical Sciences, University of Milano, 20133 Milan, Italy.
| | | | | | | | | | | |
Collapse
|
32
|
Hertel J, de Jong D, Marz M, Rose D, Tafer H, Tanzer A, Schierwater B, Stadler PF. Non-coding RNA annotation of the genome of Trichoplax adhaerens. Nucleic Acids Res 2009; 37:1602-15. [PMID: 19151082 PMCID: PMC2655684 DOI: 10.1093/nar/gkn1084] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2008] [Revised: 12/22/2008] [Accepted: 12/23/2008] [Indexed: 02/06/2023] Open
Abstract
A detailed annotation of non-protein coding RNAs is typically missing in initial releases of newly sequenced genomes. Here we report on a comprehensive ncRNA annotation of the genome of Trichoplax adhaerens, the presumably most basal metazoan whose genome has been published to-date. Since blast identified only a small fraction of the best-conserved ncRNAs--in particular rRNAs, tRNAs and some snRNAs--we developed a semi-global dynamic programming tool, GotohScan, to increase the sensitivity of the homology search. It successfully identified the full complement of major and minor spliceosomal snRNAs, the genes for RNase P and MRP RNAs, the SRP RNA, as well as several small nucleolar RNAs. We did not find any microRNA candidates homologous to known eumetazoan sequences. Interestingly, most ncRNAs, including the pol-III transcripts, appear as single-copy genes or with very small copy numbers in the Trichoplax genome.
Collapse
Affiliation(s)
- Jana Hertel
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Danielle de Jong
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Manja Marz
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Dominic Rose
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Hakim Tafer
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Andrea Tanzer
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Bernd Schierwater
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraβe 16-18, D-04107 Leipzig, Division of Ecology and Evolution, Institut für Tierökologie und Zellbiologie, Tierärztliche Hochschule Hannover, Bünteweg 17d, D-30559 Hannover, Germany, Department of Theoretical Chemistry, University of Vienna, Währingerstraβe 17, A-1090 Wien, Austria, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany and Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
| |
Collapse
|
33
|
Rose D, Jöris J, Hackermüller J, Reiche K, Li Q, Stadler PF. Duplicated RNA genes in teleost fish genomes. J Bioinform Comput Biol 2009; 6:1157-75. [PMID: 19090022 DOI: 10.1142/s0219720008003886] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2007] [Revised: 06/17/2008] [Accepted: 06/18/2008] [Indexed: 12/29/2022]
Abstract
Teleost fishes share a duplication of their entire genomes. We report here on a computational survey of structured non-coding RNAs (ncRNAs) in teleost genomes, focusing on the fate of fish-specific duplicates. As in other metazoan groups, we find evidence of a large number (11,543) of structured RNAs, most of which (~86%) are clade-specific or evolve so fast that their tetrapod homologs cannot be detected. In surprising contrast to protein-coding genes, the fish-specific genome duplication did not lead to a large number of paralogous ncRNAs: only 188 candidates, mostly microRNAs, appear in a larger copy number in teleosts than in tetrapods, suggesting that large-scale gene duplications do not play a major role in the expansion of the vertebrate ncRNA inventory.
Collapse
Affiliation(s)
- Dominic Rose
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.
| | | | | | | | | | | |
Collapse
|
34
|
Rohlf T. Critical line in random-threshold networks with inhomogeneous thresholds. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2008; 78:066118. [PMID: 19256916 DOI: 10.1103/physreve.78.066118] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2008] [Revised: 11/04/2008] [Indexed: 05/27/2023]
Abstract
We calculate analytically the critical connectivity K_{c} of random-threshold networks (RTNs) for homogeneous and inhomogeneous thresholds, and confirm the results by numerical simulations. We find a superlinear increase of K_{c} with the (average) absolute threshold mid R:hmid R: , which approaches K_{c}(mid R:hmid R:) approximately h;{2}(2lnmid R:hmid R:) for large mid R:hmid R: , and show that this asymptotic scaling is universal for RTNs with Poissonian distributed connectivity and threshold distributions with a variance that grows slower than h;{2} . Interestingly, we find that inhomogeneous distribution of thresholds leads to increased propagation of perturbations for sparsely connected networks, while for densely connected networks damage is reduced; the crossover point yields a characteristic connectivity K_{d} , that has no counterpart in Boolean networks with transition functions not restricted to threshold-dependent switching. Last, local correlations between node thresholds and in-degree are introduced. Here, numerical simulations show that even weak (anti)correlations can lead to a transition from ordered to chaotic dynamics, and vice versa.
Collapse
Affiliation(s)
- Thimo Rohlf
- Max-Planck-Institute for Mathematics in the Sciences, Inselstrasse 22, D-04103 Leipzig, Germany
| |
Collapse
|
35
|
Mimouni NK, Lyngso RB, Griffiths-Jones S, Hein J. An Analysis of Structural Influences on Selection in RNA Genes. Mol Biol Evol 2008; 26:209-16. [DOI: 10.1093/molbev/msn240] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
|
36
|
Marz M, Mosig A, Stadler BMR, Stadler PF. U7 snRNAs: a computational survey. GENOMICS PROTEOMICS & BIOINFORMATICS 2008; 5:187-95. [PMID: 18267300 PMCID: PMC5054213 DOI: 10.1016/s1672-0229(08)60006-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
U7 small nuclear RNA (snRNA) sequences have been described only for a handful of animal species in the past. Here we describe a computational search for functional U7 snRNA genes throughout vertebrates including the upstream sequence elements characteristic for snRNAs transcribed by polymerase II. Based on the results of this search, we discuss the high variability of U7 snRNAs in both sequence and structure, and report on an attempt to find U7 snRNA sequences in basal deuterostomes and non-drosophilids insect genomes based on a combination of sequence, structure, and promoter features. Due to the extremely short sequence and the high variability in both sequence and structure, no unambiguous candidates were found. These results cast doubt on putative U7 homologs in even more distant organisms that are reported in the most recent release of the Rfam database.
Collapse
Affiliation(s)
- Manja Marz
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig D-04107, Germany
| | | | | | | |
Collapse
|
37
|
Abstract
We present an easy-to-use webserver that makes it possible to simultaneously use a number of state of the art methods for performing multiple alignment and secondary structure prediction for noncoding RNA sequences. This makes it possible to use the programs without having to download the code and get the programs to run. The results of all the programs are presented on a webpage and can easily be downloaded for further analysis. Additional measures are calculated for each program to make it easier to judge the individual predictions, and a consensus prediction taking all the programs into account is also calculated. This website is free and open to all users and there is no login requirement. The webserver can be found at: http://genome.ku.dk/resources/war.
Collapse
Affiliation(s)
- Elfar Torarinsson
- Division of Genetics and Bioinformatics, IBHV, Faculty of Life Sciences, University of Copenhagen, Groennegaardsvej 3, DK-1870 Frederiksberg C, Denmark
| | | |
Collapse
|
38
|
Washietl S, Hofacker IL. Identifying structural noncoding RNAs using RNAz. ACTA ACUST UNITED AC 2008; Chapter 12:Unit 12.7. [PMID: 18428784 DOI: 10.1002/0471250953.bi1207s19] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The functions of many noncoding RNAs and cis-acting regulatory elements of mRNAs depend on a defined RNA secondary structure. RNAz predicts such functional RNA structures on the basis of thermodynamic stability and evolutionary conservation of homologous sequences. It can be used to efficiently filter multiple alignments for noncoding RNA candidates in genomic screens.
Collapse
|
39
|
Gruber AR, Bernhart SH, Hofacker IL, Washietl S. Strategies for measuring evolutionary conservation of RNA secondary structures. BMC Bioinformatics 2008; 9:122. [PMID: 18302738 PMCID: PMC2335298 DOI: 10.1186/1471-2105-9-122] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2007] [Accepted: 02/26/2008] [Indexed: 02/01/2023] Open
Abstract
Background Evolutionary conservation of RNA secondary structure is a typical feature of many functional non-coding RNAs. Since almost all of the available methods used for prediction and annotation of non-coding RNA genes rely on this evolutionary signature, accurate measures for structural conservation are essential. Results We systematically assessed the ability of various measures to detect conserved RNA structures in multiple sequence alignments. We tested three existing and eight novel strategies that are based on metrics of folding energies, metrics of single optimal structure predictions, and metrics of structure ensembles. We find that the folding energy based SCI score used in the RNAz program and a simple base-pair distance metric are by far the most accurate. The use of more complex metrics like for example tree editing does not improve performance. A variant of the SCI performed particularly well on highly conserved alignments and is thus a viable alternative when only little evolutionary information is available. Surprisingly, ensemble based methods that, in principle, could benefit from the additional information contained in sub-optimal structures, perform particularly poorly. As a general trend, we observed that methods that include a consensus structure prediction outperformed equivalent methods that only consider pairwise comparisons. Conclusion Structural conservation can be measured accurately with relatively simple and intuitive metrics. They have the potential to form the basis of future RNA gene finders, that face new challenges like finding lineage specific structures or detecting mis-aligned sequences.
Collapse
Affiliation(s)
- Andreas R Gruber
- Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, 1090 Wien, Austria.
| | | | | | | |
Collapse
|
40
|
Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature 2008; 450:219-32. [PMID: 17994088 DOI: 10.1038/nature06340] [Citation(s) in RCA: 462] [Impact Index Per Article: 27.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2007] [Accepted: 10/04/2007] [Indexed: 12/25/2022]
Abstract
Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional element shows characteristic patterns of change, or 'evolutionary signatures', dictated by its precise selective constraints. Such signatures enable recognition of new protein-coding genes and exons, spurious and incorrect gene annotations, and numerous unusual gene structures, including abundant stop-codon readthrough. Similarly, we predict non-protein-coding RNA genes and structures, and new microRNA (miRNA) genes. We provide evidence of miRNA processing and functionality from both hairpin arms and both DNA strands. We identify several classes of pre- and post-transcriptional regulatory motifs, and predict individual motif instances with high confidence. We also study how discovery power scales with the divergence and number of species compared, and we provide general guidelines for comparative studies.
Collapse
|
41
|
Zhou F, Tran T, Xu Y. Nezha, a novel active miniature inverted-repeat transposable element in cyanobacteria. Biochem Biophys Res Commun 2008; 365:790-4. [DOI: 10.1016/j.bbrc.2007.11.038] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2007] [Accepted: 11/09/2007] [Indexed: 11/16/2022]
|
42
|
Lindgreen S, Gardner PP, Krogh A. MASTR: multiple alignment and structure prediction of non-coding RNAs using simulated annealing. Bioinformatics 2007; 23:3304-11. [PMID: 18006551 DOI: 10.1093/bioinformatics/btm525] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION As more non-coding RNAs are discovered, the importance of methods for RNA analysis increases. Since the structure of ncRNA is intimately tied to the function of the molecule, programs for RNA structure prediction are necessary tools in this growing field of research. Furthermore, it is known that RNA structure is often evolutionarily more conserved than sequence. However, few existing methods are capable of simultaneously considering multiple sequence alignment and structure prediction. RESULT We present a novel solution to the problem of simultaneous structure prediction and multiple alignment of RNA sequences. Using Markov chain Monte Carlo in a simulated annealing framework, the algorithm MASTR (Multiple Alignment of STructural RNAs) iteratively improves both sequence alignment and structure prediction for a set of RNA sequences. This is done by minimizing a combined cost function that considers sequence conservation, covariation and basepairing probabilities. The results show that the method is very competitive to similar programs available today, both in terms of accuracy and computational efficiency. AVAILABILITY Source code available from http://mastr.binf.ku.dk/
Collapse
Affiliation(s)
- Stinus Lindgreen
- Bioinformatics Centre, Department of Molecular Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200 Copenhagen N, Denmark.
| | | | | |
Collapse
|
43
|
Demongeot J, Moreira A. A possible circular RNA at the origin of life. J Theor Biol 2007; 249:314-24. [PMID: 17825325 DOI: 10.1016/j.jtbi.2007.07.010] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2007] [Revised: 07/04/2007] [Accepted: 07/05/2007] [Indexed: 11/24/2022]
Abstract
The increasing volume of sequenced genomes and the recent techniques for performing in vitro molecular evolution have rekindled the interest for questions on the origin of life. Nevertheless, a gap continues to exist between the research on prebiotic chemistry and molecule generation, on one hand, and the study of molecular fossils preserved in genomes, on the other. Here we attempt to fill this gap by using some assumptions about the prebiotic scenario (including a strong stereochemical basis for the genetic code) to determine the RNA sequences more likely to appear and subsist. A set of minimal RNA rings is exhaustively determined; a subset of them is then selected through stability arguments, and a particular ring ("AL ring") is finally singled out as the most likely winner of this prebiotic game. The rings happen to have several structural and statistical properties of modern genes: a repeated AUG codon appears spontaneously (and is thus made available for becoming a start signal), the form AUG/STOP emerges, and frequency patterns resemble those of present genes. The whole set of rings was also compared to a database of tRNAs, considering the conserved positions (located in the free parts of the molecule, essentially the loops); the ring that most closely matched tRNA sequences-and matched, in fact, the consensus of tRNA at all the aligned positions-was AL, the same ring independently selected before. The unselected emergence of gene-like features through two simple selection steps and the close similarity between the finally selected ring and tRNA (including some remarkable features of the resulting alignment) suggest a possible link between the prebiotic world and the first biological molecules, which is amenable for experimental testing. Even if our scenario is partially wrong, the unlikely coincidences should provide useful hints for other efforts.
Collapse
|
44
|
Washietl S, Pedersen JS, Korbel JO, Stocsits C, Gruber AR, Hackermüller J, Hertel J, Lindemeyer M, Reiche K, Tanzer A, Ucla C, Wyss C, Antonarakis SE, Denoeud F, Lagarde J, Drenkow J, Kapranov P, Gingeras TR, Guigó R, Snyder M, Gerstein MB, Reymond A, Hofacker IL, Stadler PF. Structured RNAs in the ENCODE selected regions of the human genome. Genes Dev 2007; 17:852-64. [PMID: 17568003 PMCID: PMC1891344 DOI: 10.1101/gr.5650707] [Citation(s) in RCA: 136] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2006] [Accepted: 12/12/2006] [Indexed: 12/16/2022]
Abstract
Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack characteristic signals in primary sequence, comparative approaches evaluating evolutionary conservation of structures are most promising. We have used three recently introduced programs based on either phylogenetic-stochastic context-free grammar (EvoFold) or energy directed folding (RNAz and AlifoldZ), yielding several thousand candidate structures (corresponding to approximately 2.7% of the ENCODE regions). EvoFold has its highest sensitivity in highly conserved and relatively AU-rich regions, while RNAz favors slightly GC-rich regions, resulting in a relatively small overlap between methods. Comparison with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz and EvoFold, and an additional 239 RNAz or EvoFold predictions are supported by the (more stringent) AlifoldZ algorithm. Five hundred seventy RNAz structure predictions fall into regions that show signs of selection pressure also on the sequence level (i.e., conserved elements). More than 700 predictions overlap with noncoding transcripts detected by oligonucleotide tiling arrays. One hundred seventy-five selected candidates were tested by RT-PCR in six tissues, and expression could be verified in 43 cases (24.6%).
Collapse
Affiliation(s)
- Stefan Washietl
- Institute for Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Abstract
MicroRNAs (miRNAs) are important post-transcriptional regulators of their target genes in plants and animals. miRNAs are usually 20-24 nucleotides long. Despite their unusually small sizes, the evolutionary history of miRNA gene families seems to be similar to their protein-coding counterparts. In contrast to the small but abundant miRNA families in the animal genomes, plants have fewer but larger miRNA gene families. Members of plant miRNA gene families are often highly similar, suggesting recent expansion via tandem gene duplication and segmental duplication events. Although many miRNA genes are conserved across plant species, the same gene family varies significantly in size and genomic organization in different species, which may cause dosage effects and spatial and temporal differences in target gene regulations. In this review, we summarize the current progress in understanding the evolution of plant miRNA gene families.
Collapse
Affiliation(s)
- Aili Li
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences and The National Key Facility for Crop Gene Resources and Genetic Improvement (NFCRI), Beijing 100081, China
| | | |
Collapse
|
46
|
Gruber AR, Neuböck R, Hofacker IL, Washietl S. The RNAz web server: prediction of thermodynamically stable and evolutionarily conserved RNA structures. Nucleic Acids Res 2007; 35:W335-8. [PMID: 17452347 PMCID: PMC1933143 DOI: 10.1093/nar/gkm222] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Many non-coding RNA genes and cis-acting regulatory elements of mRNAs contain RNA secondary structures that are critical for their function. Such functional RNAs can be predicted on the basis of thermodynamic stability and evolutionary conservation. We present a web server that uses the RNAz algorithm to detect functional RNA structures in multiple alignments of nucleotide sequences. The server provides access to a complete and fully automatic analysis pipeline that allows not only to analyze single alignments in a variety of formats, but also to conduct complex screens of large genomic regions. Results are presented on a website that is illustrated by various structure representations and can be downloaded for local view. The web server is available at: rna.tbi.univie.ac.at/RNAz.
Collapse
Affiliation(s)
| | | | | | - Stefan Washietl
- *To whom correspondence should be addressed. +43-1-4277-52744+43-1-4277-52793
| |
Collapse
|
47
|
Backofen R, Bernhart SH, Flamm C, Fried C, Fritzsch G, Hackermüller J, Hertel J, Hofacker IL, Missal K, Mosig A, Prohaska SJ, Rose D, Stadler PF, Tanzer A, Washietl S, Will S. RNAs everywhere: genome-wide annotation of structured RNAs. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2007; 308:1-25. [PMID: 17171697 DOI: 10.1002/jez.b.21130] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Starting with the discovery of microRNAs and the advent of genome-wide transcriptomics, non-protein-coding transcripts have moved from a fringe topic to a central field research in molecular biology. In this contribution we review the state of the art of "computational RNomics", i.e., the bioinformatics approaches to genome-wide RNA annotation. Instead of rehashing results from recently published surveys in detail, we focus here on the open problem in the field, namely (functional) annotation of the plethora of putative RNAs. A series of exploratory studies are used to provide non-trivial examples for the discussion of some of the difficulties.
Collapse
|
48
|
Evolution of the vertebrate Y RNA cluster. Theory Biosci 2007; 126:9-14. [PMID: 18087752 DOI: 10.1007/s12064-007-0003-y] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2007] [Accepted: 02/21/2007] [Indexed: 10/23/2022]
Abstract
Relatively little is known about the evolutionary histories of most classes of non-protein coding RNAs. Here we consider Y RNAs, a relatively rarely studied group of related pol-III transcripts. A single cluster of functional genes is preserved throughout tetrapod evolution, which however exhibits clade-specific tandem duplications, gene-losses, and rearrangements.
Collapse
|
49
|
A bootstrap based analysis pipeline for efficient classification of phylogenetically related animal miRNAs. BMC Genomics 2007; 8:66. [PMID: 17341314 PMCID: PMC1832191 DOI: 10.1186/1471-2164-8-66] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2006] [Accepted: 03/06/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Phylogenetically related miRNAs (miRNA families) convey important information of the function and evolution of miRNAs. Due to the special sequence features of miRNAs, pair-wise sequence identity between miRNA precursors alone is often inadequate for unequivocally judging the phylogenetic relationships between miRNAs. Most of the current methods for miRNA classification rely heavily on manual inspection and lack measurements of the reliability of the results. RESULTS In this study, we designed an analysis pipeline (the Phylogeny-Bootstrap-Cluster (PBC) pipeline) to identify miRNA families based on branch stability in the bootstrap trees derived from overlapping genome-wide miRNA sequence sets. We tested the PBC analysis pipeline with the miRNAs from six animal species, H. sapiens, M. musculus, G. gallus, D. rerio, D. melanogaster, and C. elegans. The resulting classification was compared with the miRNA families defined in miRBase. The two classifications were largely consistent. CONCLUSION The PBC analysis pipeline is an efficient method for classifying large numbers of heterogeneous miRNA sequences. It requires minimum human involvement and provides measurements of the reliability of the classification results.
Collapse
|
50
|
Pedersen JS, Bejerano G, Siepel A, Rosenbloom K, Lindblad-Toh K, Lander ES, Kent J, Miller W, Haussler D. Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput Biol 2006; 2:e33. [PMID: 16628248 PMCID: PMC1440920 DOI: 10.1371/journal.pcbi.0020033] [Citation(s) in RCA: 376] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2005] [Accepted: 03/06/2006] [Indexed: 12/28/2022] Open
Abstract
The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general comparative genomics method based on phylogenetic stochastic context-free grammars for identifying functional RNAs encoded in the human genome and used it to survey an eight-way genome-wide alignment of the human, chimpanzee, mouse, rat, dog, chicken, zebra-fish, and puffer-fish genomes for deeply conserved functional RNAs. At a loose threshold for acceptance, this search resulted in a set of 48,479 candidate RNA structures. This screen finds a large number of known functional RNAs, including 195 miRNAs, 62 histone 3′UTR stem loops, and various types of known genetic recoding elements. Among the highest-scoring new predictions are 169 new miRNA candidates, as well as new candidate selenocysteine insertion sites, RNA editing hairpins, RNAs involved in transcript auto regulation, and many folds that form singletons or small functional RNA families of completely unknown function. While the rate of false positives in the overall set is difficult to estimate and is likely to be substantial, the results nevertheless provide evidence for many new human functional RNAs and present specific predictions to facilitate their further characterization. Structurally functional RNA is a versatile component of the cell that comprises both independent molecules and regulatory elements of mRNA transcripts. The many recent discoveries of functional RNAs, most notably miRNAs, suggests that many more are yet to be found. Computational identification of functional RNAs has traditionally been hampered by the lack of strong sequence signals. However, structural conservation over long evolutionary times creates a characteristic substitution pattern, which can be exploited with the advent of comparative genomics. The authors have devised a method for identification of functional RNA structures based on phylogenetic analysis of multiple alignments. This method has been used to screen the regions of the human genome that are under strong selective constraints. The result is a set of 48,479 candidate RNA structures. For some classes of known functional RNAs, such as miRNAs and histone 3′UTR stem loops, this set includes nearly all deeply conserved members. The initial large candidate set has been partitioned by size, shape, and genomic location and ranked by score to produce specific lists of top candidates for miRNAs, selenocysteine insertion sites, RNA editing hairpins, and RNAs involved in transcript auto regulation.
Collapse
Affiliation(s)
- Jakob Skou Pedersen
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|