1
|
Nitsche A, Rose D, Fasold M, Reiche K, Stadler PF. Comparison of splice sites reveals that long noncoding RNAs are evolutionarily well conserved. RNA (NEW YORK, N.Y.) 2015; 21:801-12. [PMID: 25802408 PMCID: PMC4408788 DOI: 10.1261/rna.046342.114] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Accepted: 12/24/2014] [Indexed: 05/03/2023]
Abstract
Large-scale RNA sequencing has revealed a large number of long mRNA-like transcripts (lncRNAs) that do not code for proteins. The evolutionary history of these lncRNAs has been notoriously hard to study systematically due to their low level of sequence conservation that precludes comprehensive homology-based surveys and makes them nearly impossible to align. An increasing number of special cases, however, has been shown to be at least as old as the vertebrate lineage. Here we use the conservation of splice sites to trace the evolution of lncRNAs. We show that >85% of the human GENCODE lncRNAs were already present at the divergence of placental mammals and many hundreds of these RNAs date back even further. Nevertheless, we observe a fast turnover of intron/exon structures. We conclude that lncRNA genes are evolutionary ancient components of vertebrate genomes that show an unexpected and unprecedented evolutionary plasticity. We offer a public web service (http://splicemap.bioinf.uni-leipzig.de) that allows to retrieve sets of orthologous splice sites and to produce overview maps of evolutionarily conserved splice sites for visualization and further analysis. An electronic supplement containing the ncRNA data sets used in this study is available at http://www.bioinf.uni-leipzig.de/publications/supplements/12-001.
Collapse
Affiliation(s)
- Anne Nitsche
- Bioinformatics Group, Department of Computer Science, University of Leipzig, D-04107 Leipzig, Germany Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany
| | - Dominic Rose
- Bioinformatics Group, Department of Computer Science, University of Freiburg, D-79110 Freiburg, Germany MML, Munich Leukemia Laboratory GmbH, D-81377 München, Germany
| | - Mario Fasold
- Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany ecSeq Bioinformatics, D-04275 Leipzig, Germany
| | - Kristin Reiche
- Young Investigators Group Bioinformatics and Transcriptomics, Department of Proteomics, Helmholtz Centre for Environmental Research-UFZ, D-04318 Leipzig, Germany Department of Diagnostics, Fraunhofer Institute for Cell Therapy and Immunology-IZI, D-04103 Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University of Leipzig, D-04107 Leipzig, Germany Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany Department of Diagnostics, Fraunhofer Institute for Cell Therapy and Immunology-IZI, D-04103 Leipzig, Germany Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany Department of Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria Center for non-coding RNA in Technology and Health, University of Copenhagen, DK-1870 Frederiksberg C, Denmark Santa Fe Institute, Santa Fe, New Mexico 87501, USA
| |
Collapse
|
2
|
Kamanu TKK, Radovanovic A, Archer JAC, Bajic VB. Exploration of miRNA families for hypotheses generation. Sci Rep 2013; 3:2940. [PMID: 24126940 PMCID: PMC3796740 DOI: 10.1038/srep02940] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2013] [Accepted: 09/25/2013] [Indexed: 12/13/2022] Open
Abstract
Technological improvements have resulted in increased discovery of new microRNAs (miRNAs) and refinement and enrichment of existing miRNA families. miRNA families are important because they suggest a common sequence or structure configuration in sets of genes that hint to a shared function. Exploratory tools to enhance investigation of characteristics of miRNA families and the functions of family-specific miRNA genes are lacking. We have developed, miRNAVISA, a user-friendly web-based tool that allows customized interrogation and comparisons of miRNA families for hypotheses generation, and comparison of per-species chromosomal distribution of miRNA genes in different families. This study illustrates hypothesis generation using miRNAVISA in seven species. Our results unveil a subclass of miRNAs that may be regulated by genomic imprinting, and also suggest that some miRNA families may be species-specific, as well as chromosome- and/or strand-specific.
Collapse
Affiliation(s)
- Timothy K K Kamanu
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia
| | | | | | | |
Collapse
|
3
|
Bompfünewerer AF, Flamm C, Fried C, Fritzsch G, Hofacker IL, Lehmann J, Missal K, Mosig A, Müller B, Prohaska SJ, Stadler BMR, Stadler PF, Tanzer A, Washietl S, Witwer C. Evolutionary patterns of non-coding RNAs. Theory Biosci 2012; 123:301-69. [PMID: 18202870 DOI: 10.1016/j.thbio.2005.01.002] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2004] [Accepted: 01/24/2005] [Indexed: 01/04/2023]
Abstract
A plethora of new functions of non-coding RNAs (ncRNAs) have been discovered in past few years. In fact, RNA is emerging as the central player in cellular regulation, taking on active roles in multiple regulatory layers from transcription, RNA maturation, and RNA modification to translational regulation. Nevertheless, very little is known about the evolution of this "Modern RNA World" and its components. In this contribution, we attempt to provide at least a cursory overview of the diversity of ncRNAs and functional RNA motifs in non-translated regions of regular messenger RNAs (mRNAs) with an emphasis on evolutionary questions. This survey is complemented by an in-depth analysis of examples from different classes of RNAs focusing mostly on their evolution in the vertebrate lineage. We present a survey of Y RNA genes in vertebrates and study the molecular evolution of the U7 snRNA, the snoRNAs E1/U17, E2, and E3, the Y RNA family, the let-7 microRNA (miRNA) family, and the mRNA-like evf-1 gene. We furthermore discuss the statistical distribution of miRNAs in metazoans, which suggests an explosive increase in the miRNA repertoire in vertebrates. The analysis of the transcription of ncRNAs suggests that small RNAs in general are genetically mobile in the sense that their association with a hostgene (e.g. when transcribed from introns of a mRNA) can change on evolutionary time scales. The let-7 family demonstrates, that even the mode of transcription (as intron or as exon) can change among paralogous ncRNA.
Collapse
|
4
|
Advances in genomics for flatfish aquaculture. GENES AND NUTRITION 2012; 8:5-17. [PMID: 22903900 DOI: 10.1007/s12263-012-0312-8] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2012] [Accepted: 08/02/2012] [Indexed: 10/28/2022]
Abstract
Fish aquaculture is considered to be one of the most sustainable sources of protein for humans. Many different species are cultured worldwide, but among them, marine flatfishes comprise a group of teleosts of high commercial interest because of their highly prized white flesh. However, the aquaculture of these fishes is seriously hampered by the scarce knowledge on their biology. In recent years, various experimental 'omics' approaches have been applied to farmed flatfishes to increment the genomic resources available. These tools are beginning to identify genetic markers associated with traits of commercial interest, and to unravel the molecular basis of different physiological processes. This article summarizes recent advances in flatfish genomics research in Europe. We focus on the new generation sequencing technologies, which can produce a massive amount of DNA sequencing data, and discuss their potentials and applications for de novo genome sequencing and transcriptome analysis. The relevance of these methods in nutrigenomics and foodomics approaches for the production of healthy animals, as well as high quality and safety products for the consumer, is also briefly discussed.
Collapse
|
5
|
Bao M, Cervantes Cervantes M, Zhong L, Wang JTL. Searching for non-coding RNAs in genomic sequences using ncRNAscout. GENOMICS PROTEOMICS & BIOINFORMATICS 2012; 10:114-21. [PMID: 22768985 PMCID: PMC5054157 DOI: 10.1016/j.gpb.2012.05.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2011] [Accepted: 12/05/2011] [Indexed: 12/16/2022]
Abstract
Recently non-coding RNA (ncRNA) genes have been found to serve many important functions in the cell such as regulation of gene expression at the transcriptional level. Potentially there are more ncRNA molecules yet to be found and their possible functions are to be revealed. The discovery of ncRNAs is a difficult task because they lack sequence indicators such as the start and stop codons displayed by protein-coding RNAs. Current methods utilize either sequence motifs or structural parameters to detect novel ncRNAs within genomes. Here, we present an ab initio ncRNA finder, named ncRNAscout, by utilizing both sequence motifs and structural parameters. Specifically, our method has three components: (i) a measure of the frequency of a sequence, (ii) a measure of the structural stability of a sequence contained in a t-score, and (iii) a measure of the frequency of certain patterns within a sequence that may indicate the presence of ncRNA. Experimental results show that, given a genome and a set of known ncRNAs, our method is able to accurately identify and locate a significant number of ncRNA sequences in the genome. The ncRNAscout tool is available for downloading at http://bioinformatics.njit.edu/ncRNAscout.
Collapse
Affiliation(s)
- Michael Bao
- Bioinformatics Center, New Jersey Institute of Technology, Newark, NJ 07102, USA
| | | | | | | |
Collapse
|
6
|
Prakash A, Shepard SS, He J, Hart B, Chen M, Amarachintha SP, Mileyeva-Biebesheimer O, Bechtel J, Fedorov A. Evolution of genomic sequence inhomogeneity at mid-range scales. BMC Genomics 2009; 10:513. [PMID: 19891785 PMCID: PMC2779198 DOI: 10.1186/1471-2164-10-513] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2009] [Accepted: 11/05/2009] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Mid-range inhomogeneity or MRI is the significant enrichment of particular nucleotides in genomic sequences extending from 30 up to several thousands of nucleotides. The best-known manifestation of MRI is CpG islands representing CG-rich regions. Recently it was demonstrated that MRI could be observed not only for G+C content but also for all other nucleotide pairings (e.g. A+G and G+T) as well as for individual bases. Various types of MRI regions are 4-20 times enriched in mammalian genomes compared to their occurrences in random models. RESULTS This paper explores how different types of mutations change MRI regions. Human, chimpanzee and Macaca mulatta genomes were aligned to study the projected effects of substitutions and indels on human sequence evolution within both MRI regions and control regions of average nucleotide composition. Over 18.8 million fixed point substitutions, 3.9 million SNPs, and indels spanning 6.9 Mb were procured and evaluated in human. They include 1.8 Mb substitutions and 1.9 Mb indels within MRI regions. Ancestral and mutant (derived) alleles for substitutions have been determined. Substitutions were grouped according to their fixation within human populations: fixed substitutions (from the human-chimp-macaca alignment), major SNPs (> 80% mutant allele frequency within humans), medium SNPs (20% - 80% mutant allele frequency), minor SNPs (3% - 20%), and rare SNPs (<3%). Data on short (< 3 bp) and medium-length (3 - 50 bp) insertions and deletions within MRI regions and appropriate control regions were analyzed for the effect of indels on the expansion or diminution of such regions as well as on changing nucleotide composition. CONCLUSION MRI regions have comparable levels of de novo mutations to the control genomic sequences with average base composition. De novo substitutions rapidly erode MRI regions, bringing their nucleotide composition toward genome-average levels. However, those substitutions that favor the maintenance of MRI properties have a higher chance to spread through the entire population. Indels have a clear tendency to maintain MRI features yet they have a smaller impact than substitutions. All in all, the observed fixation bias for mutations helps to preserve MRI regions during evolution.
Collapse
Affiliation(s)
- Ashwin Prakash
- Program in Cardiovascular & Metabolic Diseases Track, Biomedical Sciences, Toledo, OH 43614, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Tomaru Y, Nakanishi M, Miura H, Kimura Y, Ohkawa H, Ohta Y, Hayashizaki Y, Suzuki M. Identification of an inter-transcription factor regulatory network in human hepatoma cells by Matrix RNAi. Nucleic Acids Res 2009; 37:1049-60. [PMID: 19129217 PMCID: PMC2651797 DOI: 10.1093/nar/gkn1028] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Transcriptional regulation by transcriptional regulatory factors (TRFs) of their target TRF genes is central to the control of gene expression. To study a static multi-tiered inter-TRF regulatory network in the human hepatoma cells, we have applied a Matrix RNAi approach in which siRNA knockdown and quantitative RT-PCR are used in combination on the same set of TRFs to determine their interdependencies. This approach focusing on several liver-enriched TRF families, each of which consists of structurally homologous members, revealed many significant regulatory relationships. These include the cross-talks between hepatocyte nuclear factors (HNFs) and the other TRF groups such as CCAAT/enhancer-binding proteins (CEBPs), retinoic acid receptors (RARs), retinoid receptors (RXRs) and RAR-related orphan receptors (RORs), which play key regulatory functions in human hepatocytes and liver. In addition, various multi-component regulatory motifs, which make up the complex inter-TRF regulatory network, were identified. A large part of the regulatory edges identified by the Matrix RNAi approach could be confirmed by chromatin immunoprecipitation. The resultant significant edges enabled us to depict the inter-TRF TRN forming an apparent regulatory hierarchy of (FOXA1, RXRA) → TCF1 → (HNF4A, ONECUT1) → (RORC, CEBPA) as the main streamline.
Collapse
Affiliation(s)
- Yasuhiro Tomaru
- OMICS Sciences Center (OSC), RIKEN Yokohama Institute 1-7-22 Suehiro-Cho, Japan
| | | | | | | | | | | | | | | |
Collapse
|
8
|
Kavanaugh LA, Dietrich FS. Non-coding RNA prediction and verification in Saccharomyces cerevisiae. PLoS Genet 2009; 5:e1000321. [PMID: 19119416 PMCID: PMC2603021 DOI: 10.1371/journal.pgen.1000321] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2008] [Accepted: 12/01/2008] [Indexed: 11/18/2022] Open
Abstract
Non-coding RNA (ncRNA) play an important and varied role in cellular function. A significant amount of research has been devoted to computational prediction of these genes from genomic sequence, but the ability to do so has remained elusive due to a lack of apparent genomic features. In this work, thermodynamic stability of ncRNA structural elements, as summarized in a Z-score, is used to predict ncRNA in the yeast Saccharomyces cerevisiae. This analysis was coupled with comparative genomics to search for ncRNA genes on chromosome six of S. cerevisiae and S. bayanus. Sets of positive and negative control genes were evaluated to determine the efficacy of thermodynamic stability for discriminating ncRNA from background sequence. The effect of window sizes and step sizes on the sensitivity of ncRNA identification was also explored. Non-coding RNA gene candidates, common to both S. cerevisiae and S. bayanus, were verified using northern blot analysis, rapid amplification of cDNA ends (RACE), and publicly available cDNA library data. Four ncRNA transcripts are well supported by experimental data (RUF10, RUF11, RUF12, RUF13), while one additional putative ncRNA transcript is well supported but the data are not entirely conclusive. Six candidates appear to be structural elements in 5′ or 3′ untranslated regions of annotated protein-coding genes. This work shows that thermodynamic stability, coupled with comparative genomics, can be used to predict ncRNA with significant structural elements. Recent advances in DNA sequence technology have made it possible to sequence entire genomes. Once a genome is sequenced, it becomes necessary to identify the set of genes and other functional elements within the genome. This is particularly challenging as much of the genomic sequence does not appear to perform any function and is loosely referred to as “junk.” Identifying functional elements among the “junk” is difficult. Experimental methods have been developed for this purpose but they are time-consuming, expensive, and often provide an incomplete picture. Thus, it is important to develop the ability to identify these functional elements using computational methods. Protein-coding genes are relatively easy to identify computationally, but other categories of functional elements present a significantly greater challenge. In this work, we used a computational approach to identify genes that do not encode for a protein but rather function as an RNA molecule. We then used experimental methods to verify our predictions and thereby validate the computational method.
Collapse
Affiliation(s)
- Laura A. Kavanaugh
- Department of Molecular Genetics and Microbiology, Institute for Genome Sciences and Policy, Duke University Medical Center, Durham, North Carolina, United States of America
| | - Fred S. Dietrich
- Department of Molecular Genetics and Microbiology, Institute for Genome Sciences and Policy, Duke University Medical Center, Durham, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
9
|
Miura H, Tomaru Y, Nakanishi M, Kondo S, Hayashizaki Y, Suzuki M. Identification of DNA regions and a set of transcriptional regulatory factors involved in transcriptional regulation of several human liver-enriched transcription factor genes. Nucleic Acids Res 2008; 37:778-92. [PMID: 19074951 PMCID: PMC2647325 DOI: 10.1093/nar/gkn978] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Mammalian tissue- and/or time-specific transcription is primarily regulated in a combinatorial fashion through interactions between a specific set of transcriptional regulatory factors (TRFs) and their cognate cis-regulatory elements located in the regulatory regions. In exploring the DNA regions and TRFs involved in combinatorial transcriptional regulation, we noted that individual knockdown of a set of human liver-enriched TRFs such as HNF1A, HNF3A, HNF3B, HNF3G and HNF4A resulted in perturbation of the expression of several single TRF genes, such as HNF1A, HNF3G and CEBPA genes. We thus searched the potential binding sites for these five TRFs in the highly conserved genomic regions around these three TRF genes and found several putative combinatorial regulatory regions. Chromatin immunoprecipitation analysis revealed that almost all of the putative regulatory DNA regions were bound by the TRFs as well as two coactivators (CBP and p300). The strong transcription-enhancing activity of the putative combinatorial regulatory region located downstream of the CEBPA gene was confirmed. EMSA demonstrated specific bindings of these HNFs to the target DNA region. Finally, co-transfection reporter assays with various combinations of expression vectors for these HNF genes demonstrated the transcriptional activation of the CEBPA gene in a combinatorial manner by these TRFs.
Collapse
Affiliation(s)
- Hisashi Miura
- RIKEN Omics Science Center, RIKEN Yokohama Institute 1-7-22 Suehiro-Cho, Tsurumi-Ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | | | |
Collapse
|
10
|
Kaczkowski B, Torarinsson E, Reiche K, Havgaard JH, Stadler PF, Gorodkin J. Structural profiles of human miRNA families from pairwise clustering. Bioinformatics 2008; 25:291-4. [PMID: 19059941 DOI: 10.1093/bioinformatics/btn628] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
UNLABELLED MicroRNAs (miRNAs) are a group of small, approximately 21 nt long, riboregulators inhibiting gene expression at a post-transcriptional level. Their most distinctive structural feature is the foldback hairpin of their precursor pre-miRNAs. Even though each pre-miRNA deposited in miRBase has its secondary structure already predicted, little is known about the patterns of structural conservation among pre-miRNAs. We address this issue by clustering the human pre-miRNA sequences based on pairwise, sequence and secondary structure alignment using FOLDALIGN, followed by global multiple alignment of obtained clusters by WAR. As a result, the common secondary structure was successfully determined for four FOLDALIGN clusters: the RF00027 structural family of the Rfam database and three clusters with previously undescribed consensus structures. AVAILABILITY http://genome.ku.dk/resources/mirclust
Collapse
Affiliation(s)
- Bogumił Kaczkowski
- Division of Genetics and Bioinformatics, IBHV, University of Copenhagen, Frederiksberg C, Denmark
| | | | | | | | | | | |
Collapse
|
11
|
Abstract
Recent progress in the analyses of the mouse transcriptome leads to unexpected discoveries. The mouse genomic sequences read by RNA polymerase II may be six times more than previously expected for human chromosomes. The transcript-abundant regions (named "transcription forests") occupy more than half of the genomic sequence and are divided by transcript-scarce regions (transcription deserts). Many of the coding mRNAs may have partially overlapping antisense RNAs. There are transcripts bridging several adjacent genes that were previously regarded as distinct ones. The transcription start sites appearing as cap analysis of gene expression (CAGE) tags are mapped on the mouse genomic sequences. Distributions of CAGE tags show that the shapes of mammalian gene promoters can be classified into four major categories. These shapes were conserved between mouse and human. Most of the gene has exonic transcription start sites, especially in the 3' untranslated region (3' UTR) sequences. The term "RNA continent" has been invented to express this unexpectedly complex and prodigious mouse transcriptome. More than a half of the RNA polymerase II transcripts are regarded as noncoding RNAs (ncRNAs). The great variety of ncRNAs in mammalian transcriptome implies that there are many functional ncRNAs in the cells. Especially, the evolutionarily conserved microRNAs play critical roles in mammalian development and other biological functions. Moreover, many other ncRNAs have also been shown to have biological significant functions, mainly in the regulation of gene expression. The functional survey of the RNA continent has just started. We will describe the state of the art of the RNA continent and its impact on the modern molecular biology, especially on the cancer research.
Collapse
Affiliation(s)
- Jun Yasuda
- Functional RNA Research Program, Frontier Research System, RIKEN Yokohama Institute, 1-7-22, Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | | |
Collapse
|
12
|
Rukov JL, Irimia M, Mørk S, Lund VK, Vinther J, Arctander P. High qualitative and quantitative conservation of alternative splicing in Caenorhabditis elegans and Caenorhabditis briggsae. Mol Biol Evol 2007; 24:909-17. [PMID: 17272679 DOI: 10.1093/molbev/msm023] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Alternative splicing (AS) is an important contributor to proteome diversity and is regarded as an explanatory factor for the relatively low number of human genes compared with less complex animals. To assess the evolutionary conservation of AS and its developmental regulation, we have investigated the qualitative and quantitative expression of 21 orthologous alternative splice events through the development of 2 nematode species separated by 85-110 Myr of evolutionary time. We demonstrate that most of these alternative splice events present in Caenorhabditis elegans are conserved in Caenorhabditis briggsae. Moreover, we find that relative isoform expression levels vary significantly during development for 78% of the AS events and that this quantitative variation is highly conserved between the 2 species. Our results suggest that AS is generally tightly regulated through development and that the regulatory mechanisms controlling AS are to a large extent conserved during the evolution of Caenorhabditis. This strong conservation indicates that both major and minor splice forms have important functional roles and that the relative quantities in which they are expressed are crucial. Our results therefore suggest that the quantitative regulation of isoform expression levels is an intrinsic part of most AS events. Moreover, our results indicate that AS contributes little to transcript variation in Caenorhabditis genes and that gene duplication may be the major evolutionary mechanism for the origin of novel transcripts in these 2 species.
Collapse
Affiliation(s)
- Jakob Lewin Rukov
- Molecular Evolution Group, Department of Molecular Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark.
| | | | | | | | | | | |
Collapse
|
13
|
Takeda JI, Suzuki Y, Nakao M, Kuroda T, Sugano S, Gojobori T, Imanishi T. H-DBAS: alternative splicing database of completely sequenced and manually annotated full-length cDNAs based on H-Invitational. Nucleic Acids Res 2007; 35:D104-9. [PMID: 17130147 PMCID: PMC1716722 DOI: 10.1093/nar/gkl854] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2006] [Revised: 10/09/2006] [Accepted: 10/10/2006] [Indexed: 11/13/2022] Open
Abstract
The Human-transcriptome DataBase for Alternative Splicing (H-DBAS) is a specialized database of alternatively spliced human transcripts. In this database, each of the alternative splicing (AS) variants corresponds to a completely sequenced and carefully annotated human full-length cDNA, one of those collected for the H-Invitational human-transcriptome annotation meeting. H-DBAS contains 38,664 representative alternative splicing variants (RASVs) in 11,744 loci, in total. The data is retrievable by various features of AS, which were annotated according to manual annotations, such as by patterns of ASs, consequently invoked alternations in the encoded amino acids and affected protein motifs, GO terms, predicted subcellular localization signals and transmembrane domains. The database also records recently identified very complex patterns of AS, in which two distinct genes seemed to be bridged, nested or degenerated (multiple CDS): in all three cases, completely unrelated proteins are encoded by a single locus. By using AS Viewer, each AS event can be analyzed in the context of full-length cDNAs, enabling the user's empirical understanding of the relation between AS event and the consequent alternations in the encoded amino acid sequences together with various kinds of affected protein motifs. H-DBAS is accessible at http://jbirc.jbic.or.jp/h-dbas/.
Collapse
Affiliation(s)
- Jun-ichi Takeda
- Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics Consortium, AIST Bio-IT Research BuildingAomi 2-42, Koto-ku, Tokyo 135-0064, Japan
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, AIST Bio-IT Research BuildingAomi 2-42, Koto-ku, Tokyo 135-0064, Japan
| | - Yutaka Suzuki
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, the University of Tokyo5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
| | - Mitsuteru Nakao
- Computational Biology Research Center, National Institute of Advanced Science and Technology, AIST Bio-IT Research BuildingAomi 2-42, Koto-ku, Tokyo 135-0064, Japan
- Kazusa DNA Research Institute, 2-6-7 Kazusa-KamatariKisarazu, Chiba 292-0818, Japan
| | - Tsuyoshi Kuroda
- Maze Corporation, TS Building 1013-20-2 Hatagaya, Shibuya-ku, Tokyo 151-0072, Japan
| | - Sumio Sugano
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, the University of Tokyo5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
| | - Takashi Gojobori
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, AIST Bio-IT Research BuildingAomi 2-42, Koto-ku, Tokyo 135-0064, Japan
- Center for Information Biology and DDBJ, National Institute of Genetics1111 Yata, Mishima, Shizuoka 411-8540, Japan
| | - Tadashi Imanishi
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, AIST Bio-IT Research BuildingAomi 2-42, Koto-ku, Tokyo 135-0064, Japan
- Graduate School of Information Science and Technology, Hokkaido UniversityNorth 14, West 9, Kita-ku, Sapporo, Hokkaido 060-0814, Japan
| |
Collapse
|
14
|
Gray TA, Wilson A, Fortin PJ, Nicholls RD. The putatively functional Mkrn1-p1 pseudogene is neither expressed nor imprinted, nor does it regulate its source gene in trans. Proc Natl Acad Sci U S A 2006; 103:12039-44. [PMID: 16882727 PMCID: PMC1567693 DOI: 10.1073/pnas.0602216103] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A recently promoted genome evolution model posits that mammalian pseudogenes can regulate their founding source genes, and it thereby ascribes an important function to "junk DNA." This model arose from analysis of a serendipitous mouse mutant in which a transgene insertion/deletion caused severe polycystic kidney disease and osteogenesis imperfecta with approximately 80% perinatal lethality, when inherited paternally [Hirotsune, S., et al. (2003) Nature 423, 91-96]. The authors concluded that the transgene reduced the expression of a nearby transcribed and imprinted pseudogene, Mkrn1-p1. This reduction in chromosome 5-imprinted Mkrn1-p1 transcripts was proposed to destabilize the cognate chromosome 6 Mkrn1 source gene mRNA, with a partial reduction in one Mkrn1 isoform leading to the imprinted phenotype. Here, we show that 5' Mkrn1-p1 is fully methylated on both alleles, a pattern indicative of silenced chromatin, and that Mkrn1-p1 is not transcribed and therefore cannot stabilize Mkrn1 transcripts in trans. A small, truncated, rodent-specific Mkrn1 transcript explains the product erroneously attributed to Mkrn1-p1. Additionally, Mkrn1 expression is not imprinted, and 5' Mkrn1 is fully unmethylated. Finally, mice in which Mkrn1 has been directly disrupted show none of the phenotypes attributed to a partial reduction of Mkrn1. These data contradict the previous suggestions that Mkrn1-p1 is imprinted, and that either it or its source Mkrn1 gene relates to the original imprinted transgene phenotype. This study invalidates the data upon which the pseudogene trans-regulation model is based and therefore strongly supports the view that mammalian pseudogenes are evolutionary relics.
Collapse
Affiliation(s)
- Todd A. Gray
- *Wadsworth Center, David Axelrod Institute, Albany, NY 12208; and
- To whom correspondence may be addressed. E-mail:
or
| | - Alison Wilson
- *Wadsworth Center, David Axelrod Institute, Albany, NY 12208; and
| | | | - Robert D. Nicholls
- Birth Defects Laboratories, Division of Medical Genetics, Department of Pediatrics, Children's Hospital of Pittsburgh, and
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15213
- To whom correspondence may be addressed. E-mail:
or
| |
Collapse
|
15
|
Liu J, Gough J, Rost B. Distinguishing protein-coding from non-coding RNAs through support vector machines. PLoS Genet 2006; 2:e29. [PMID: 16683024 PMCID: PMC1449884 DOI: 10.1371/journal.pgen.0020029] [Citation(s) in RCA: 119] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2005] [Accepted: 01/24/2006] [Indexed: 12/18/2022] Open
Abstract
RIKEN's FANTOM project has revealed many previously unknown coding sequences, as well as an unexpected degree of variation in transcripts resulting from alternative promoter usage and splicing. Ever more transcripts that do not code for proteins have been identified by transcriptome studies, in general. Increasing evidence points to the important cellular roles of such non-coding RNAs (ncRNAs). The distinction of protein-coding RNA transcripts from ncRNA transcripts is therefore an important problem in understanding the transcriptome and carrying out its annotation. Very few in silico methods have specifically addressed this problem. Here, we introduce CONC (for “coding or non-coding”), a novel method based on support vector machines that classifies transcripts according to features they would have if they were coding for proteins. These features include peptide length, amino acid composition, predicted secondary structure content, predicted percentage of exposed residues, compositional entropy, number of homologs from database searches, and alignment entropy. Nucleotide frequencies are also incorporated into the method. Confirmed coding cDNAs for eukaryotic proteins from the Swiss-Prot database constituted the set of true positives, ncRNAs from RNAdb and NONCODE the true negatives. Ten-fold cross-validation suggested that CONC distinguished coding RNAs from ncRNAs at about 97% specificity and 98% sensitivity. Applied to 102,801 mouse cDNAs from the FANTOM3 dataset, our method reliably identified over 14,000 ncRNAs and estimated the total number of ncRNAs to be about 28,000. There are two types of RNA: messenger RNAs (mRNAs), which are translated into proteins, and non-coding RNAs (ncRNAs), which function as RNA molecules. Besides textbook examples such as tRNAs and rRNAs, non-coding RNAs have been found to carry out very diverse functions, from mRNA splicing and RNA modification to translational regulation. It has been estimated that non-coding RNAs make up the vast majority of transcription output of higher eukaryotes. Discriminating mRNA from ncRNA has become an important biological and computational problem. The authors describe a computational method based on a machine learning algorithm known as a support vector machine (SVM) that classifies transcripts according to features they would have if they were coding for proteins. These features include peptide length, amino acid composition, secondary structure content, and protein alignment information. The method is applied to the dataset from the FANTOM3 large-scale mouse cDNA sequencing project; it identifies over 14,000 ncRNAs in mouse and estimates the total number of ncRNAs in the FANTOM3 data to be about 28,000.
Collapse
Affiliation(s)
- Jinfeng Liu
- Columbia University Bioinformatics Center, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America.
| | | | | |
Collapse
|
16
|
Ginger MR, Shore AN, Contreras A, Rijnkels M, Miller J, Gonzalez-Rimbau MF, Rosen JM. A noncoding RNA is a potential marker of cell fate during mammary gland development. Proc Natl Acad Sci U S A 2006; 103:5781-6. [PMID: 16574773 PMCID: PMC1420634 DOI: 10.1073/pnas.0600745103] [Citation(s) in RCA: 146] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2005] [Indexed: 12/26/2022] Open
Abstract
PINC is a large, alternatively spliced, developmentally regulated, noncoding RNA expressed in the regressed terminal ductal lobular unit-like structures of the parous mammary gland. Previous studies have shown that this population of cells possesses not only progenitor-like qualities (the ability to proliferate and repopulate a mammary gland) and the ability to survive developmentally programmed cell death but also the inhibition of carcinogen-induced proliferation. Here we report that PINC expression is temporally and spatially regulated in response to developmental stimuli in vivo and that PINC RNA is localized to distinct foci in either the nucleus or the cytoplasm in a cell-cycle-specific manner. Loss-of-function experiments suggest that PINC performs dual roles in cell survival and regulation of cell-cycle progression, suggesting that PINC may contribute to the developmentally mediated changes previously observed in the terminal ductal lobular unit-like structures of the parous gland. This is one of the first reports describing the functional properties of a large, developmentally regulated, mammalian, noncoding RNA.
Collapse
Affiliation(s)
| | - Amy N. Shore
- Program in Developmental Biology, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030; and
| | | | - Monique Rijnkels
- U.S. Department of Agriculture/Agricultural Research Services Children’s Nutrition Research Center, Department of Pediatrics, Baylor College of Medicine, 1100 Bates Street, Houston, TX 77030
| | | | | | | |
Collapse
|
17
|
Chen FC, Chen CJ, Ho JY, Chuang TJ. Identification and evolutionary analysis of novel exons and alternative splicing events using cross-species EST-to-genome comparisons in human, mouse and rat. BMC Bioinformatics 2006; 7:136. [PMID: 16536879 PMCID: PMC1479377 DOI: 10.1186/1471-2105-7-136] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2005] [Accepted: 03/15/2006] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Alternative splicing (AS) is important for evolution and major biological functions in complex organisms. However, the extent of AS in mammals other than human and mouse is largely unknown, making it difficult to study AS evolution in mammals and its biomedical implications. RESULTS Here we describe a cross-species EST-to-genome comparison algorithm (ENACE) that can identify novel exons for EST-scanty species and distinguish conserved and lineage-specific exons. The identified exons represent not only novel exons but also evolutionarily meaningful AS events that are not previously annotated. A genome-wide AS analysis in human, mouse and rat using ENACE reveals a total of 758 novel cassette-on exons and 167 novel retained introns that have no EST evidence from the same species. RT-PCR-sequencing experiments validated approximately 50 approximately 80% of the tested exons, indicating high presence of exons predicted by ENACE. ENACE is particularly powerful when applied to closely related species. In addition, our analysis shows that the ENACE-identified AS exons tend not to pass the nonsynonymous-to-synonymous substitution ratio test and not to contain protein domain, implying that such exons may be under positive selection or relaxed negative selection. These AS exons may contribute to considerable inter-species functional divergence. Our analysis further indicates that a large number of exons may have been gained or lost during mammalian evolution. Moreover, a functional analysis shows that inter-species divergence of AS events may be substantial in protein carriers and receptor proteins in mammals. These exons may be of interest to studies of AS evolution. The ENACE programs and sequences of the ENACE-identified AS events are available for download. CONCLUSION ENACE can identify potential novel cassette exons and retained introns between closely related species using a comparative approach. It can also provide information regarding lineage- or species-specificity in transcript isoforms, which are important for evolutionary and functional studies.
Collapse
Affiliation(s)
- Feng-Chi Chen
- Genomics Research Center, Academia Sinica, Academia Road, Nankang, Taipei 11529, Taiwan
| | - Chuang-Jong Chen
- Genomics Research Center, Academia Sinica, Academia Road, Nankang, Taipei 11529, Taiwan
| | - Jar-Yi Ho
- Genomics Research Center, Academia Sinica, Academia Road, Nankang, Taipei 11529, Taiwan
| | - Trees-Juen Chuang
- Genomics Research Center, Academia Sinica, Academia Road, Nankang, Taipei 11529, Taiwan
| |
Collapse
|
18
|
Mendes Soares LM, Valcárcel J. The expanding transcriptome: the genome as the 'Book of Sand'. EMBO J 2006; 25:923-31. [PMID: 16511566 PMCID: PMC1409726 DOI: 10.1038/sj.emboj.7601023] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2005] [Accepted: 01/17/2006] [Indexed: 01/07/2023] Open
Abstract
The central dogma of molecular biology inspired by classical work in prokaryotic organisms accounts for only part of the genetic agenda of complex eukaryotes. First, post-transcriptional events lead to the generation of multiple mRNAs, proteins and functions from a single primary transcript, revealing regulatory networks distinct in mechanism and biological function from those controlling RNA transcription. Second, a variety of populous families of small RNAs (small nuclear RNAs, small nucleolar RNAs, microRNAs, siRNAs and shRNAs) assemble on ribonucleoprotein complexes and regulate virtually all aspects of the gene expression pathway, with profound biological consequences. Third, high-throughput methods of genomic analysis reveal that RNAs other than non-protein-coding RNAs (ncRNAs) represent a major component of the transcriptome that may perform novel functions in gene regulation and beyond. Post-transcriptional regulation, small RNAs and ncRNAs provide an expanding picture of the transcriptome that enriches our views of what genes are, how they operate, evolve and are regulated.
Collapse
Affiliation(s)
| | - Juan Valcárcel
- Centre de Regulació Genòmica, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
- Gene Regulation Programme, Centre de Regulació Genòmica, Passeig Marítim 37-49, Barcelona 08003, Spain. Tel.: +34 9 3224 0956; Fax: +34 9 3224 0899; E-mail:
| |
Collapse
|
19
|
Abstract
The human genome project has had an impact on both biological research and its political organization; this review focuses primarily on the scientific novelty that has emerged from the project but also touches on its political dimensions. The project has generated both anticipated and novel information; in the later category are the description of the unusual distribution of genes, the prevalence of non-protein-coding genes, and the extraordinary evolutionary conservation of some regions of the genome. The applications of the sequence data are just starting to be felt in basic, rather than therapeutic, biomedical research and in the vibrant human origins and variation debates. The political impact of the project is in the unprecedented extent to which directed funding programs have emerged as drivers of basic research and the organization of the multidisciplinary groups that are needed to utilize the human DNA sequence.
Collapse
Affiliation(s)
- Peter F R Little
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney 2074, New South Wales, Australia.
| |
Collapse
|
20
|
Prasanth KV, Prasanth SG, Xuan Z, Hearn S, Freier SM, Bennett CF, Zhang MQ, Spector DL. Regulating gene expression through RNA nuclear retention. Cell 2005; 123:249-63. [PMID: 16239143 DOI: 10.1016/j.cell.2005.08.033] [Citation(s) in RCA: 570] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2005] [Revised: 06/08/2005] [Accepted: 08/09/2005] [Indexed: 01/18/2023]
Abstract
Multiple mechanisms have evolved to regulate the eukaryotic genome. We have identified CTN-RNA, a mouse tissue-specific approximately 8 kb nuclear-retained poly(A)+ RNA that regulates the level of its protein-coding partner. CTN-RNA is transcribed from the protein-coding mouse cationic amino acid transporter 2 (mCAT2) gene through alternative promoter and poly(A) site usage. CTN-RNA is diffusely distributed in nuclei and is also localized to paraspeckles. The 3'UTR of CTN-RNA contains elements for adenosine-to-inosine editing, involved in its nuclear retention. Interestingly, knockdown of CTN-RNA also downregulates mCAT2 mRNA. Under stress, CTN-RNA is posttranscriptionally cleaved to produce protein-coding mCAT2 mRNA. Our findings reveal a role of the cell nucleus in harboring RNA molecules that are not immediately needed to produce proteins but whose cytoplasmic presence is rapidly required upon physiologic stress. This mechanism of action highlights an important paradigm for the role of a nuclear-retained stable RNA transcript in regulating gene expression.
Collapse
MESH Headings
- 3' Untranslated Regions/genetics
- Animals
- Base Sequence
- Cationic Amino Acid Transporter 2/genetics
- Cationic Amino Acid Transporter 2/metabolism
- Cell Fractionation
- Cell Line
- Cell Line, Tumor
- Cell Nucleus/metabolism
- Chromosomes
- Gene Expression Regulation
- Genes, Reporter
- Genome
- Green Fluorescent Proteins/metabolism
- In Situ Hybridization, Fluorescence
- Interferon-gamma/pharmacology
- Lipopolysaccharides/pharmacology
- Mice
- Models, Biological
- Molecular Sequence Data
- NIH 3T3 Cells
- Oligonucleotides, Antisense/pharmacology
- Poly A/genetics
- Precipitin Tests
- Promoter Regions, Genetic
- RNA/genetics
- RNA/metabolism
- RNA Editing
- RNA Processing, Post-Transcriptional
- RNA, Messenger/analysis
- RNA, Small Nuclear/metabolism
- Reverse Transcriptase Polymerase Chain Reaction
- Sequence Analysis, RNA
- Transcription, Genetic
Collapse
|
21
|
Freyhult E, Gardner PP, Moulton V. A comparison of RNA folding measures. BMC Bioinformatics 2005; 6:241. [PMID: 16202126 PMCID: PMC1274297 DOI: 10.1186/1471-2105-6-241] [Citation(s) in RCA: 89] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2005] [Accepted: 10/03/2005] [Indexed: 11/25/2022] Open
Abstract
Background In the last few decades there has been a great deal of discussion concerning whether or not noncoding RNA sequences (ncRNAs) fold in a more well-defined manner than random sequences. In this paper, we investigate several existing measures for how well an RNA sequence folds, and compare the behaviour of these measures over a large range of Rfam ncRNA families. Such measures can be useful in, for example, identifying novel ncRNAs, and indicating the presence of alternate RNA foldings. Results Our analysis shows that ncRNAs, but not mRNAs, in general have lower minimal free energy (MFE) than random sequences with the same dinucleotide frequency. Moreover, even when the MFE is significant, many ncRNAs appear to not have a unique fold, but rather several alternative folds, at least when folded in silico. Furthermore, we find that the six investigated measures are correlated to varying degrees. Conclusion Due to the correlations between the different measures we find that it is sufficient to use only two of them in RNA folding studies, one to test if the sequence in question has lower energy than a random sequence with the same dinucleotide frequency (the Z-score) and the other to see if the sequence has a unique fold (the average base-pair distance, D).
Collapse
Affiliation(s)
- Eva Freyhult
- The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, Sweden
| | - Paul P Gardner
- Dept. of Evolutionary Biology, University of Copenhagen, Universitetsparken 15, 2100 Copenhagen Ø, Denmark
| | - Vincent Moulton
- School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK
| |
Collapse
|
22
|
Cho YS, Iguchi N, Yang J, Handel MA, Hecht NB. Meiotic messenger RNA and noncoding RNA targets of the RNA-binding protein Translin (TSN) in mouse testis. Biol Reprod 2005; 73:840-7. [PMID: 15987823 DOI: 10.1095/biolreprod.105.042788] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022] Open
Abstract
In postmeiotic male germ cells, TSN, formerly known as testis brain-RNA binding protein, is found in the cytoplasm and functions as a posttranscriptional regulator of a group of genes transcribed by the transcription factor CREM-tau. In contrast, in pachytene spermatocytes, TSN is found predominantly in nuclei. Tsn-null males show a reduced sperm count and high levels of apoptosis in meiotic cells, suggesting a critical function for TSN during meiosis. To identify meiotic target RNAs that associate in vivo with TSN, we reversibly cross-linked TSN to RNA in testis extracts from 17-day-old and adult mice and immunoprecipitated the complexes with an affinity-purified TSN antibody. Extracts from Tsn-null mice were used as controls. Cloning and sequencing the immunoprecipitated RNAs, we identified four new TSN target mRNAs, encoding diazepam-binding inhibitor-like 5, arylsulfatase A, a tetratricopeptide repeat structure-containing protein, and ring finger protein 139. In contrast to the population of postmeiotic translationally delayed mRNAs that bind TSN, these four mRNAs are initially expressed in pachytene spermatocytes. In addition, anti-TSN also precipitated a nonprotein-coding RNA (ncRNA), which is abundant in nuclei of pachytene spermatocytes and has a putative polyadenylation signal, but no open reading frame. A second similar ncRNA is adjacent to a GGA repeat, a motif frequently associated with recombination hot spots. RNA gel-shift assays confirm that the four new target mRNAs and the ncRNA specifically bind to TSN in testis extracts. These studies have, for the first time, identified both mRNAs and a ncRNA as TSN targets expressed during meiosis.
Collapse
Affiliation(s)
- Yoon Shin Cho
- Center for Research on Reproduction and Women's Health, University of Pennsylvania School of Medicine, Philadelphia, 19104, USA
| | | | | | | | | |
Collapse
|
23
|
Abstract
The past four years have seen an explosion in the number of detected RNA transcripts with no apparent protein-coding potential. This has led to speculation that non-protein-coding RNAs (ncRNAs) might be as important as proteins in the regulation of vital cellular functions. However, there has been significantly less progress in actually demonstrating the functions of these transcripts. In this article, we review the results of recent experiments that show that transcription of non-protein-coding RNA is far more widespread than was previously anticipated. Although some ncRNAs act as molecular switches that regulate gene expression, the function of many ncRNAs is unknown. New experimental and computational approaches are emerging that will help determine whether these newly identified transcription products are evidence of important new biochemical pathways or are merely 'junk' RNA generated by the cell as a by-product of its functional activities.
Collapse
Affiliation(s)
- Alexander Hüttenhofer
- Division of Genomics and RNomics, Innsbruck Medical University-Biocenter, Fritz-Pregl-Strasse 3, 6020 Innsbruck, Austria.
| | | | | |
Collapse
|
24
|
Yan MD, Hong CC, Lai GM, Cheng AL, Lin YW, Chuang SE. Identification and characterization of a novel gene Saf transcribed from the opposite strand of Fas. Hum Mol Genet 2005; 14:1465-74. [PMID: 15829500 DOI: 10.1093/hmg/ddi156] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Apoptosis is a morphologically distinct form of cell death involved in many physiological and pathological processes. The regulation of Fas/Apo-1 involved in membrane-mediated apoptosis has also been known to play crucial roles in many systems. More and more naturally occurring antisense RNAs are now known to regulate, at least in part, a growing number of eukaryotic genes. In this report, we describe the findings of a novel RNA transcribed from the opposite strand of the intron 1 of the human Fas gene. Using orientation-specific RT-PCR and northern blot analysis, we show that this transcript is 1.5 kb in length and was expressed in several human tissues and cell lines. This transcript was cloned by 5'- and 3'-RACE (rapid amplification of cDNA ends) and the transcription start site was determined by primer extension. This novel gene was named Saf. To assess the functions of Saf, Jurkat cells transfected with human Saf or control vector was prepared. The stable Saf-transfectant was highly resistant to Fas-mediated but not to TNF-alpha-mediated apoptosis. Although the overall mRNA expression level of Fas was not affected, expression of some novel forms of Fas transcripts was increased in Saf-transfectant, especially the inhibitory soluble forms. These findings collectively suggest that Saf might protect T lymphocytes from Fas-mediated apoptosis by blocking the binding of FasL or its agonistic Fas antibody. Saf might regulate the expression of Fas alternative splice forms through pre-mRNA processing.
Collapse
Affiliation(s)
- Ming-De Yan
- Graduate Institute of Life Sciences, National Defense Medical Center, Taipei, Taiwan, ROC
| | | | | | | | | | | |
Collapse
|
25
|
Washietl S, Hofacker IL, Stadler PF. Fast and reliable prediction of noncoding RNAs. Proc Natl Acad Sci U S A 2005; 102:2454-9. [PMID: 15665081 PMCID: PMC548974 DOI: 10.1073/pnas.0409169102] [Citation(s) in RCA: 465] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2004] [Indexed: 01/22/2023] Open
Abstract
We report an efficient method for detecting functional RNAs. The approach, which combines comparative sequence analysis and structure prediction, already has yielded excellent results for a small number of aligned sequences and is suitable for large-scale genomic screens. It consists of two basic components: (i) a measure for RNA secondary structure conservation based on computing a consensus secondary structure, and (ii) a measure for thermodynamic stability, which, in the spirit of a z score, is normalized with respect to both sequence length and base composition but can be calculated without sampling from shuffled sequences. Functional RNA secondary structures can be identified in multiple sequence alignments with high sensitivity and high specificity. We demonstrate that this approach is not only much more accurate than previous methods but also significantly faster. The method is implemented in the program rnaz, which can be downloaded from www.tbi.univie.ac.at/~wash/RNAz. We screened all alignments of length n > or = 50 in the Comparative Regulatory Genomics database, which compiles conserved noncoding elements in upstream regions of orthologous genes from human, mouse, rat, Fugu, and zebrafish. We recovered all of the known noncoding RNAs and cis-acting elements with high significance and found compelling evidence for many other conserved RNA secondary structures not described so far to our knowledge.
Collapse
Affiliation(s)
- Stefan Washietl
- Department of Theoretical Chemistry and Structural Biology, University of Vienna, Währingerstrasse 17, A-1090 Wien, Austria
| | | | | |
Collapse
|
26
|
Hackermüller J, Meisner NC, Auer M, Jaritz M, Stadler PF. The effect of RNA secondary structures on RNA-ligand binding and the modifier RNA mechanism: a quantitative model. Gene 2005; 345:3-12. [PMID: 15716109 DOI: 10.1016/j.gene.2004.11.043] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2004] [Revised: 10/13/2004] [Accepted: 11/09/2004] [Indexed: 12/17/2022]
Abstract
RNA-ligand binding often depends crucially on the local RNA secondary structure at the binding site. We develop here a model that quantitatively predicts the effect of RNA secondary structure on effective RNA-ligand binding activities based on equilibrium thermodynamics and the explicit computations of partition functions for the RNA structures. A statistical test for the impact of a particular structural feature on the binding affinities follows directly from this approach. The formalism is extended to describing the effects of hybridizing small "modifier RNAs" to a target RNA molecule outside its ligand binding site. We illustrate the applicability of our approach by quantitatively describing the interaction of the mRNA stabilizing protein HuR with AU-rich elements. We discuss our model and recent experimental findings demonstrating the effectivity of modifier RNAs in vitro in the context of the current research activities in the field of non-coding RNAs. We speculate that modifier RNAs might also exist in nature; if so, they present an additional regulatory layer for fine-tuning gene expression that could evolve rapidly, leaving no obvious traces in the genomic DNA sequences.
Collapse
Affiliation(s)
- Jörg Hackermüller
- Novartis Institutes for Biomedical Research Vienna, Informatics and Knowledge Management at NIBR, Insilico Sciences, Brunnerstrasse 59, A-1235 Vienna, Austria
| | | | | | | | | |
Collapse
|
27
|
Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2005. [PMCID: PMC2448604 DOI: 10.1002/cfg.419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
|