1
|
Mathur M, Kim CM, Munro SA, Rudina SS, Sawyer EM, Smolke CD. Programmable mutually exclusive alternative splicing for generating RNA and protein diversity. Nat Commun 2019; 10:2673. [PMID: 31209208 PMCID: PMC6572816 DOI: 10.1038/s41467-019-10403-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 05/01/2019] [Indexed: 02/07/2023] Open
Abstract
Alternative splicing performs a central role in expanding genomic coding capacity and proteomic diversity. However, programming of splicing patterns in engineered biological systems remains underused. Synthetic approaches thus far have predominantly focused on controlling expression of a single protein through alternative splicing. Here, we describe a modular and extensible platform for regulating four programmable exons that undergo a mutually exclusive alternative splicing event to generate multiple functionally-distinct proteins. We present an intron framework that enforces the mutual exclusivity of two internal exons and demonstrate a graded series of consensus sequence elements of varying strengths that set the ratio of two mutually exclusive isoforms. We apply this framework to program the DNA-binding domains of modular transcription factors to differentially control downstream gene activation. This splicing platform advances an approach for generating diverse isoforms and can ultimately be applied to program modular proteins and increase coding capacity of synthetic biological systems.
Collapse
Affiliation(s)
- Melina Mathur
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
| | - Cameron M Kim
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
| | - Sarah A Munro
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
- Joint Initiative for Metrology in Biology, Stanford, CA, 94305, USA
- Genome-scale Measurements Group, National Institute of Standards and Technology, Stanford, CA, 94305, USA
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Shireen S Rudina
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
| | - Eric M Sawyer
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, 94720, USA
| | - Christina D Smolke
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, 94158, USA.
| |
Collapse
|
2
|
An RNA Switch of a Large Exon of Ninein Is Regulated by the Neural Stem Cell Specific-RNA Binding Protein, Qki5. Int J Mol Sci 2019; 20:ijms20051010. [PMID: 30813567 PMCID: PMC6429586 DOI: 10.3390/ijms20051010] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Revised: 02/20/2019] [Accepted: 02/21/2019] [Indexed: 12/20/2022] Open
Abstract
A set of tissue-specific splicing factors are thought to govern alternative splicing events during neural progenitor cell (NPC)-to-neuron transition by regulating neuron-specific exons. Here, we propose one such factor, RNA-binding protein Quaking 5 (Qki5), which is specifically expressed in the early embryonic neural stem cells. We performed mRNA-SEQ (Sequence) analysis using mRNAs obtained by developing cerebral cortices in Qk (Quaking) conditional knockout (cKO) mice. As expected, we found a large number of alternative splicing changes between control and conditional knockouts relative to changes in transcript levels. DAVID (The Database for Annotation, Visualization and Integrated Discovery) and Metascape analyses suggested that the affected spliced genes are involved in axon development and microtubule-based processes. Among these, the mRNA coding for the Ninein protein is listed as one of Qki protein-dependent alternative splicing targets. Interestingly, this exon encodes a very long polypeptide (2121 nt), and has been previously defined as a dynamic RNA switch during the NPC-to-neuron transition. Additionally, we validated that the regulation of this large exon is consistent with the Qki5-dependent alternative exon inclusion mode suggested by our previous Qki5 HITS-CLIP (high throughput sequencing-cross linking immunoprecipitation) analysis. Taken together, these data suggest that Qki5 is an important factor for alternative splicing in the NPC-to-neuron transition.
Collapse
|
3
|
Ipe J, Collins KS, Hao Y, Gao H, Bhatia P, Gaedigk A, Liu Y, Skaar TC. PASSPORT-seq: A Novel High-Throughput Bioassay to Functionally Test Polymorphisms in Micro-RNA Target Sites. Front Genet 2018; 9:219. [PMID: 29963077 PMCID: PMC6013768 DOI: 10.3389/fgene.2018.00219] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Accepted: 05/29/2018] [Indexed: 11/18/2022] Open
Abstract
Next-generation sequencing (NGS) studies have identified large numbers of genetic variants that are predicted to alter miRNA–mRNA interactions. We developed a novel high-throughput bioassay, PASSPORT-seq, that can functionally test in parallel 100s of these variants in miRNA binding sites (mirSNPs). The results are highly reproducible across both technical and biological replicates. The utility of the bioassay was demonstrated by testing 100 mirSNPs in HEK293, HepG2, and HeLa cells. The results of several of the variants were validated in all three cell lines using traditional individual luciferase assays. Fifty-five mirSNPs were functional in at least one of three cell lines (FDR ≤ 0.05); 11, 36, and 27 of them were functional in HEK293, HepG2, and HeLa cells, respectively. Only four of the variants were functional in all three cell lines, which demonstrates the cell-type specific effects of mirSNPs and the importance of testing the mirSNPs in multiple cell lines. Using PASSPORT-seq, we functionally tested 111 variants in the 3′ UTR of 17 pharmacogenes that are predicted to alter miRNA regulation. Thirty-three of the variants tested were functional in at least one cell line.
Collapse
Affiliation(s)
- Joseph Ipe
- Division of Clinical Pharmacology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Kimberly S Collins
- Division of Clinical Pharmacology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Pharmacology and Toxicology, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Yangyang Hao
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States.,Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Hongyu Gao
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States.,Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Puja Bhatia
- Division of Clinical Pharmacology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Andrea Gaedigk
- Division of Clinical Pharmacology, Toxicology and Therapeutic Innovation, Children's Mercy Kansas City, Kansas City, MO, United States
| | - Yunlong Liu
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States.,Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Todd C Skaar
- Division of Clinical Pharmacology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| |
Collapse
|
4
|
Zhu BH, Xiao J, Xue W, Xu GC, Sun MY, Li JT. P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads. BMC Genomics 2018; 19:175. [PMID: 29499650 PMCID: PMC5834899 DOI: 10.1186/s12864-018-4567-3] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 02/22/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Obtaining complete gene structures is one major goal of genome assembly. Some gene regions are fragmented in low quality and high-quality assemblies. Therefore, new approaches are needed to recover gene regions. Genomes are widely transcribed, generating messenger and non-coding RNAs. These widespread transcripts can be used to scaffold genomes and complete transcribed regions. RESULTS We present P_RNA_scaffolder, a fast and accurate tool using paired-end RNA-sequencing reads to scaffold genomes. This tool aims to improve the completeness of both protein-coding and non-coding genes. After this tool was applied to scaffolding human contigs, the structures of both protein-coding genes and circular RNAs were almost completely recovered and equivalent to those in a complete genome, especially for long proteins and long circular RNAs. Tested in various species, P_RNA_scaffolder exhibited higher speed and efficiency than the existing state-of-the-art scaffolders. This tool also improved the contiguity of genome assemblies generated by current mate-pair scaffolding and third-generation single-molecule sequencing assembly. CONCLUSIONS The P_RNA_scaffolder can improve the contiguity of genome assembly and benefit gene prediction. This tool is available at http://www.fishbrowser.org/software/P_RNA_scaffolder .
Collapse
Affiliation(s)
- Bai-Han Zhu
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, 100141, China.,College of Fisheries and Life Science, Shanghai Ocean University, Shanghai, 201306, China
| | - Jun Xiao
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, 100141, China.,College of Fisheries and Life Science, Shanghai Ocean University, Shanghai, 201306, China
| | - Wei Xue
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, 100141, China
| | - Gui-Cai Xu
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, 100141, China.,College of Marine Science, Zhejiang Ocean University, Zhoushan, 316022, China
| | - Ming-Yuan Sun
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, 100141, China.,College of Fisheries and Life Science, Shanghai Ocean University, Shanghai, 201306, China
| | - Jiong-Tang Li
- Key Laboratory of Aquatic Genomics, Ministry of Agriculture, CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, 100141, China.
| |
Collapse
|
5
|
Pai AA, Henriques T, McCue K, Burkholder A, Adelman K, Burge CB. The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture. eLife 2017; 6:32537. [PMID: 29280736 PMCID: PMC5762160 DOI: 10.7554/elife.32537] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 12/22/2017] [Indexed: 12/28/2022] Open
Abstract
Production of most eukaryotic mRNAs requires splicing of introns from pre-mRNA. The splicing reaction requires definition of splice sites, which are initially recognized in either intron-spanning (‘intron definition’) or exon-spanning (‘exon definition’) pairs. To understand how exon and intron length and splice site recognition mode impact splicing, we measured splicing rates genome-wide in Drosophila, using metabolic labeling/RNA sequencing and new mathematical models to estimate rates. We found that the modal intron length range of 60–70 nt represents a local maximum of splicing rates, but that much longer exon-defined introns are spliced even faster and more accurately. We observed unexpectedly low variation in splicing rates across introns in the same gene, suggesting the presence of gene-level influences, and we identified multiple gene level variables associated with splicing rate. Together our data suggest that developmental and stress response genes may have preferentially evolved exon definition in order to enhance the rate or accuracy of splicing.
Collapse
Affiliation(s)
- Athma A Pai
- Departments of Biology and Biological Engineering, Massachusetts Institute of Technology, Cambridge, United States
| | - Telmo Henriques
- Epigenetics and Stem Cell Biology Laboratory, National Institute of Environmental Health Sciences, Research Triangle, United States.,Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, United States
| | - Kayla McCue
- Program in Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, United States
| | - Adam Burkholder
- Center for Integrative Bioinformatics, National Institute of Environmental Health Sciences, Research Triangle, United States
| | - Karen Adelman
- Epigenetics and Stem Cell Biology Laboratory, National Institute of Environmental Health Sciences, Research Triangle, United States.,Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, United States
| | - Christopher B Burge
- Departments of Biology and Biological Engineering, Massachusetts Institute of Technology, Cambridge, United States.,Program in Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, United States
| |
Collapse
|
6
|
Arias MA, Lubkin A, Chasin LA. Splicing of designer exons informs a biophysical model for exon definition. RNA (NEW YORK, N.Y.) 2015; 21:213-229. [PMID: 25492963 PMCID: PMC4338349 DOI: 10.1261/rna.048009.114] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2014] [Accepted: 10/29/2014] [Indexed: 06/04/2023]
Abstract
Pre-mRNA molecules in humans contain mostly short internal exons flanked by longer introns. To explain the removal of such introns, exon recognition instead of intron recognition has been proposed. We studied this exon definition using designer exons (DEs) made up of three prototype modules of our own design: an exonic splicing enhancer (ESE), an exonic splicing silencer (ESS), and a Reference Sequence (R) predicted to be neither. Each DE was examined as the central exon in a three-exon minigene. DEs made of R modules showed a sharp size dependence, with exons shorter than 14 nt and longer than 174 nt splicing poorly. Changing the strengths of the splice sites improved longer exon splicing but worsened shorter exon splicing, effectively displacing the curve to the right. For the ESE we found, unexpectedly, that its enhancement efficiency was independent of its position within the exon. For the ESS we found a step-wise positional increase in its effects; it was most effective at the 3' end of the exon. To apply these results quantitatively, we developed a biophysical model for exon definition of internal exons undergoing cotranscriptional splicing. This model features commitment to inclusion before the downstream exon is synthesized and competition between skipping and inclusion fates afterward. Collision of both exon ends to form an exon definition complex was incorporated to account for the effect of size; ESE/ESS effects were modeled on the basis of stabilization/destabilization. This model accurately predicted the outcome of independent experiments on more complex DEs that combined ESEs and ESSs.
Collapse
Affiliation(s)
- Mauricio A Arias
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA
| | - Ashira Lubkin
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA
| | - Lawrence A Chasin
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA
| |
Collapse
|
7
|
Soemedi R, Vega H, Belmont JM, Ramachandran S, Fairbrother WG. Genetic variation and RNA binding proteins: tools and techniques to detect functional polymorphisms. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2014; 825:227-66. [PMID: 25201108 DOI: 10.1007/978-1-4939-1221-6_7] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
At its most fundamental level the goal of genetics is to connect genotype to phenotype. This question is asked at a basic level evaluating the role of genes and pathways in genetic model organism. Increasingly, this question is being asked in the clinic. Genomes of individuals and populations are being sequenced and compared. The challenge often comes at the stage of analysis. The variant positions are analyzed with the hope of understanding human disease. However after a genome or exome has been sequenced, the researcher is often deluged with hundreds of potentially relevant variations. Traditionally, amino-acid changing mutations were considered the tractable class of disease-causing mutations; however, mutations that disrupt noncoding elements are the subject of growing interest. These noncoding changes are a major avenue of disease (e.g., one in three hereditary disease alleles are predicted to affect splicing). Here, we review some current practices of medical genetics, the basic theory behind biochemical binding and functional assays, and then explore technical advances in how variations that alter RNA protein recognition events are detected and studied. These advances are advances in scale-high-throughput implementations of traditional biochemical assays that are feasible to perform in any molecular biology laboratory. This chapter utilizes a case study approach to illustrate some methods for analyzing polymorphisms. The first characterizes a functional intronic SNP that deletes a high affinity PTB site using traditional low-throughput biochemical and functional assays. From here we demonstrate the utility of high-throughput splicing and spliceosome assembly assays for screening large sets of SNPs and disease alleles for allelic differences in gene expression. Finally we perform three pilot drug screens with small molecules (G418, tetracycline, and valproic acid) that illustrate how compounds that rescue specific instances of differential pre-mRNA processing can be discovered.
Collapse
Affiliation(s)
- Rachel Soemedi
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA
| | | | | | | | | |
Collapse
|
8
|
De Conti L, Baralle M, Buratti E. Exon and intron definition in pre-mRNA splicing. WILEY INTERDISCIPLINARY REVIEWS-RNA 2012; 4:49-60. [DOI: 10.1002/wrna.1140] [Citation(s) in RCA: 207] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
|
9
|
Bolisetty MT, Beemon KL. Splicing of internal large exons is defined by novel cis-acting sequence elements. Nucleic Acids Res 2012; 40:9244-54. [PMID: 22790982 PMCID: PMC3467050 DOI: 10.1093/nar/gks652] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Human internal exons have an average size of 147 nt, and most are <300 nt. This small size is thought to facilitate exon definition. A small number of large internal exons have been identified and shown to be alternatively spliced. We identified 1115 internal exons >1000 nt in the human genome; these were found in 5% of all protein-coding genes, and most were expressed and translated. Surprisingly, 40% of these were expressed at levels similar to the flanking exons, suggesting they were constitutively spliced. While all of the large exons had strong splice sites, the constitutively spliced large exons had a higher ratio of splicing enhancers/silencers and were more conserved across mammals than the alternatively spliced large exons. We asked if large exons contain specific sequences that promote splicing and identified 38 sequences enriched in the large exons relative to small exons. The consensus sequence is C-rich with a central invariant CA dinucleotide. Mutation of these sequences in a candidate large exon indicated that these are important for recognition of large exons by the splicing machinery. We propose that these sequences are large exon splicing enhancers (LESEs).
Collapse
Affiliation(s)
- Mohan T Bolisetty
- Department of Biology, Johns Hopkins University, Baltimore, MD 21218, USA.
| | | |
Collapse
|
10
|
Searching for splicing motifs. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2008; 623:85-106. [PMID: 18380342 DOI: 10.1007/978-0-387-77374-2_6] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Intron removal during pre-mRNA splicing in higher eukaryotes requires the accurate identification of the two splice sites at the ends of the exons, or exon definition. The sequences constituting the splice sites provide insufficient information to distinguish true splice sites from the greater number of false splice sites that populate transcripts. Additional information used for exon recognition resides in a large number of positively or negatively acting elements that lie both within exons and in the adjacent introns. The identification of such sequence motifs has progressed rapidly in recent years, such that extensive lists are now available for exonic splicing enhancers and exonic splicing silencers. These motifs have been identified both by empirical experiments and by computational predictions, the validity of the latter being confirmed by experimental verification. Molecular searches have been carried out either by the selection of sequences that bind to splicing factors, or enhance or silence splicing in vitro or in vivo. Computational methods have focused on sequences of 6 or 8 nucleotides that are over- or under-represented in exons, compared to introns or transcripts that do not undergo splicing. These various methods have sought to provide global definitions of motifs, yet the motifs are distinctive to the method used for identification and display little overlap. Astonishingly, at least three-quarters of a typical mRNA would be comprised of these motifs. A present challenge lies in understanding how the cell integrates this surfeit of information to generate what is usually a binary splicing decision.
Collapse
|
11
|
Bruce SR, Kaetzel CS, Peterson ML. Cryptic intron activation within the large exon of the mouse polymeric immunoglobulin receptor gene: cryptic splice sites correspond to protein domain boundaries. Nucleic Acids Res 1999; 27:3446-54. [PMID: 10446232 PMCID: PMC148586 DOI: 10.1093/nar/27.17.3446] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The fourth exon of the mouse polymeric immuno-globulin receptor (pIgR) is 654 nt long and, despite being surrounded by large introns, is constitutively spliced into the mRNA. Deletion of an 84 nt sequence from this exon strongly activated both cryptic 5' and 3' splice sites surrounding a 78 nt cryptic intron. The 84 nt deletion is just upstream of the cryptic 3' splice site; the cryptic 3' splice site was likely activated because the deletion created a better 3' splice site. However, the cryptic 5' splice site was also required to activate the cryptic splice reaction; point mutations in either of the cryptic splice sites that decreased their match to the consensus splice site sequence inactivated the cryptic splice reaction. The activation and inactivation of these cryptic splice sites as a pair suggests that they are being co-recognized by the splicing machinery. Interestingly, the large fourth exon of the pIgR gene encodes two immunoglobulin-like extracellular protein domains; the cryptic 3' splice site coincides with the junction between these protein domains. The cryptic 5' splice site is located between protein subdomains where an intron is found in another gene of the immunoglobulin superfamily.
Collapse
Affiliation(s)
- S R Bruce
- Department of Microbiology, University of Kentucky College of Medicine, Lexington, KY 40536, USA
| | | | | |
Collapse
|
12
|
Gardiner K. Saturation identification of coding sequences in genomic DNA. Methods Enzymol 1999; 303:144-61. [PMID: 10349644 DOI: 10.1016/s0076-6879(99)03012-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Affiliation(s)
- K Gardiner
- Eleanor Roosevelt Institute, Denver, Colorado 80206, USA
| |
Collapse
|
13
|
Abstract
We introduce a general probabilistic model of the gene structure of human genomic sequences which incorporates descriptions of the basic transcriptional, translational and splicing signals, as well as length distributions and compositional features of exons, introns and intergenic regions. Distinct sets of model parameters are derived to account for the many substantial differences in gene density and structure observed in distinct C + G compositional regions of the human genome. In addition, new models of the donor and acceptor splice signals are described which capture potentially important dependencies between signal positions. The model is applied to the problem of gene identification in a computer program, GENSCAN, which identifies complete exon/intron structures of genes in genomic DNA. Novel features of the program include the capacity to predict multiple genes in a sequence, to deal with partial as well as complete genes, and to predict consistent sets of genes occurring on either or both DNA strands. GENSCAN is shown to have substantially higher accuracy than existing methods when tested on standardized sets of human and vertebrate genes, with 75 to 80% of exons identified exactly. The program is also capable of indicating fairly accurately the reliability of each predicted exon. Consistently high levels of accuracy are observed for sequences of differing C + G content and for distinct groups of vertebrates.
Collapse
Affiliation(s)
- C Burge
- Department of Mathematics, Stanford University, CA 94305, USA
| | | |
Collapse
|
14
|
Affiliation(s)
- S M Berget
- Verna and Marrs McClean Department of Biochemistry, Baylor College of Medicine, Houston, Texas 77030
| |
Collapse
|
15
|
Brady JP, Kantorow M, Sax CM, Donovan DM, Piatigorsky J. Murine transcription factor alpha A-crystallin binding protein I. Complete sequence, gene structure, expression, and functional inhibition via antisense RNA. J Biol Chem 1995; 270:1221-9. [PMID: 7836383 DOI: 10.1074/jbc.270.3.1221] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
alpha A-crystallin binding protein I (alpha A-CRYBP1) is a ubiquitously expressed DNA binding protein that was previously identified by its ability to interact with a functionally important sequence in the mouse alpha A-crystallin gene promoter. Here, we have cloned a single copy gene with 10 exons spanning greater than 70 kb of genomic DNA that encodes alpha A-CRYBP1. The mouse alpha A-CRYBP1 gene specifies a 2,688-amino acid protein with 72% amino acid identity to its human homologue, PRDII-BF1. Both the human and the mouse proteins contain two sets of consensus C2H2 zinc fingers at each end as well a central nonconsensus zinc finger. The alpha A-CRYBP1 gene produces a 9.5-kb transcript in 11 different tissues as well as a testis-specific, 7.7-kb transcript. alpha A-CRYBP1 cDNA clones were isolated from adult mouse brain and testis as well as from cell lines derived from mouse lens (alpha TN4-1) and muscle (C2C12). A single clone isolated from the muscle C2C12 library contains an additional exon near the 5'-end that would prevent production of a functional protein if the normal translation start site were utilized; however, there is another potential initiation codon located downstream that is in frame with the rest of the coding region. In addition, we identified multiple cDNAs from the testis in which the final intron is still present. Finally, we used an antisense expression construct derived from an alpha A-CRYBP1 cDNA clone to provide the first functional evidence that alpha A-CRYBP1 regulates gene expression. When introduced into the alpha TN4-1 mouse lens cell line, the antisense construct significantly inhibited expression from a heterologous promoter that utilized the alpha A-CRYBP1 binding site as an enhancer.
Collapse
Affiliation(s)
- J P Brady
- Laboratory of Molecular and Developmental Biology, NEI, National Institutes of Health, Bethesda, Maryland 20892
| | | | | | | | | |
Collapse
|
16
|
Valentine CR, Heflich RH. Genomic DNA sequencing of mRNA splicing mutants in the hprt gene of Chinese hamster ovary cells. ENVIRONMENTAL AND MOLECULAR MUTAGENESIS 1995; 25:85-96. [PMID: 7698111 DOI: 10.1002/em.2850250202] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
We have analyzed 41 mRNA-splicing mutants from the hypoxanthine-guanine phosphoribosyl-transferase (hprt) gene of Chinese hamster ovary (CHO) cells. Twenty-two of these mutants produced single cDNA PCR products with a partial or complete exon deletion; 19 mutants produced multiple cDNA PCR products, and most of these products contained one or more deleted exons. The affected exons and surrounding introns were amplified from genomic DNA and sequenced in order to identify mutations causing aberrant splicing. We found acceptor site mutations in 10 mutants, exonic mutations in 8 mutants, and no mutations in 5 mutants. Four mutants from solvent controls did not amplify the appropriate exons and were considered genomic deletion mutants. Our previous work [Manjanatha MG et al. (1994): Mutat Res 308;65-75] showed that nonsense mutants in the hprt gene of CHO cells are associated with multiple cDNA PCR products containing deleted exons and a low abundance of hprt mRNA if the mutation is found in an internal exon. The present results are consistent with these associations being facilitated by instability of mRNA after ribosome termination at nonsense codons.
Collapse
Affiliation(s)
- C R Valentine
- Division of Genetic Toxicology, National Center for Toxicological Research, Jefferson, Arkansas 72079-9502, USA
| | | |
Collapse
|