1
|
Wu X, Liu T, Ye C, Ye W, Ji G. scAPAtrap: identification and quantification of alternative polyadenylation sites from single-cell RNA-seq data. Brief Bioinform 2020; 22:5952304. [PMID: 33142319 DOI: 10.1093/bib/bbaa273] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Revised: 09/17/2020] [Accepted: 09/20/2020] [Indexed: 02/06/2023] Open
Abstract
Alternative polyadenylation (APA) generates diverse mRNA isoforms, which contributes to transcriptome diversity and gene expression regulation by affecting mRNA stability, translation and localization in cells. The rapid development of 3' tag-based single-cell RNA-sequencing (scRNA-seq) technologies, such as CEL-seq and 10x Genomics, has led to the emergence of computational methods for identifying APA sites and profiling APA dynamics at single-cell resolution. However, existing methods fail to detect the precise location of poly(A) sites or sites with low read coverage. Moreover, they rely on priori genome annotation and can only detect poly(A) sites located within or near annotated genes. Here we proposed a tool called scAPAtrap for detecting poly(A) sites at the whole genome level in individual cells from 3' tag-based scRNA-seq data. scAPAtrap incorporates peak identification and poly(A) read anchoring, enabling the identification of the precise location of poly(A) sites, even for sites with low read coverage. Moreover, scAPAtrap can identify poly(A) sites without using priori genome annotation, which helps locate novel poly(A) sites in previously overlooked regions and improve genome annotation. We compared scAPAtrap with two latest methods, scAPA and Sierra, using scRNA-seq data from different experimental technologies and species. Results show that scAPAtrap identified poly(A) sites with higher accuracy and sensitivity than competing methods and could be used to explore APA dynamics among cell types or the heterogeneous APA isoform expression in individual cells. scAPAtrap is available at https://github.com/BMILAB/scAPAtrap.
Collapse
Affiliation(s)
- Xiaohui Wu
- Department of Automation in Xiamen University
| | - Tao Liu
- Department of Automation in Xiamen University
| | - Congting Ye
- College of the Environment and Ecology in Xiamen University
| | - Wenbin Ye
- Department of Automation in Xiamen University
| | - Guoli Ji
- Department of Automation in Xiamen University
| |
Collapse
|
2
|
Sheynkman GM, Tuttle KS, Laval F, Tseng E, Underwood JG, Yu L, Dong D, Smith ML, Sebra R, Willems L, Hao T, Calderwood MA, Hill DE, Vidal M. ORF Capture-Seq as a versatile method for targeted identification of full-length isoforms. Nat Commun 2020; 11:2326. [PMID: 32393825 PMCID: PMC7214433 DOI: 10.1038/s41467-020-16174-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 04/16/2020] [Indexed: 01/02/2023] Open
Abstract
Most human protein-coding genes are expressed as multiple isoforms, which greatly expands the functional repertoire of the encoded proteome. While at least one reliable open reading frame (ORF) model has been assigned for every coding gene, the majority of alternative isoforms remains uncharacterized due to (i) vast differences of overall levels between different isoforms expressed from common genes, and (ii) the difficulty of obtaining full-length transcript sequences. Here, we present ORF Capture-Seq (OCS), a flexible method that addresses both challenges for targeted full-length isoform sequencing applications using collections of cloned ORFs as probes. As a proof-of-concept, we show that an OCS pipeline focused on genes coding for transcription factors increases isoform detection by an order of magnitude when compared to unenriched samples. In short, OCS enables rapid discovery of isoforms from custom-selected genes and will accelerate mapping of the human transcriptome.
Collapse
Affiliation(s)
- Gloria M Sheynkman
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA. .,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA. .,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.
| | - Katharine S Tuttle
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Department of Biochemistry, Northeastern University, Boston, MA, 02115, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Icahn Institute of Data Science and Genomic Technology, New York, NY, 10029, USA
| | - Florent Laval
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Laboratory of Molecular Biology, TERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux, 5030, Belgium.,Laboratory of Molecular and Cellular Epigenetics, GIGA-Cancer, University of Liège, 4000, Liège, Belgium
| | | | | | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China
| | - Da Dong
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China
| | - Melissa L Smith
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Icahn Institute of Data Science and Genomic Technology, New York, NY, 10029, USA
| | - Robert Sebra
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Icahn Institute of Data Science and Genomic Technology, New York, NY, 10029, USA
| | - Luc Willems
- Laboratory of Molecular Biology, TERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux, 5030, Belgium.,Laboratory of Molecular and Cellular Epigenetics, GIGA-Cancer, University of Liège, 4000, Liège, Belgium
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - David E Hill
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA. .,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA. .,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
| |
Collapse
|
3
|
Genome-Wide Profiling of Polyadenylation Events in Maize Using High-Throughput Transcriptomic Sequences. G3-GENES GENOMES GENETICS 2019; 9:2749-2760. [PMID: 31239292 PMCID: PMC6686930 DOI: 10.1534/g3.119.400196] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Polyadenylation is an essential post-transcriptional modification of eukaryotic transcripts that plays critical role in transcript stability, localization, transport, and translational efficiency. About 70% genes in plants contain alternative polyadenylation (APA) sites. Despite availability of vast amount of sequencing data, to date, a comprehensive map of the polyadenylation events in maize is not available. Here, 9.48 billion RNA-Seq reads were analyzed to characterize 95,345 Poly(A) Clusters (PAC) in 23,705 (51%) maize genes. Of these, 76% were APA genes. However, most APA genes (55%) expressed a dominant PAC rather than favoring multiple PACs equally. The lincRNA genes with PACs were significantly longer in length than the genes without any PAC and about 48% genes had APA sites. Heterogeneity was observed in 52% of the PACs supporting the imprecise nature of the polyadenylation process. Genomic distribution revealed that the majority of the PACs (78%) were located in the genic regions. Unlike previous studies, large number of PACs were observed in the intergenic (n = 21,264), 5′-UTR (735), CDS (2,542), and the intronic regions (12,841). The CDS and introns with PACs were longer in length than without PACs, whereas intergenic PACs were more often associated with transcripts that lacked annotated 3′-UTRs. Nucleotide composition around PACs demonstrated AT-richness and the common upstream motif was AAUAAA, which is consistent with other plants. According to this study, only 2,830 genes still maintained the use of AAUAAA motif. This large-scale data provides useful insights about the gene expression regulation and could be utilized as evidence to validate the annotation of transcript ends.
Collapse
|
4
|
Abstract
Gene maps, or annotations, enable us to navigate the functional landscape of our genome. They are a resource upon which virtually all studies depend, from single-gene to genome-wide scales and from basic molecular biology to medical genetics. Yet present-day annotations suffer from trade-offs between quality and size, with serious but often unappreciated consequences for downstream studies. This is particularly true for long non-coding RNAs (lncRNAs), which are poorly characterized compared to protein-coding genes. Long-read sequencing technologies promise to improve current annotations, paving the way towards a complete annotation of lncRNAs expressed throughout a human lifetime.
Collapse
|
5
|
Polyadenylation sites and their characteristics in the genome of channel catfish (Ictalurus punctatus) as revealed by using RNA-Seq data. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY D-GENOMICS & PROTEOMICS 2019; 30:248-255. [PMID: 30952021 DOI: 10.1016/j.cbd.2019.03.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2019] [Revised: 03/24/2019] [Accepted: 03/24/2019] [Indexed: 11/21/2022]
Abstract
Polyadenylation plays important roles in gene expression regulation in eukaryotes, which typically involves cleavage and poly(A) tail addition at the polyadenylation site (PAS) of the pre-mature mRNA. Many eukaryotic genes contain more than one PASs, termed as alternative polyadenylation (APA). As a crucial post-transcriptional regulation, polyadenylation affects various aspects of RNA metabolism such as mRNA stability, translocation, and translation. However, polyadenylation has been rarely studied in teleosts. Here we conducted polyadenylation analysis in channel catfish, a commercially important aquaculture species around the world. Using RNA-Seq data, we identified 20,320 PASs which were classified into 14,500 clusters by merging adjacent PASs. Most of the PASs were found in 3' UTRs, followed by intron regions based on the annotation of channel catfish reference genome. No apparent difference in PAS distribution was observed between the sense and antisense strand of the channel catfish genome. The sequence analysis of nucleotide composition and motif around PASs yielded a highly similar profile among various organisms, suggesting the conservation and importance of polyadenylation in evolution. Using APA genes with more than two PASs, gene ontology enrichment revealed genes particularly involved in RNA binding. Reactome pathway analysis showed the enrichment of the innate immune system, especially neutrophil degranulation.
Collapse
|
6
|
Fu H, Yang D, Su W, Ma L, Shen Y, Ji G, Ye X, Wu X, Li QQ. Genome-wide dynamics of alternative polyadenylation in rice. Genome Res 2016; 26:1753-1760. [PMID: 27733415 PMCID: PMC5131826 DOI: 10.1101/gr.210757.116] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 10/06/2016] [Indexed: 12/02/2022]
Abstract
Alternative polyadenylation (APA), in which a transcript uses one of the poly(A) sites to define its 3'-end, is a common regulatory mechanism in eukaryotic gene expression. However, the potential of APA in determining crop agronomic traits remains elusive. This study systematically tallied poly(A) sites of 14 different rice tissues and developmental stages using the poly(A) tag sequencing (PAT-seq) approach. The results indicate significant involvement of APA in developmental and quantitative trait loci (QTL) gene expression. About 48% of all expressed genes use APA to generate transcriptomic and proteomic diversity. Some genes switch APA sites, allowing differentially expressed genes to use alternate 3' UTRs. Interestingly, APA in mature pollen is distinct where differential expression levels of a set of poly(A) factors and different distributions of APA sites are found, indicating a unique mRNA 3'-end formation regulation during gametophyte development. Equally interesting, statistical analyses showed that QTL tends to use APA for regulation of gene expression of many agronomic traits, suggesting a potential important role of APA in rice production. These results provide thus far the most comprehensive and high-resolution resource for advanced analysis of APA in crops and shed light on how APA is associated with trait formation in eukaryotes.
Collapse
Affiliation(s)
- Haihui Fu
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian, China, 361102
| | - Dewei Yang
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian, China, 350018
| | - Wenyue Su
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian, China, 361102
| | - Liuyin Ma
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian, China, 361102
| | - Yingjia Shen
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian, China, 361102
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen, Fujian, China, 361005
| | - Xinfu Ye
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian, China, 350018
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen, Fujian, China, 361005
| | - Qingshun Q Li
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian, China, 361102
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian, China, 350018
- Graduate College of Biomedical Sciences, Western University of Health Sciences, Pomona, California 91766, USA
| |
Collapse
|
7
|
Kim S, Cho CS, Han K, Lee J. Structural Variation of Alu Element and Human Disease. Genomics Inform 2016; 14:70-77. [PMID: 27729835 PMCID: PMC5056899 DOI: 10.5808/gi.2016.14.3.70] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2016] [Revised: 08/09/2016] [Accepted: 08/10/2016] [Indexed: 01/04/2023] Open
Abstract
Transposable elements are one of major sources to cause genomic instability through various mechanisms including de novo insertion, insertion-mediated genomic deletion, and recombination-associated genomic deletion. Among them is Alu element which is the most abundant element, composing ~10% of the human genome. The element emerged in the primate genome 65 million years ago and has since propagated successfully in the human and non-human primate genomes. Alu element is a non-autonomous retrotransposon and therefore retrotransposed using L1-enzyme machinery. The 'master gene' model has been generally accepted to explain Alu element amplification in primate genomes. According to the model, different subfamilies of Alu elements are created by mutations on the master gene and most Alu elements are amplified from the hyperactive master genes. Alu element is frequently involved in genomic rearrangements in the human genome due to its abundance and sequence identity between them. The genomic rearrangements caused by Alu elements could lead to genetic disorders such as hereditary disease, blood disorder, and neurological disorder. In fact, Alu elements are associated with approximately 0.1% of human genetic disorders. The first part of this review discusses mechanisms of Alu amplification and diversity among different Alu subfamilies. The second part discusses the particular role of Alu elements in generating genomic rearrangements as well as human genetic disorders.
Collapse
Affiliation(s)
- Songmi Kim
- Department of Nanobiomedical Science, Dankook University, Cheonan 31116, Korea.; BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116, Korea
| | - Chun-Sung Cho
- Department of Neurosurgery, Dankook University College of Medicine, Cheonan 31116, Korea
| | - Kyudong Han
- Department of Nanobiomedical Science, Dankook University, Cheonan 31116, Korea.; BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116, Korea
| | - Jungnam Lee
- Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, University of Florida, Gainesville, FL 32610, USA
| |
Collapse
|
8
|
Wu X, Zeng Y, Guan J, Ji G, Huang R, Li QQ. Genome-wide characterization of intergenic polyadenylation sites redefines gene spaces in Arabidopsis thaliana. BMC Genomics 2015; 16:511. [PMID: 26155789 PMCID: PMC4568572 DOI: 10.1186/s12864-015-1691-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2015] [Accepted: 06/05/2015] [Indexed: 12/22/2022] Open
Abstract
Background Messenger RNA polyadenylation is an essential step for the maturation of most eukaryotic mRNAs. Accurate determination of poly(A) sites helps define the 3’-ends of genes, which is important for genome annotation and gene function research. Genomic studies have revealed the presence of poly(A) sites in intergenic regions, which may be attributed to 3’-UTR extensions and novel transcript units. However, there is no systematically evaluation of intergenic poly(A) sites in plants. Results Approximately 16,000 intergenic poly(A) site clusters (IPAC) in Arabidopsis thaliana were discovered and evaluated at the whole genome level. Based on the distributions of distance from IPACs to nearby sense and antisense genes, these IPACs were classified into three categories. About 70 % of them were from previously unannotated 3’-UTR extensions to known genes, which would extend 6985 transcripts of TAIR10 genome annotation beyond their 3’-ends, with a mean extension of 134 nucleotides. 1317 IPACs were originated from novel intergenic transcripts, 37 of which were likely to be associated with protein coding transcripts. 2957 IPACs corresponded to antisense transcripts for genes on the reverse strand, which might affect 2265 protein coding genes and 39 non-protein-coding genes, including long non-coding RNA genes. The rest of IPACs could be originated from transcriptional read-through or gene mis-annotations. Conclusions The identified IPACs corresponding to novel transcripts, 3’-UTR extensions, and antisense transcription should be incorporated into current Arabidopsis genome annotation. Comprehensive characterization of IPACs from this study provides insights of alternative polyadenylation and antisense transcription in plants. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1691-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen, Fujian, China.
| | - Yong Zeng
- Department of Automation, Xiamen University, Xiamen, Fujian, China.
| | - Jinting Guan
- Department of Automation, Xiamen University, Xiamen, Fujian, China.
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen, Fujian, China. .,Innovation Center for Cell Signaling Network, Xiamen University, Xiamen, Fujian, China.
| | - Rongting Huang
- Department of Automation, Xiamen University, Xiamen, Fujian, China.
| | - Qingshun Q Li
- Key Laboratory of the Ministry of Education on Costal Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian, China. .,Graduate College of Biomedical Sciences, Western University of Health Sciences, Pomona, CA, USA. .,Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian, China.
| |
Collapse
|
9
|
Evaluation of two statistical methods provides insights into the complex patterns of alternative polyadenylation site switching. PLoS One 2015; 10:e0124324. [PMID: 25875641 PMCID: PMC4396989 DOI: 10.1371/journal.pone.0124324] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Accepted: 03/01/2015] [Indexed: 11/19/2022] Open
Abstract
Switching between different alternative polyadenylation (APA) sites plays an important role in the fine tuning of gene expression. New technologies for the execution of 3’-end enriched RNA-seq allow genome-wide detection of the genes that exhibit significant APA site switching between different samples. Here, we show that the independence test gives better results than the linear trend test in detecting APA site-switching events. Further examination suggests that the discrepancy between these two statistical methods arises from complex APA site-switching events that cannot be represented by a simple change of average 3’-UTR length. In theory, the linear trend test is only effective in detecting these simple changes. We classify the switching events into four switching patterns: two simple patterns (3’-UTR shortening and lengthening) and two complex patterns. By comparing the results of the two statistical methods, we show that complex patterns account for 1/4 of all observed switching events that happen between normal and cancerous human breast cell lines. Because simple and complex switching patterns may convey different biological meanings, they merit separate study. We therefore propose to combine both the independence test and the linear trend test in practice. First, the independence test should be used to detect APA site switching; second, the linear trend test should be invoked to identify simple switching events; and third, those complex switching events that pass independence testing but fail linear trend testing can be identified.
Collapse
|
10
|
Bioinformatics analysis of alternative polyadenylation in green alga Chlamydomonas reinhardtii using transcriptome sequences from three different sequencing platforms. G3-GENES GENOMES GENETICS 2014; 4:871-83. [PMID: 24626288 PMCID: PMC4025486 DOI: 10.1534/g3.114.010249] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
Messenger RNA 3′-end formation is an essential posttranscriptional processing step for most eukaryotic genes. Different from plants and animals where AAUAAA and its variants routinely are found as the main poly(A) signal, Chlamydomonas reinhardtii uses UGUAA as the major poly(A) signal. The advance of sequencing technology provides an enormous amount of sequencing data for us to explore the variations of poly(A) signals, alternative polyadenylation (APA), and its relationship with splicing in this algal species. Through genome-wide analysis of poly(A) sites in C. reinhardtii, we identified a large number of poly(A) sites: 21,041 from Sanger expressed sequence tags, 88,184 from 454, and 195,266 from Illumina sequence reads. In comparison with previous collections, more new poly(A) sites are found in coding sequences and intron and intergenic regions by deep-sequencing. Interestingly, G-rich signals are particularly abundant in intron and intergenic regions. The prevalence of different poly(A) signals between coding sequences and a 3′-untranslated region implies potentially different polyadenylation mechanisms. Our data suggest that the APA occurs in about 68% of C. reinhardtii genes. Using Gene Ontolgy analysis, we found most of the APA genes are involved in RNA regulation and metabolic process, protein synthesis, hydrolase, and ligase activities. Moreover, intronic poly(A) sites are more abundant in constitutively spliced introns than retained introns, suggesting an interplay between polyadenylation and splicing. Our results support that APA, as in higher eukaryotes, may play significant roles in increasing transcriptome diversity and gene expression regulation in this algal species. Our datasets also provide useful information for accurate annotation of transcript ends in C. reinhardtii.
Collapse
|
11
|
Harrison BJ, Flight RM, Gomes C, Venkat G, Ellis SR, Sankar U, Twiss JL, Rouchka EC, Petruska JC. IB4-binding sensory neurons in the adult rat express a novel 3' UTR-extended isoform of CaMK4 that is associated with its localization to axons. J Comp Neurol 2014; 522:308-36. [PMID: 23817991 PMCID: PMC3855891 DOI: 10.1002/cne.23398] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2012] [Revised: 06/13/2013] [Accepted: 06/19/2013] [Indexed: 01/22/2023]
Abstract
Calcium/calmodulin-dependent protein kinase 4 (gene and transcript: CaMK4; protein: CaMKIV) is the nuclear effector of the Ca(2+) /calmodulin kinase (CaMK) pathway where it coordinates transcriptional responses. However, CaMKIV is present in the cytoplasm and axons of subpopulations of neurons, including some sensory neurons of the dorsal root ganglia (DRG), suggesting an extranuclear role for this protein. We observed that CaMKIV was expressed strongly in the cytoplasm and axons of a subpopulation of small-diameter DRG neurons, most likely cutaneous nociceptors by virtue of their binding the isolectin IB4. In IB4+ spinal nerve axons, 20% of CaMKIV was colocalized with the endocytic marker Rab7 in axons that highly expressed CAM-kinase-kinase (CAMKK), an upstream activator of CaMKIV, suggesting a role for CaMKIV in signaling though signaling endosomes. Using fluorescent in situ hybridization (FISH) with riboprobes, we also observed that small-diameter neurons expressed high levels of a novel 3' untranslated region (UTR) variant of CaMK4 mRNA. Using rapid amplification of cDNA ends (RACE), reverse-transcription polymerase chain reaction (RT-PCR) with gene-specific primers, and cDNA sequencing analyses we determined that the novel transcript contains an additional 10 kb beyond the annotated gene terminus to a highly conserved alternate polyadenylation site. Quantitative PCR (qPCR) analyses of fluorescent-activated cell sorted (FACS) DRG neurons confirmed that this 3'-UTR-extended variant was preferentially expressed in IB4-binding neurons. Computational analyses of the 3'-UTR sequence predict that UTR-extension introduces consensus sites for RNA-binding proteins (RBPs) including the embryonic lethal abnormal vision (ELAV)/Hu family proteins. We consider the possible implications of axonal CaMKIV in the context of the unique properties of IB4-binding DRG neurons.
Collapse
Affiliation(s)
- Benjamin J. Harrison
- Anatomical Sciences and Neurobiology, University of Louisville, Louisville, Kentucky, 40202, USA
- Kentucky Spinal Cord Injury Research Center (KSCIRC), University of Louisville, Louisville, Kentucky, 40292, USA
| | - Robert M. Flight
- Anatomical Sciences and Neurobiology, University of Louisville, Louisville, Kentucky, 40202, USA
| | - Cynthia Gomes
- Department of Biochemistry and Molecular Bi ology, University of Louisville School of Medicine, Kentucky, 40202, USA
| | - Gayathri Venkat
- Anatomical Sciences and Neurobiology, University of Louisville, Louisville, Kentucky, 40202, USA
- Kentucky Spinal Cord Injury Research Center (KSCIRC), University of Louisville, Louisville, Kentucky, 40292, USA
| | - Steven R Ellis
- Department of Biochemistry and Molecular Bi ology, University of Louisville School of Medicine, Kentucky, 40202, USA
| | - Uma Sankar
- James Graham Brown Cancer Center, University of Louisville, Louisville, Kentucky, 40292, USA
- Owensboro Cancer Research Program, University of Louisville, Owensboro, KY 42303, USA
- Department of Pharmacology and Toxicology, University of Louisville, Louisville, Kentucky, 40292, USA
| | - Jeffery L. Twiss
- Department of Biology, Drexel University, Philadelphia, Pennsylvania, 19104, USA
| | - Eric C. Rouchka
- Department of Computer Engineering and Computer Science, University of Louisville, Louisville, Kentucky, 40292, USA
| | - Jeffrey C. Petruska
- Anatomical Sciences and Neurobiology, University of Louisville, Louisville, Kentucky, 40202, USA
- Kentucky Spinal Cord Injury Research Center (KSCIRC), University of Louisville, Louisville, Kentucky, 40292, USA
- Department of Neurological Surgery, University of Louisville, Louisville, Kentucky, 40202, USA
| |
Collapse
|
12
|
An J, Zhu X, Wang H, Jin X. A dynamic interplay between alternative polyadenylation and microRNA regulation: implications for cancer (Review). Int J Oncol 2013; 43:995-1001. [PMID: 23913120 DOI: 10.3892/ijo.2013.2047] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Accepted: 07/18/2013] [Indexed: 12/15/2022] Open
Abstract
Alternative polyadenylation and microRNA regulation are both mechanisms of post-transcriptional regulation of gene expression. Alternative polyadenylation often results in mRNA isoforms with the same coding sequence but different lengths of 3' UTRs, while microRNAs regulate gene expression by binding to specific mRNA 3' UTRs. In this sense, different isoforms of an mRNA may be differentially regulated by microRNAs, sometimes resulting in cellular proliferation and this mechanism is being speculated on as a potential cause for cancer development.
Collapse
Affiliation(s)
- Jindan An
- Key Laboratory of Cancer Prevention and Treatment of Heilongjiang Province, Mudanjiang Medical University, Mudanjiang, P.R. China
| | | | | | | |
Collapse
|
13
|
Fox-Walsh K, Davis-Turak J, Zhou Y, Li H, Fu XD. A multiplex RNA-seq strategy to profile poly(A+) RNA: application to analysis of transcription response and 3' end formation. Genomics 2011; 98:266-71. [PMID: 21515359 DOI: 10.1016/j.ygeno.2011.04.003] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2011] [Revised: 04/05/2011] [Accepted: 04/08/2011] [Indexed: 11/26/2022]
Abstract
RNA-seq technologies are now replacing microarrays for profiling gene expression. Here we describe a robust RNA-seq strategy for multiplex analysis of RNA samples based on deep sequencing. First, an oligo-dT linked to an adaptor sequence is used to prime cDNA synthesis. Upon solid phase selection, second strand synthesis is initiated using a random primer linked to another adaptor sequence. Finally, the library is released from the beads and amplified using a bar-coded primer together with a common primer. This method, referred to as Multiplex Analysis of PolyA-linked Sequences (MAPS), preserves strand information, permits rapid identification of potentially new polyadenylation sites, and profiles gene expression in a highly cost effective manner. We have applied this technology to determine the transcriptome response to knockdown of the RNA binding protein TLS, and compared the result to current microarray technology, demonstrating the ability of MAPS to robustly detect regulated gene expression.
Collapse
Affiliation(s)
- Kristi Fox-Walsh
- Department of Cellular and Molecular Medicine, University of California, San Diego, CA 92093-0651, USA
| | | | | | | | | |
Collapse
|
14
|
Ozsolak F, Kapranov P, Foissac S, Kim SW, Fishilevich E, Monaghan AP, John B, Milos PM. Comprehensive polyadenylation site maps in yeast and human reveal pervasive alternative polyadenylation. Cell 2011; 143:1018-29. [PMID: 21145465 DOI: 10.1016/j.cell.2010.11.020] [Citation(s) in RCA: 311] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2010] [Revised: 09/28/2010] [Accepted: 11/09/2010] [Indexed: 01/12/2023]
Abstract
The emerging discoveries on the link between polyadenylation and disease states underline the need to fully characterize genome-wide polyadenylation states. Here, we report comprehensive maps of global polyadenylation events in human and yeast generated using refinements to the Direct RNA Sequencing technology. This direct approach provides a quantitative view of genome-wide polyadenylation states in a strand-specific manner and requires only attomole RNA quantities. The polyadenylation profiles revealed an abundance of unannotated polyadenylation sites, alternative polyadenylation patterns, and regulatory element-associated poly(A)(+) RNAs. We observed differences in sequence composition surrounding canonical and noncanonical human polyadenylation sites, suggesting novel noncoding RNA-specific polyadenylation mechanisms in humans. Furthermore, we observed the correlation level between sense and antisense transcripts to depend on gene expression levels, supporting the view that overlapping transcription from opposite strands may play a regulatory role. Our data provide a comprehensive view of the polyadenylation state and overlapping transcription.
Collapse
Affiliation(s)
- Fatih Ozsolak
- Helicos BioSciences Corporation, Cambridge, MA 02139, USA.
| | | | | | | | | | | | | | | |
Collapse
|
15
|
Poly(A) signals located near the 5' end of genes are silenced by a general mechanism that prevents premature 3'-end processing. Mol Cell Biol 2010; 31:639-51. [PMID: 21135120 DOI: 10.1128/mcb.00919-10] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Poly(A) signals located at the 3' end of eukaryotic genes drive cleavage and polyadenylation at the same end of pre-mRNA. Although these sequences are expected only at the 3' end of genes, we found that strong poly(A) signals are also predicted within the 5' untranslated regions (UTRs) of many Drosophila melanogaster mRNAs. Most of these 5' poly(A) signals have little influence on the processing of the endogenous transcripts, but they are very active when placed at the 3' end of reporter genes. In investigating these unexpected observations, we discovered that both these novel poly(A) signals and standard poly(A) signals become functionally silent when they are positioned close to transcription start sites in either Drosophila or human cells. This indicates that the stage when the poly(A) signal emerges from the polymerase II (Pol II) transcription complex determines whether a putative poly(A) signal is recognized as functional. The data suggest that this mechanism, which probably prevents cryptic poly(A) signals from causing premature transcription termination, depends on low Ser2 phosphorylation of the C-terminal domain of Pol II and inefficient recruitment of processing factors.
Collapse
|
16
|
Akhtar MN, Bukhari SA, Fazal Z, Qamar R, Shahmuradov IA. POLYAR, a new computer program for prediction of poly(A) sites in human sequences. BMC Genomics 2010; 11:646. [PMID: 21092114 PMCID: PMC3053588 DOI: 10.1186/1471-2164-11-646] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2010] [Accepted: 11/19/2010] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND mRNA polyadenylation is an essential step of pre-mRNA processing in eukaryotes. Accurate prediction of the pre-mRNA 3'-end cleavage/polyadenylation sites is important for defining the gene boundaries and understanding gene expression mechanisms. RESULTS 28761 human mapped poly(A) sites have been classified into three classes containing different known forms of polyadenylation signal (PAS) or none of them (PAS-strong, PAS-weak and PAS-less, respectively) and a new computer program POLYAR for the prediction of poly(A) sites of each class was developed. In comparison with polya_svm (till date the most accurate computer program for prediction of poly(A) sites) while searching for PAS-strong poly(A) sites in human sequences, POLYAR had a significantly higher prediction sensitivity (80.8% versus 65.7%) and specificity (66.4% versus 51.7%) However, when a similar sort of search was conducted for PAS-weak and PAS-less poly(A) sites, both programs had a very low prediction accuracy, which indicates that our knowledge about factors involved in the determination of the poly(A) sites is not sufficient to identify such polyadenylation regions. CONCLUSIONS We present a new classification of polyadenylation sites into three classes and a novel computer program POLYAR for prediction of poly(A) sites/regions of each of the class. In tests, POLYAR shows high accuracy of prediction of the PAS-strong poly(A) sites, though this program's efficiency in searching for PAS-weak and PAS-less poly(A) sites is not very high but is comparable to other available programs. These findings suggest that additional characteristics of such poly(A) sites remain to be elucidated. POLYAR program with a stand-alone version for downloading is available at http://cub.comsats.edu.pk/polyapredict.htm.
Collapse
Affiliation(s)
- Malik Nadeem Akhtar
- Department of Biosciences, COMSATS Institute of Information Technology, Islamabad, Pakistan
| | | | | | | | | |
Collapse
|
17
|
Wang P, Yu P, Gao P, Shi T, Ma D. Discovery of novel human transcript variants by analysis of intronic single-block EST with polyadenylation site. BMC Genomics 2009; 10:518. [PMID: 19906316 PMCID: PMC2784480 DOI: 10.1186/1471-2164-10-518] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2009] [Accepted: 11/12/2009] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND Alternative polyadenylation sites within a gene can lead to alternative transcript variants. Although bioinformatic analysis has been conducted to detect polyadenylation sites using nucleic acid sequences (EST/mRNA) in the public databases, one special type, single-block EST is much less emphasized. This bias leaves a large space to discover novel transcript variants. RESULTS In the present study, we identified novel transcript variants in the human genome by detecting intronic polyadenylation sites. Poly(A/T)-tailed ESTs were obtained from single-block ESTs and clustered into 10,844 groups standing for 5,670 genes. Most sites were not found in other alternative splicing databases. To verify that these sites are from expressed transcripts, we analyzed the supporting EST number of each site, blasted representative ESTs against known mRNA sequences, traced terminal sequences from cDNA clones, and compared with the data of Affymetrix tiling array. These analyses confirmed about 84% (9,118/10,844) of the novel alternative transcripts, especially, 33% (3,575/10,844) of the transcripts from 2,704 genes were taken as high-reliability. Additionally, RT-PCR confirmed 38% (10/26) of predicted novel transcript variants. CONCLUSION Our results provide evidence for novel transcript variants with intronic poly(A) sites. The expression of these novel variants was confirmed with computational and experimental tools. Our data provide a genome-wide resource for identification of novel human transcript variants with intronic polyadenylation sites, and offer a new view into the mystery of the human transcriptome.
Collapse
Affiliation(s)
- Pingzhang Wang
- Chinese National Human Genome Center, #3-707 North YongChang Road BDA, Beijing, PR China.
| | | | | | | | | |
Collapse
|
18
|
Koscielny G, Texier VL, Gopalakrishnan C, Kumanduri V, Riethoven JJ, Nardone F, Stanley E, Fallsehr C, Hofmann O, Kull M, Harrington E, Boué S, Eyras E, Plass M, Lopez F, Ritchie W, Moucadel V, Ara T, Pospisil H, Herrmann A, G. Reich J, Guigó R, Bork P, Doeberitz MVK, Vilo J, Hide W, Apweiler R, Thanaraj TA, Gautheret D. ASTD: The Alternative Splicing and Transcript Diversity database. Genomics 2009; 93:213-20. [DOI: 10.1016/j.ygeno.2008.11.003] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2008] [Revised: 11/03/2008] [Accepted: 11/05/2008] [Indexed: 10/21/2022]
|
19
|
Abstract
The properties and biology of mRNA transcripts can be affected profoundly by the choice of alternative polyadenylation sites, making definition of the 3' ends of transcripts essential for understanding their regulation. Here we show that 22-52% of sequences in commonly used human and murine "full-length" transcript databases may not currently end at bona fide polyadenylation sites. To identify probable transcript termini over the entire murine and human genomes, we analyzed the EST databases for positional clustering of EST ends. The analysis yielded 58,282 murine- and 86,410 human-candidate polyadenylation sites, of which 75% mapped to 23,091 known murine transcripts and 22,891 known human transcripts. The murine dataset correctly predicted 97% of the 3' ends in a manually curated and experimentally supported benchmark transcript set. Of currently known genes, 15% had no associated prediction and 25% had only a single predicted termination site. The remaining genes had an average of 3-4 alternative polyadenylation sites predicted for each murine or human transcript, respectively. The results are made available in the form of tables and an interactive web site that can be mined for rapid assessment of the validity of 3' ends in existing collections, enumeration of potential alternative 3' polyadenylation sites of known transcripts, direct retrieval of terminal sequences for design of probes, and detection of polyadenylation sites not currently mapped to known genes.
Collapse
|
20
|
Chen C, Ara T, Gautheret D. Using Alu elements as polyadenylation sites: A case of retroposon exaptation. Mol Biol Evol 2008; 26:327-34. [PMID: 18984903 DOI: 10.1093/molbev/msn249] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Of the 1.1 million Alu retroposons in the human genome, about 10,000 are inserted in the 3' untranslated regions (UTR) of protein-coding genes and 1% of these (107 events) are active as polyadenylation sites (PASs). Strikingly, although Alu's in 3' UTR are indifferently inserted in the forward or reverse direction, 99% of polyadenylation-active Alu sequences are forward oriented. Consensus Alu+ sequences contain sites that can give rise to polyadenylation signals and enhancers through a few point mutations. We found that the strand bias of polyadenylation-active Alu's reflects a radical difference in the fitness of sense and antisense Alu's toward cleavage/polyadenylation activity. In contrast to previous beliefs, Alu inserts do not necessarily represent weak or cryptic PASs; instead, they often constitute the major or the unique PAS in a gene, adding to the growing list of Alu exaptations. Finally, some Alu-borne PASs are intronic and produce truncated transcripts that may impact gene function and/or contribute to gene remodeling.
Collapse
Affiliation(s)
- Chongjian Chen
- Institut de Génétique et Microbiologie, Université Paris, Orsay, France
| | | | | |
Collapse
|
21
|
Lee TM, Lipovich L. Structural differences of orthologous genes: insights from human-primate comparisons. Genomics 2008; 92:134-43. [PMID: 18606524 DOI: 10.1016/j.ygeno.2008.05.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2007] [Revised: 04/16/2008] [Accepted: 05/02/2008] [Indexed: 01/15/2023]
Abstract
The genomic basis of phenotypic distinctions between humans and nonhuman primates remains insufficiently explained. We hypothesized that interspecies structural differences of orthologous genes can cause such distinctions and searched protein-coding genes conserved between humans and nonhuman primates for species-specific initial and terminal exons. We inferred gene structure differences from genomic locations where portions of primate transcripts aligned with the human genome outside of any human exons. Of 22,466 high-confidence FANTOM3 human transcriptional units, 7424 (33%) had nonhuman primate full-length cDNA support. One hundred eighty-three of the loci contained 68,424 bp of sequence exonic in nonhuman primates but not humans. Fifty-four of 183 included species-specific portions of protein-coding regions. Six genes had evidence of intergenic splicing in a nonhuman primate but not in human. It is imperative that primate transcriptome projects be accelerated on par with genome projects to understand better interspecies gene structure distinctions.
Collapse
Affiliation(s)
- Tuan Meng Lee
- School of Computer Engineering, Nanyang Technological University, Singapore
| | | |
Collapse
|
22
|
Shen Y, Ji G, Haas BJ, Wu X, Zheng J, Reese GJ, Li QQ. Genome level analysis of rice mRNA 3'-end processing signals and alternative polyadenylation. Nucleic Acids Res 2008; 36:3150-61. [PMID: 18411206 PMCID: PMC2396415 DOI: 10.1093/nar/gkn158] [Citation(s) in RCA: 116] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2007] [Revised: 03/18/2008] [Accepted: 03/19/2008] [Indexed: 12/24/2022] Open
Abstract
The position of a poly(A) site of eukaryotic mRNA is determined by sequence signals in pre-mRNA and a group of polyadenylation factors. To reveal rice poly(A) signals at a genome level, we constructed a dataset of 55 742 authenticated poly(A) sites and characterized the poly(A) signals. This resulted in identifying the typical tripartite cis-elements, including FUE, NUE and CE, as previously observed in Arabidopsis. The average size of the 3'-UTR was 289 nucleotides. When mapped to the genome, however, 15% of these poly(A) sites were found to be located in the currently annotated intergenic regions. Moreover, an extensive alternative polyadenylation profile was evident where 50% of the genes analyzed had more than one unique poly(A) site (excluding microheterogeneity sites), and 13% had four or more poly(A) sites. About 4% of the analyzed genes possessed alternative poly(A) sites at their introns, 5'-UTRs, or protein coding regions. The authenticity of these alternative poly(A) sites was partially confirmed using MPSS data. Analysis of nucleotide profile and signal patterns indicated that there may be a different set of poly(A) signals for those poly(A) sites found in the coding regions. Based on the features of rice poly(A) signals, an updated algorithm termed PASS-Rice was designed to predict poly(A) sites.
Collapse
Affiliation(s)
- Yingjia Shen
- Department of Botany, Miami University, Oxford, OH 45056, USA, Department of Automation, Xiamen University, Xiamen, Fujian, China 361005, The Genome Research Institute, Rockville, MD 20850 and IT Research Computing Support Group, Miami University, Oxford, OH 45056, USA
| | - Guoli Ji
- Department of Botany, Miami University, Oxford, OH 45056, USA, Department of Automation, Xiamen University, Xiamen, Fujian, China 361005, The Genome Research Institute, Rockville, MD 20850 and IT Research Computing Support Group, Miami University, Oxford, OH 45056, USA
| | - Brian J. Haas
- Department of Botany, Miami University, Oxford, OH 45056, USA, Department of Automation, Xiamen University, Xiamen, Fujian, China 361005, The Genome Research Institute, Rockville, MD 20850 and IT Research Computing Support Group, Miami University, Oxford, OH 45056, USA
| | - Xiaohui Wu
- Department of Botany, Miami University, Oxford, OH 45056, USA, Department of Automation, Xiamen University, Xiamen, Fujian, China 361005, The Genome Research Institute, Rockville, MD 20850 and IT Research Computing Support Group, Miami University, Oxford, OH 45056, USA
| | - Jianti Zheng
- Department of Botany, Miami University, Oxford, OH 45056, USA, Department of Automation, Xiamen University, Xiamen, Fujian, China 361005, The Genome Research Institute, Rockville, MD 20850 and IT Research Computing Support Group, Miami University, Oxford, OH 45056, USA
| | - Greg J. Reese
- Department of Botany, Miami University, Oxford, OH 45056, USA, Department of Automation, Xiamen University, Xiamen, Fujian, China 361005, The Genome Research Institute, Rockville, MD 20850 and IT Research Computing Support Group, Miami University, Oxford, OH 45056, USA
| | - Qingshun Quinn Li
- Department of Botany, Miami University, Oxford, OH 45056, USA, Department of Automation, Xiamen University, Xiamen, Fujian, China 361005, The Genome Research Institute, Rockville, MD 20850 and IT Research Computing Support Group, Miami University, Oxford, OH 45056, USA
| |
Collapse
|
23
|
Thomas CP, Andrews JI, Liu KZ. Intronic polyadenylation signal sequences and alternate splicing generate human soluble Flt1 variants and regulate the abundance of soluble Flt1 in the placenta. FASEB J 2007; 21:3885-95. [PMID: 17615362 DOI: 10.1096/fj.07-8809com] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The gene FLT1 produces at least two transcripts from a common transcription start site: full-length Flt1 contains 30 exons encoding a membrane-bound VEGF receptor; soluble Flt1 (sFlt1) shares the first 13 exons but utilizes poly(A) signal sequences within intron 13 to create a transcript that lacks downstream exons. To address the mechanisms that regulate human sFlt1, we mapped the 3' end of sFlt1 mRNA and defined the full extent of its 3' untranslated region (UTR). We identified a 3.2 Kb sFlt1 transcript that is cleaved within an alternatively spliced exon downstream of exon 14 and is predicted to encode a C-terminal variant of sFlt1 with an unusual polyserine tail. sFlt1 mRNA cleavage sites within intron 13 were identified in human placenta and in vascular endothelium by ribonuclease protection assay (RPA). A proximal and two distal mRNA cleavage sites were identified by RPA downstream of consensus polyadenylation signals that create variant transcripts with a 3' UTR ranging from 30 bases to approximately 4 Kb. Northern blot analysis and 3' rapid amplification of cDNA ends (RACE) in placenta confirmed the existence of distal intronic sFlt1 cleavage sites that give rise to a sFlt1 transcript of approximately 7 Kb. The identity of the distal signal sequences were then confirmed by mutagenesis of putative signal elements in a polyadenylation reporter assay. We demonstrate the heterogeneity of human sFlt1 that arises from alternate splicing and from alternative polyadenylation directed by strong intronic poly(A) signal sequences leading to C-terminal variants and to an sFlt1 transcript with a large 3' UTR containing several AU rich elements and poly(U) regions that may regulate mRNA stability.
Collapse
Affiliation(s)
- Christie P Thomas
- Department of Internal Medicine and Graduate Program in Molecular and Cellular Biology, E300 GH, University of Iowa Carver College of Medicine, 200 Hawkins Dr., Iowa City, IA 52242-1081 USA.
| | | | | |
Collapse
|
24
|
Moucadel V, Lopez F, Ara T, Benech P, Gautheret D. Beyond the 3' end: experimental validation of extended transcript isoforms. Nucleic Acids Res 2007; 35:1947-57. [PMID: 17339231 PMCID: PMC1874610 DOI: 10.1093/nar/gkm062] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
High throughput EST and full-length cDNA sequencing have revealed extensive variations at the 3' ends of mammalian transcripts. Whether all of these changes are biologically meaningful has been the subject of controversy, as such, results may reflect in part transcription or polyadenylation leakage. We selected here a set of tandem poly(A) sites predicted from EST/cDNA sequence analysis that (i) are conserved between human and mouse, (ii) produce alternative 3' isoforms with unusual size features and (iii) are not documented in current genome databases, and we submitted these sites to experimental validation in mouse tissues. Out of 86 tested poly(A) sites from 44 genes, 84 were individually confirmed using a specially devised RT-PCR strategy. We then focused on validating the exon structure between distant tandem poly(A) sites separated by over 3 kb, and between stop codons and alternative poly(A) sites located at 4.5 kb or more, using a long-distance RT-PCR strategy. In most cases, long transcripts spanning the whole poly(A)-poly(A) or stop-poly(A) distance were detected, confirming that tandem sites were part of the same transcription unit. Given the apparent conservation of these long alternative 3' ends, different regulatory functions can be foreseen, depending on the location where transcription starts.
Collapse
Affiliation(s)
| | | | | | | | - Daniel Gautheret
- *To whom correspondence should be addressed. 33 (0)1 69 15 46 3233 (0)1 69 15 46 29
| |
Collapse
|