1
|
Jonnakuti VS, Wagner EJ, Maletić-Savatić M, Liu Z, Yalamanchili HK. PolyAMiner-Bulk is a deep learning-based algorithm that decodes alternative polyadenylation dynamics from bulk RNA-seq data. CELL REPORTS METHODS 2024; 4:100707. [PMID: 38325383 PMCID: PMC10921021 DOI: 10.1016/j.crmeth.2024.100707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 04/13/2023] [Accepted: 01/11/2024] [Indexed: 02/09/2024]
Abstract
Alternative polyadenylation (APA) is a key post-transcriptional regulatory mechanism; yet, its regulation and impact on human diseases remain understudied. Existing bulk RNA sequencing (RNA-seq)-based APA methods predominantly rely on predefined annotations, severely impacting their ability to decode novel tissue- and disease-specific APA changes. Furthermore, they only account for the most proximal and distal cleavage and polyadenylation sites (C/PASs). Deconvoluting overlapping C/PASs and the inherent noisy 3' UTR coverage in bulk RNA-seq data pose additional challenges. To overcome these limitations, we introduce PolyAMiner-Bulk, an attention-based deep learning algorithm that accurately recapitulates C/PAS sequence grammar, resolves overlapping C/PASs, captures non-proximal-to-distal APA changes, and generates visualizations to illustrate APA dynamics. Evaluation on multiple datasets strongly evinces the performance merit of PolyAMiner-Bulk, accurately identifying more APA changes compared with other methods. With the growing importance of APA and the abundance of bulk RNA-seq data, PolyAMiner-Bulk establishes a robust paradigm of APA analysis.
Collapse
Affiliation(s)
- Venkata Soumith Jonnakuti
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA; Program in Quantitative and Computational Biology, Baylor College of Medicine, Houston, TX 77030, USA; Medical Scientist Training Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - Eric J Wagner
- Department of Biochemistry and Biophysics, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA
| | - Mirjana Maletić-Savatić
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA
| | - Zhandong Liu
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA; Program in Quantitative and Computational Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Hari Krishna Yalamanchili
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA; USDA/ARS Children's Nutrition Research Center, Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA.
| |
Collapse
|
2
|
Polenkowski M, Allister AB, Burbano de Lara S, Soltau M, Kendre G, Tran DDH. Mapping alternative polyadenylation in human cells using direct RNA sequencing technology. STAR Protoc 2023; 4:102420. [PMID: 37432858 PMCID: PMC10362186 DOI: 10.1016/j.xpro.2023.102420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 05/19/2023] [Accepted: 06/08/2023] [Indexed: 07/13/2023] Open
Abstract
Alternative cleavage and polyadenylation (APA) is a widespread mechanism to generate mRNA isoforms with alternative 3' untranslated regions. Here, we detail a protocol for detecting APA genome wide using direct RNA sequencing technology including computational analysis. We describe steps for RNA sample and library preparation, nanopore sequencing, and data analysis. Experiments and data analysis can be performed over a period of 6-8 days and require molecular biology and bioinformatics skills. For complete details on the use and execution of this protocol, please refer to Polenkowski et al.1.
Collapse
Affiliation(s)
- Mareike Polenkowski
- Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30623 Hannover, Germany
| | | | | | - Madleen Soltau
- Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30623 Hannover, Germany
| | - Gajanan Kendre
- Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30623 Hannover, Germany; Department of Life Science, National Institute of Technology, Rourkela, Odisha 769008, India
| | - Doan Duy Hai Tran
- Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30623 Hannover, Germany.
| |
Collapse
|
3
|
Jonnakuti VS, Wagner EJ, Maletić-Savatić M, Liu Z, Yalamanchili HK. PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.23.523471. [PMID: 36747700 PMCID: PMC9900750 DOI: 10.1101/2023.01.23.523471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
More than half of human genes exercise alternative polyadenylation (APA) and generate mRNA transcripts with varying 3' untranslated regions (UTR). However, current computational approaches for identifying cleavage and polyadenylation sites (C/PASs) and quantifying 3'UTR length changes from bulk RNA-seq data fail to unravel tissue- and disease-specific APA dynamics. Here, we developed a next-generation bioinformatics algorithm and application, PolyAMiner-Bulk, that utilizes an attention-based machine learning architecture and an improved vector projection-based engine to infer differential APA dynamics accurately. When applied to earlier studies, PolyAMiner-Bulk accurately identified more than twice the number of APA changes in an RBM17 knockdown bulk RNA-seq dataset compared to current generation tools. Moreover, on a separate dataset, PolyAMiner-Bulk revealed novel APA dynamics and pathways in scleroderma pathology and identified differential APA in a gene that was identified as being involved in scleroderma pathogenesis in an independent study. Lastly, we used PolyAMiner-Bulk to analyze the RNA-seq data of post-mortem prefrontal cortexes from the ROSMAP data consortium and unraveled novel APA dynamics in Alzheimer's Disease. Our method, PolyAMiner-Bulk, creates a paradigm for future alternative polyadenylation analysis from bulk RNA-seq data.
Collapse
Affiliation(s)
- Venkata Soumith Jonnakuti
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
- Program in Quantitative and Computational Biology, Baylor College of Medicine, Houston, TX, 77030, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Eric J. Wagner
- Department of Biochemistry and Biophysics, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA
| | - Mirjana Maletić-Savatić
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
| | - Zhandong Liu
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
- Program in Quantitative and Computational Biology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Hari Krishna Yalamanchili
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
| |
Collapse
|
4
|
Li Z, Gao E, Zhou J, Han W, Xu X, Gao X. Applications of deep learning in understanding gene regulation. CELL REPORTS METHODS 2023; 3:100384. [PMID: 36814848 PMCID: PMC9939384 DOI: 10.1016/j.crmeth.2022.100384] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Gene regulation is a central topic in cell biology. Advances in omics technologies and the accumulation of omics data have provided better opportunities for gene regulation studies than ever before. For this reason deep learning, as a data-driven predictive modeling approach, has been successfully applied to this field during the past decade. In this article, we aim to give a brief yet comprehensive overview of representative deep-learning methods for gene regulation. Specifically, we discuss and compare the design principles and datasets used by each method, creating a reference for researchers who wish to replicate or improve existing methods. We also discuss the common problems of existing approaches and prospectively introduce the emerging deep-learning paradigms that will potentially alleviate them. We hope that this article will provide a rich and up-to-date resource and shed light on future research directions in this area.
Collapse
Affiliation(s)
- Zhongxiao Li
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Elva Gao
- The KAUST School, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Juexiao Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Wenkai Han
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xiaopeng Xu
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| |
Collapse
|
5
|
Comprehensive mapping of alternative polyadenylation site usage and its dynamics at single-cell resolution. Proc Natl Acad Sci U S A 2022; 119:e2113504119. [PMID: 36454750 PMCID: PMC9894249 DOI: 10.1073/pnas.2113504119] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Alternative polyadenylation (APA) plays an important role in posttranscriptional gene regulation such as transcript stability and translation efficiency. However, our knowledge about APA dynamics at the single-cell level is largely unexplored. Here, we developed single-cell polyadenylation sequencing, a strand-specific approach for sequencing the 3' end of transcripts, to investigate the landscape of APA at the single-cell level. By analyzing several cell lines, we found many genes using multiple polyA sites in bulk data are prone to use only one polyA site in each single cell. Interestingly, cell cycle genes were significantly enriched in genes with high variation in polyA site usages. Furthermore, the 414 genes showing a polyA site usage switch after cell synchronization enriched cell cycle genes, while the differentially expressed genes after cell synchronization did not enrich cell cycle genes. We further identified 812 genes showing polyA site usage changes between neighboring cell cycles, which were grouped into six clusters, with cell phase-specific functional categories enriched in each cluster. Deletion of one polyA site in MSL1 and SCCPDH results in slower and faster cell cycle progression, respectively, supporting polyA site usage switch played an important role in cell cycle. These results indicate that APA is an important layer for cell cycle regulation.
Collapse
|
6
|
Svoboda M, Frost HR, Bosco G. Internal oligo(dT) priming introduces systematic bias in bulk and single-cell RNA sequencing count data. NAR Genom Bioinform 2022; 4:lqac035. [PMID: 35651651 PMCID: PMC9142200 DOI: 10.1093/nargab/lqac035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 03/25/2022] [Accepted: 04/29/2022] [Indexed: 12/28/2022] Open
Abstract
Significant advances in RNA sequencing have been recently made possible by using oligo(dT) primers for simultaneous mRNA enrichment and reverse transcription priming. The associated increase in efficiency has enabled more economical bulk RNA sequencing methods and the advent of high-throughput single-cell RNA sequencing, already one of the most widely adopted methods in transcriptomics. However, the effects of off-target oligo(dT) priming on gene expression quantification have not been appreciated. In the present study, we describe the extent, the possible causes, and the consequences of internal oligo(dT) priming across multiple public datasets obtained from various bulk and single-cell RNA sequencing platforms. To explore and address this issue, we developed a computational algorithm for RNA counting methods, which identifies the sequencing read alignments that likely resulted from internal oligo(dT) priming and removes them from the data. Directly comparing filtered datasets to those obtained by an alternative method reveals significant improvements in gene expression measurement. Finally, we infer a list of human genes whose expression quantification is most likely to be affected by internal oligo(dT) priming and predict that when measured using these methods, the expression of most genes may be inflated by at least 10% whereby some genes are affected more than others.
Collapse
Affiliation(s)
| | - H Robert Frost
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA
| | - Giovanni Bosco
- Correspondence may also be addressed to Giovanni Bosco. Tel: +1 603 650 1210; Fax: +1 603 650 1188;
| |
Collapse
|
7
|
Yang X, Tong Y, Liu G, Yuan J, Yang Y. scAPAatlas: an atlas of alternative polyadenylation across cell types in human and mouse. Nucleic Acids Res 2021; 50:D356-D364. [PMID: 34643729 PMCID: PMC8728290 DOI: 10.1093/nar/gkab917] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 09/15/2021] [Accepted: 09/25/2021] [Indexed: 12/22/2022] Open
Abstract
Alternative polyadenylation (APA) has been widely recognized as a crucial step during the post-transcriptional regulation of eukaryotic genes. Recent studies have demonstrated that APA exerts key regulatory roles in many biological processes and often occurs in a tissue- and cell-type-specific manner. However, to our knowledge, there is no database incorporating information about APA at the cell-type level. Single-cell RNA-seq is a rapidly evolving and powerful tool that enable APA analysis at the cell-type level. Here, we present a comprehensive resource, scAPAatlas (http://www.bioailab.com:3838/scAPAatlas), for exploring APA across different cell types, and interpreting potential biological functions. Based on the curated scRNA-seq data from 24 human and 25 mouse normal tissues, we systematically identified cell-type-specific APA events for different cell types and examined the correlations between APA and gene expression level. We also estimated the crosstalk between cell-type-specific APA events and microRNAs or RNA-binding proteins. A user-friendly web interface has been constructed to support browsing, searching and visualizing multi-layer information of cell-type-specific APA events. Overall, scAPAatlas, incorporating a rich resource for exploration of APA at the cell-type level, will greatly help researchers chart cell type with APA and elucidate the biological functions of APA.
Collapse
Affiliation(s)
- Xiaoxiao Yang
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammation Biology, Tianjin Key Laboratory of Medical Epigenetics, Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China.,Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Yang Tong
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammation Biology, Tianjin Key Laboratory of Medical Epigenetics, Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China.,Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Gerui Liu
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammation Biology, Tianjin Key Laboratory of Medical Epigenetics, Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China.,Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Jiapei Yuan
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Tianjin 300020, China
| | - Yang Yang
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammation Biology, Tianjin Key Laboratory of Medical Epigenetics, Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China.,Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| |
Collapse
|
8
|
Li L, Huang KL, Gao Y, Cui Y, Wang G, Elrod ND, Li Y, Chen YE, Ji P, Peng F, Russell WK, Wagner EJ, Li W. An atlas of alternative polyadenylation quantitative trait loci contributing to complex trait and disease heritability. Nat Genet 2021; 53:994-1005. [PMID: 33986536 DOI: 10.1038/s41588-021-00864-5] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 04/05/2021] [Indexed: 12/14/2022]
Abstract
Genome-wide association studies have identified thousands of noncoding variants associated with human traits and diseases. However, the functional interpretation of these variants is a major challenge. Here, we constructed a multi-tissue atlas of human 3'UTR alternative polyadenylation (APA) quantitative trait loci (3'aQTLs), containing approximately 0.4 million common genetic variants associated with the APA of target genes, identified in 46 tissues isolated from 467 individuals (Genotype-Tissue Expression Project). Mechanistically, 3'aQTLs can alter poly(A) motifs, RNA secondary structure and RNA-binding protein-binding sites, leading to thousands of APA changes. Our CRISPR-based experiments indicate that such 3'aQTLs can alter APA regulation. Furthermore, we demonstrate that mapping 3'aQTLs can identify APA regulators, such as La-related protein 4. Finally, 3'aQTLs are colocalized with approximately 16.1% of trait-associated variants and are largely distinct from other QTLs, such as expression QTLs. Together, our findings show that 3'aQTLs contribute substantially to the molecular mechanisms underlying human complex traits and diseases.
Collapse
Affiliation(s)
- Lei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, USA
| | - Kai-Lieh Huang
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX, USA
| | - Yipeng Gao
- Graduate Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX, USA
| | - Ya Cui
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, USA
| | - Gao Wang
- The Gertrude H. Sergievsky Center and Department of Neurology, Columbia University, New York, NY, USA
| | - Nathan D Elrod
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX, USA
| | - Yumei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, USA
| | - Yiling Elaine Chen
- Department of Statistics, University of California, Los Angeles, CA, USA
| | - Ping Ji
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX, USA
| | - Fanglue Peng
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - William K Russell
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX, USA
| | - Eric J Wagner
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX, USA.
| | - Wei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, USA.
| |
Collapse
|
9
|
Yalamanchili HK, Elrod ND, Jensen MK, Ji P, Lin A, Wagner EJ, Liu Z. A computational pipeline to infer alternative poly-adenylation from 3' sequencing data. Methods Enzymol 2021; 655:185-204. [PMID: 34183121 PMCID: PMC10866047 DOI: 10.1016/bs.mie.2021.04.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
An increasing number of investigations have established alternative polyadenylation (APA) as a key mechanism of gene regulation through altering the length of 3' untranslated region (UTR) and generating distinct mRNA termini. Further, appreciation for the significance of APA in disease contexts propelled the development of several 3' sequencing techniques. While these RNA sequencing technologies have advanced APA analysis, the intrinsic limitation of 3' read coverage and lack of appropriate computational tools constrain precise mapping and quantification of polyadenylation sites. Notably, Poly(A)-ClickSeq (PAC-seq) overcomes limiting factors such as poly(A) enrichment and 3' linker ligation steps using click-chemistry. Here we provide an updated PolyA-miner protocol, a computational approach to analyze PAC-seq or other 3'-Seq datasets. As a key practical constraint, we also provide a detailed account on the impact of sequencing depth on the number of detected polyadenylation sites and APA changes. This protocol is also updated to handle unique molecular identifiers used to address PCR duplication potentially observed in PAC-seq.
Collapse
Affiliation(s)
- Hari Krishna Yalamanchili
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, United States; Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, United States; USDA/ARS Children's Nutrition Research Center, Department of Pediatrics, Baylor College of Medicine, Houston, TX, United States
| | - Nathan D Elrod
- Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch at Galveston, Galveston, TX, United States
| | - Madeline K Jensen
- Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch at Galveston, Galveston, TX, United States
| | - Ping Ji
- Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch at Galveston, Galveston, TX, United States
| | - Ai Lin
- Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch at Galveston, Galveston, TX, United States; Department of Etiology and Carcinogenesis, National Cancer Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Eric J Wagner
- Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch at Galveston, Galveston, TX, United States
| | - Zhandong Liu
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, United States; Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, United States.
| |
Collapse
|
10
|
Jeschke J, Collignon E, Al Wardi C, Krayem M, Bizet M, Jia Y, Garaud S, Wimana Z, Calonne E, Hassabi B, Morandini R, Deplus R, Putmans P, Dube G, Singh NK, Koch A, Shostak K, Rizzotto L, Ross RL, Desmedt C, Bareche Y, Rothé F, Lehmann-Che J, Duterque-Coquillaud M, Leroy X, Menschaert G, Teixeira L, Guo M, Limbach PA, Close P, Chariot A, Leucci E, Ghanem G, Yuan BF, Willard-Gallo K, Sotiriou C, Marine JC, Fuks F. Downregulation of the FTO m 6A RNA demethylase promotes EMT-mediated progression of epithelial tumors and sensitivity to Wnt inhibitors. NATURE CANCER 2021; 2:611-628. [PMID: 35121941 PMCID: PMC10734094 DOI: 10.1038/s43018-021-00223-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Accepted: 05/17/2021] [Indexed: 02/05/2023]
Abstract
Post-transcriptional modifications of RNA constitute an emerging regulatory layer of gene expression. The demethylase fat mass- and obesity-associated protein (FTO), an eraser of N6-methyladenosine (m6A), has been shown to play a role in cancer, but its contribution to tumor progression and the underlying mechanisms remain unclear. Here, we report widespread FTO downregulation in epithelial cancers associated with increased invasion, metastasis and worse clinical outcome. Both in vitro and in vivo, FTO silencing promotes cancer growth, cell motility and invasion. In human-derived tumor xenografts (PDXs), FTO pharmacological inhibition favors tumorigenesis. Mechanistically, we demonstrate that FTO depletion elicits an epithelial-to-mesenchymal transition (EMT) program through increased m6A and altered 3'-end processing of key mRNAs along the Wnt signaling cascade. Accordingly, FTO knockdown acts via EMT to sensitize mouse xenografts to Wnt inhibition. We thus identify FTO as a key regulator, across epithelial cancers, of Wnt-triggered EMT and tumor progression and reveal a therapeutically exploitable vulnerability of FTO-low tumors.
Collapse
Affiliation(s)
- Jana Jeschke
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Evelyne Collignon
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Clémence Al Wardi
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Mohammad Krayem
- Laboratory of Oncology and Experimental Surgery, Institut Jules Bordet, ULB, Brussels, Belgium
| | - Martin Bizet
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Yan Jia
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium
- Department of Breast Oncology, Tianjin Medical University Cancer Institute and Hospital, Tianjin, China
| | - Soizic Garaud
- Molecular Immunology Laboratory, Institut Jules Bordet, ULB, Brussels, Belgium
| | - Zéna Wimana
- Laboratory of Oncology and Experimental Surgery, Institut Jules Bordet, ULB, Brussels, Belgium
- Department of Nuclear Medicine, Institut Jules Bordet, ULB, Brussels, Belgium
| | - Emilie Calonne
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Bouchra Hassabi
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Renato Morandini
- Laboratory of Oncology and Experimental Surgery, Institut Jules Bordet, ULB, Brussels, Belgium
| | - Rachel Deplus
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Pascale Putmans
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Gaurav Dube
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Nitesh Kumar Singh
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Alexander Koch
- Department of Pathology, Maastricht UMC, Maastricht, the Netherlands
| | - Kateryna Shostak
- Laboratory of Medical Chemistry, GIGA Stem Cells, University of Liège, Liège, Belgium
| | - Lara Rizzotto
- Trace, LKI Leuven Cancer Institute, KU Leuven, Leuven, Belgium
| | - Robert L Ross
- Rieveschl Laboratories for Mass Spectrometry, Department of Chemistry, University of Cincinnati, Cincinnati, OH, USA
| | - Christine Desmedt
- Breast Cancer Translational Research Laboratory, Institut Jules Bordet, U-CRC, ULB, Brussels, Belgium
| | - Yacine Bareche
- Breast Cancer Translational Research Laboratory, Institut Jules Bordet, U-CRC, ULB, Brussels, Belgium
| | - Françoise Rothé
- Breast Cancer Translational Research Laboratory, Institut Jules Bordet, U-CRC, ULB, Brussels, Belgium
| | - Jacqueline Lehmann-Che
- Pathophysiology of Breast Cancer Team, Université de Paris, INSERM U976, HIPI, Paris, France
- Breast Disease Unit and Molecular Oncology Unit, AP-HP, Hôpital Saint-Louis, Paris, France
| | - Martine Duterque-Coquillaud
- Université Lille, CNRS, Inserm, CHU Lille, Institut Pasteur de Lille, UMR9020-UMR-S 1277, CANTHER, Lille, France
| | - Xavier Leroy
- Université Lille, CNRS, Inserm, CHU Lille, Institut Pasteur de Lille, UMR9020-UMR-S 1277, CANTHER, Lille, France
- Department of Pathology, CHU Lille, Université Lille, Lille, France
| | - Gerben Menschaert
- Biobix, Laboratory of Bioinformatics and Computational Genomics, Ghent University, Ghent, Belgium
| | - Luis Teixeira
- Pathophysiology of Breast Cancer Team, Université de Paris, INSERM U976, HIPI, Paris, France
- Breast Disease Unit and Molecular Oncology Unit, AP-HP, Hôpital Saint-Louis, Paris, France
| | - Mingzhou Guo
- Department of Gastroenterology & Hepatology, Chinese PLA General Hospital, Beijing, China
| | - Patrick A Limbach
- Rieveschl Laboratories for Mass Spectrometry, Department of Chemistry, University of Cincinnati, Cincinnati, OH, USA
| | - Pierre Close
- Laboratory of Cancer Signaling, GIGA Stem Cells, University of Liège, Liège, Belgium
- WELBIO, University of Liège, Liège, Belgium
| | - Alain Chariot
- Laboratory of Medical Chemistry, GIGA Stem Cells, University of Liège, Liège, Belgium
- WELBIO, University of Liège, Liège, Belgium
| | - Eleonora Leucci
- Trace, LKI Leuven Cancer Institute, KU Leuven, Leuven, Belgium
- Laboratory of RNA Cancer Biology, Department of Oncology, LKI, KU Leuven, Leuven, Belgium
| | - Ghanem Ghanem
- Laboratory of Oncology and Experimental Surgery, Institut Jules Bordet, ULB, Brussels, Belgium
| | - Bi-Feng Yuan
- College of Chemistry and Molecular Sciences, Wuhan University, Wuhan, China
| | - Karen Willard-Gallo
- Molecular Immunology Laboratory, Institut Jules Bordet, ULB, Brussels, Belgium
| | - Christos Sotiriou
- Breast Cancer Translational Research Laboratory, Institut Jules Bordet, U-CRC, ULB, Brussels, Belgium
| | - Jean-Christophe Marine
- Laboratory for Molecular Cancer Biology, VIB, KU Leuven, Leuven, Belgium
- Laboratory for Molecular Cancer Biology, Department of Oncology, KU Leuven, Leuven, Belgium
| | - François Fuks
- Laboratory of Cancer Epigenetics, Faculty of Medicine, ULB-Cancer Research Center (U-CRC), Université Libre de Bruxelles (ULB), Brussels, Belgium.
- WELBIO, Université Libre de Bruxelles (ULB), Brussels, Belgium.
| |
Collapse
|
11
|
Kim S, Bai Y, Fan Z, Diergaarde B, Tseng GC, Park HJ. The microRNA target site landscape is a novel molecular feature associating alternative polyadenylation with immune evasion activity in breast cancer. Brief Bioinform 2021; 22:bbaa191. [PMID: 32844230 PMCID: PMC8138879 DOI: 10.1093/bib/bbaa191] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 07/10/2020] [Accepted: 07/28/2020] [Indexed: 12/14/2022] Open
Abstract
Alternative polyadenylation (APA) in breast tumor samples results in the removal/addition of cis-regulatory elements such as microRNA (miRNA) target sites in the 3'-untranslated region (3'-UTRs) of genes. Although previous computational APA studies focused on a subset of genes strongly affected by APA (APA genes), we identify miRNAs of which widespread APA events collectively increase or decrease the number of target sites [probabilistic inference of microRNA target site modification through APA (PRIMATA-APA)]. Using PRIMATA-APA on the cancer genome atlas (TCGA) breast cancer data, we found that the global APA events change the number of the target sites of particular microRNAs [target sites modified miRNA (tamoMiRNA)] enriched for cancer development and treatments. We also found that when knockdown (KD) of NUDT21 in HeLa cells induces a different set of widespread 3'-UTR shortening than TCGA breast cancer data, it changes the target sites of the common tamoMiRNAs. Since the NUDT21 KD experiment previously demonstrated the tumorigenic role of APA events in a miRNA dependent fashion, this result suggests that the APA-initiated tumorigenesis is attributable to the miRNA target site changes, not the APA events themselves. Further, we found that the miRNA target site changes identify tumor cell proliferation and immune cell infiltration to the tumor microenvironment better than the miRNA expression levels or the APA events themselves. Altogether, our computational analyses provide a proof-of-concept demonstration that the miRNA target site information indicates the effect of global APA events with a potential as predictive biomarker.
Collapse
Affiliation(s)
- Soyeon Kim
- Department of Pediatrics, University of Pittsburgh Medical Center and in Division of Pulmonary Medicine, Children’s Hospital of Pittsburgh of UPMC
| | - YuLong Bai
- Department of Human Genetics in the Graduate School of Public Health, University of Pittsburgh
| | - Zhenjiang Fan
- Department of Computer Science, University of Pittsburgh
| | - Brenda Diergaarde
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh
| | - George C Tseng
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh
| | - Hyun Jung Park
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh
| |
Collapse
|
12
|
Kandhari N, Kraupner-Taylor CA, Harrison PF, Powell DR, Beilharz TH. The Detection and Bioinformatic Analysis of Alternative 3 ' UTR Isoforms as Potential Cancer Biomarkers. Int J Mol Sci 2021; 22:5322. [PMID: 34070203 PMCID: PMC8158509 DOI: 10.3390/ijms22105322] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 05/06/2021] [Accepted: 05/06/2021] [Indexed: 12/17/2022] Open
Abstract
Alternative transcript cleavage and polyadenylation is linked to cancer cell transformation, proliferation and outcome. This has led researchers to develop methods to detect and bioinformatically analyse alternative polyadenylation as potential cancer biomarkers. If incorporated into standard prognostic measures such as gene expression and clinical parameters, these could advance cancer prognostic testing and possibly guide therapy. In this review, we focus on the existing methodologies, both experimental and computational, that have been applied to support the use of alternative polyadenylation as cancer biomarkers.
Collapse
Affiliation(s)
- Nitika Kandhari
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| | - Calvin A. Kraupner-Taylor
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| | - Paul F. Harrison
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
- Monash Bioinformatics Platform, Monash University, Melbourne, VIC 3800, Australia;
| | - David R. Powell
- Monash Bioinformatics Platform, Monash University, Melbourne, VIC 3800, Australia;
| | - Traude H. Beilharz
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| |
Collapse
|
13
|
Andreassi C, Luisier R, Crerar H, Darsinou M, Blokzijl-Franke S, Lenn T, Luscombe NM, Cuda G, Gaspari M, Saiardi A, Riccio A. Cytoplasmic cleavage of IMPA1 3' UTR is necessary for maintaining axon integrity. Cell Rep 2021; 34:108778. [PMID: 33626357 PMCID: PMC7918530 DOI: 10.1016/j.celrep.2021.108778] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 12/22/2020] [Accepted: 01/29/2021] [Indexed: 12/31/2022] Open
Abstract
The 3' untranslated regions (3' UTRs) of messenger RNAs (mRNAs) are non-coding sequences involved in many aspects of mRNA metabolism, including intracellular localization and translation. Incorrect processing and delivery of mRNA cause severe developmental defects and have been implicated in many neurological disorders. Here, we use deep sequencing to show that in sympathetic neuron axons, the 3' UTRs of many transcripts undergo cleavage, generating isoforms that express the coding sequence with a short 3' UTR and stable 3' UTR-derived fragments of unknown function. Cleavage of the long 3' UTR of Inositol Monophosphatase 1 (IMPA1) mediated by a protein complex containing the endonuclease argonaute 2 (Ago2) generates a translatable isoform that is necessary for maintaining the integrity of sympathetic neuron axons. Thus, our study provides a mechanism of mRNA metabolism that simultaneously regulates local protein synthesis and generates an additional class of 3' UTR-derived RNAs.
Collapse
Affiliation(s)
- Catia Andreassi
- MRC Laboratory for Molecular Cell Biology, University College London, London WC1E 6BT, UK
| | | | - Hamish Crerar
- MRC Laboratory for Molecular Cell Biology, University College London, London WC1E 6BT, UK
| | - Marousa Darsinou
- MRC Laboratory for Molecular Cell Biology, University College London, London WC1E 6BT, UK
| | - Sasja Blokzijl-Franke
- MRC Laboratory for Molecular Cell Biology, University College London, London WC1E 6BT, UK
| | - Tchern Lenn
- MRC Laboratory for Molecular Cell Biology, University College London, London WC1E 6BT, UK
| | - Nicholas M Luscombe
- Francis Crick Institute, London NW1 1AT, UK; UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Giovanni Cuda
- Research Centre for Advanced Biochemistry and Molecular Biology, Department of Experimental and Clinical Medicine, Magna Graecia University of Catanzaro, Catanzaro 88100, Italy
| | - Marco Gaspari
- Research Centre for Advanced Biochemistry and Molecular Biology, Department of Experimental and Clinical Medicine, Magna Graecia University of Catanzaro, Catanzaro 88100, Italy
| | - Adolfo Saiardi
- MRC Laboratory for Molecular Cell Biology, University College London, London WC1E 6BT, UK
| | - Antonella Riccio
- MRC Laboratory for Molecular Cell Biology, University College London, London WC1E 6BT, UK.
| |
Collapse
|
14
|
Jin W, Zhu Q, Yang Y, Yang W, Wang D, Yang J, Niu X, Yu D, Gong J. Animal-APAdb: a comprehensive animal alternative polyadenylation database. Nucleic Acids Res 2021; 49:D47-D54. [PMID: 32986825 PMCID: PMC7779049 DOI: 10.1093/nar/gkaa778] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 08/27/2020] [Accepted: 09/08/2020] [Indexed: 12/31/2022] Open
Abstract
Alternative polyadenylation (APA) is an important post-transcriptional regulatory mechanism that recognizes different polyadenylation signals on transcripts, resulting in transcripts with different lengths of 3′ untranslated regions and thereby influencing a series of biological processes. Recent studies have highlighted the important roles of APA in human. However, APA profiles in other animals have not been fully recognized, and there is no database that provides comprehensive APA information for other animals except human. Here, by using the RNA sequencing data collected from public databases, we systematically characterized the APA profiles in 9244 samples of 18 species. In total, we identified 342 952 APA events with a median of 17 020 per species using the DaPars2 algorithm, and 315 691 APA events with a median of 17 953 per species using the QAPA algorithm in these 18 species, respectively. In addition, we predicted the polyadenylation sites (PAS) and motifs near PAS of these species. We further developed Animal-APAdb, a user-friendly database (http://gong_lab.hzau.edu.cn/Animal-APAdb/) for data searching, browsing and downloading. With comprehensive information of APA events in different tissues of different species, Animal-APAdb may greatly facilitate the exploration of animal APA patterns and novel mechanisms, gene expression regulation and APA evolution across tissues and species.
Collapse
Affiliation(s)
- Weiwei Jin
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Qizhao Zhu
- College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, P.R. China
| | - Yanbo Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Wenqian Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Dongyang Wang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Jiajun Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Xiaohui Niu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Debing Yu
- College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, P.R. China
| | - Jing Gong
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China.,College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, P.R. China
| |
Collapse
|
15
|
Tu M, Li Y. Profiling Alternative 3' Untranslated Regions in Sorghum using RNA-seq Data. Front Genet 2020; 11:556749. [PMID: 33193635 PMCID: PMC7649775 DOI: 10.3389/fgene.2020.556749] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 09/30/2020] [Indexed: 12/18/2022] Open
Abstract
Sorghum is an important crop widely used for food, feed, and fuel. Transcriptome-wide studies of 3′ untranslated regions (3′UTR) using regular RNA-seq remain scarce in sorghum, while transcriptomes have been characterized extensively using Illumina short-read sequencing platforms for many sorghum varieties under various conditions or developmental contexts. 3′UTR is a critical regulatory component of genes, controlling the translation, transport, and stability of messenger RNAs. In the present study, we profiled the alternative 3′UTRs at the transcriptome level in three genetically related but phenotypically contrasting lines of sorghum: Rio, BTx406, and R9188. A total of 1,197 transcripts with alternative 3′UTRs were detected using RNA-seq data. Their categorization identified 612 high-confidence alternative 3′UTRs. Importantly, the high-confidence alternative 3′UTR genes significantly overlapped with the genesets that are associated with RNA N6-methyladenosine (m6A) modification, suggesting a clear indication between alternative 3′UTR and m6A methylation in sorghum. Moreover, taking advantage of sorghum genetics, we provided evidence of genotype specificity of alternative 3′UTR usage. In summary, our work exemplifies a transcriptome-wide profiling of alternative 3′UTRs using regular RNA-seq data in non-model crops and gains insights into alternative 3′UTRs and their genotype specificity.
Collapse
Affiliation(s)
- Min Tu
- Waksman Institute of Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
| | - Yin Li
- Waksman Institute of Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
| |
Collapse
|
16
|
Yalamanchili HK, Alcott CE, Ji P, Wagner EJ, Zoghbi HY, Liu Z. PolyA-miner: accurate assessment of differential alternative poly-adenylation from 3'Seq data using vector projections and non-negative matrix factorization. Nucleic Acids Res 2020; 48:e69. [PMID: 32463457 PMCID: PMC7337927 DOI: 10.1093/nar/gkaa398] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 04/05/2020] [Accepted: 05/04/2020] [Indexed: 12/23/2022] Open
Abstract
Almost 70% of human genes undergo alternative polyadenylation (APA) and generate mRNA transcripts with varying lengths, typically of the 3′ untranslated regions (UTR). APA plays an important role in development and cellular differentiation, and its dysregulation can cause neuropsychiatric diseases and increase cancer severity. Increasing awareness of APA’s role in human health and disease has propelled the development of several 3′ sequencing (3′Seq) techniques that allow for precise identification of APA sites. However, despite the recent data explosion, there are no robust computational tools that are precisely designed to analyze 3′Seq data. Analytical approaches that have been used to analyze these data predominantly use proximal to distal usage. With about 50% of human genes having more than two APA isoforms, current methods fail to capture the entirety of APA changes and do not account for non-proximal to non-distal changes. Addressing these key challenges, this study demonstrates PolyA-miner, an algorithm to accurately detect and assess differential alternative polyadenylation specifically from 3′Seq data. Genes are abstracted as APA matrices, and differential APA usage is inferred using iterative consensus non-negative matrix factorization (NMF) based clustering. PolyA-miner accounts for all non-proximal to non-distal APA switches using vector projections and reflects precise gene-level 3′UTR changes. It can also effectively identify novel APA sites that are otherwise undetected when using reference-based approaches. Evaluation on multiple datasets—first-generation MicroArray Quality Control (MAQC) brain and Universal Human Reference (UHR) PolyA-seq data, recent glioblastoma cell line NUDT21 knockdown Poly(A)-ClickSeq (PAC-seq) data, and our own mouse hippocampal and human stem cell-derived neuron PAC-seq data—strongly supports the value and protocol-independent applicability of PolyA-miner. Strikingly, in the glioblastoma cell line data, PolyA-miner identified more than twice the number of genes with APA changes than initially reported. With the emerging importance of APA in human development and disease, PolyA-miner can significantly improve data analysis and help decode the underlying APA dynamics.
Collapse
Affiliation(s)
- Hari Krishna Yalamanchili
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.,Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA
| | - Callison E Alcott
- Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA.,Program in Developmental Biology, Baylor College of Medicine, Houston, TX 77030, USA.,Medical Scientist Training Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ping Ji
- Department of Biochemistry & Molecular Biology, University of Texas Medical Branch, Galveston, TX, 77555, USA
| | - Eric J Wagner
- Department of Biochemistry & Molecular Biology, University of Texas Medical Branch, Galveston, TX, 77555, USA
| | - Huda Y Zoghbi
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.,Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA.,Howard Hughes Medical Institute, Houston, TX 77030, USA.,Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA.,Department of Neuroscience, Baylor College of Medicine, Houston, TX 77030, USA
| | - Zhandong Liu
- Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA.,Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
17
|
Fan Z, Kim S, Bai Y, Diergaarde B, Park HJ. 3'-UTR Shortening Contributes to Subtype-Specific Cancer Growth by Breaking Stable ceRNA Crosstalk of Housekeeping Genes. Front Bioeng Biotechnol 2020; 8:334. [PMID: 32411683 PMCID: PMC7201092 DOI: 10.3389/fbioe.2020.00334] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 03/25/2020] [Indexed: 12/21/2022] Open
Abstract
Shortening of 3'UTRs (3'US) through alternative polyadenylation is a post-transcriptional mechanism that regulates the expression of hundreds of genes in human cancers. In breast cancer, different subtypes of tumor samples, such as estrogen receptor positive and negative (ER+ and ER-), are characterized by distinct molecular mechanisms, suggesting possible differences in the post-transcriptional regulation between the subtype tumors. In this study, based on the profound tumorigenic role of 3'US interacting with competing-endogenous RNA (ceRNA) network (3'US-ceRNA effect), we hypothesize that the 3'US-ceRNA effect drives subtype-specific tumor growth. However, we found that the subtypes are available in different sample sizes, biasing the ceRNA network size and disabling the fair comparison of the 3'US-ceRNA effect. Using normalized Laplacian matrix eigenvalue distribution, we addressed this bias and built tumor ceRNA networks comparable between the subtypes. Based on the comparison, we identified a novel role of housekeeping (HK) genes as stable and strong miRNA sponges (sponge HK genes) that synchronize the ceRNA networks of normal samples (adjacent to ER+ and ER- tumor samples). We further found that distinct 3'US events in the ER- tumor break the stable sponge effect of HK genes in a subtype-specific fashion, especially in association with the aggressive and metastatic phenotypes. Knockdown of NUDT21 further suggested the role of 3'US-ceRNA effect in repressing HK genes for tumor growth. In this study, we identified 3'US-ceRNA effect on the sponge HK genes for subtype-specific growth of ER- tumors.
Collapse
Affiliation(s)
- Zhenjiang Fan
- Department of Computer Science, University of Pittsburgh, Pittsburgh, PA, United States
| | - Soyeon Kim
- Department of Pediatrics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States.,Division of Pulmonary Medicine, Children's Hospital of Pittsburgh UPMC, Pittsburgh, PA, United States
| | - Yulong Bai
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
| | - Brenda Diergaarde
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States.,Hillman Cancer Center, University of Pittsburgh Medical Cancer, Pittsburgh, PA, United States
| | - Hyun Jung Park
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
| |
Collapse
|
18
|
Guyon C, Jmari N, Padonou F, Li YC, Ucar O, Fujikado N, Coulpier F, Blanchet C, Root DE, Giraud M. Aire-dependent genes undergo Clp1-mediated 3'UTR shortening associated with higher transcript stability in the thymus. eLife 2020; 9:52985. [PMID: 32338592 PMCID: PMC7205469 DOI: 10.7554/elife.52985] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Accepted: 04/24/2020] [Indexed: 12/23/2022] Open
Abstract
The ability of the immune system to avoid autoimmune disease relies on tolerization of thymocytes to self-antigens whose expression and presentation by thymic medullary epithelial cells (mTECs) is controlled predominantly by Aire at the transcriptional level and possibly regulated at other unrecognized levels. Aire-sensitive gene expression is influenced by several molecular factors, some of which belong to the 3'end processing complex, suggesting they might impact transcript stability and levels through an effect on 3'UTR shortening. We discovered that Aire-sensitive genes display a pronounced preference for short-3'UTR transcript isoforms in mTECs, a feature preceding Aire's expression and correlated with the preferential selection of proximal polyA sites by the 3'end processing complex. Through an RNAi screen and generation of a lentigenic mouse, we found that one factor, Clp1, promotes 3'UTR shortening associated with higher transcript stability and expression of Aire-sensitive genes, revealing a post-transcriptional level of control of Aire-activated expression in mTECs.
Collapse
Affiliation(s)
- Clotilde Guyon
- Institut Cochin, INSERM U1016, Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Nada Jmari
- Institut Cochin, INSERM U1016, Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Francine Padonou
- Institut Cochin, INSERM U1016, Université Paris Descartes, Sorbonne Paris Cité, Paris, France.,Université de Nantes, Inserm, Centre de Recherche en Transplantation et Immunologie, UMR 1064, ITUN, F-44000, Nantes, France
| | - Yen-Chin Li
- Institut Cochin, INSERM U1016, Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Olga Ucar
- Division of Developmental Immunology, German Cancer Research Center, Heidelberg, Germany
| | - Noriyuki Fujikado
- Division of Immunology, Department of Microbiology and Immunobiology, Harvard Medical School, Boston, United States
| | - Fanny Coulpier
- Ecole Normale Supérieure, PSL Research University, CNRS, INSERM, Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Plateforme Génomique, Paris, France
| | | | - David E Root
- The Broad Institute of MIT and Harvard, Cambridge, United States
| | - Matthieu Giraud
- Institut Cochin, INSERM U1016, Université Paris Descartes, Sorbonne Paris Cité, Paris, France.,Université de Nantes, Inserm, Centre de Recherche en Transplantation et Immunologie, UMR 1064, ITUN, F-44000, Nantes, France
| |
Collapse
|
19
|
Hong W, Ruan H, Zhang Z, Ye Y, Liu Y, Li S, Jing Y, Zhang H, Diao L, Liang H, Han L. APAatlas: decoding alternative polyadenylation across human tissues. Nucleic Acids Res 2020; 48:D34-D39. [PMID: 31586392 PMCID: PMC6943053 DOI: 10.1093/nar/gkz876] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 09/15/2019] [Accepted: 09/29/2019] [Indexed: 02/06/2023] Open
Abstract
Alternative polyadenylation (APA) is an RNA-processing mechanism on the 3' terminus that generates distinct isoforms of mRNAs and/or other RNA polymerase II transcripts with different 3'UTR lengths. Widespread APA affects post-transcriptional gene regulation in mRNA translation, stability, and localization, and exhibits strong tissue specificity. However, no existing database provides comprehensive information about APA events in a large number of human normal tissues. Using the RNA-seq data from the Genotype-Tissue Expression project, we systematically identified APA events from 9475 samples across 53 human tissues and examined their associations with multiple traits and gene expression across tissues. We further developed APAatlas, a user-friendly database (https://hanlab.uth.edu/apa/) for searching, browsing and downloading related information. APAatlas will help the biomedical research community elucidate the functions and mechanisms of APA events in human tissues.
Collapse
Affiliation(s)
- Wei Hong
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Hang Ruan
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Zhao Zhang
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Youqiong Ye
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Yaoming Liu
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Shengli Li
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Ying Jing
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Huiwen Zhang
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Lixia Diao
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Han Liang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Leng Han
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Center for Precision Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
20
|
Zhu S, Ye W, Ye L, Fu H, Ye C, Xiao X, Ji Y, Lin W, Ji G, Wu X. PlantAPAdb: A Comprehensive Database for Alternative Polyadenylation Sites in Plants. PLANT PHYSIOLOGY 2020; 182:228-242. [PMID: 31767692 PMCID: PMC6945835 DOI: 10.1104/pp.19.00943] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 11/18/2019] [Indexed: 05/23/2023]
Abstract
Alternative cleavage and polyadenylation (APA) is increasingly recognized as an important regulatory mechanism in eukaryotic gene expression and is dynamically modulated in a developmental, tissue-specific, or environmentally responsive manner. Given the functional importance of APA and the rapid accumulation of APA sites in plants, a comprehensive and easily accessible APA site database is necessary for improved understanding of APA-mediated gene expression regulation. We present a database called PlantAPAdb that catalogs the most comprehensive APA site data derived from sequences from diverse 3' sequencing protocols and biological samples in plants. Currently, PlantAPAdb contains APA sites in six species, Oryza sativa (japonica and indica), Arabidopsis (Arabidopsis thaliana), Medicago truncatula, Trifolium pratense, Phyllostachys edulis, and Chlamydomonas reinhardtii APA sites in PlantAPAdb are available for bulk download and can be queried in a Google-like manner. PlantAPAdb provides rich information of the whole-genome APA sites, including genomic locations, heterogeneous cleavage sites, expression levels, and sample information. It also provides comprehensive poly(A) signals for APA sites in different genomic regions according to distinct profiles of cis-elements in plants. In addition, PlantAPAdb contains events of 3' untranslated region shortening/lengthening resulting from APA, which helps to understand the mechanisms underlying systematic changes in 3' untranslated region lengths. Additional information about conservation of APA sites in plants is also available, providing insights into the evolutionary polyadenylation configuration across species. As a user-friendly database, PlantAPAdb is a large and extendable resource for elucidating APA mechanisms, APA conservation, and gene expression regulation.
Collapse
Affiliation(s)
- Sheng Zhu
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 3611002, China
| | - Wenbin Ye
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 3611002, China
| | - Lishan Ye
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 3611002, China
- Xiamen Health and Medical Big Data Center, Xiamen, Fujian 361008, China
| | - Hongjuan Fu
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 3611002, China
| | - Congting Ye
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
| | - Xuesong Xiao
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Yuanhaowei Ji
- School of Mathematics, Northwest University, Xi'an, Shanxi 710127, China
| | - Weixu Lin
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 3611002, China
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 3611002, China
| |
Collapse
|
21
|
Leung MKK, Delong A, Frey BJ. Inference of the human polyadenylation code. Bioinformatics 2019; 34:2889-2898. [PMID: 29648582 PMCID: PMC6129302 DOI: 10.1093/bioinformatics/bty211] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Accepted: 04/09/2018] [Indexed: 01/02/2023] Open
Abstract
Motivation Processing of transcripts at the 3′-end involves cleavage at a polyadenylation site followed by the addition of a poly(A)-tail. By selecting which site is cleaved, the process of alternative polyadenylation enables genes to produce transcript isoforms with different 3′-ends. To facilitate the identification and treatment of disease-causing mutations that affect polyadenylation and to understand the sequence determinants underlying this regulatory process, a computational model that can accurately predict polyadenylation patterns from genomic features is desirable. Results Previous works have focused on identifying candidate polyadenylation sites and classifying tissue-specific sites. By training on how multiple sites in genes are competitively selected for polyadenylation from 3′-end sequencing data, we developed a deep learning model that can predict the tissue-specific strength of a polyadenylation site in the 3′ untranslated region of the human genome given only its genomic sequence. We demonstrate the model’s broad utility on multiple tasks, without any application-specific training. The model can be used to predict which polyadenylation site is more likely to be selected in genes with multiple sites. It can be used to scan the 3′ untranslated region to find candidate polyadenylation sites. It can be used to classify the pathogenicity of variants near annotated polyadenylation sites in ClinVar. It can also be used to anticipate the effect of antisense oligonucleotide experiments to redirect polyadenylation. We provide analysis on how different features affect the model’s predictive performance and a method to identify sensitive regions of the genome at the single-based resolution that can affect polyadenylation regulation. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Michael K K Leung
- Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada.,Deep Genomics, MaRS Centre, Toronto, Canada
| | - Andrew Delong
- Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada.,Deep Genomics, MaRS Centre, Toronto, Canada
| | - Brendan J Frey
- Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada.,Deep Genomics, MaRS Centre, Toronto, Canada.,Banting and Best Department of Medical Research, University of Toronto, Toronto, Canada
| |
Collapse
|
22
|
Mariella E, Marotta F, Grassi E, Gilotto S, Provero P. The Length of the Expressed 3' UTR Is an Intermediate Molecular Phenotype Linking Genetic Variants to Complex Diseases. Front Genet 2019; 10:714. [PMID: 31475030 PMCID: PMC6707137 DOI: 10.3389/fgene.2019.00714] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Accepted: 07/05/2019] [Indexed: 11/13/2022] Open
Abstract
In the last decades, genome-wide association studies (GWAS) have uncovered tens of thousands of associations between common genetic variants and complex diseases. However, these statistical associations can rarely be interpreted functionally and mechanistically. As the majority of the disease-associated variants are located far from coding sequences, even the relevant gene is often unclear. A way to gain insight into the relevant mechanisms is to study the genetic determinants of intermediate molecular phenotypes, such as gene expression and transcript structure. We propose a computational strategy to discover genetic variants affecting the relative expression of alternative 3′ untranslated region (UTR) isoforms, generated through alternative polyadenylation, a widespread posttranscriptional regulatory mechanism known to have relevant functional consequences. When applied to a large dataset in which whole genome and RNA sequencing data are available for 373 European individuals, 2,530 genes with alternative polyadenylation quantitative trait loci (apaQTL) were identified. We analyze and discuss possible mechanisms of action of these variants, and we show that they are significantly enriched in GWAS hits, in particular those concerning immune-related and neurological disorders. Our results point to an important role for genetically determined alternative polyadenylation in affecting predisposition to complex diseases, and suggest new ways to extract functional information from GWAS data.
Collapse
Affiliation(s)
- Elisa Mariella
- Department of Molecular Biotechnology and Health Sciences, University of Turin, Turin, Italy
| | - Federico Marotta
- Department of Molecular Biotechnology and Health Sciences, University of Turin, Turin, Italy
| | - Elena Grassi
- Department of Molecular Biotechnology and Health Sciences, University of Turin, Turin, Italy
| | - Stefano Gilotto
- Department of Molecular Biotechnology and Health Sciences, University of Turin, Turin, Italy
| | - Paolo Provero
- Department of Molecular Biotechnology and Health Sciences, University of Turin, Turin, Italy.,Center for Tranlational Genomics and Bioinformatics, San Raffaele Scientific Institute, Milan, Italy
| |
Collapse
|
23
|
Wang R, Nambiar R, Zheng D, Tian B. PolyA_DB 3 catalogs cleavage and polyadenylation sites identified by deep sequencing in multiple genomes. Nucleic Acids Res 2019; 46:D315-D319. [PMID: 29069441 PMCID: PMC5753232 DOI: 10.1093/nar/gkx1000] [Citation(s) in RCA: 139] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/12/2017] [Indexed: 12/11/2022] Open
Abstract
PolyA_DB is a database cataloging cleavage and polyadenylation sites (PASs) in several genomes. Previous versions were based mainly on expressed sequence tags (ESTs), which had a limited amount and could lead to inaccurate PAS identification due to the presence of internal A-rich sequences in transcripts. Here, we present an updated version of the database based solely on deep sequencing data. First, PASs are mapped by the 3′ region extraction and deep sequencing (3′READS) method, ensuring unequivocal PAS identification. Second, a large volume of data based on diverse biological samples increases PAS coverage by 3.5-fold over the EST-based version and provides PAS usage information. Third, strand-specific RNA-seq data are used to extend annotated 3′ ends of genes to obtain more thorough annotations of alternative polyadenylation (APA) sites. Fourth, conservation information of PAS across mammals sheds light on significance of APA sites. The database (URL: http://www.polya-db.org/v3) currently holds PASs in human, mouse, rat and chicken, and has links to the UCSC genome browser for further visualization and for integration with other genomic data.
Collapse
Affiliation(s)
- Ruijia Wang
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School and Rutgers Cancer Institute of New Jersey, Newark, NJ 07103, USA
| | - Ram Nambiar
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA
| | - Dinghai Zheng
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School and Rutgers Cancer Institute of New Jersey, Newark, NJ 07103, USA
| | - Bin Tian
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School and Rutgers Cancer Institute of New Jersey, Newark, NJ 07103, USA
| |
Collapse
|
24
|
Feng X, Li L, Wagner EJ, Li W. TC3A: The Cancer 3' UTR Atlas. Nucleic Acids Res 2019; 46:D1027-D1030. [PMID: 30053266 PMCID: PMC5753254 DOI: 10.1093/nar/gkx892] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2017] [Accepted: 09/22/2017] [Indexed: 11/13/2022] Open
Abstract
Widespread alternative polyadenylation (APA) occurs during enhanced cellular proliferation and transformation. Recently, we demonstrated that CFIm25-mediated 3′ UTR shortening through APA promotes glioblastoma tumor growth in vitro and in vivo, further underscoring its significance to tumorigenesis. Here, we report The Cancer 3′ UTR Atlas (TC3A), a comprehensive resource of APA usage for 10,537 tumors across 32 cancer types. These APA events represent potentially novel prognostic biomarkers and may uncover novel mechanisms for the regulation of cancer driver genes. TC3A is built on top of the now de facto standard cBioPortal. Therefore, the large community of existing cBioPortal users and clinical researchers will find TC3A familiar and immediately usable. TC3A is currently fully functional and freely available at http://tc3a.org.
Collapse
Affiliation(s)
- Xin Feng
- Division of Biostatistics, Dan L. Duncan Cancer Center and Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Lei Li
- Division of Biostatistics, Dan L. Duncan Cancer Center and Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Eric J Wagner
- Department of Biochemistry & Molecular Biology, University of Texas Medical Branch at Galveston, Galveston, TX 77550, USA
| | - Wei Li
- Division of Biostatistics, Dan L. Duncan Cancer Center and Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
25
|
Zhu S, Wu X, Fu H, Ye C, Chen M, Jiang Z, Ji G. Modeling of Genome-Wide Polyadenylation Signals in Xenopus tropicalis. Front Genet 2019; 10:647. [PMID: 31333724 PMCID: PMC6616101 DOI: 10.3389/fgene.2019.00647] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 06/18/2019] [Indexed: 12/22/2022] Open
Abstract
Alternative polyadenylation (APA) is an important post-transcriptional modification event to process messenger RNA (mRNA) for transcriptional termination, transport, and translation. In the present study, we characterized poly(A) signals in Xenopus tropicalis using 70,918 highly confident poly(A) sites derived from 16,511 protein-coding genes to understand their roles in the regulation of embryo development and gender difference. We examined potential factors, including the gene length, the number of introns in a gene, and the intron length, that may affect the prevalence of APA. We observed 12 prominent poly(A) signal patterns, which accounted for approximately 92% of total APA sites in Xenopus tropicalis. Among them, three patterns are specific to X. tropicalis, so they are absent in other animals such as humans or mice. We catalogued APA sites based on their genomic regions and developed a bioinformatics pipeline to identify over-represented signal patterns for each class. Then the schema of cis elements for APA sites in each genomic region was proposed. More importantly, APA usage is dramatically dynamic in embryos along five developmental stages and well-coordinated with the maternal-to-zygotic transition event. We used an entropy-based method to identify developmental stage-specific APA sites and identified significant signal patterns around specific sites and constitutive sites. We found that the APA frequency in different genomic regions varies with developmental stages and that those sites located in intron or coding sequence regions contribute most to the dynamics of gene expression during developmental stages. This study deciphers the characteristics and poly(A) signal patterns for both canonical APA sites and non-canonical APA sites across different developmental stages and gender dimorphisms in X. tropicalis, providing new insights into the dynamic regulation of distal and proximal APA.
Collapse
Affiliation(s)
- Sheng Zhu
- Department of Automation, Xiamen University, Xiamen, China.,National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen, China.,National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China.,Innovation Center for Cell Signaling Network, Xiamen University, Xiamen, China
| | - Hongjuan Fu
- Department of Automation, Xiamen University, Xiamen, China
| | - Congting Ye
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China.,Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, China
| | - Moliang Chen
- Department of Automation, Xiamen University, Xiamen, China
| | - Zhihua Jiang
- Department of Animal Sciences and Center for Reproductive Biology, Washington State University, Pullman, WA, United States
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen, China.,National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China.,Innovation Center for Cell Signaling Network, Xiamen University, Xiamen, China
| |
Collapse
|
26
|
Chen M, Ji G, Fu H, Lin Q, Ye C, Ye W, Su Y, Wu X. A survey on identification and quantification of alternative polyadenylation sites from RNA-seq data. Brief Bioinform 2019; 21:1261-1276. [PMID: 31267126 DOI: 10.1093/bib/bbz068] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Revised: 05/03/2019] [Accepted: 05/14/2019] [Indexed: 12/13/2022] Open
Abstract
Alternative polyadenylation (APA) has been implicated to play an important role in post-transcriptional regulation by regulating mRNA abundance, stability, localization and translation, which contributes considerably to transcriptome diversity and gene expression regulation. RNA-seq has become a routine approach for transcriptome profiling, generating unprecedented data that could be used to identify and quantify APA site usage. A number of computational approaches for identifying APA sites and/or dynamic APA events from RNA-seq data have emerged in the literature, which provide valuable yet preliminary results that should be refined to yield credible guidelines for the scientific community. In this review, we provided a comprehensive overview of the status of currently available computational approaches. We also conducted objective benchmarking analysis using RNA-seq data sets from different species (human, mouse and Arabidopsis) and simulated data sets to present a systematic evaluation of 11 representative methods. Our benchmarking study showed that the overall performance of all tools investigated is moderate, reflecting that there is still lot of scope to improve the prediction of APA site or dynamic APA events from RNA-seq data. Particularly, prediction results from individual tools differ considerably, and only a limited number of predicted APA sites or genes are common among different tools. Accordingly, we attempted to give some advice on how to assess the reliability of the obtained results. We also proposed practical recommendations on the appropriate method applicable to diverse scenarios and discussed implications and future directions relevant to profiling APA from RNA-seq data.
Collapse
Affiliation(s)
- Moliang Chen
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Hongjuan Fu
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Qianmin Lin
- Xiang' an hospital of Xiamen university, Xiamen 361005, China
| | - Congting Ye
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
| | - Wenbin Ye
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Yaru Su
- College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116, China
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| |
Collapse
|
27
|
Ye C, Long Y, Ji G, Li QQ, Wu X. APAtrap: identification and quantification of alternative polyadenylation sites from RNA-seq data. Bioinformatics 2019; 34:1841-1849. [PMID: 29360928 DOI: 10.1093/bioinformatics/bty029] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2017] [Accepted: 01/17/2018] [Indexed: 12/28/2022] Open
Abstract
Motivation Alternative polyadenylation (APA) has been increasingly recognized as a crucial mechanism that contributes to transcriptome diversity and gene expression regulation. As RNA-seq has become a routine protocol for transcriptome analysis, it is of great interest to leverage such unprecedented collection of RNA-seq data by new computational methods to extract and quantify APA dynamics in these transcriptomes. However, research progress in this area has been relatively limited. Conventional methods rely on either transcript assembly to determine transcript 3' ends or annotated poly(A) sites. Moreover, they can neither identify more than two poly(A) sites in a gene nor detect dynamic APA site usage considering more than two poly(A) sites. Results We developed an approach called APAtrap based on the mean squared error model to identify and quantify APA sites from RNA-seq data. APAtrap is capable of identifying novel 3' UTRs and 3' UTR extensions, which contributes to locating potential poly(A) sites in previously overlooked regions and improving genome annotations. APAtrap also aims to tally all potential poly(A) sites and detect genes with differential APA site usages between conditions. Extensive comparisons of APAtrap with two other latest methods, ChangePoint and DaPars, using various RNA-seq datasets from simulation studies, human and Arabidopsis demonstrate the efficacy and flexibility of APAtrap for any organisms with an annotated genome. Availability and implementation Freely available for download at https://apatrap.sourceforge.io. Contact liqq@xmu.edu.cn or xhuister@xmu.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Congting Ye
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
| | - Yuqi Long
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Qingshun Quinn Li
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China.,Graduate College of Biomedical Sciences, Western University of Health Sciences, Pomona, CA 91766, USA
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| |
Collapse
|
28
|
DPAC: A Tool for Differential Poly(A)-Cluster Usage from Poly(A)-Targeted RNAseq Data. G3-GENES GENOMES GENETICS 2019; 9:1825-1830. [PMID: 31023725 PMCID: PMC6553543 DOI: 10.1534/g3.119.400273] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Poly(A)-tail targeted RNAseq approaches, such as 3′READS, PAS-Seq and Poly(A)-ClickSeq, are becoming popular alternatives to random-primed RNAseq to focus sequencing reads just to the 3′ ends of polyadenylated RNAs to identify poly(A)-sites and characterize changes in their usage. Additionally, we and others have demonstrated that these approaches perform similarly to other RNAseq strategies for differential gene expression analysis, while saving on the volume of sequencing data required and providing a simpler library synthesis strategy. Here, we present DPAC (Differential Poly(A)-Clustering); a streamlined pipeline for the preprocessing of poly(A)-tail targeted RNAseq data, mapping of poly(A)-sites, poly(A)-site clustering and annotation, and determination of differential poly(A)-cluster usage using DESeq2. Changes in poly(A)-cluster usage is simultaneously used to report differential gene expression, differential terminal exon usage and alternative polyadenylation (APA).
Collapse
|
29
|
Kim N, Chung W, Eum HH, Lee HO, Park WY. Alternative polyadenylation of single cells delineates cell types and serves as a prognostic marker in early stage breast cancer. PLoS One 2019; 14:e0217196. [PMID: 31100099 PMCID: PMC6524824 DOI: 10.1371/journal.pone.0217196] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 05/08/2019] [Indexed: 12/15/2022] Open
Abstract
Alternative polyadenylation (APA) in 3’ untranslated regions (3’ UTR) plays an important role in regulating transcript abundance, localization, and interaction with microRNAs. Length-variation of 3’UTRs by APA contributes to efficient proliferation of cancer cells. In this study, we investigated APA in single cancer cells and tumor microenvironment cells to understand the physiological implication of APA in different cell types. We analyzed APA patterns and the expression level of genes from the 515 single-cell RNA sequencing (scRNA-seq) dataset from 11 breast cancer patients. Although the overall 3’UTR length of individual genes was distributed equally in tumor and non-tumor cells, we found a differential pattern of polyadenylation in gene sets between tumor and non-tumor cells. In addition, we found a differential pattern of APA across tumor types using scRNA-seq data from 3 glioblastoma patients and 1 renal cell carcinoma patients. In detail, 1,176 gene sets and 53 genes showed the distinct pattern of 3’UTR shortening and over-expression as signatures for five cell types including B lymphocytes, T lymphocytes, myeloid cells, stromal cells, and breast cancer cells. Functional categories of gene sets for cellular proliferation demonstrated concordant regulation of APA and gene expression specific to cell types. The expression of APA genes in breast cancer was significantly correlated with the clinical outcome of earlier stage breast cancer patients. We identified cell type-specific APA in single cells, which allows the identification of cell types based on 3’UTR length variation in combination with gene expression. Specifically, an immune-specific APA signature in breast cancer could be utilized as a prognostic marker of early stage breast cancer.
Collapse
Affiliation(s)
- Nayoung Kim
- Samsung Genome Institute, Samsung Medical Center, Seoul, South Korea
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, South Korea
| | - Woosung Chung
- Samsung Genome Institute, Samsung Medical Center, Seoul, South Korea
| | - Hye Hyeon Eum
- Samsung Genome Institute, Samsung Medical Center, Seoul, South Korea
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, South Korea
| | - Hae-Ock Lee
- Samsung Genome Institute, Samsung Medical Center, Seoul, South Korea
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, South Korea
- Department of Health Sciences and Technology, Samsung Advanced Institute for Health Sciences &Technology, Sungkyunkwan University, Seoul, South Korea
- * E-mail: (HOL); (WYP)
| | - Woong-Yang Park
- Samsung Genome Institute, Samsung Medical Center, Seoul, South Korea
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, South Korea
- Department of Health Sciences and Technology, Samsung Advanced Institute for Health Sciences &Technology, Sungkyunkwan University, Seoul, South Korea
- GENINUS Inc., Seoul, South Korea
- * E-mail: (HOL); (WYP)
| |
Collapse
|
30
|
Grassi E, Santoro R, Umbach A, Grosso A, Oliviero S, Neri F, Conti L, Ala U, Provero P, DiCunto F, Merlo GR. Choice of Alternative Polyadenylation Sites, Mediated by the RNA-Binding Protein Elavl3, Plays a Role in Differentiation of Inhibitory Neuronal Progenitors. Front Cell Neurosci 2019; 12:518. [PMID: 30687010 PMCID: PMC6338052 DOI: 10.3389/fncel.2018.00518] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 12/12/2018] [Indexed: 01/09/2023] Open
Abstract
Alternative polyadenylation (APA) is a widespread mechanism involving about half of the expressed genes, resulting in varying lengths of the 3′ untranslated region (3′UTR). Variations in length and sequence of the 3′UTR may underlie changes of post-transcriptional processing, localization, miRNA targeting and stability of mRNAs. During embryonic development a large array of mRNAs exhibit APA, with a prevalence of the longer 3′UTR versions in differentiating cells. Little is known about polyA+ site usage during differentiation of mammalian neural progenitors. Here we exploit a model of adherent neural stem (ANS) cells, which homogeneously and efficiently differentiate into GABAergic neurons. RNAseq data shows a global trend towards lengthening of the 3′UTRs during differentiation. Enriched expression of the longer 3′UTR variants of Pes1 and Gng2 was detected in the mouse brain in areas of cortical and subcortical neuronal differentiation, respectively, by two-probes fluorescent in situ hybridization (FISH). Among the coding genes upregulated during differentiation of ANS cells we found Elavl3, a neural-specific RNA-binding protein homologous to Drosophila Elav. In the insect, Elav regulates polyA+ site choice while interacting with paused Pol-II promoters. We tested the role of Elavl3 in ANS cells, by silencing Elavl3 and observed consistent changes in 3′UTR length and delayed neuronal differentiation. These results indicate that choice of the polyA+ site and lengthening of 3′UTRs is a possible additional mechanism of posttranscriptional RNA modification involved in neuronal differentiation.
Collapse
Affiliation(s)
- Elena Grassi
- Department of Molecular Biotechnology, University of Turin, Turin, Italy
| | - Roberto Santoro
- Department of Molecular Biotechnology, University of Turin, Turin, Italy
| | - Alessandro Umbach
- Department of Molecular Biotechnology, University of Turin, Turin, Italy
| | - Anna Grosso
- Department of Neurosciences, University of Turin, Turin, Italy
| | - Salvatore Oliviero
- Italian Institute for Genomic Medicine, Turin, Italy.,Department of Life Science and System Biology, University of Turin, Turin, Italy
| | - Francesco Neri
- Italian Institute for Genomic Medicine, Turin, Italy.,Department of Life Science and System Biology, University of Turin, Turin, Italy
| | - Luciano Conti
- Centre for Integrative Biology-CIBIO, University of Trento, Povo, Italy
| | - Ugo Ala
- Department of Molecular Biotechnology, University of Turin, Turin, Italy
| | - Paolo Provero
- Department of Molecular Biotechnology, University of Turin, Turin, Italy
| | - Ferdinando DiCunto
- Department of Molecular Biotechnology, University of Turin, Turin, Italy.,Department of Neurosciences, University of Turin, Turin, Italy
| | - Giorgio R Merlo
- Department of Molecular Biotechnology, University of Turin, Turin, Italy
| |
Collapse
|
31
|
Gruber AJ, Gypas F, Riba A, Schmidt R, Zavolan M. Terminal exon characterization with TECtool reveals an abundance of cell-specific isoforms. Nat Methods 2018; 15:832-836. [PMID: 30202060 PMCID: PMC7611301 DOI: 10.1038/s41592-018-0114-z] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Accepted: 07/10/2018] [Indexed: 11/23/2022]
Abstract
Sequencing of RNA 3' ends has uncovered numerous sites that do not correspond to the termination sites of known transcripts. Through their 3' untranslated regions, protein-coding RNAs interact with RNA-binding proteins and microRNAs, which regulate many properties, including RNA stability and subcellular localization. We developed the terminal exon characterization (TEC) tool ( http://tectool.unibas.ch ), which can be used with RNA-sequencing data from any species for which a genome annotation that includes sites of RNA cleavage and polyadenylation is available. We discovered hundreds of previously unknown isoforms and cell-type-specific terminal exons in human cells. Ribosome profiling data revealed that many of these isoforms were translated. By applying TECtool to single-cell sequencing data, we found that the newly identified isoforms were expressed in subpopulations of cells. Thus, TECtool enables the identification of previously unknown isoforms in well-studied cell systems and in rare cell types.
Collapse
Affiliation(s)
- Andreas J Gruber
- Oxford Big Data Institute, Nuffield Department of Medicine, University of Oxford, Oxford, UK.
| | - Foivos Gypas
- Computational and Systems Biology, Biozentrum, University of Basel, Basel, Switzerland
| | - Andrea Riba
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France
| | - Ralf Schmidt
- Computational and Systems Biology, Biozentrum, University of Basel, Basel, Switzerland
| | - Mihaela Zavolan
- Computational and Systems Biology, Biozentrum, University of Basel, Basel, Switzerland.
| |
Collapse
|
32
|
Xu C, Zhang J. Alternative Polyadenylation of Mammalian Transcripts Is Generally Deleterious, Not Adaptive. Cell Syst 2018; 6:734-742.e4. [PMID: 29886108 DOI: 10.1016/j.cels.2018.05.007] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Revised: 03/27/2018] [Accepted: 05/09/2018] [Indexed: 01/07/2023]
Abstract
Alternative polyadenylation (APA) produces from the same gene multiple mature RNAs with varying 3' ends. Although APA is commonly believed to generate beneficial functional diversity and be adaptive, we hypothesize that most genes have one optimal polyadenylation site and that APA is caused largely by deleterious polyadenylation errors. The error hypothesis, but not the adaptive hypothesis, predicts that, as the expression level of a gene increases, its polyadenylation diversity declines, relative use of the major (presumably optimal) polyadenylation site increases, and that of each minor (presumably nonoptimal) site decreases. It further predicts that the number of polyadenylation signals per gene is smaller than the random expectation and that polyadenylation signals for major but not minor sites are under purifying selection. All of these predictions are confirmed in mammals, suggesting that numerous defective RNAs are produced in normal cells, many phenotypic variations at the molecular level are nonadaptive, and cellular life is noisier than is appreciated.
Collapse
Affiliation(s)
- Chuan Xu
- College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China; Department of Ecology and Evolutionary Biology, University of Michigan, 4018 Biological Science Building, 1105 North University Avenue, Ann Arbor, MI 48109, USA
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, 4018 Biological Science Building, 1105 North University Avenue, Ann Arbor, MI 48109, USA.
| |
Collapse
|
33
|
Kasowitz SD, Ma J, Anderson SJ, Leu NA, Xu Y, Gregory BD, Schultz RM, Wang PJ. Nuclear m6A reader YTHDC1 regulates alternative polyadenylation and splicing during mouse oocyte development. PLoS Genet 2018; 14:e1007412. [PMID: 29799838 PMCID: PMC5991768 DOI: 10.1371/journal.pgen.1007412] [Citation(s) in RCA: 339] [Impact Index Per Article: 56.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Revised: 06/07/2018] [Accepted: 05/14/2018] [Indexed: 12/31/2022] Open
Abstract
The N6-methyladenosine (m6A) modification is the most prevalent internal RNA modification in eukaryotes. The majority of m6A sites are found in the last exon and 3' UTRs. Here we show that the nuclear m6A reader YTHDC1 is essential for embryo viability and germline development in mouse. Specifically, YTHDC1 is required for spermatogonial development in males and for oocyte growth and maturation in females; Ythdc1-deficient oocytes are blocked at the primary follicle stage. Strikingly, loss of YTHDC1 leads to extensive alternative polyadenylation in oocytes, altering 3' UTR length. Furthermore, YTHDC1 deficiency causes massive alternative splicing defects in oocytes. The majority of splicing defects in mutant oocytes are rescued by introducing wild-type, but not m6A-binding-deficient, YTHDC1. YTHDC1 is associated with the pre-mRNA 3' end processing factors CPSF6, SRSF3, and SRSF7. Thus, YTHDC1 plays a critical role in processing of pre-mRNA transcripts in the oocyte nucleus and may have similar non-redundant roles throughout fetal development.
Collapse
Affiliation(s)
- Seth D. Kasowitz
- Department of Biomedical Sciences, University of Pennsylvania, Philadelphia, United States of America
| | - Jun Ma
- Department of Biomedical Sciences, University of Pennsylvania, Philadelphia, United States of America
- Department of Biology, University of Pennsylvania, Philadelphia, United States of America
| | - Stephen J. Anderson
- Department of Biology, University of Pennsylvania, Philadelphia, United States of America
| | - N. Adrian Leu
- Department of Biomedical Sciences, University of Pennsylvania, Philadelphia, United States of America
| | - Yang Xu
- Department of Biomedical Sciences, University of Pennsylvania, Philadelphia, United States of America
| | - Brian D. Gregory
- Department of Biology, University of Pennsylvania, Philadelphia, United States of America
| | - Richard M. Schultz
- Department of Biology, University of Pennsylvania, Philadelphia, United States of America
- Department of Anatomy, Physiology and Cell Biology, School of Veterinary Medicine, University of California, Davis, Davis, United States of America
| | - P. Jeremy Wang
- Department of Biomedical Sciences, University of Pennsylvania, Philadelphia, United States of America
| |
Collapse
|
34
|
The 3'UTR signature defines a highly metastatic subgroup of triple-negative breast cancer. Oncotarget 2018; 7:59834-59844. [PMID: 27494850 PMCID: PMC5312352 DOI: 10.18632/oncotarget.10975] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2016] [Accepted: 07/18/2016] [Indexed: 01/13/2023] Open
Abstract
Triple-negative breast cancer (TNBC) is a highly heterogeneous disease with an aggressive clinical course. Prognostic models are needed to chart potential patient outcomes. To address this, we used alternative 3′UTR patterns to improve postoperative risk stratification. We collected 327 publicly available microarrays and generated the 3′UTR landscape based on expression ratios of alternative 3′UTR. After initial feature filtering, we built a 17-3′UTR-based classifier using an elastic net model. Time-dependent ROC comparisons and Kaplan–Meier analyses confirmed an outstanding discriminating power of our prognostic model for TNBC patients. In the training cohort, 5-year event-free survival (EFS) was 78.6% (95% CI 71.2–86.0) for the low-risk group, and 16.3% (95% CI 2.3–30.4) for the high-risk group (log-rank p<0.0001; hazard ratio [HR] 8.29, 95% CI 4.78–14.4), In the validation set, 5-year EFS was 75.6% (95% CI 68.0–83.2) for the low-risk group, and 33.2% (95% CI 17.1–49.3) for the high-risk group (log-rank p<0.0001; HR 3.17, 95% CI 1.66–5.42). In conclusion, the 17-3′UTR-based classifier provides a superior prognostic performance for estimating disease recurrence and metastasis in TNBC patients and it may permit personalized management strategies.
Collapse
|
35
|
Szkop KJ, Nobeli I. Untranslated Parts of Genes Interpreted: Making Heads or Tails of High-Throughput Transcriptomic Data via Computational Methods: Computational methods to discover and quantify isoforms with alternative untranslated regions. Bioessays 2017; 39. [PMID: 29052251 DOI: 10.1002/bies.201700090] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2017] [Revised: 09/12/2017] [Indexed: 01/07/2023]
Abstract
In this review we highlight the importance of defining the untranslated parts of transcripts, and present a number of computational approaches for the discovery and quantification of alternative transcription start and poly-adenylation events in high-throughput transcriptomic data. The fate of eukaryotic transcripts is closely linked to their untranslated regions, which are determined by the position at which transcription starts and ends at a genomic locus. Although the extent of alternative transcription starts and alternative poly-adenylation sites has been revealed by sequencing methods focused on the ends of transcripts, the application of these methods is not yet widely adopted by the community. We suggest that computational methods applied to standard high-throughput technologies are a useful, albeit less accurate, alternative to the expertise-demanding 5' and 3' sequencing and they are the only option for analysing legacy transcriptomic data. We review these methods here, focusing on technical challenges and arguing for the need to include better normalization of the data and more appropriate statistical models of the expected variation in the signal.
Collapse
Affiliation(s)
- Krzysztof J Szkop
- Institute of Structural and Molecular Biology, Department of Biological Sciences Birkbeck, University of London, Malet Street, London WC1E 7HX, UK
| | - Irene Nobeli
- Institute of Structural and Molecular Biology, Department of Biological Sciences Birkbeck, University of London, Malet Street, London WC1E 7HX, UK
| |
Collapse
|
36
|
Hu W, Li S, Park JY, Boppana S, Ni T, Li M, Zhu J, Tian B, Xie Z, Xiang M. Dynamic landscape of alternative polyadenylation during retinal development. Cell Mol Life Sci 2016; 74:1721-1739. [PMID: 27990575 DOI: 10.1007/s00018-016-2429-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Revised: 11/24/2016] [Accepted: 12/01/2016] [Indexed: 10/20/2022]
Abstract
The development of the central nervous system (CNS) is a complex process that must be exquisitely controlled at multiple levels to ensure the production of appropriate types and quantity of neurons. RNA alternative polyadenylation (APA) contributes to transcriptome diversity and gene regulation, and has recently been shown to be widespread in the CNS. However, the previous studies have been primarily focused on the tissue specificity of APA and developmental APA change of whole model organisms; a systematic survey of APA usage is lacking during CNS development. Here, we conducted global analysis of APA during mouse retinal development, and identified stage-specific polyadenylation (pA) sites that are enriched for genes critical for retinal development and visual perception. Moreover, we demonstrated 3'UTR (untranslated region) lengthening and increased usage of intronic pA sites over development that would result in gaining many different RBP (RNA-binding protein) and miRNA target sites. Furthermore, we showed that a considerable number of polyadenylated lncRNAs are co-expressed with protein-coding genes involved in retinal development and functions. Together, our data indicate that APA is highly and dynamically regulated during retinal development and maturation, suggesting that APA may serve as a crucial mechanism of gene regulation underlying the delicate process of CNS development.
Collapse
Affiliation(s)
- Wenyan Hu
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, 500040, China
| | - Shengguo Li
- Center for Advanced Biotechnology and Medicine and Department of Pediatrics, Rutgers University-Robert Wood Johnson Medical School, 679 Hoes Lane West, Piscataway, NJ, 08854, USA
| | - Ji Yeon Park
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ, 07101, USA
| | - Sridhar Boppana
- Center for Advanced Biotechnology and Medicine and Department of Pediatrics, Rutgers University-Robert Wood Johnson Medical School, 679 Hoes Lane West, Piscataway, NJ, 08854, USA
| | - Ting Ni
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Miaoxin Li
- Department of Medical Genetics, Center for Genome Research, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Jun Zhu
- Systems Biology Center, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Bin Tian
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ, 07101, USA
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, 500040, China.
| | - Mengqing Xiang
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, 500040, China. .,Center for Advanced Biotechnology and Medicine and Department of Pediatrics, Rutgers University-Robert Wood Johnson Medical School, 679 Hoes Lane West, Piscataway, NJ, 08854, USA.
| |
Collapse
|
37
|
Rangel L, Lospitao E, Ruiz-Sáenz A, Alonso MA, Correas I. Alternative polyadenylation in a family of paralogous EPB41 genes generates protein 4.1 diversity. RNA Biol 2016; 14:236-244. [PMID: 27981895 DOI: 10.1080/15476286.2016.1270003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Alternative polyadenylation (APA) is a step in mRNA 3'-end processing that contributes to the complexity of the transcriptome by generating isoforms that differ in either their coding sequence or their 3'-untranslated regions (UTRs). The EPB41 genes, EPB41, EPB41L2, EPB41L3 and EPB41L1, encode an impressively complex array of structural adaptor proteins (designated 4.1R, 4.1G, 4.1B and 4.1N, respectively) by using alternative transcriptional promoters and tissue-specific alternative pre-mRNA splicing. The great variety of 4.1 proteins mainly results from 5'-end and internal processing of the EPB41 pre-mRNAs. Thus, 4.1 proteins can vary in their N-terminal extensions but all contain a highly homologous C-terminal domain (CTD). Here we study a new group of EPB41-related mRNAs that originate by APA and lack the exons encoding the CTD characteristic of prototypical 4.1 proteins, thereby encoding a new type of 4.1 protein. For the EPB41 gene, this type of processing was observed in all 11 human tissues analyzed. Comparative genomic analysis of EPB41 indicates that APA is conserved in various mammals. In addition, we show that APA also functions for the EPB41L2, EPB41L3 and EPB41L1 genes, but in a more restricted manner in the case of the latter 2 than it does for the EPB41 and EPB41L2 genes. Our study shows alternative polyadenylation to be an additional mechanism for the generation of 4.1 protein diversity in the already complex EPB41-related genes. Understanding the diversity of EPB41 RNA processing is essential for a full appreciation of the many 4.1 proteins expressed in normal and pathological tissues.
Collapse
Affiliation(s)
- Laura Rangel
- a Departamento de Biología Molecular , Universidad Autónoma de Madrid (UAM), Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas (CSIC), Nicolás Cabrera , Cantoblanco, Madrid , Spain
| | - Eva Lospitao
- a Departamento de Biología Molecular , Universidad Autónoma de Madrid (UAM), Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas (CSIC), Nicolás Cabrera , Cantoblanco, Madrid , Spain
| | - Ana Ruiz-Sáenz
- a Departamento de Biología Molecular , Universidad Autónoma de Madrid (UAM), Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas (CSIC), Nicolás Cabrera , Cantoblanco, Madrid , Spain
| | - Miguel A Alonso
- a Departamento de Biología Molecular , Universidad Autónoma de Madrid (UAM), Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas (CSIC), Nicolás Cabrera , Cantoblanco, Madrid , Spain
| | - Isabel Correas
- a Departamento de Biología Molecular , Universidad Autónoma de Madrid (UAM), Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas (CSIC), Nicolás Cabrera , Cantoblanco, Madrid , Spain
| |
Collapse
|
38
|
Grassi E, Mariella E, Lembo A, Molineris I, Provero P. Roar: detecting alternative polyadenylation with standard mRNA sequencing libraries. BMC Bioinformatics 2016; 17:423. [PMID: 27756200 PMCID: PMC5069797 DOI: 10.1186/s12859-016-1254-8] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 09/08/2016] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Post-transcriptional regulation is a complex mechanism that plays a central role in defining multiple cellular identities starting from a common genome. Modifications in the length of 3'UTRs have been found to play an important role in this context, since alternative 3' UTRs could lead to differences for example in regulation by microRNAs and cellular localization of the transcripts thus altering their fate. RESULTS We propose a strategy to identify the genes undergoing regulation of 3' UTR length using RNA sequencing data obtained from standard libraries, thus widely applicable to data originally obtained to perform classical differential expression analyses. We decided to exploit previously annotated APA sites from public databases, in contrast with other approaches recently proposed in which the location of the APA site is inferred from the data together with the relative abundance of the isoforms. We demonstrate the reliability of our method by comparing it to the results of other microarray based or specific RNA-seq libraries methods and show that using APA sites databases results in higher sensitivity compared to de novo site prediction approach. CONCLUSIONS We implemented the algorithm in a Bioconductor package to facilitate its broad usage in the scientific community. The ability of this approach to detect shortening from libraries with a number of reads comparable to that needed for differential expression analyses makes it useful for investigating if alternative polyadenylation is relevant in a certain biological process without requiring specific experimental assays.
Collapse
Affiliation(s)
- Elena Grassi
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy.
| | - Elisa Mariella
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
| | - Antonio Lembo
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
| | - Ivan Molineris
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
| | - Paolo Provero
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
- Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 60, Milan, 20132, Italy
| |
Collapse
|
39
|
Zheng D, Liu X, Tian B. 3'READS+, a sensitive and accurate method for 3' end sequencing of polyadenylated RNA. RNA (NEW YORK, N.Y.) 2016; 22:1631-1639. [PMID: 27512124 PMCID: PMC5029459 DOI: 10.1261/rna.057075.116] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Accepted: 06/21/2016] [Indexed: 06/06/2023]
Abstract
Sequencing of the 3' end of poly(A)(+) RNA identifies cleavage and polyadenylation sites (pAs) and measures transcript expression. We previously developed a method, 3' region extraction and deep sequencing (3'READS), to address mispriming issues that often plague 3' end sequencing. Here we report a new version, named 3'READS+, which has vastly improved accuracy and sensitivity. Using a special locked nucleic acid oligo to capture poly(A)(+) RNA and to remove the bulk of the poly(A) tail, 3'READS+ generates RNA fragments with an optimal number of terminal A's that balance data quality and detection of genuine pAs. With improved RNA ligation steps for efficiency, the method shows much higher sensitivity (over two orders of magnitude) compared to the previous version. Using 3'READS+, we have uncovered a sizable fraction of previously overlooked pAs located next to or within a stretch of adenylate residues in human genes and more accurately assessed the frequency of alternative cleavage and polyadenylation (APA) in HeLa cells (∼50%). 3'READS+ will be a useful tool to accurately study APA and to analyze gene expression by 3' end counting, especially when the amount of input total RNA is limited.
Collapse
Affiliation(s)
- Dinghai Zheng
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
| | - Xiaochuan Liu
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
| | - Bin Tian
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
| |
Collapse
|
40
|
Gruber AJ, Schmidt R, Gruber AR, Martin G, Ghosh S, Belmadani M, Keller W, Zavolan M. A comprehensive analysis of 3' end sequencing data sets reveals novel polyadenylation signals and the repressive role of heterogeneous ribonucleoprotein C on cleavage and polyadenylation. Genome Res 2016; 26:1145-59. [PMID: 27382025 PMCID: PMC4971764 DOI: 10.1101/gr.202432.115] [Citation(s) in RCA: 141] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2015] [Accepted: 05/31/2016] [Indexed: 12/22/2022]
Abstract
Alternative polyadenylation (APA) is a general mechanism of transcript diversification in mammals, which has been recently linked to proliferative states and cancer. Different 3′ untranslated region (3′ UTR) isoforms interact with different RNA-binding proteins (RBPs), which modify the stability, translation, and subcellular localization of the corresponding transcripts. Although the heterogeneity of pre-mRNA 3′ end processing has been established with high-throughput approaches, the mechanisms that underlie systematic changes in 3′ UTR lengths remain to be characterized. Through a uniform analysis of a large number of 3′ end sequencing data sets, we have uncovered 18 signals, six of which are novel, whose positioning with respect to pre-mRNA cleavage sites indicates a role in pre-mRNA 3′ end processing in both mouse and human. With 3′ end sequencing we have demonstrated that the heterogeneous ribonucleoprotein C (HNRNPC), which binds the poly(U) motif whose frequency also peaks in the vicinity of polyadenylation (poly(A)) sites, has a genome-wide effect on poly(A) site usage. HNRNPC-regulated 3′ UTRs are enriched in ELAV-like RBP 1 (ELAVL1) binding sites and include those of the CD47 gene, which participate in the recently discovered mechanism of 3′ UTR–dependent protein localization (UDPL). Our study thus establishes an up-to-date, high-confidence catalog of 3′ end processing sites and poly(A) signals, and it uncovers an important role of HNRNPC in regulating 3′ end processing. It further suggests that U-rich elements mediate interactions with multiple RBPs that regulate different stages in a transcript's life cycle.
Collapse
Affiliation(s)
- Andreas J Gruber
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Ralf Schmidt
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Andreas R Gruber
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Georges Martin
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Souvik Ghosh
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Manuel Belmadani
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Walter Keller
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Mihaela Zavolan
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| |
Collapse
|
41
|
Kainov YA, Aushev VN, Naumenko SA, Tchevkina EM, Bazykin GA. Complex Selection on Human Polyadenylation Signals Revealed by Polymorphism and Divergence Data. Genome Biol Evol 2016; 8:1971-9. [PMID: 27324920 PMCID: PMC4943204 DOI: 10.1093/gbe/evw137] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/05/2016] [Indexed: 12/19/2022] Open
Abstract
Polyadenylation is a step of mRNA processing which is crucial for its expression and stability. The major polyadenylation signal (PAS) represents a nucleotide hexamer that adheres to the AATAAA consensus sequence. Over a half of human genes have multiple cleavage and polyadenylation sites, resulting in a great diversity of transcripts differing in function, stability, and translational activity. Here, we use available whole-genome human polymorphism data together with data on interspecies divergence to study the patterns of selection acting on PAS hexamers. Common variants of PAS hexamers are depleted of single nucleotide polymorphisms (SNPs), and SNPs within PAS hexamers have a reduced derived allele frequency (DAF) and increased conservation, indicating prevalent negative selection; at the same time, the SNPs that "improve" the PAS (i.e., those leading to higher cleavage efficiency) have increased DAF, compared to those that "impair" it. SNPs are rarer at PAS of "unique" polyadenylation sites (one site per gene); among alternative polyadenylation sites, at the distal PAS and at exonic PAS. Similar trends were observed in DAFs and divergence between species of placental mammals. Thus, selection permits PAS mutations mainly at redundant and/or weakly functional PAS. Nevertheless, a fraction of the SNPs at PAS hexamers likely affect gene functions; in particular, some of the observed SNPs are associated with disease.
Collapse
Affiliation(s)
- Yaroslav A Kainov
- Centre for Developmental Neurobiology, King's College London, London, United Kingdom Oncogenes Regulation Department, N.N. Blokhin Russian Cancer Research Center, Institute of Carcinogenesis, Moscow, Russia
| | - Vasily N Aushev
- Oncogenes Regulation Department, N.N. Blokhin Russian Cancer Research Center, Institute of Carcinogenesis, Moscow, Russia Department of Preventive Medicine, Icahn School of Medicine at Mount Sinai, New York
| | - Sergey A Naumenko
- Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, Canada
| | - Elena M Tchevkina
- Oncogenes Regulation Department, N.N. Blokhin Russian Cancer Research Center, Institute of Carcinogenesis, Moscow, Russia
| | - Georgii A Bazykin
- Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia Skolkovo Institute of Science and Technology, Skolkovo, Russia Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Russia Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russia Pirogov Russian National Research Medical University, Moscow, Russia
| |
Collapse
|
42
|
Wu X, Zhang Y, Li QQ. PlantAPA: A Portal for Visualization and Analysis of Alternative Polyadenylation in Plants. FRONTIERS IN PLANT SCIENCE 2016; 7:889. [PMID: 27446120 PMCID: PMC4914594 DOI: 10.3389/fpls.2016.00889] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2016] [Accepted: 06/06/2016] [Indexed: 05/24/2023]
Abstract
Alternative polyadenylation (APA) is an important layer of gene regulation that produces mRNAs that have different 3' ends and/or encode diverse protein isoforms. Up to 70% of annotated genes in plants undergo APA. Increasing numbers of poly(A) sites collected in various plant species demand new methods and tools to access and mine these data. We have created an open-access web service called PlantAPA (http://bmi.xmu.edu.cn/plantapa) to visualize and analyze genome-wide poly(A) sites in plants. PlantAPA provides various interactive and dynamic graphics and seamlessly integrates a genome browser that can profile heterogeneous cleavage sites and quantify expression patterns of poly(A) sites across different conditions. Particularly, through PlantAPA, users can analyze poly(A) sites in extended 3' UTR regions, intergenic regions, and ambiguous regions owing to alternative transcription or RNA processing. In addition, it also provides tools for analyzing poly(A) site selections, 3' UTR lengthening or shortening, non-canonical APA site switching, and differential gene expression between conditions, making it more powerful for the study of APA-mediated gene expression regulation. More importantly, PlantAPA offers a bioinformatics pipeline that allows users to upload their own short reads or ESTs for poly(A) site extraction, enabling users to further explore poly(A) site selection using stored PlantAPA poly(A) sites together with their own poly(A) site datasets. To date, PlantAPA hosts the largest database of APA sites in plants, including Oryza sativa, Arabidopsis thaliana, Medicago truncatula, and Chlamydomonas reinhardtii. As a user-friendly web service, PlantAPA will be a valuable addition to the community of biologists studying APA mechanisms and gene expression regulation in plants.
Collapse
Affiliation(s)
- Xiaohui Wu
- Department of Automation, Xiamen UniversityXiamen, China
| | - Yumin Zhang
- Department of Automation, Xiamen UniversityXiamen, China
| | - Qingshun Q. Li
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen UniversityXiamen, China
- Graduate College of Biomedical Sciences, Western University of Health SciencesPomona, CA, USA
| |
Collapse
|
43
|
Huang Y, Xiong Y, Lin Z, Feng X, Jiang X, Songyang Z, Huang J. Specific Tandem 3'UTR Patterns and Gene Expression Profiles in Mouse Thy1+ Germline Stem Cells. PLoS One 2015; 10:e0145417. [PMID: 26713853 PMCID: PMC4699828 DOI: 10.1371/journal.pone.0145417] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2015] [Accepted: 12/03/2015] [Indexed: 12/14/2022] Open
Abstract
A recently developed strategy of sequencing alternative polyadenylation (APA) sites (SAPAS) with second-generation sequencing technology can be used to explore complete genome-wide patterns of tandem APA sites and global gene expression profiles. spermatogonial stem cells (SSCs) maintain long-term reproductive abilities in male mammals. The detailed mechanisms by which SSCs self-renew and generate mature spermatozoa are not clear. To understand the specific alternative polyadenylation pattern and global gene expression profile of male germline stem cells (GSCs, mainly referred to SSCs here), we isolated and purified mouse Thy1+ cells from testis by magnetic-activated cell sorting (MACS) and then used the SAPAS method for analysis, using pluripotent embryonic stem cells (ESCs) and differentiated mouse embryonic fibroblast cells (MEFs) as controls. As a result, we obtained 99,944 poly(A) sites, approximately 40% of which were newly detected in our experiments. These poly(A) sites originated from three mouse cell types and covered 17,499 genes, including 831 long non-coding RNA (lncRNA) genes. We observed that GSCs tend to have shorter 3'UTR lengths while MEFs tend towards longer 3'UTR lengths. We also identified 1337 genes that were highly expressed in GSCs, and these genes were highly consistent with the functional characteristics of GSCs. Our detailed bioinformatics analysis identified APA site-switching events at 3'UTRs and many new specifically expressed genes in GSCs, which we experimentally confirmed. Furthermore, qRT-PCR was performed to validate several events of the 334 genes with distal-to-proximal poly(A) switch in GSCs. Consistently APA reporter assay confirmed the total 3'UTR shortening in GSCs compared to MEFs. We also analyzed the cis elements around the proximal poly(A) site preferentially used in GSCs and found C-rich elements may contribute to this regulation. Overall, our results identified the expression level and polyadenylation site profiles and these data provide new insights into the processes potentially involved in the GSC life cycle and spermatogenesis.
Collapse
Affiliation(s)
- Yan Huang
- Key Laboratory of Reproductive Medicine of Guangdong Province, the First Affiliated Hospital and Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, 51000, China
| | - Yuanyan Xiong
- Key Laboratory of Reproductive Medicine of Guangdong Province, the First Affiliated Hospital and Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, 51000, China
- SYSU-CMU Shunde International Joint Research Institute, Shunde, China
| | - Zhuoheng Lin
- Key Laboratory of Reproductive Medicine of Guangdong Province, the First Affiliated Hospital and Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, 51000, China
| | - Xuyang Feng
- Key Laboratory of Reproductive Medicine of Guangdong Province, the First Affiliated Hospital and Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, 51000, China
| | - Xue Jiang
- Key Laboratory of Reproductive Medicine of Guangdong Province, the First Affiliated Hospital and Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, 51000, China
| | - Zhou Songyang
- Key Laboratory of Reproductive Medicine of Guangdong Province, the First Affiliated Hospital and Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, 51000, China
| | - Junjiu Huang
- Key Laboratory of Reproductive Medicine of Guangdong Province, the First Affiliated Hospital and Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, 51000, China
- * E-mail:
| |
Collapse
|
44
|
Huang G, Huang S, Wang R, Yan X, Li Y, Feng Y, Wang S, Yang X, Chen L, Li J, You L, Chen S, Luo G, Xu A. Dynamic Regulation of Tandem 3' Untranslated Regions in Zebrafish Spleen Cells during Immune Response. THE JOURNAL OF IMMUNOLOGY 2015; 196:715-25. [PMID: 26673144 DOI: 10.4049/jimmunol.1500847] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 11/08/2015] [Indexed: 12/24/2022]
Abstract
Alternative polyadenylation (APA) has been found to be involved in tumorigenesis, development, and cell differentiation, as well as in the activation of several subsets of immune cells in vitro. Whether APA takes place in immune responses in vivo is largely unknown. We profiled the variation in tandem 3' untranslated regions (UTRs) in pathogen-challenged zebrafish and identified hundreds of APA genes with ∼ 10% being immune response genes. The detected immune response APA genes were enriched in TLR signaling, apoptosis, and JAK-STAT signaling pathways. A greater number of microRNA target sites and AU-rich elements were found in the extended 3' UTRs than in the common 3' UTRs of these APA genes. Further analysis suggested that microRNA and AU-rich element-mediated posttranscriptional regulation plays an important role in modulating the expression of APA genes. These results indicate that APA is extensively involved in immune responses in vivo, and it may be a potential new paradigm for immune regulation.
Collapse
Affiliation(s)
- Guangrui Huang
- School of Basic Medical Sciences, Beijing University of Chinese Medicine, Beijing 100029, People's Republic of China; State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and Department of Genetics and Genome Sciences, Case Western Reserve University School of Medicine, Cleveland, OH 44106
| | - Shengfeng Huang
- State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| | - Ruihua Wang
- State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| | - Xinyu Yan
- State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| | - Yuxin Li
- State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| | - Yuchao Feng
- State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| | - Shaozhou Wang
- State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| | - Xia Yang
- State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| | - Liutao Chen
- State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| | - Jun Li
- State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| | - Leiming You
- School of Basic Medical Sciences, Beijing University of Chinese Medicine, Beijing 100029, People's Republic of China; State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| | - Shangwu Chen
- State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| | - Guangbin Luo
- School of Basic Medical Sciences, Beijing University of Chinese Medicine, Beijing 100029, People's Republic of China; Department of Genetics and Genome Sciences, Case Western Reserve University School of Medicine, Cleveland, OH 44106
| | - Anlong Xu
- School of Basic Medical Sciences, Beijing University of Chinese Medicine, Beijing 100029, People's Republic of China; State Key Laboratory of Biocontrol, Department of Biochemistry, School of Life Sciences, Sun Yat-Sen (Zhongshan) University, Guangzhou, Guangdong 510275, People's Republic of China; and
| |
Collapse
|
45
|
Tan YD, Deng J, Neilson JR. RAX2: a genome-wide detection method of condition-associated transcription variation. Nucleic Acids Res 2015; 43:e96. [PMID: 25953852 PMCID: PMC4551904 DOI: 10.1093/nar/gkv411] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 04/17/2015] [Indexed: 11/23/2022] Open
Abstract
Most mammalian genes have mRNA variants due to alternative promoter usage, alternative splicing, and alternative cleavage and polyadenylation. Expression of alternative RNA isoforms has been found to be associated with tumorigenesis, proliferation and differentiation. Detection of condition-associated transcription variation requires association methods. Traditional association methods such as Pearson chi-square test and Fisher Exact test are single test methods and do not work on count data with replicates. Although the Cochran Mantel Haenszel (CMH) approach can handle replicated count data, our simulations showed that multiple CMH tests still had very low power. To identify condition-associated variation of transcription, we here proposed a ranking analysis of chi-squares (RAX2) for large-scale association analysis. RAX2 is a nonparametric method and has accurate and conservative estimation of FDR profile. Simulations demonstrated that RAX2 performs well in finding condition-associated transcription variants. We applied RAX2 to primary T-cell transcriptomic data and identified 1610 (16.3%) tags associated in transcription with immune stimulation at FDR < 0.05. Most of these tags also had differential expression. Analysis of two and three tags within genes revealed that under immune stimulation short RNA isoforms were preferably used.
Collapse
Affiliation(s)
- Yuan-De Tan
- Department of Molecular Physiology and Biophysics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| | - Jixin Deng
- Department of Molecular Physiology and Biophysics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| | - Joel R Neilson
- Department of Molecular Physiology and Biophysics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA Dan L. Duncan Cancer Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| |
Collapse
|
46
|
Bahrami-Samani E, Vo DT, de Araujo PR, Vogel C, Smith AD, Penalva LOF, Uren PJ. Computational challenges, tools, and resources for analyzing co- and post-transcriptional events in high throughput. WILEY INTERDISCIPLINARY REVIEWS. RNA 2015; 6:291-310. [PMID: 25515586 PMCID: PMC4397117 DOI: 10.1002/wrna.1274] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2014] [Revised: 10/24/2014] [Accepted: 10/29/2014] [Indexed: 11/10/2022]
Abstract
Co- and post-transcriptional regulation of gene expression is complex and multifaceted, spanning the complete RNA lifecycle from genesis to decay. High-throughput profiling of the constituent events and processes is achieved through a range of technologies that continue to expand and evolve. Fully leveraging the resulting data is nontrivial, and requires the use of computational methods and tools carefully crafted for specific data sources and often intended to probe particular biological processes. Drawing upon databases of information pre-compiled by other researchers can further elevate analyses. Within this review, we describe the major co- and post-transcriptional events in the RNA lifecycle that are amenable to high-throughput profiling. We place specific emphasis on the analysis of the resulting data, in particular the computational tools and resources available, as well as looking toward future challenges that remain to be addressed.
Collapse
Affiliation(s)
- Emad Bahrami-Samani
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| | - Dat T. Vo
- Children’s Cancer Research Institute and Department of Cellular and Structural Biology, University of Texas Health Science Center, San Antonio, TX
| | - Patricia Rosa de Araujo
- Children’s Cancer Research Institute and Department of Cellular and Structural Biology, University of Texas Health Science Center, San Antonio, TX
| | - Christine Vogel
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY
| | - Andrew D. Smith
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| | - Luiz O. F. Penalva
- Children’s Cancer Research Institute and Department of Cellular and Structural Biology, University of Texas Health Science Center, San Antonio, TX
| | - Philip J. Uren
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| |
Collapse
|
47
|
Rozhkova AV, Filippenkov IB, Sudarkina OY, Limborska SA, Dergunova LV. Alternative promoters located in SGMS1 gene introns participate in regulation of its expression in human tissues. Mol Biol 2015. [DOI: 10.1134/s002689331501015x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
48
|
Oleksiewicz U, Tomczak K, Woropaj J, Markowska M, Stępniak P, Shah PK. Computational characterisation of cancer molecular profiles derived using next generation sequencing. Contemp Oncol (Pozn) 2015; 19:A78-91. [PMID: 25691827 PMCID: PMC4322529 DOI: 10.5114/wo.2014.47137] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Our current understanding of cancer genetics is grounded on the principle that cancer arises from a clone that has accumulated the requisite somatically acquired genetic aberrations, leading to the malignant transformation. It also results in aberrent of gene and protein expression. Next generation sequencing (NGS) or deep sequencing platforms are being used to create large catalogues of changes in copy numbers, mutations, structural variations, gene fusions, gene expression, and other types of information for cancer patients. However, inferring different types of biological changes from raw reads generated using the sequencing experiments is algorithmically and computationally challenging. In this article, we outline common steps for the quality control and processing of NGS data. We highlight the importance of accurate and application-specific alignment of these reads and the methodological steps and challenges in obtaining different types of information. We comment on the importance of integrating these data and building infrastructure to analyse it. We also provide exhaustive lists of available software to obtain information and point the readers to articles comparing software for deeper insight in specialised areas. We hope that the article will guide readers in choosing the right tools for analysing oncogenomic datasets.
Collapse
Affiliation(s)
- Urszula Oleksiewicz
- Laboratory of Gene Therapy, Department of Cancer Immunology, The Greater Poland Cancer Centre, Poznan, Poland ; Department of Cancer Immunology and Diagnostics, Chair of Medical Biotechnology, Poznan University of Medical Sciences, Poznan, Poland ; These authors contributed equally to this paper
| | - Katarzyna Tomczak
- Laboratory of Gene Therapy, Department of Cancer Immunology, The Greater Poland Cancer Centre, Poznan, Poland ; Department of Cancer Immunology and Diagnostics, Chair of Medical Biotechnology, Poznan University of Medical Sciences, Poznan, Poland ; Postgraduate School of Molecular Medicine, Medical University of Warsaw, Warsaw ; These authors contributed equally to this paper
| | - Jakub Woropaj
- Poznan University of Economics, Poznań, Poland ; These authors contributed equally to this paper
| | | | | | - Parantu K Shah
- Institute for Applied Cancer Science, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
49
|
An improved poly(A) motifs recognition method based on decision level fusion. Comput Biol Chem 2014; 54:49-56. [PMID: 25594576 DOI: 10.1016/j.compbiolchem.2014.12.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Revised: 11/27/2014] [Accepted: 12/27/2014] [Indexed: 01/07/2023]
Abstract
Polyadenylation is the process of addition of poly(A) tail to mRNA 3' ends. Identification of motifs controlling polyadenylation plays an essential role in improving genome annotation accuracy and better understanding of the mechanisms governing gene regulation. The bioinformatics methods used for poly(A) motifs recognition have demonstrated that information extracted from sequences surrounding the candidate motifs can differentiate true motifs from the false ones greatly. However, these methods depend on either domain features or string kernels. To date, methods combining information from different sources have not been found yet. Here, we proposed an improved poly(A) motifs recognition method by combing different sources based on decision level fusion. First of all, two novel prediction methods was proposed based on support vector machine (SVM): one method is achieved by using the domain-specific features and principle component analysis (PCA) method to eliminate the redundancy (PCA-SVM); the other method is based on Oligo string kernel (Oligo-SVM). Then we proposed a novel machine-learning method for poly(A) motif prediction by marrying four poly(A) motifs recognition methods, including two state-of-the-art methods (Random Forest (RF) and HMM-SVM), and two novel proposed methods (PCA-SVM and Oligo-SVM). A decision level information fusion method was employed to combine the decision values of different classifiers by applying the DS evidence theory. We evaluated our method on a comprehensive poly(A) dataset that consists of 14,740 samples on 12 variants of poly(A) motifs and 2750 samples containing none of these motifs. Our method has achieved accuracy up to 86.13%. Compared with the four classifiers, our evidence theory based method reduces the average error rate by about 30%, 27%, 26% and 16%, respectively. The experimental results suggest that the proposed method is more effective for poly(A) motif recognition.
Collapse
|
50
|
Guan J, Fu J, Wu M, Chen L, Ji G, Quinn Li Q, Wu X. VAAPA: a web platform for visualization and analysis of alternative polyadenylation. Comput Biol Med 2014; 57:20-5. [PMID: 25506822 DOI: 10.1016/j.compbiomed.2014.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2013] [Revised: 11/10/2014] [Accepted: 11/17/2014] [Indexed: 11/19/2022]
Abstract
Polyadenylation [poly(A)] is an essential process during the maturation of most mRNAs in eukaryotes. Alternative polyadenylation (APA) as an important layer of gene expression regulation has been increasingly recognized in various species. Here, a web platform for visualization and analysis of alternative polyadenylation (VAAPA) was developed. This platform can visualize the distribution of poly(A) sites and poly(A) clusters of a gene or a section of a chromosome. It can also highlight genes with switched APA sites among different conditions. VAAPA is an easy-to-use web-based tool that provides functions of poly(A) site query, data uploading, downloading, and APA sites visualization. It was designed in a multi-tier architecture and developed based on Smart GWT (Google Web Toolkit) using Java as the development language. VAAPA will be a valuable addition to the community for the comprehensive study of APA, not only by making the high quality poly(A) site data more accessible, but also by providing users with numerous valuable functions for poly(A) site analysis and visualization.
Collapse
Affiliation(s)
- Jinting Guan
- Department of Automation, Xiamen University, Xiamen 361005, Fujian, China
| | - Jingyi Fu
- Department of Automation, Xiamen University, Xiamen 361005, Fujian, China
| | - Mingcheng Wu
- Department of Automation, Xiamen University, Xiamen 361005, Fujian, China
| | - Longteng Chen
- Department of Automation, Xiamen University, Xiamen 361005, Fujian, China
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen 361005, Fujian, China; Innovation Center for Cell Biology, Xiamen University, Xiamen 361102, Fujian, China
| | - Qingshun Quinn Li
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, and College of the Environment and Ecology, Xiamen University, Xiamen 361102, Fujian, China; Department of Biology, Miami University, Oxford, OH 45056, USA
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen 361005, Fujian, China.
| |
Collapse
|