1
|
Tian Q, Zou Q, Jia L. Benchmarking of methods that identify alternative polyadenylation events in single-/multiple-polyadenylation site genes. NAR Genom Bioinform 2025; 7:lqaf056. [PMID: 40371010 PMCID: PMC12076406 DOI: 10.1093/nargab/lqaf056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2024] [Revised: 04/23/2025] [Accepted: 05/01/2025] [Indexed: 05/16/2025] Open
Abstract
Alternative polyadenylation (APA) is a widespread post-transcriptional mechanism that diversifies gene expression by generating messenger RNA isoforms with varying 3' untranslated regions. Accurate identification and quantification of transcriptome-wide polyadenylation site (PAS) usage are essential for understanding APA-mediated gene regulation and its biological implications. In this review, we first review the landscape of computational tools developed to identify APA events from RNA sequencing (RNA-seq) data. We then benchmarked five PAS prediction tools and seven APA detection algorithms using five RNA-seq datasets derived from clear cell renal cell carcinoma (ccRCC) and adjacent normal tissues. By evaluating tool performance across genes with either single or multiple PASs, we revealed substantial variation in accuracy, sensitivity, and consistency among the tools. Based on this comparative analysis, we offer practical guidelines for tool selection and propose considerations for improving APA detection accuracy. Additionally, our analysis identified CCNL2 as a candidate gene exhibiting significant APA regulation in ccRCC, highlighting its potential as a disease-associated biomarker.
Collapse
Affiliation(s)
- Qiuxiang Tian
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, 410082, China
| | - Quan Zou
- School of Information Technology and Administration, Hunan University of Finance and Economics, Changsha, 410205, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, 324000, China
| | - Linpei Jia
- Department of Nephrology, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Beijing, 100053, China
| |
Collapse
|
2
|
Sun J, Kim JY, Jun S, Park M, de Jong E, Chang JW, Cheng S, Fan D, Chen Y, Griffin TJ, Lee JH, You HJ, Zhang W, Yong J. Dichotomous intronic polyadenylation profiles reveal multifaceted gene functions in the pan-cancer transcriptome. Exp Mol Med 2024; 56:2145-2161. [PMID: 39349823 PMCID: PMC11541570 DOI: 10.1038/s12276-024-01289-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Revised: 06/20/2024] [Accepted: 06/24/2024] [Indexed: 11/08/2024] Open
Abstract
Alternative cleavage and polyadenylation within introns (intronic APA) generate shorter mRNA isoforms; however, their physiological significance remains elusive. In this study, we developed a comprehensive workflow to analyze intronic APA profiles using the mammalian target of rapamycin (mTOR)-regulated transcriptome as a model system. Our investigation revealed two contrasting effects within the transcriptome in response to fluctuations in cellular mTOR activity: an increase in intronic APA for a subset of genes and a decrease for another subset of genes. The application of this workflow to RNA-seq data from The Cancer Genome Atlas demonstrated that this dichotomous intronic APA pattern is a consistent feature in transcriptomes across both normal tissues and various cancer types. Notably, our analyses of protein length changes resulting from intronic APA events revealed two distinct phenomena in proteome programming: a loss of functional domains due to significant changes in protein length or minimal alterations in C-terminal protein sequences within unstructured regions. Focusing on conserved intronic APA events across 10 different cancer types highlighted the prevalence of the latter cases in cancer transcriptomes, whereas the former cases were relatively enriched in normal tissue transcriptomes. These observations suggest potential, yet distinct, roles for intronic APA events during pathogenic processes and emphasize the abundance of protein isoforms with similar lengths in the cancer proteome. Furthermore, our investigation into the isoform-specific functions of JMJD6 intronic APA events supported the hypothesis that alterations in unstructured C-terminal protein regions lead to functional differences. Collectively, our findings underscore intronic APA events as a discrete molecular signature present in both normal tissues and cancer transcriptomes, highlighting the contribution of APA to the multifaceted functionality of the cancer proteome.
Collapse
Affiliation(s)
- Jiao Sun
- Department of Computer Science, University of Central Florida, Orlando, FL, 32816, USA
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Jin-Young Kim
- Department of Pharmacology, Chosun University School of Medicine, Gwangju, 61452, Republic of Korea
| | - Semo Jun
- Department of Cellular and Molecular Medicine, Chosun University School of Medicine, Gwangju, 61452, Republic of Korea
| | - Meeyeon Park
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN, 55455, USA
| | - Ebbing de Jong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN, 55455, USA
- SUNY Upstate Medical University, Syracuse, NY, 13210, USA
| | - Jae-Woong Chang
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN, 55455, USA
| | - Sze Cheng
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN, 55455, USA
| | - Deliang Fan
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Yue Chen
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN, 55455, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN, 55455, USA
| | - Jung-Hee Lee
- Department of Cellular and Molecular Medicine, Chosun University School of Medicine, Gwangju, 61452, Republic of Korea
| | - Ho Jin You
- Department of Pharmacology, Chosun University School of Medicine, Gwangju, 61452, Republic of Korea.
| | - Wei Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL, 32816, USA.
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN, 55455, USA.
| |
Collapse
|
3
|
Bi X, Ye W, Cheng X, Yang N, Wu X. vizAPA: visualizing dynamics of alternative polyadenylation from bulk and single-cell data. Bioinformatics 2024; 40:btae099. [PMID: 38485700 PMCID: PMC10950478 DOI: 10.1093/bioinformatics/btae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 01/11/2024] [Indexed: 03/21/2024] Open
Abstract
MOTIVATION Alternative polyadenylation (APA) is a widespread post-transcriptional regulatory mechanism across all eukaryotes. With the accumulation of genome-wide APA sites, especially those with single-cell resolution, it is imperative to develop easy-to-use visualization tools to guide APA analysis. RESULTS We developed an R package called vizAPA for visualizing APA dynamics from bulk and single-cell data. vizAPA implements unified data structures for APA data and genome annotations. vizAPA also enables identification of genes with differential APA usage across biological samples and/or cell types. vizAPA provides four unique modules for extensively visualizing APA dynamics across biological samples and at the single-cell level. vizAPA could serve as a plugin in many routine APA analysis pipelines to augment studies for APA dynamics. AVAILABILITY AND IMPLEMENTATION https://github.com/BMILAB/vizAPA.
Collapse
Affiliation(s)
- Xingyu Bi
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Wenbin Ye
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, CA 92697, United States
| | - Xin Cheng
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Ning Yang
- College of Industrial Design, Pukyong National University, Busan 48513, Korea
| | - Xiaohui Wu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| |
Collapse
|
4
|
Bryce-Smith S, Burri D, Gazzara MR, Herrmann CJ, Danecka W, Fitzsimmons CM, Wan YK, Zhuang F, Fansler MM, Fernández JM, Ferret M, Gonzalez-Uriarte A, Haynes S, Herdman C, Kanitz A, Katsantoni M, Marini F, McDonnel E, Nicolet B, Poon CL, Rot G, Schärfen L, Wu PJ, Yoon Y, Barash Y, Zavolan M. Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data. RNA (NEW YORK, N.Y.) 2023; 29:1839-1855. [PMID: 37816550 PMCID: PMC10653393 DOI: 10.1261/rna.079849.123] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 09/21/2023] [Indexed: 10/12/2023]
Abstract
The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for continuous extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies, while the containers and reproducible workflows could easily be deployed and extended to evaluate new methods or data sets.
Collapse
Affiliation(s)
- Sam Bryce-Smith
- Department of Neuromuscular Diseases, UCL Queen Square Motor Neuron Disease Centre, UCL Queen Square Institute of Neurology, UCL, London WC1N 3BG, United Kingdom
| | - Dominik Burri
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Matthew R Gazzara
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Christina J Herrmann
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Weronika Danecka
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh EH9 3FF, United Kingdom
| | - Christina M Fitzsimmons
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Yuk Kei Wan
- Genome Institute of Singapore, Buona Vista, Singapore 138672
- Yong Loo Lin School of Medicine, National University of Singapore, Kent Ridge, Singapore 119228
| | - Farica Zhuang
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Mervin M Fansler
- Tri-Institutional Program in Computational Biology and Medicine, Weill Cornell Graduate Studies, New York, New York 10065, USA
- Cancer Biology and Genetics, Sloan-Kettering Institute, MSKCC, New York, New York 10065, USA
| | - José M Fernández
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Meritxell Ferret
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Asier Gonzalez-Uriarte
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Samuel Haynes
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh EH9 3FF, United Kingdom
| | - Chelsea Herdman
- Department of Neurobiology, University of Utah, Salt Lake City, Utah 84132, USA
| | - Alexander Kanitz
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Maria Katsantoni
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Federico Marini
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg-University Mainz, 55118 Mainz, Germany
| | - Euan McDonnel
- Leeds Institute for Data Analytics, School of Molecular and Cellular Biology, University of Leeds, Leeds LS2 9NL, United Kingdom
| | - Ben Nicolet
- Department of Hematopoiesis, Sanquin Research, Landsteiner Laboratory, Amsterdam UMC, University of Amsterdam, 1066 CX Amsterdam, The Netherlands
- Oncode Institute, 3521 AL Utrecht, The Netherlands
| | - Chi-Lam Poon
- Graduate School of Medical Sciences, Weill Cornell Medicine, New York, New York 10065, USA
| | - Gregor Rot
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Institute of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland
| | - Leonard Schärfen
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, Connecticut 06520, USA
| | - Pin-Jou Wu
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, 72076 Tübingen, Germany
| | - Yoseop Yoon
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California Irvine, Irvine, California 92617, USA
| | - Yoseph Barash
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Mihaela Zavolan
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
5
|
Ye W, Lian Q, Ye C, Wu X. A Survey on Methods for Predicting Polyadenylation Sites from DNA Sequences, Bulk RNA-seq, and Single-cell RNA-seq. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022:S1672-0229(22)00121-8. [PMID: 36167284 PMCID: PMC10372920 DOI: 10.1016/j.gpb.2022.09.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 08/17/2022] [Accepted: 09/19/2022] [Indexed: 05/08/2023]
Abstract
Alternative polyadenylation (APA) plays important roles in modulating mRNA stability, translation, and subcellular localization, and contributes extensively to shaping eukaryotic transcriptome complexity and proteome diversity. Identification of poly(A) sites (pAs) on a genome-wide scale is a critical step toward understanding the underlying mechanism of APA-mediated gene regulation. A number of established computational tools have been proposed to predict pAs from diverse genomic data. Here we provided an exhaustive overview of computational approaches for predicting pAs from DNA sequences, bulk RNA sequencing (RNA-seq) data, and single-cell RNA sequencing (scRNA-seq) data. Particularly, we examined several representative tools using bulk RNA-seq and scRNA-seq data from peripheral blood mononuclear cells and put forward operable suggestions on how to assess the reliability of pAs predicted by different tools. We also proposed practical guidelines on choosing appropriate methods applicable to diverse scenarios. Moreover, we discussed in depth the challenges in improving the performance of pA prediction and benchmarking different methods. Additionally, we highlighted outstanding challenges and opportunities using new machine learning and integrative multi-omics techniques, and provided our perspective on how computational methodologies might evolve in the future for non-3' untranslated region, tissue-specific, cross-species, and single-cell pA prediction.
Collapse
Affiliation(s)
- Wenbin Ye
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Qiwei Lian
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China; Department of Automation, Xiamen University, Xiamen 361005, China
| | - Congting Ye
- Key Laboratory of the Coastal and Wetland Ecosystems, Ministry of Education, College of the Environment and Ecology, Xiamen University, Xiamen 361005, China
| | - Xiaohui Wu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China.
| |
Collapse
|