1
|
Monzó C, Liu T, Conesa A. Transcriptomics in the era of long-read sequencing. Nat Rev Genet 2025:10.1038/s41576-025-00828-z. [PMID: 40155769 DOI: 10.1038/s41576-025-00828-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/20/2025] [Indexed: 04/01/2025]
Abstract
Transcriptome sequencing revolutionized the analysis of gene expression, providing an unbiased approach to gene detection and quantification that enabled the discovery of novel isoforms, alternative splicing events and fusion transcripts. However, although short-read sequencing technologies have surpassed the limited dynamic range of previous technologies such as microarrays, they have limitations, for example, in resolving full-length transcripts and complex isoforms. Over the past 5 years, long-read sequencing technologies have matured considerably, with improvements in instrumentation and analytical methods, enabling their application to RNA sequencing (RNA-seq). Benchmarking studies are beginning to identify the strengths and limitations of long-read RNA-seq, although there remains a need for comprehensive resources to guide newcomers through the intricacies of this approach. In this Review, we provide a comprehensive overview of the long-read RNA-seq workflow, from library preparation and sequencing challenges to core data processing, downstream analyses and emerging developments. We present an extensive inventory of experimental and analytical methods and discuss current challenges and prospects.
Collapse
Affiliation(s)
- Carolina Monzó
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain.
| | - Tianyuan Liu
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain
| | - Ana Conesa
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain.
| |
Collapse
|
2
|
Song Y, Zhang C, Omenn GS, O’Meara MJ, Welch JD. Predicting the Structural Impact of Human Alternative Splicing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.21.572928. [PMID: 38187531 PMCID: PMC10769328 DOI: 10.1101/2023.12.21.572928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Protein structure prediction with neural networks is a powerful new method for linking protein sequence, structure, and function, but structures have generally been predicted for only a single isoform of each gene, neglecting splice variants. To investigate the structural implications of alternative splicing, we used AlphaFold2 to predict the structures of more than 11,000 human isoforms. We employed multiple metrics to identify splicing-induced structural alterations, including template matching score, secondary structure composition, surface charge distribution, radius of gyration, accessibility of post-translational modification sites, and structure-based function prediction. We identified examples of how alternative splicing induced clear changes in each of these properties. Structural similarity between isoforms largely correlated with degree of sequence identity, but we identified a subset of isoforms with low structural similarity despite high sequence similarity. Exon skipping and alternative last exons tended to increase the surface charge and radius of gyration. Splicing also buried or exposed numerous post-translational modification sites, most notably among the isoforms of BAX. Functional prediction nominated numerous functional differences among isoforms of the same gene, with loss of function compared to the reference predominating. Finally, we used single-cell RNA-seq data from the Tabula Sapiens to determine the cell types in which each structure is expressed. Our work represents an important resource for studying the structure and function of splice isoforms across the cell types of the human body.
Collapse
Affiliation(s)
- Yuxuan Song
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Matthew J. O’Meara
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Joshua D. Welch
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Computer Science and Engineering, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
3
|
Dam SH, Olsen LR, Vitting-Seerup K. Expression and splicing mediate distinct biological signals. BMC Biol 2023; 21:220. [PMID: 37858135 PMCID: PMC10588054 DOI: 10.1186/s12915-023-01724-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 10/04/2023] [Indexed: 10/21/2023] Open
Abstract
BACKGROUND Through alternative splicing, most human genes produce multiple isoforms in a cell-, tissue-, and disease-specific manner. Numerous studies show that alternative splicing is essential for development, diseases, and their treatments. Despite these important examples, the extent and biological relevance of splicing are currently unknown. RESULTS To solve this problem, we developed pairedGSEA and used it to profile transcriptional changes in 100 representative RNA-seq datasets. Our systematic analysis demonstrates that changes in splicing, on average, contribute to 48.1% of the biological signal in expression analyses. Gene-set enrichment analysis furthermore indicates that expression and splicing both convey shared and distinct biological signals. CONCLUSIONS These findings establish alternative splicing as a major regulator of the human condition and suggest that most contemporary RNA-seq studies likely miss out on critical biological insights. We anticipate our results will contribute to the transition from a gene-centric to an isoform-centric research paradigm.
Collapse
Affiliation(s)
- Søren Helweg Dam
- Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Lars Rønn Olsen
- Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Kristoffer Vitting-Seerup
- Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark.
| |
Collapse
|
4
|
Biswas D, Shenoy SV, Chetanya C, Lachén-Montes M, Barpanda A, Athithyan AP, Ghosh S, Ausín K, Zelaya MV, Fernández-Irigoyen J, Manna A, Roy S, Talukdar A, Ball GR, Santamaría E, Srivastava S. Deciphering the Interregional and Interhemisphere Proteome of the Human Brain in the Context of the Human Proteome Project. J Proteome Res 2021; 20:5280-5293. [PMID: 34714085 DOI: 10.1021/acs.jproteome.1c00511] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
This study, which performs an extensive mass spectrometry-based analysis of 19 brain regions from both left and right hemispheres, presents the first draft of the human brain interhemispheric proteome. This high-resolution proteomics data provides comprehensive coverage of 3300 experimentally measured (nonhypothetical) proteins across multiple regions, allowing the characterization of protein-centric interhemispheric differences and synapse biology, and portrays the regional mapping of specific regions for brain disorder biomarkers. In the context of the Human Proteome Project (HPP), the interhemispheric proteome data reveal specific markers like chimerin 2 (CHN2) in the cerebellar vermis, olfactory marker protein (OMP) in the olfactory bulb, and ankyrin repeat domain 63 (ANKRD63) in basal ganglia, in line with regional brain transcriptomes mapped in the Human Protein Atlas (HPA). In addition, an in silico analysis pipeline was used to predict the structure and function of the uncharacterized uPE1 protein ANKRD63, and parallel reaction monitoring (PRM) was applied to validate its region-specific expression. Finally, we have built the Interhemispheric Brain Proteome Map (IBPM) Portal (www.brainprot.org) to stimulate the scientific community's interest in the brain molecular landscape and accelerate and support research in neuroproteomics. Data are available via ProteomeXchange with identifier PXD019936.
Collapse
Affiliation(s)
- Deeptarup Biswas
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Sanjyot Vinayak Shenoy
- Department of Mathematics, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Chetanya Chetanya
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Mercedes Lachén-Montes
- Clinical Neuroproteomics Unit, Proteomics Platform, Proteored-ISCIII, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), Navarra Institute for Health Research (IdiSNA), 31008 Pamplona, Spain
| | - Abhilash Barpanda
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | | | - Susmita Ghosh
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Karina Ausín
- Clinical Neuroproteomics Unit, Proteomics Platform, Proteored-ISCIII, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), Navarra Institute for Health Research (IdiSNA), 31008 Pamplona, Spain
| | - María Victoria Zelaya
- Clinical Neuroproteomics Unit, Proteomics Platform, Proteored-ISCIII, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), Navarra Institute for Health Research (IdiSNA), 31008 Pamplona, Spain
| | - Joaquín Fernández-Irigoyen
- Clinical Neuroproteomics Unit, Proteomics Platform, Proteored-ISCIII, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), Navarra Institute for Health Research (IdiSNA), 31008 Pamplona, Spain
| | - Akash Manna
- Medicine Department, Medical College Hospital Kolkata, 88 College Street, Kolkata 700072, India
| | - Sudesh Roy
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Arunasu Talukdar
- Medicine Department, Medical College Hospital Kolkata, 88 College Street, Kolkata 700072, India
| | - Graham Roy Ball
- School of Science and Technology, Nottingham Trent University, Clifton Lane, Nottingham NG11 8NS, United Kingdom
| | - Enrique Santamaría
- Clinical Neuroproteomics Unit, Proteomics Platform, Proteored-ISCIII, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), Navarra Institute for Health Research (IdiSNA), 31008 Pamplona, Spain
| | - Sanjeeva Srivastava
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| |
Collapse
|
5
|
Chen H, Shaw D, Bu D, Jiang T. FINER: enhancing the prediction of tissue-specific functions of isoforms by refining isoform interaction networks. NAR Genom Bioinform 2021; 3:lqab057. [PMID: 34169280 PMCID: PMC8219044 DOI: 10.1093/nargab/lqab057] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 05/18/2021] [Accepted: 06/03/2021] [Indexed: 12/24/2022] Open
Abstract
Annotating the functions of gene products is a mainstay in biology. A variety of databases have been established to record functional knowledge at the gene level. However, functional annotations at the isoform resolution are in great demand in many biological applications. Although critical information in biological processes such as protein-protein interactions (PPIs) is often used to study gene functions, it does not directly help differentiate the functions of isoforms, as the 'proteins' in the existing PPIs generally refer to 'genes'. On the other hand, the prediction of isoform functions and prediction of isoform-isoform interactions, though inherently intertwined, have so far been treated as independent computational problems in the literature. Here, we present FINER, a unified framework to jointly predict isoform functions and refine PPIs from the gene level to the isoform level, enabling both tasks to benefit from each other. Extensive computational experiments on human tissue-specific data demonstrate that FINER is able to gain at least 5.16% in AUC and 15.1% in AUPRC for functional prediction across multiple tissues by refining noisy PPIs, resulting in significant improvement over the state-of-the-art methods. Some in-depth analyses reveal consistency between FINER's predictions and the tissue specificity as well as subcellular localization of isoforms.
Collapse
Affiliation(s)
- Hao Chen
- Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA
| | - Dipan Shaw
- Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA
| | - Dongbo Bu
- Key Lab of Intelligent Information Process, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tao Jiang
- Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA
- Bioinformatics Division, BNRIST/Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
6
|
Tseng CC, Wong MC, Liao WT, Chen CJ, Lee SC, Yen JH, Chang SJ. Genetic Variants in Transcription Factor Binding Sites in Humans: Triggered by Natural Selection and Triggers of Diseases. Int J Mol Sci 2021; 22:ijms22084187. [PMID: 33919522 PMCID: PMC8073710 DOI: 10.3390/ijms22084187] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2021] [Revised: 04/15/2021] [Accepted: 04/16/2021] [Indexed: 12/14/2022] Open
Abstract
Variants of transcription factor binding sites (TFBSs) constitute an important part of the human genome. Current evidence demonstrates close links between nucleotides within TFBSs and gene expression. There are multiple pathways through which genomic sequences located in TFBSs regulate gene expression, and recent genome-wide association studies have shown the biological significance of TFBS variation in human phenotypes. However, numerous challenges remain in the study of TFBS polymorphisms. This article aims to cover the current state of understanding as regards the genomic features of TFBSs and TFBS variants; the mechanisms through which TFBS variants regulate gene expression; the approaches to studying the effects of nucleotide changes that create or disrupt TFBSs; the challenges faced in studies of TFBS sequence variations; the effects of natural selection on collections of TFBSs; in addition to the insights gained from the study of TFBS alleles related to gout, its associated comorbidities (increased body mass index, chronic kidney disease, diabetes, dyslipidemia, coronary artery disease, ischemic heart disease, hypertension, hyperuricemia, osteoporosis, and prostate cancer), and the treatment responses of patients.
Collapse
Affiliation(s)
- Chia-Chun Tseng
- Graduate Institute of Clinical Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 80708, Taiwan; (C.-C.T.); (J.-H.Y.)
- Division of Rheumatology, Department of Internal Medicine, Kaohsiung Medical University Hospital, Kaohsiung 80756, Taiwan
| | - Man-Chun Wong
- Department of Biotechnology, College of Life Science, Kaohsiung Medical University, Kaohsiung 80708, Taiwan;
| | - Wei-Ting Liao
- Department of Biotechnology, College of Life Science, Kaohsiung Medical University, Kaohsiung 80708, Taiwan;
- Department of Medical Research, Kaohsiung Medical University Hospital, Kaohsiung 80756, Taiwan
- Correspondence: (W.-T.L.); (S.-J.C.); Tel.: +886-7-3121101 (W.-T.L.); +886-7-5916679 (S.-J.C.); Fax:+886-7-3125339 (W.-T.L.); +886-7-5919264 (S.-J.C.)
| | - Chung-Jen Chen
- Department of Internal Medicine, Kaohsiung Municipal Ta-Tung Hospital, Kaohsiung 80145, Taiwan;
| | - Su-Chen Lee
- Laboratory Diagnosis of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 80708, Taiwan;
| | - Jeng-Hsien Yen
- Graduate Institute of Clinical Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 80708, Taiwan; (C.-C.T.); (J.-H.Y.)
- Division of Rheumatology, Department of Internal Medicine, Kaohsiung Medical University Hospital, Kaohsiung 80756, Taiwan
- Institute of Biomedical Sciences, National Sun Yat-Sen University, Kaohsiung 80424, Taiwan
- Department of Biological Science and Technology, National Chiao-Tung University, Hsinchu 30010, Taiwan
| | - Shun-Jen Chang
- Department of Kinesiology, Health and Leisure Studies, National University of Kaohsiung, Kaohsiung 81148, Taiwan
- Correspondence: (W.-T.L.); (S.-J.C.); Tel.: +886-7-3121101 (W.-T.L.); +886-7-5916679 (S.-J.C.); Fax:+886-7-3125339 (W.-T.L.); +886-7-5919264 (S.-J.C.)
| |
Collapse
|