1
|
Monzó C, Liu T, Conesa A. Transcriptomics in the era of long-read sequencing. Nat Rev Genet 2025:10.1038/s41576-025-00828-z. [PMID: 40155769 DOI: 10.1038/s41576-025-00828-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/20/2025] [Indexed: 04/01/2025]
Abstract
Transcriptome sequencing revolutionized the analysis of gene expression, providing an unbiased approach to gene detection and quantification that enabled the discovery of novel isoforms, alternative splicing events and fusion transcripts. However, although short-read sequencing technologies have surpassed the limited dynamic range of previous technologies such as microarrays, they have limitations, for example, in resolving full-length transcripts and complex isoforms. Over the past 5 years, long-read sequencing technologies have matured considerably, with improvements in instrumentation and analytical methods, enabling their application to RNA sequencing (RNA-seq). Benchmarking studies are beginning to identify the strengths and limitations of long-read RNA-seq, although there remains a need for comprehensive resources to guide newcomers through the intricacies of this approach. In this Review, we provide a comprehensive overview of the long-read RNA-seq workflow, from library preparation and sequencing challenges to core data processing, downstream analyses and emerging developments. We present an extensive inventory of experimental and analytical methods and discuss current challenges and prospects.
Collapse
Affiliation(s)
- Carolina Monzó
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain.
| | - Tianyuan Liu
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain
| | - Ana Conesa
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Valencia, Spain.
| |
Collapse
|
2
|
Wang T, Ji Z, Xiao X, Zhu D, Li H, Li X. Identification of reproduction-related genes in the hypothalamus of sheep (Ovis aries) using the nanopore full-length transcriptome sequencing technology. Sci Rep 2024; 14:27884. [PMID: 39537852 PMCID: PMC11561102 DOI: 10.1038/s41598-024-79140-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 11/06/2024] [Indexed: 11/16/2024] Open
Abstract
The hypothalamus is the coordination center of the sheep (Ovis aries) endocrine system and plays an important role in the reproductive processes of sheep. However, the specific mechanism by which the hypothalamus affects sheep reproductive performance remains unclear. In this study, the hypothalamus tissues of high-reproduction small-tailed Han sheep and low-reproduction Wadi sheep were collected, and full-length transcriptome sequencing by Oxford Nanopore Technologies (ONT) was performed to explore the key functional genes associated with sheep fecundity. The differentially expressed genes (DEGs) were screened and enriched using DESeq2 software through Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). Approximately 41.75 million clean reads were obtained from the hypothalamus tissues of high- and low-reproduction sheep, after quality control, 32,194,872 high-quality full-length sequences and 2,114 DEGs were obtained, including 1,247 upregulated genes and 867 downregulated genes (P adjust < 0.05, |log2FC|>1). Some DEGs were enriched in oocyte meiosis, progesterone-mediated oocyte maturation, estrogen signaling pathway, GnRH signaling pathway and other development-related signaling pathways. The constructed protein-protein interaction (PPI) networks identified the reproduction-related genes, such as GSK3B, PPP2R1B, and PPP2CB. The results of this study will enrich and supplement the genomic information available for small-tailed Han sheep and Wadi sheep, as well as expand the understanding of the molecular mechanisms underlying the regulation of animal reproduction by the hypothalamus, and they also provided reference data for further investigations on the mechanism of high reproduction in sheep.
Collapse
Affiliation(s)
- Tong Wang
- Key Laboratory of Efficient Utilization of Non-grain Feed Resources (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Shandong Provincial Key laboratory for Livestock Germplasm Innovation and Utilization, College of Animal Science and Technology, Shandong Agricultural University, No. 61 Daizong Road, Taian, 271018, Shandong, People's Republic of China
| | - Zhibin Ji
- Key Laboratory of Efficient Utilization of Non-grain Feed Resources (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Shandong Provincial Key laboratory for Livestock Germplasm Innovation and Utilization, College of Animal Science and Technology, Shandong Agricultural University, No. 61 Daizong Road, Taian, 271018, Shandong, People's Republic of China.
| | - Xue Xiao
- Key Laboratory of Efficient Utilization of Non-grain Feed Resources (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Shandong Provincial Key laboratory for Livestock Germplasm Innovation and Utilization, College of Animal Science and Technology, Shandong Agricultural University, No. 61 Daizong Road, Taian, 271018, Shandong, People's Republic of China
| | - Dejie Zhu
- Key Laboratory of Efficient Utilization of Non-grain Feed Resources (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Shandong Provincial Key laboratory for Livestock Germplasm Innovation and Utilization, College of Animal Science and Technology, Shandong Agricultural University, No. 61 Daizong Road, Taian, 271018, Shandong, People's Republic of China
| | - Hengyi Li
- Key Laboratory of Efficient Utilization of Non-grain Feed Resources (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Shandong Provincial Key laboratory for Livestock Germplasm Innovation and Utilization, College of Animal Science and Technology, Shandong Agricultural University, No. 61 Daizong Road, Taian, 271018, Shandong, People's Republic of China
| | - Xinyu Li
- Key Laboratory of Efficient Utilization of Non-grain Feed Resources (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Shandong Provincial Key laboratory for Livestock Germplasm Innovation and Utilization, College of Animal Science and Technology, Shandong Agricultural University, No. 61 Daizong Road, Taian, 271018, Shandong, People's Republic of China
| |
Collapse
|
3
|
Kabza M, Ritter A, Byrne A, Sereti K, Le D, Stephenson W, Sterne-Weiler T. Accurate long-read transcript discovery and quantification at single-cell, pseudo-bulk and bulk resolution with Isosceles. Nat Commun 2024; 15:7316. [PMID: 39183289 PMCID: PMC11345431 DOI: 10.1038/s41467-024-51584-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 08/07/2024] [Indexed: 08/27/2024] Open
Abstract
Accurate detection and quantification of mRNA isoforms from nanopore long-read sequencing remains challenged by technical noise, particularly in single cells. To address this, we introduce Isosceles, a computational toolkit that outperforms other methods in isoform detection sensitivity and quantification accuracy across single-cell, pseudo-bulk and bulk resolution levels, as demonstrated using synthetic and biologically-derived datasets. Here we show Isosceles improves the fidelity of single-cell transcriptome quantification at the isoform-level, and enables flexible downstream analysis. As a case study, we apply Isosceles, uncovering coordinated splicing within and between neuronal differentiation lineages. Isosceles is suitable to be applied in diverse biological systems, facilitating studies of cellular heterogeneity across biomedical research applications.
Collapse
Affiliation(s)
- Michal Kabza
- Roche Informatics, F. Hoffmann-La Roche Ltd, Poznań, Poland
| | - Alexander Ritter
- Computational Biology & Translation, Genentech Inc., South San Francisco, CA, USA
| | - Ashley Byrne
- Department of Next Generation Sequencing and Microchemistry, Proteomics and Lipidomics, Genentech Inc., South San Francisco, CA, USA
| | - Kostianna Sereti
- Department of Discovery Oncology, Genentech Inc., South San Francisco, CA, USA
| | - Daniel Le
- Department of Next Generation Sequencing and Microchemistry, Proteomics and Lipidomics, Genentech Inc., South San Francisco, CA, USA
| | - William Stephenson
- Department of Next Generation Sequencing and Microchemistry, Proteomics and Lipidomics, Genentech Inc., South San Francisco, CA, USA
| | - Timothy Sterne-Weiler
- Computational Biology & Translation, Genentech Inc., South San Francisco, CA, USA.
- Department of Discovery Oncology, Genentech Inc., South San Francisco, CA, USA.
| |
Collapse
|
4
|
Engal E, Sharma A, Aviel U, Taqatqa N, Juster S, Jaffe-Herman S, Bentata M, Geminder O, Gershon A, Lewis R, Kay G, Hecht M, Epsztejn-Litman S, Gotkine M, Mouly V, Eiges R, Salton M, Drier Y. DNMT3B splicing dysregulation mediated by SMCHD1 loss contributes to DUX4 overexpression and FSHD pathogenesis. SCIENCE ADVANCES 2024; 10:eadn7732. [PMID: 38809976 PMCID: PMC11135424 DOI: 10.1126/sciadv.adn7732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 04/25/2024] [Indexed: 05/31/2024]
Abstract
Structural maintenance of chromosomes flexible hinge domain-containing 1 (SMCHD1) is a noncanonical SMC protein and an epigenetic regulator. Mutations in SMCHD1 cause facioscapulohumeral muscular dystrophy (FSHD), by overexpressing DUX4 in muscle cells. Here, we demonstrate that SMCHD1 is a key regulator of alternative splicing in various cell types. We show how SMCHD1 loss causes splicing alterations of DNMT3B, which can lead to hypomethylation and DUX4 overexpression. Analyzing RNA sequencing data from muscle biopsies of patients with FSHD and Smchd1 knocked out cells, we found mis-splicing of hundreds of genes upon SMCHD1 loss. We conducted a high-throughput screen of splicing factors, revealing the involvement of the splicing factor RBM5 in the mis-splicing of DNMT3B. Subsequent RNA immunoprecipitation experiments confirmed that SMCHD1 is required for RBM5 recruitment. Last, we show that mis-splicing of DNMT3B leads to hypomethylation of the D4Z4 region and to DUX4 overexpression. These results suggest that DNMT3B mis-splicing due to SMCHD1 loss plays a major role in FSHD pathogenesis.
Collapse
Affiliation(s)
- Eden Engal
- The Lautenberg Center for Immunology and Cancer Research, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
- Department of Military Medicine and “Tzameret”, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Aveksha Sharma
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Uria Aviel
- The Lautenberg Center for Immunology and Cancer Research, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
- Stem Cell Research Laboratory, Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem 9103102, Israel
| | - Nadeen Taqatqa
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Sarah Juster
- Stem Cell Research Laboratory, Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem 9103102, Israel
- Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Shiri Jaffe-Herman
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Mercedes Bentata
- The Lautenberg Center for Immunology and Cancer Research, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Ophir Geminder
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
- Department of Military Medicine and “Tzameret”, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Adi Gershon
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Reyut Lewis
- The Lautenberg Center for Immunology and Cancer Research, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Gillian Kay
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Merav Hecht
- The Lautenberg Center for Immunology and Cancer Research, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Silvina Epsztejn-Litman
- Stem Cell Research Laboratory, Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem 9103102, Israel
| | - Marc Gotkine
- Department of Neurology, Hadassah Medical Organization and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem 9112002, Israel
| | - Vincent Mouly
- UPMC University Paris 06, Inserm UMRS974, CNRS FRE3617, Center for Research in Myology, Sorbonne University,75252 Paris, France
| | - Rachel Eiges
- Stem Cell Research Laboratory, Medical Genetics Institute, Shaare Zedek Medical Center, Jerusalem 9103102, Israel
- Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Maayan Salton
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Yotam Drier
- The Lautenberg Center for Immunology and Cancer Research, The Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| |
Collapse
|
5
|
Su Y, Yu Z, Jin S, Ai Z, Yuan R, Chen X, Xue Z, Guo Y, Chen D, Liang H, Liu Z, Liu W. Comprehensive assessment of mRNA isoform detection methods for long-read sequencing data. Nat Commun 2024; 15:3972. [PMID: 38730241 PMCID: PMC11087464 DOI: 10.1038/s41467-024-48117-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 04/19/2024] [Indexed: 05/12/2024] Open
Abstract
The advancement of Long-Read Sequencing (LRS) techniques has significantly increased the length of sequencing to several kilobases, thereby facilitating the identification of alternative splicing events and isoform expressions. Recently, numerous computational tools for isoform detection using long-read sequencing data have been developed. Nevertheless, there remains a deficiency in comparative studies that systemically evaluate the performance of these tools, which are implemented with different algorithms, under various simulations that encompass potential influencing factors. In this study, we conducted a benchmark analysis of thirteen methods implemented in nine tools capable of identifying isoform structures from long-read RNA-seq data. We evaluated their performances using simulated data, which represented diverse sequencing platforms generated by an in-house simulator, RNA sequins (sequencing spike-ins) data, as well as experimental data. Our findings demonstrate IsoQuant as a highly effective tool for isoform detection with LRS, with Bambu and StringTie2 also exhibiting strong performance. These results offer valuable guidance for future research on alternative splicing analysis and the ongoing improvement of tools for isoform detection using LRS data.
Collapse
Affiliation(s)
- Yaqi Su
- Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China
- Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
| | - Zhejian Yu
- Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China
- Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
| | - Siqian Jin
- Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China
- Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
| | - Zhipeng Ai
- Division of Human Reproduction and Developmental Genetics, Women's Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310006, Zhejiang, China
| | - Ruihong Yuan
- Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
| | - Xinyi Chen
- Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China
- Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
| | - Ziwei Xue
- Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China
- Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
| | - Yixin Guo
- Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China
- Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
| | - Di Chen
- Center for Reproductive Medicine of the Second Affiliated Hospital Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China
- Centre for Regeneration and Cell Therapy of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
| | - Hongqing Liang
- Division of Human Reproduction and Developmental Genetics, Women's Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310006, Zhejiang, China
| | - Zuozhu Liu
- Zhejiang University-Angel Align Inc. R&D Center for Intelligent Healthcare, Zhejiang University-University of Illinois at Urbana-Champaign Institute (ZJU-UIUC Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China
| | - Wanlu Liu
- Department of Orthopedic Surgery of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China.
- Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), International Campus, Zhejiang University, Haining, 314400, Zhejiang, China.
- Future Health Laboratory, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314100, China.
- Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| |
Collapse
|
6
|
Koster CC, Kleefeldt AA, van den Broek M, Luttik M, Daran JM, Daran-Lapujade P. Long-read direct RNA sequencing of the mitochondrial transcriptome of Saccharomyces cerevisiae reveals condition-dependent intron abundance. Yeast 2024; 41:256-278. [PMID: 37642136 DOI: 10.1002/yea.3893] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 07/11/2023] [Accepted: 07/18/2023] [Indexed: 08/31/2023] Open
Abstract
Mitochondria fulfil many essential roles and have their own genome, which is expressed as polycistronic transcripts that undergo co- or posttranscriptional processing and splicing. Due to the inherent complexity and limited technical accessibility of the mitochondrial transcriptome, fundamental questions regarding mitochondrial gene expression and splicing remain unresolved, even in the model eukaryote Saccharomyces cerevisiae. Long-read sequencing could address these fundamental questions. Therefore, a method for the enrichment of mitochondrial RNA and sequencing using Nanopore technology was developed, enabling the resolution of splicing of polycistronic genes and the quantification of spliced RNA. This method successfully captured the full mitochondrial transcriptome and resolved RNA splicing patterns with single-base resolution and was applied to explore the transcriptome of S. cerevisiae grown with glucose or ethanol as the sole carbon source, revealing the impact of growth conditions on mitochondrial RNA expression and splicing. This study uncovered a remarkable difference in the turnover of Group II introns between yeast grown in either mostly fermentative or fully respiratory conditions. Whether this accumulation of introns in glucose medium has an impact on mitochondrial functions remains to be explored. Combined with the high tractability of the model yeast S. cerevisiae, the developed method enables to monitor mitochondrial transcriptome responses in a broad range of relevant contexts, including oxidative stress, apoptosis and mitochondrial diseases.
Collapse
Affiliation(s)
- Charlotte C Koster
- Department of Biotechnology, Delft University of Technology, Delft, The Netherlands
| | - Askar A Kleefeldt
- Department of Biotechnology, Delft University of Technology, Delft, The Netherlands
| | - Marcel van den Broek
- Department of Biotechnology, Delft University of Technology, Delft, The Netherlands
| | - Marijke Luttik
- Department of Biotechnology, Delft University of Technology, Delft, The Netherlands
| | - Jean-Marc Daran
- Department of Biotechnology, Delft University of Technology, Delft, The Netherlands
| | | |
Collapse
|
7
|
Liu Z, Zhu C, Steinmetz LM, Wei W. Identification and quantification of small exon-containing isoforms in long-read RNA sequencing data. Nucleic Acids Res 2023; 51:e104. [PMID: 37843096 PMCID: PMC10639058 DOI: 10.1093/nar/gkad810] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 08/03/2023] [Accepted: 09/20/2023] [Indexed: 10/17/2023] Open
Abstract
Small exons are pervasive in transcriptomes across organisms, and their quantification in RNA isoforms is crucial for understanding gene functions. Although long-read RNA-seq based on Oxford Nanopore Technologies (ONT) offers the advantage of covering transcripts in full length, its lower base accuracy poses challenges for identifying individual exons, particularly microexons (≤ 30 nucleotides). Here, we systematically assess small exons quantification in synthetic and human ONT RNA-seq datasets. We demonstrate that reads containing small exons are often not properly aligned, affecting the quantification of relevant transcripts. Thus, we develop a local-realignment method for misaligned exons (MisER), which remaps reads with misaligned exons to the transcript references. Using synthetic and simulated datasets, we demonstrate the high sensitivity and specificity of MisER for the quantification of transcripts containing small exons. Moreover, MisER enabled us to identify small exons with a higher percent spliced-in index (PSI) in neural, particularly neural-regulated microexons, when comparing 14 neural to 16 non-neural tissues in humans. Our work introduces an improved quantification method for long-read RNA-seq and especially facilitates studies using ONT long-reads to elucidate the regulation of genes involving small exons.
Collapse
Affiliation(s)
- Zhen Liu
- Lingang Laboratory, Shanghai, Shanghai 200031, China
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, Shanghai 200031, China
| | - Chenchen Zhu
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305, USA
| | - Lars M Steinmetz
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305, USA
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA 94304, USA
| | - Wu Wei
- Lingang Laboratory, Shanghai, Shanghai 200031, China
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, Shanghai 200031, China
- Center for Biomedical Informatics, Shanghai Children's Hospital, Shanghai Jiao Tong University, Shanghai, Shanghai 200040, China
| |
Collapse
|
8
|
Dong X, Du MRM, Gouil Q, Tian L, Jabbari JS, Bowden R, Baldoni PL, Chen Y, Smyth GK, Amarasinghe SL, Law CW, Ritchie ME. Benchmarking long-read RNA-sequencing analysis tools using in silico mixtures. Nat Methods 2023; 20:1810-1821. [PMID: 37783886 DOI: 10.1038/s41592-023-02026-3] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 08/25/2023] [Indexed: 10/04/2023]
Abstract
The lack of benchmark data sets with inbuilt ground-truth makes it challenging to compare the performance of existing long-read isoform detection and differential expression analysis workflows. Here, we present a benchmark experiment using two human lung adenocarcinoma cell lines that were each profiled in triplicate together with synthetic, spliced, spike-in RNAs (sequins). Samples were deeply sequenced on both Illumina short-read and Oxford Nanopore Technologies long-read platforms. Alongside the ground-truth available via the sequins, we created in silico mixture samples to allow performance assessment in the absence of true positives or true negatives. Our results show that StringTie2 and bambu outperformed other tools from the six isoform detection tools tested, DESeq2, edgeR and limma-voom were best among the five differential transcript expression tools tested and there was no clear front-runner for performing differential transcript usage analysis between the five tools compared, which suggests further methods development is needed for this application.
Collapse
Affiliation(s)
- Xueyi Dong
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia.
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia.
| | - Mei R M Du
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
| | - Quentin Gouil
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia
| | - Luyi Tian
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia
- Guangzhou National Laboratory, Guangzhou, China
| | - Jafar S Jabbari
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia
| | - Rory Bowden
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia
| | - Pedro L Baldoni
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia
| | - Yunshun Chen
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia
| | - Gordon K Smyth
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- School of Mathematics and Statistics, The University of Melbourne, Parkville, Victoria, Australia
| | - Shanika L Amarasinghe
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia
- The Australian Regenerative Medicine Institute, Monash University, Clayton, Victoria, Australia
| | - Charity W Law
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia
| | - Matthew E Ritchie
- The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia.
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia.
| |
Collapse
|
9
|
Kainth AS, Haddad GA, Hall JM, Ruthenburg AJ. Merging short and stranded long reads improves transcript assembly. PLoS Comput Biol 2023; 19:e1011576. [PMID: 37883581 PMCID: PMC10629667 DOI: 10.1371/journal.pcbi.1011576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 11/07/2023] [Accepted: 10/05/2023] [Indexed: 10/28/2023] Open
Abstract
Long-read RNA sequencing has arisen as a counterpart to short-read sequencing, with the potential to capture full-length isoforms, albeit at the cost of lower depth. Yet this potential is not fully realized due to inherent limitations of current long-read assembly methods and underdeveloped approaches to integrate short-read data. Here, we critically compare the existing methods and develop a new integrative approach to characterize a particularly challenging pool of low-abundance long noncoding RNA (lncRNA) transcripts from short- and long-read sequencing in two distinct cell lines. Our analysis reveals severe limitations in each of the sequencing platforms. For short-read assemblies, coverage declines at transcript termini resulting in ambiguous ends, and uneven low coverage results in segmentation of a single transcript into multiple transcripts. Conversely, long-read sequencing libraries lack depth and strand-of-origin information in cDNA-based methods, culminating in erroneous assembly and quantitation of transcripts. We also discover a cDNA synthesis artifact in long-read datasets that markedly impacts the identity and quantitation of assembled transcripts. Towards remediating these problems, we develop a computational pipeline to "strand" long-read cDNA libraries that rectifies inaccurate mapping and assembly of long-read transcripts. Leveraging the strengths of each platform and our computational stranding, we also present and benchmark a hybrid assembly approach that drastically increases the sensitivity and accuracy of full-length transcript assembly on the correct strand and improves detection of biological features of the transcriptome. When applied to a challenging set of under-annotated and cell-type variable lncRNA, our method resolves the segmentation problem of short-read sequencing and the depth problem of long-read sequencing, resulting in the assembly of coherent transcripts with precise 5' and 3' ends. Our workflow can be applied to existing datasets for superior demarcation of transcript ends and refined isoform structure, which can enable better differential gene expression analyses and molecular manipulations of transcripts.
Collapse
Affiliation(s)
- Amoldeep S. Kainth
- Department of Molecular Genetics and Cell Biology, The University of Chicago, Chicago, Illinois, United States of America
| | - Gabriela A. Haddad
- Committee on Genetics, Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, United States of America
| | - Johnathon M. Hall
- Department of Molecular Genetics and Cell Biology, The University of Chicago, Chicago, Illinois, United States of America
| | - Alexander J. Ruthenburg
- Department of Molecular Genetics and Cell Biology, The University of Chicago, Chicago, Illinois, United States of America
- Committee on Genetics, Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, United States of America
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
10
|
Liu Z, Quinones-Valdez G, Fu T, Huang E, Choudhury M, Reese F, Mortazavi A, Xiao X. L-GIREMI uncovers RNA editing sites in long-read RNA-seq. Genome Biol 2023; 24:171. [PMID: 37474948 PMCID: PMC10360234 DOI: 10.1186/s13059-023-03012-w] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 07/12/2023] [Indexed: 07/22/2023] Open
Abstract
Although long-read RNA-seq is increasingly applied to characterize full-length transcripts it can also enable detection of nucleotide variants, such as genetic mutations or RNA editing sites, which is significantly under-explored. Here, we present an in-depth study to detect and analyze RNA editing sites in long-read RNA-seq. Our new method, L-GIREMI, effectively handles sequencing errors and read biases. Applied to PacBio RNA-seq data, L-GIREMI affords a high accuracy in RNA editing identification. Additionally, our analysis uncovered novel insights about RNA editing occurrences in single molecules and double-stranded RNA structures. L-GIREMI provides a valuable means to study nucleotide variants in long-read RNA-seq.
Collapse
Affiliation(s)
- Zhiheng Liu
- Department of Integrative Biology and Physiology, University of California, Los Angeles, CA, USA
| | - Giovanni Quinones-Valdez
- Department of Integrative Biology and Physiology, University of California, Los Angeles, CA, USA
| | - Ting Fu
- Molecular, Cellular, and Integrative Physiology Interdepartmental Program, University of California, Los Angeles, CA, USA
| | - Elaine Huang
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, CA, USA
| | - Mudra Choudhury
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, CA, USA
| | - Fairlie Reese
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
- Center for Complex Biological Systems, University of California, Irvine, CA, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
- Center for Complex Biological Systems, University of California, Irvine, CA, USA
| | - Xinshu Xiao
- Department of Integrative Biology and Physiology, University of California, Los Angeles, CA, USA.
- Molecular, Cellular, and Integrative Physiology Interdepartmental Program, University of California, Los Angeles, CA, USA.
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, CA, USA.
| |
Collapse
|
11
|
Carbonell-Sala S, Lagarde J, Nishiyori H, Palumbo E, Arnan C, Takahashi H, Carninci P, Uszczynska-Ratajczak B, Guigó R. CapTrap-Seq: A platform-agnostic and quantitative approach for high-fidelity full-length RNA transcript sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.16.543444. [PMID: 37398314 PMCID: PMC10312720 DOI: 10.1101/2023.06.16.543444] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Long-read RNA sequencing is essential to produce accurate and exhaustive annotation of eukaryotic genomes. Despite advancements in throughput and accuracy, achieving reliable end-to-end identification of RNA transcripts remains a challenge for long-read sequencing methods. To address this limitation, we developed CapTrap-seq, a cDNA library preparation method, which combines the Cap-trapping strategy with oligo(dT) priming to detect 5'capped, full-length transcripts, together with the data processing pipeline LyRic. We benchmarked CapTrap-seq and other popular RNA-seq library preparation protocols in a number of human tissues using both ONT and PacBio sequencing. To assess the accuracy of the transcript models produced, we introduced a capping strategy for synthetic RNA spike-in sequences that mimics the natural 5'cap formation in RNA spike-in molecules. We found that the vast majority (up to 90%) of transcript models that LyRic derives from CapTrap-seq reads are full-length. This makes it possible to produce highly accurate annotations with minimal human intervention.
Collapse
|
12
|
Stokes T, Cen HH, Kapranov P, Gallagher IJ, Pitsillides AA, Volmar C, Kraus WE, Johnson JD, Phillips SM, Wahlestedt C, Timmons JA. Transcriptomics for Clinical and Experimental Biology Research: Hang on a Seq. ADVANCED GENETICS (HOBOKEN, N.J.) 2023; 4:2200024. [PMID: 37288167 PMCID: PMC10242409 DOI: 10.1002/ggn2.202200024] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Indexed: 06/09/2023]
Abstract
Sequencing the human genome empowers translational medicine, facilitating transcriptome-wide molecular diagnosis, pathway biology, and drug repositioning. Initially, microarrays are used to study the bulk transcriptome; but now short-read RNA sequencing (RNA-seq) predominates. Positioned as a superior technology, that makes the discovery of novel transcripts routine, most RNA-seq analyses are in fact modeled on the known transcriptome. Limitations of the RNA-seq methodology have emerged, while the design of, and the analysis strategies applied to, arrays have matured. An equitable comparison between these technologies is provided, highlighting advantages that modern arrays hold over RNA-seq. Array protocols more accurately quantify constitutively expressed protein coding genes across tissue replicates, and are more reliable for studying lower expressed genes. Arrays reveal long noncoding RNAs (lncRNA) are neither sparsely nor lower expressed than protein coding genes. Heterogeneous coverage of constitutively expressed genes observed with RNA-seq, undermines the validity and reproducibility of pathway analyses. The factors driving these observations, many of which are relevant to long-read or single-cell sequencing are discussed. As proposed herein, a reappreciation of bulk transcriptomic methods is required, including wider use of the modern high-density array data-to urgently revise existing anatomical RNA reference atlases and assist with more accurate study of lncRNAs.
Collapse
Affiliation(s)
- Tanner Stokes
- Faculty of ScienceMcMaster UniversityHamiltonL8S 4L8Canada
| | - Haoning Howard Cen
- Life Sciences InstituteUniversity of British ColumbiaVancouverV6T 1Z3Canada
| | | | - Iain J Gallagher
- School of Applied SciencesEdinburgh Napier UniversityEdinburghEH11 4BNUK
| | | | | | | | - James D. Johnson
- Life Sciences InstituteUniversity of British ColumbiaVancouverV6T 1Z3Canada
| | | | | | - James A. Timmons
- Miller School of MedicineUniversity of MiamiMiamiFL33136USA
- William Harvey Research InstituteQueen Mary University LondonLondonEC1M 6BQUK
- Augur Precision Medicine LTDStirlingFK9 5NFUK
| |
Collapse
|
13
|
Lim WF, Rinaldi C. RNA Transcript Diversity in Neuromuscular Research. J Neuromuscul Dis 2023:JND221601. [PMID: 37182892 DOI: 10.3233/jnd-221601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Three decades since the Human Genome Project began, scientists have now identified more then 25,000 protein coding genes in the human genome. The vast majority of the protein coding genes (> 90%) are multi-exonic, with the coding DNA being interrupted by intronic sequences, which are removed from the pre-mRNA transcripts before being translated into proteins, a process called splicing maturation. Variations in this process, i.e. by exon skipping, intron retention, alternative 5' splice site (5'ss), 3' splice site (3'ss), or polyadenylation usage, lead to remarkable transcriptome and proteome diversity in human tissues. Given its critical biological importance, alternative splicing is tightly regulated in a tissue- and developmental stage-specific manner. The central nervous system and skeletal muscle are amongst the tissues with the highest number of differentially expressed alternative exons, revealing a remarkable degree of transcriptome complexity. It is therefore not surprising that splicing mis-regulation is causally associated with a myriad of neuromuscular diseases, including but not limited to amyotrophic lateral sclerosis (ALS), spinal muscular atrophy (SMA), Duchenne muscular dystrophy (DMD), and myotonic dystrophy type 1 and 2 (DM1, DM2). A gene's transcript diversity has since become an integral and an important consideration for drug design, development and therapy. In this review, we will discuss transcript diversity in the context of neuromuscular diseases and current approaches to address splicing mis-regulation.
Collapse
Affiliation(s)
- Wooi Fang Lim
- Department of Paediatrics and Institute of Developmental and Regenerative Medicine, University of Oxford, Oxford, UK
| | - Carlo Rinaldi
- Department of Paediatrics and Institute of Developmental and Regenerative Medicine, University of Oxford, Oxford, UK
- MDUK Oxford Neuromuscular Centre, University of Oxford, Oxford, UK
| |
Collapse
|
14
|
Raoufi S, Jafarinejad-Farsangi S, Dehesh T, Hadizadeh M. Investigating unique genes of five molecular subtypes of breast cancer using penalized logistic regression. J Cancer Res Ther 2023; 19:S126-S137. [PMID: 37147992 DOI: 10.4103/jcrt.jcrt_811_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Background Breast cancer (BC) is the most common cancer and the fifth cause of death in women worldwide. Exploring unique genes for cancers has been interesting. Patients and Methods This study aimed to explore unique genes of five molecular subtypes of BC in women using penalized logistic regression models. For this purpose, microarray data of five independent GEO data sets were combined. This combination includes genetic information of 324 women with BC and 12 healthy women. Least absolute shrinkage and selection operator (LASSO) logistic regression and adaptive LASSO logistic regression were used to extract unique genes. The biological process of extracted genes was evaluated in an open-source GOnet web application. R software version 3.6.0 with the glmnet package was used for fitting the models. Results Totally, 119 genes were extracted among 15 pairwise comparisons. Seventeen genes (14%) showed overlap between comparative groups. According to GO enrichment analysis, the biological process of extracted genes was enriched in negative and positive regulation biological processes, and molecular function tracking revealed that most genes are involved in kinase and transferring activities. On the other hand, we identified unique genes for each comparative group and the subsequent pathways for them. However, a significant pathway was not identified for genes in normal-like versus ERBB2 and luminal A, basal versus control, and lumina B versus luminal A groups. Conclusion Most genes selected by LASSO logistic regression and adaptive LASSO logistic regression identified unique genes and related pathways for comparative subgroups of BC, which would be useful to comprehend the molecular differences between subgroups that would be considered for further research and therapeutic approaches in the future.
Collapse
Affiliation(s)
- Sadegh Raoufi
- Modeling in Health Research Center, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran
| | | | - Tania Dehesh
- Department of Epidemiology and Biostatistics, School of Public Health, Kerman University of Medical Sciences, Kerman, Iran
| | - Morteza Hadizadeh
- Cardiovascular Research Centre, Institute of Basic and Clinical Physiology Sciences, Kerman University of Medical Sciences, Kerman, Iran
| |
Collapse
|
15
|
Hu Y, Gouru A, Wang K. DELongSeq for efficient detection of differential isoform expression from long-read RNA-seq data. NAR Genom Bioinform 2023; 5:lqad019. [PMID: 36879902 PMCID: PMC9985341 DOI: 10.1093/nargab/lqad019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 01/12/2023] [Accepted: 02/16/2023] [Indexed: 03/07/2023] Open
Abstract
Conventional gene expression quantification approaches, such as microarrays or quantitative PCR, have similar variations of estimates for all genes. However, next-generation short-read or long-read sequencing use read counts to estimate expression levels with much wider dynamic ranges. In addition to the accuracy of estimated isoform expression, efficiency, which measures the degree of estimation uncertainty, is also an important factor for downstream analysis. Instead of read count, we present DELongSeq, which employs information matrix of EM algorithm to quantify uncertainty of isoform expression estimates to improve estimation efficiency. DELongSeq uses random-effect regression model for the analysis of DE isoform, in that within-study variation represents variable precision in isoform expression estimation and between-study variation represents variation in isoform expression levels across samples. More importantly, DELongSeq allows 1 case versus 1 control comparison of differential expression, which has specific application scenarios in precision medicine (such as before versus after treatment, or tumor versus stromal tissues). Through extensive simulations and analysis of several RNA-Seq datasets, we show that the uncertainty quantification approach is computationally reliable, and can improve the power of differential expression (DE) analysis of isoforms or genes. In summary, DELongSeq allows for efficient detection of differential isoform/gene expression from long-read RNA-Seq data.
Collapse
Affiliation(s)
- Yu Hu
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Anagha Gouru
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
16
|
Walter M, Puniamoorthy N. Discovering novel reproductive genes in a non-model fly using de novo GridION transcriptomics. Front Genet 2022; 13:1003771. [PMID: 36568389 PMCID: PMC9768217 DOI: 10.3389/fgene.2022.1003771] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 11/16/2022] [Indexed: 12/12/2022] Open
Abstract
Gene discovery has important implications for investigating phenotypic trait evolution, adaptation, and speciation. Male reproductive tissues, such as accessory glands (AGs), are hotspots for recruitment of novel genes that diverge rapidly even among closely related species/populations. These genes synthesize seminal fluid proteins that often affect post-copulatory sexual selection-they can mediate male-male sperm competition, ejaculate-female interactions that modify female remating and even influence reproductive incompatibilities among diverging species/populations. Although de novo transcriptomics has facilitated gene discovery in non-model organisms, reproductive gene discovery is still challenging without a reference database as they are often novel and bear no homology to known proteins. Here, we use reference-free GridION long-read transcriptomics, from Oxford Nanopore Technologies (ONT), to discover novel AG genes and characterize their expression in the widespread dung fly, Sepsis punctum. Despite stark population differences in male reproductive traits (e.g.: Body size, testes size, and sperm length) as well as female re-mating, the male AG genes and their secretions of S. punctum are still unknown. We implement a de novo ONT transcriptome pipeline incorporating quality-filtering and rigorous error-correction procedures, and we evaluate gene sequence and gene expression results against high-quality Illumina short-read data. We discover highly-expressed reproductive genes in AG transcriptomes of S. punctum consisting of 40 high-quality and high-confidence ONT genes that cross-verify against Illumina genes, among which 26 are novel and specific to S. punctum. Novel genes account for an average of 81% of total gene expression and may be functionally relevant in seminal fluid protein production. For instance, 80% of genes encoding secretory proteins account for 74% total gene expression. In addition, median sequence similarities of ONT nucleotide and protein sequences match within-Illumina sequence similarities. Read-count based expression quantification in ONT is congruent with Illumina's Transcript per Million (TPM), both in overall pattern and within functional categories. Rapid genomic innovation followed by recruitment of de novo genes for high expression in S. punctum AG tissue, a pattern observed in other insects, could be a likely mechanism of evolution of these genes. The study also demonstrates the feasibility of adapting ONT transcriptomics for gene discovery in non-model systems.
Collapse
|
17
|
You Y, Clark MB, Shim H. NanoSplicer: Accurate identification of splice junctions using Oxford Nanopore sequencing. Bioinformatics 2022; 38:3741-3748. [PMID: 35639973 PMCID: PMC9344838 DOI: 10.1093/bioinformatics/btac359] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 04/02/2022] [Accepted: 05/24/2022] [Indexed: 11/30/2022] Open
Abstract
Motivation Long-read sequencing methods have considerable advantages for characterizing RNA isoforms. Oxford Nanopore sequencing records changes in electrical current when nucleic acid traverses through a pore. However, basecalling of this raw signal (known as a squiggle) is error prone, making it challenging to accurately identify splice junctions. Existing strategies include utilizing matched short-read data and/or annotated splice junctions to correct nanopore reads but add expense or limit junctions to known (incomplete) annotations. Therefore, a method that could accurately identify splice junctions solely from nanopore data would have numerous advantages. Results We developed ‘NanoSplicer’ to identify splice junctions using raw nanopore signal (squiggles). For each splice junction, the observed squiggle is compared to candidate squiggles representing potential junctions to identify the correct candidate. Measuring squiggle similarity enables us to compute the probability of each candidate junction and find the most likely one. We tested our method using (i) synthetic mRNAs with known splice junctions and (ii) biological mRNAs from a lung-cancer cell-line. The results from both datasets demonstrate NanoSplicer improves splice junction identification, especially when the basecalling error rate near the splice junction is elevated. Availability and implementation NanoSplicer is available at https://github.com/shimlab/NanoSplicer and archived at https://doi.org/10.5281/zenodo.6403849. Data is available from ENA: ERS7273757 and ERS7273453. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yupei You
- School of Mathematics and Statistics/Melbourne Integrative Genomics, The University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Michael B Clark
- Centre for Stem Cell Systems, Department of Anatomy and Physiology, The University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Heejung Shim
- School of Mathematics and Statistics/Melbourne Integrative Genomics, The University of Melbourne, Melbourne, VIC, 3010, Australia
| |
Collapse
|
18
|
Ringeling FR, Chakraborty S, Vissers C, Reiman D, Patel AM, Lee KH, Hong A, Park CW, Reska T, Gagneur J, Chang H, Spletter ML, Yoon KJ, Ming GL, Song H, Canzar S. Partitioning RNAs by length improves transcriptome reconstruction from short-read RNA-seq data. Nat Biotechnol 2022; 40:741-750. [PMID: 35013600 PMCID: PMC11332977 DOI: 10.1038/s41587-021-01136-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 10/26/2021] [Indexed: 02/06/2023]
Abstract
The accuracy of methods for assembling transcripts from short-read RNA sequencing data is limited by the lack of long-range information. Here we introduce Ladder-seq, an approach that separates transcripts according to their lengths before sequencing and uses the additional information to improve the quantification and assembly of transcripts. Using simulated data, we show that a kallisto algorithm extended to process Ladder-seq data quantifies transcripts of complex genes with substantially higher accuracy than conventional kallisto. For reference-based assembly, a tailored scheme based on the StringTie2 algorithm reconstructs a single transcript with 30.8% higher precision than its conventional counterpart and is more than 30% more sensitive for complex genes. For de novo assembly, a similar scheme based on the Trinity algorithm correctly assembles 78% more transcripts than conventional Trinity while improving precision by 78%. In experimental data, Ladder-seq reveals 40% more genes harboring isoform switches compared to conventional RNA sequencing and unveils widespread changes in isoform usage upon m6A depletion by Mettl14 knockout.
Collapse
Affiliation(s)
| | | | - Caroline Vissers
- Department of Biochemistry & Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Derek Reiman
- Department of Biomedical Engineering, University of Illinois at Chicago, Chicago, IL, USA
| | - Akshay M Patel
- Gene Center, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Ki-Heon Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
| | - Ari Hong
- Center for RNA Research, Institute for Basic Science (IBS), Seoul, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Chan-Woo Park
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
| | - Tim Reska
- Gene Center, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Garching, Germany
- Institute of Human Genetics, Technical University of Munich, Munich, Germany
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Hyeshik Chang
- Center for RNA Research, Institute for Basic Science (IBS), Seoul, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Maria L Spletter
- Biomedical Center, Department of Physiological Chemistry, Ludwig-Maximilians-Universität München, Martinsried-Planegg, Germany
| | - Ki-Jun Yoon
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
| | - Guo-Li Ming
- Department of Neuroscience and Mahoney Institute for Neurosciences, University of Pennsylvania, Philadelphia, PA, USA
| | - Hongjun Song
- Department of Neuroscience and Mahoney Institute for Neurosciences, University of Pennsylvania, Philadelphia, PA, USA
| | - Stefan Canzar
- Gene Center, Ludwig-Maximilians-Universität München, Munich, Germany.
| |
Collapse
|
19
|
Grünberger F, Ferreira-Cerca S, Grohmann D. Nanopore sequencing of RNA and cDNA molecules in Escherichia coli. RNA (NEW YORK, N.Y.) 2022; 28:400-417. [PMID: 34906997 PMCID: PMC8848933 DOI: 10.1261/rna.078937.121] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 11/29/2021] [Indexed: 05/09/2023]
Abstract
High-throughput sequencing dramatically changed our view of transcriptome architectures and allowed for ground-breaking discoveries in RNA biology. Recently, sequencing of full-length transcripts based on the single-molecule sequencing platform from Oxford Nanopore Technologies (ONT) was introduced and is widely used to sequence eukaryotic and viral RNAs. However, experimental approaches implementing this technique for prokaryotic transcriptomes remain scarce. Here, we present an experimental and bioinformatic workflow for ONT RNA-seq in the bacterial model organism Escherichia coli, which can be applied to any microorganism. Our study highlights critical steps of library preparation and computational analysis and compares the results to gold standards in the field. Furthermore, we comprehensively evaluate the applicability and advantages of different ONT-based RNA sequencing protocols, including direct RNA, direct cDNA, and PCR-cDNA. We find that (PCR)-cDNA-seq offers improved yield and accuracy compared to direct RNA sequencing. Notably, (PCR)-cDNA-seq is suitable for quantitative measurements and can be readily used for simultaneous and accurate detection of transcript 5' and 3' boundaries, analysis of transcriptional units, and transcriptional heterogeneity. In summary, based on our comprehensive study, we show nanopore RNA-seq to be a ready-to-use tool allowing rapid, cost-effective, and accurate annotation of multiple transcriptomic features. Thereby nanopore RNA-seq holds the potential to become a valuable alternative method for RNA analysis in prokaryotes.
Collapse
Affiliation(s)
- Felix Grünberger
- Institute of Biochemistry, Genetics and Microbiology, Institute of Microbiology and Archaea Centre, Single-Molecule Biochemistry Lab and Biochemistry Centre Regensburg, University of Regensburg, 93053 Regensburg, Germany
| | - Sébastien Ferreira-Cerca
- Regensburg Center of Biochemistry (RCB), University of Regensburg, 93053 Regensburg, Germany
- Institute for Biochemistry, Genetics and Microbiology, Regensburg Center for Biochemistry, Biochemistry III, University of Regensburg, 93053 Regensburg, Germany
| | - Dina Grohmann
- Institute of Biochemistry, Genetics and Microbiology, Institute of Microbiology and Archaea Centre, Single-Molecule Biochemistry Lab and Biochemistry Centre Regensburg, University of Regensburg, 93053 Regensburg, Germany
- Regensburg Center of Biochemistry (RCB), University of Regensburg, 93053 Regensburg, Germany
| |
Collapse
|
20
|
Tian L, Jabbari JS, Thijssen R, Gouil Q, Amarasinghe SL, Voogd O, Kariyawasam H, Du MRM, Schuster J, Wang C, Su S, Dong X, Law CW, Lucattini A, Prawer YDJ, Collar-Fernández C, Chung JD, Naim T, Chan A, Ly CH, Lynch GS, Ryall JG, Anttila CJA, Peng H, Anderson MA, Flensburg C, Majewski I, Roberts AW, Huang DCS, Clark MB, Ritchie ME. Comprehensive characterization of single-cell full-length isoforms in human and mouse with long-read sequencing. Genome Biol 2021; 22:310. [PMID: 34763716 PMCID: PMC8582192 DOI: 10.1186/s13059-021-02525-6] [Citation(s) in RCA: 99] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 10/21/2021] [Indexed: 12/11/2022] Open
Abstract
A modified Chromium 10x droplet-based protocol that subsamples cells for both short-read and long-read (nanopore) sequencing together with a new computational pipeline (FLAMES) is developed to enable isoform discovery, splicing analysis, and mutation detection in single cells. We identify thousands of unannotated isoforms and find conserved functional modules that are enriched for alternative transcript usage in different cell types and species, including ribosome biogenesis and mRNA splicing. Analysis at the transcript level allows data integration with scATAC-seq on individual promoters, improved correlation with protein expression data, and linked mutations known to confer drug resistance to transcriptome heterogeneity.
Collapse
Affiliation(s)
- Luyi Tian
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia.
| | - Jafar S Jabbari
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Australian Genome Research Facility, Victorian Comprehensive Cancer Centre, Melbourne, VIC, Australia
| | - Rachel Thijssen
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Quentin Gouil
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Shanika L Amarasinghe
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Oliver Voogd
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Hasaru Kariyawasam
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Mei R M Du
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Jakob Schuster
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Changqing Wang
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Shian Su
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Xueyi Dong
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Charity W Law
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Alexis Lucattini
- Australian Genome Research Facility, Victorian Comprehensive Cancer Centre, Melbourne, VIC, Australia
| | - Yair David Joseph Prawer
- Centre for Stem Cell Systems, Department of Anatomy and Neuroscience, The University of Melbourne, Parkville, VIC, Australia
| | | | - Jin D Chung
- Centre for Muscle Research, Department of Physiology, The University of Melbourne, Melbourne, VIC, Australia
| | - Timur Naim
- Centre for Muscle Research, Department of Physiology, The University of Melbourne, Melbourne, VIC, Australia
| | - Audrey Chan
- Centre for Muscle Research, Department of Physiology, The University of Melbourne, Melbourne, VIC, Australia
| | - Chi Hai Ly
- Centre for Muscle Research, Department of Physiology, The University of Melbourne, Melbourne, VIC, Australia
- Present address: Department of Neurology, Stanford University, Stanford, CA, USA
| | - Gordon S Lynch
- Centre for Muscle Research, Department of Physiology, The University of Melbourne, Melbourne, VIC, Australia
| | - James G Ryall
- Centre for Muscle Research, Department of Physiology, The University of Melbourne, Melbourne, VIC, Australia
- Present address: VOW, North Parramatta, NSW, Australia
| | - Casey J A Anttila
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Hongke Peng
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Mary Ann Anderson
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
- Clinical Haematology, Peter MacCallum Cancer Centre and Royal Melbourne Hospital, Melbourne, VIC, Australia
| | - Christoffer Flensburg
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Ian Majewski
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Andrew W Roberts
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
- Clinical Haematology, Peter MacCallum Cancer Centre and Royal Melbourne Hospital, Melbourne, VIC, Australia
- Centre for Cancer Research, University of Melbourne, Melbourne, VIC, Australia
- Victorian Comprehensive Cancer Centre, Melbourne, VIC, Australia
| | - David C S Huang
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Michael B Clark
- Centre for Stem Cell Systems, Department of Anatomy and Neuroscience, The University of Melbourne, Parkville, VIC, Australia
| | - Matthew E Ritchie
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia.
| |
Collapse
|
21
|
Li M, Sun W, Wang F, Wu X, Wang J. Asymmetric epigenetic modification and homoeolog expression bias in the establishment and evolution of allopolyploid Brassica napus. THE NEW PHYTOLOGIST 2021; 232:898-913. [PMID: 34265096 DOI: 10.1111/nph.17621] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 07/09/2021] [Indexed: 05/26/2023]
Abstract
This study explores how allopolyploidization reshapes the biased expression and asymmetric epigenetic modification of homoeologous gene pairs, and examines the regulation types and epigenetic basis of expression bias. We analyzed the gene expression and four epigenetic modifications (DNA methylation, H3K4me3, H3K27me3 and H3K27ac) of 29 976 homoeologous gene pairs in resynthesized, natural allopolyploid Brassica napus and an in silico 'hybrid'. We comprehensively elucidated the biased gene expression, asymmetric epigenetic modifications and the generational transmission characteristics of these homoeologous gene pairs in B. napus. We analyzed cis/trans effects and the epigenetic basis of homoeolog expression bias. There was a significant positive correlation between two active histone modifications and biased gene expression. We revealed that parental legacy was the dominant principle in the remodeling of homoeolog expression bias and asymmetric epigenetic modifications in B. napus, and further clarified that this depends on whether there were differences in the expression/epigenetic modifications of gene pairs in parents/progenitors. The maternal genome was dominant in the homoeolog expression bias of resynthesized B. napus, and this phenomenon was attenuated in natural B. napus. Furthermore, cis rather than trans effects were dominant when epigenetic modifications potentially affected biased expression of gene pairs in B. napus.
Collapse
Affiliation(s)
- Mengdi Li
- College of Life Sciences, Wuhan University, Wuhan, 430072, China
| | - Weiqi Sun
- College of Life Sciences, Wuhan University, Wuhan, 430072, China
| | - Fan Wang
- College of Life Sciences, Wuhan University, Wuhan, 430072, China
| | - Xiaoming Wu
- Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture, Oil Crops Research Institute of CAAS, Wuhan, 430062, China
| | - Jianbo Wang
- College of Life Sciences, Wuhan University, Wuhan, 430072, China
| |
Collapse
|
22
|
Krappinger JC, Bonstingl L, Pansy K, Sallinger K, Wreglesworth NI, Grinninger L, Deutsch A, El-Heliebi A, Kroneis T, Mcfarlane RJ, Sensen CW, Feichtinger J. Non-coding Natural Antisense Transcripts: Analysis and Application. J Biotechnol 2021; 340:75-101. [PMID: 34371054 DOI: 10.1016/j.jbiotec.2021.08.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 06/30/2021] [Accepted: 08/04/2021] [Indexed: 12/12/2022]
Abstract
Non-coding natural antisense transcripts (ncNATs) are regulatory RNA sequences that are transcribed in the opposite direction to protein-coding or non-coding transcripts. These transcripts are implicated in a broad variety of biological and pathological processes, including tumorigenesis and oncogenic progression. With this complex field still in its infancy, annotations, expression profiling and functional characterisations of ncNATs are far less comprehensive than those for protein-coding genes, pointing out substantial gaps in the analysis and characterisation of these regulatory transcripts. In this review, we discuss ncNATs from an analysis perspective, in particular regarding the use of high-throughput sequencing strategies, such as RNA-sequencing, and summarize the unique challenges of investigating the antisense transcriptome. Finally, we elaborate on their potential as biomarkers and future targets for treatment, focusing on cancer.
Collapse
Affiliation(s)
- Julian C Krappinger
- Division of Cell Biology, Histology and Embryology, Gottfried Schatz Research Center for Cell Signalling, Metabolism and Aging, Medical University of Graz, Neue Stiftingtalstraße 6/II, 8010 Graz, Austria; Christian Doppler Laboratory for innovative Pichia pastoris host and vector systems, Division of Cell Biology, Histology and Embryology, Medical University of Graz, Neue Stiftingtalstraße 6/II, 8010 Graz, Austria
| | - Lilli Bonstingl
- Division of Cell Biology, Histology and Embryology, Gottfried Schatz Research Center for Cell Signalling, Metabolism and Aging, Medical University of Graz, Neue Stiftingtalstraße 6/II, 8010 Graz, Austria; Center for Biomarker Research in Medicine, Stiftingtalstraße 5, 8010 Graz, Austria
| | - Katrin Pansy
- Division of Haematology, Medical University of Graz, Stiftingtalstrasse 24, 8010 Graz, Austria
| | - Katja Sallinger
- Division of Cell Biology, Histology and Embryology, Gottfried Schatz Research Center for Cell Signalling, Metabolism and Aging, Medical University of Graz, Neue Stiftingtalstraße 6/II, 8010 Graz, Austria; Center for Biomarker Research in Medicine, Stiftingtalstraße 5, 8010 Graz, Austria
| | - Nick I Wreglesworth
- North West Cancer Research Institute, School of Medical Sciences, Bangor University, LL57 2UW Bangor, United Kingdom
| | - Lukas Grinninger
- Division of Cell Biology, Histology and Embryology, Gottfried Schatz Research Center for Cell Signalling, Metabolism and Aging, Medical University of Graz, Neue Stiftingtalstraße 6/II, 8010 Graz, Austria; Austrian Biotech University of Applied Sciences, Konrad Lorenz-Straße 10, 3430 Tulln an der Donau, Austria
| | - Alexander Deutsch
- Division of Haematology, Medical University of Graz, Stiftingtalstrasse 24, 8010 Graz, Austria; BioTechMed-Graz, Mozartgasse 12/II, 8010 Graz, Austria
| | - Amin El-Heliebi
- Division of Cell Biology, Histology and Embryology, Gottfried Schatz Research Center for Cell Signalling, Metabolism and Aging, Medical University of Graz, Neue Stiftingtalstraße 6/II, 8010 Graz, Austria; Center for Biomarker Research in Medicine, Stiftingtalstraße 5, 8010 Graz, Austria
| | - Thomas Kroneis
- Division of Cell Biology, Histology and Embryology, Gottfried Schatz Research Center for Cell Signalling, Metabolism and Aging, Medical University of Graz, Neue Stiftingtalstraße 6/II, 8010 Graz, Austria; Center for Biomarker Research in Medicine, Stiftingtalstraße 5, 8010 Graz, Austria
| | - Ramsay J Mcfarlane
- North West Cancer Research Institute, School of Medical Sciences, Bangor University, LL57 2UW Bangor, United Kingdom
| | - Christoph W Sensen
- BioTechMed-Graz, Mozartgasse 12/II, 8010 Graz, Austria; Institute of Computational Biotechnology, Graz University of Technology, Petersgasse 14/V, 8010 Graz, Austria; HCEMM Kft., Római blvd. 21, 6723 Szeged, Hungary
| | - Julia Feichtinger
- Division of Cell Biology, Histology and Embryology, Gottfried Schatz Research Center for Cell Signalling, Metabolism and Aging, Medical University of Graz, Neue Stiftingtalstraße 6/II, 8010 Graz, Austria; Christian Doppler Laboratory for innovative Pichia pastoris host and vector systems, Division of Cell Biology, Histology and Embryology, Medical University of Graz, Neue Stiftingtalstraße 6/II, 8010 Graz, Austria; BioTechMed-Graz, Mozartgasse 12/II, 8010 Graz, Austria.
| |
Collapse
|
23
|
De Paoli-Iseppi R, Gleeson J, Clark MB. Isoform Age - Splice Isoform Profiling Using Long-Read Technologies. Front Mol Biosci 2021; 8:711733. [PMID: 34409069 PMCID: PMC8364947 DOI: 10.3389/fmolb.2021.711733] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 07/19/2021] [Indexed: 01/12/2023] Open
Abstract
Alternative splicing (AS) of RNA is a key mechanism that results in the expression of multiple transcript isoforms from single genes and leads to an increase in the complexity of both the transcriptome and proteome. Regulation of AS is critical for the correct functioning of many biological pathways, while disruption of AS can be directly pathogenic in diseases such as cancer or cause risk for complex disorders. Current short-read sequencing technologies achieve high read depth but are limited in their ability to resolve complex isoforms. In this review we examine how long-read sequencing (LRS) technologies can address this challenge by covering the entire RNA sequence in a single read and thereby distinguish isoform changes that could impact RNA regulation or protein function. Coupling LRS with technologies such as single cell sequencing, targeted sequencing and spatial transcriptomics is producing a rapidly expanding suite of technological approaches to profile alternative splicing at the isoform level with unprecedented detail. In addition, integrating LRS with genotype now allows the impact of genetic variation on isoform expression to be determined. Recent results demonstrate the potential of these techniques to elucidate the landscape of splicing, including in tissues such as the brain where AS is particularly prevalent. Finally, we also discuss how AS can impact protein function, potentially leading to novel therapeutic targets for a range of diseases.
Collapse
Affiliation(s)
| | | | - Michael B. Clark
- Centre for Stem Cell Systems, Department of Anatomy and Physiology, The University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
24
|
Chen K, Wei X, Pariyani R, Kortesniemi M, Zhang Y, Yang B. 1H NMR Metabolomics and Full-Length RNA-Seq Reveal Effects of Acylated and Nonacylated Anthocyanins on Hepatic Metabolites and Gene Expression in Zucker Diabetic Fatty Rats. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2021; 69:4423-4437. [PMID: 33835816 PMCID: PMC8154569 DOI: 10.1021/acs.jafc.1c00130] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Revised: 03/25/2021] [Accepted: 03/26/2021] [Indexed: 06/01/2023]
Abstract
Anthocyanins have been reported to possess antidiabetic effects. Recent studies indicate acylated anthocyanins have better stability and antioxidative activity compared to their nonacylated counterparts. This study compared the effects of nonacylated and acylated anthocyanins on hepatic gene expression and metabolic profile in diabetic rats, using full-length transcriptomics and 1H NMR metabolomics. Zucker diabetic fatty (ZDF) rats were fed with nonacylated anthocyanin extract from bilberries (NAAB) or acylated anthocyanin extract from purple potatoes (AAPP) at daily doses of 25 and 50 mg/kg body weight for 8 weeks. Both anthocyanin extracts restored the levels of multiple metabolites (glucose, lactate, alanine, and pyruvate) and expression of genes (G6pac, Pck1, Pklr, and Gck) involved in glycolysis and gluconeogenesis. AAPP decreased the hepatic glutamine level. NAAB regulated the expression of Mgat4a, Gstm6, and Lpl, whereas AAPP modified the expression of Mgat4a, Jun, Fos, and Egr1. This study indicated different effects of AAPP and NAAB on the hepatic transcriptomic and metabolic profiles of diabetic rats.
Collapse
Affiliation(s)
- Kang Chen
- Food
Chemistry and Food Development,
Department of Life Technologies, University
of Turku, FI-20014 Turun yliopisto, Finland
| | - Xuetao Wei
- Beijing
Key Laboratory of Toxicological Research and Risk Assessment for Food
Safety, Department of Toxicology, School of Public Health, Beijing University, Beijing 100191, China
| | - Raghunath Pariyani
- Food
Chemistry and Food Development,
Department of Life Technologies, University
of Turku, FI-20014 Turun yliopisto, Finland
| | - Maaria Kortesniemi
- Food
Chemistry and Food Development,
Department of Life Technologies, University
of Turku, FI-20014 Turun yliopisto, Finland
| | - Yumei Zhang
- Department
of Nutrition and Food Hygiene, School of Public Health, Beijing University, Beijing 100191, China
| | - Baoru Yang
- Food
Chemistry and Food Development,
Department of Life Technologies, University
of Turku, FI-20014 Turun yliopisto, Finland
| |
Collapse
|