1
|
Stemerdink M, Riepe T, Zomer N, Salz R, Kwint M, Oostrik J, Timmermans R, Ferrari B, Ferrari S, Dueñas Rey A, Delanote E, de Bruijn SE, Kremer H, Roosing S, Coppieters F, Hoischen A, Cremers FPM, 't Hoen PAC, van Wijk E, de Vrieze E. Deciphering the largest disease-associated transcript isoforms in the human neural retina with advanced long-read sequencing approaches. Genome Res 2025; 35:725-739. [PMID: 40037841 DOI: 10.1101/gr.280060.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Accepted: 02/11/2025] [Indexed: 03/06/2025]
Abstract
Sequencing technologies have long limited the comprehensive investigation of large transcripts associated with inherited retinal diseases (IRDs) like Usher syndrome, which involves 11 associated genes with transcripts up to 19.6 kb. To address this, we used PacBio long-read mRNA isoform sequencing (Iso-Seq) following standard library preparation and an optimized workflow to enrich for long transcripts in the human neural retina. While our workflow achieved sequencing of transcripts up to 15 kb, this was insufficient for Usher syndrome-associated genes USH2A and ADGRV1, with transcripts of 18.9 kb and 19.6 kb, respectively. To overcome this, we employed the Samplix Xdrop System for indirect target enrichment of cDNA, a technique typically used for genomic DNA capture. This method facilitated the successful capture and sequencing of ADGRV1 transcripts as well as full-length 18.9 kb USH2A transcripts. By combining algorithmic analysis with detailed manual curation of sequenced reads, we identified novel isoforms characterized by an alternative 5' transcription start site, the inclusion of previously unannotated exons, or alternative splicing events across the 11 Usher syndrome-associated genes. These findings have significant implications for genetic diagnostics and therapeutic development. The analysis applied here on Usher syndrome-associated transcripts exemplifies a valuable approach that can be extended to explore the transcriptomic complexity of other IRD-associated genes in the complete transcriptome data set generated within this study. Additionally, we demonstrate the adaptability of the Samplix Xdrop System for capturing cDNA, and the optimized methodologies described can be expanded to facilitate the enrichment of large transcripts from various tissues of interest.
Collapse
Affiliation(s)
- Merel Stemerdink
- Department of Otorhinolaryngology, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Tabea Riepe
- Department of Medical BioSciences, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Nick Zomer
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Renee Salz
- Department of Medical BioSciences, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Michael Kwint
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Jaap Oostrik
- Department of Otorhinolaryngology, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Raoul Timmermans
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Barbara Ferrari
- Fondazione Banca degli Occhi del Veneto, Zelarino, Venice 30174, Italy
| | - Stefano Ferrari
- Fondazione Banca degli Occhi del Veneto, Zelarino, Venice 30174, Italy
| | - Alfredo Dueñas Rey
- Center for Medical Genetics, Ghent University Hospital, Ghent 9000, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent 9000, Belgium
| | - Emma Delanote
- Center for Medical Genetics, Ghent University Hospital, Ghent 9000, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent 9000, Belgium
| | - Suzanne E de Bruijn
- Department of Otorhinolaryngology, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Hannie Kremer
- Department of Otorhinolaryngology, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Susanne Roosing
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Frauke Coppieters
- Center for Medical Genetics, Ghent University Hospital, Ghent 9000, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent 9000, Belgium
- Department of Pharmaceutics, Ghent University, Ghent 9000, Belgium
| | - Alexander Hoischen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
- Department of Internal Medicine and Radboud Center for Infectious Diseases (RCI), Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Frans P M Cremers
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Peter A C 't Hoen
- Department of Medical BioSciences, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Erwin van Wijk
- Department of Otorhinolaryngology, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Erik de Vrieze
- Department of Otorhinolaryngology, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands;
| |
Collapse
|
2
|
Acedo-Terrades A, Perera-Bel J, Nonell L. The importance of data transformation in RNA-Seq preprocessing for bladder cancer subtyping. BMC Res Notes 2025; 18:61. [PMID: 39930545 PMCID: PMC11812149 DOI: 10.1186/s13104-025-07138-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2024] [Accepted: 02/04/2025] [Indexed: 02/13/2025] Open
Abstract
OBJECTIVE RNA-Seq provides an accurate quantification of gene expression levels and it is widely used for molecular subtype classification in cancer, with special importance in prognosis. However, the reliability and validity of these analyses can significantly be influenced by how data are processed. In this study we evaluate how RNA-Seq preprocessing methods influence molecular subtype classification in bladder cancer. By benchmarking various aligners, quantifiers and methods of normalization and transformation, we stress the importance of preprocessing choices for accurate and consistent subtype classification. RESULTS Our findings highlight that log-transformation plays a crucial role in centroid-based classifiers such as consensusMIBC and TCGAclas, while distribution-free algorithms like LundTax offer robustness to preprocessing variations. Non log-transformed data resulted in low classification rates and poor agreement with reference classifications in consensusMIBC and TCGAclas classifiers. Additionally, LundTax consistently demonstrated better separation among subtypes, compared to consensusMIBC and TCGAclas, regardless of preprocessing methods. Nonetheless, the study is limited by the lack of a true reference for objective assessment of the accuracy of the assigned subtypes. Hence, future work will be necessary to determine the robustness and scalability of the obtained results.
Collapse
Affiliation(s)
| | | | - Lara Nonell
- Bioinformatics Unit, Vall d'Hebron Institute of Oncology, Barcelona, Spain.
| |
Collapse
|
3
|
Brooks TG, Lahens NF, Mrčela A, Yang J, Purohit S, Naik A, Ricciotti E, Sengupta S, Choi PS, Grant GR. Sources of non-uniform coverage in short-read RNA-Seq data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.30.634337. [PMID: 39975309 PMCID: PMC11838458 DOI: 10.1101/2025.01.30.634337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
The origin of several normal cellular functions and related abnormalities can be traced back to RNA splicing. As such, RNA splicing is currently the focus of a vast array of studies. To quantify the transcriptome, short-read RNA-Seq remains the standard assay. The primary technical artifact of RNASeq library prep, which severely interferes with analysis, is extreme non-uniformity in coverage across transcripts. This non-uniformity is present in both bulk and single-cell RNA-Seq and is observed even when the sample contains only full-length transcripts. This issue dramatically affects the accuracy of isoform-level quantification of multi-isoform genes. Understanding the sources of this non-uniformity is critical to developing improved protocols and analysis methods. Here, we explore eight potential sources of non-uniformity. We demonstrate that it cannot be explained by one factor alone. We performed targeted experiments to investigate the effect of fragment length, PCR ramp rate, and ribosomal depletion. We assessed existing data sets with varying sample quality, PCR cycle number, reverse transcriptase, and technical or biological replicates. We found evidence that interference of reverse transcription by secondary structure is unlikely to be the major contributing factor, that rRNA pull-down methods do not cause non-uniformity, that PCR ramp rate does not substantially impact non-uniformity, and that shorter fragments do not reduce non-uniformity. All these findings contradict prior publications or recommendations.
Collapse
Affiliation(s)
- Thomas G Brooks
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Nicholas F Lahens
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Antonijo Mrčela
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Jianing Yang
- Chronobiology and Sleep Institute, University of Pennsylvania, Philadelphia, PA, USA
| | - Souparna Purohit
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Amruta Naik
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Emanuela Ricciotti
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Shaon Sengupta
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Pediatrics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Peter S Choi
- Department of Pathology & Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Division of Cancer Pathobiology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Gregory R Grant
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
4
|
Rodrigues ABM, Passetti F, Guimarães ACR. Complementary Strategies to Identify Differentially Expressed Genes in the Choroid Plexus of Patients with Progressive Multiple Sclerosis. Neuroinformatics 2025; 23:10. [PMID: 39836313 DOI: 10.1007/s12021-024-09713-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/28/2024] [Indexed: 01/22/2025]
Abstract
Multiple sclerosis (MS) is a neurological disease causing myelin and axon damage through inflammatory and autoimmune processes. Despite affecting millions worldwide, understanding its genetic pathways remains limited. The choroid plexus (ChP) has been studied in neurodegenerative processes and diseases like MS due to its dysregulation, yet its role in MS pathophysiology remains unclear. Our work re-evaluates the ChP transcriptome in progressive MS patients and compares gene expression profiles using diverse methodological strategies. Samples from patient and healthy control RNASeq sequencing of brain tissue from post-mortem patients (GEO: GSE137619) were used. After an evaluation and quality control of these data, they had their transcripts mapped and quantified against the reference transcriptome GRCh38/hg38 of Homo sapiens using three strategies to identify differentially expressed genes in progressive MS patients. Functional analysis of genes revealed their involvement in immune processes, cell adhesion and migration, hormonal actions, amino acid transport, chemokines, metals, and signaling pathways. Our findings can offer valuable insights for progressive MS therapies, suggesting specific genes influence immune cell recruitment and potential ChP microenvironment changes. Combining complementary approaches maximizes literature coverage, facilitating a deeper understanding of the biological context in progressive MS.
Collapse
Affiliation(s)
| | - Fabio Passetti
- Instituto Carlos Chagas - Fiocruz/Paraná, Curitiba, PR, Brazil
| | - Ana Carolina Ramos Guimarães
- Laboratory for Applied Genomics and Bioinnovations, Instituto Oswaldo Cruz - Fiocruz, Rio de Janeiro, RJ, Brazil.
| |
Collapse
|
5
|
Adachi Y, Terakura S, Osaki M, Okuno Y, Sato Y, Sagou K, Takeuchi Y, Yokota H, Imai K, Steinberger P, Leitner J, Hanajiri R, Murata M, Kiyoi H. Cullin-5 deficiency promotes chimeric antigen receptor T cell effector functions potentially via the modulation of JAK/STAT signaling pathway. Nat Commun 2024; 15:10376. [PMID: 39658572 PMCID: PMC11631977 DOI: 10.1038/s41467-024-54794-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 11/21/2024] [Indexed: 12/12/2024] Open
Abstract
Chimeric antigen receptor (CAR) T cell is a promising therapy for cancer, but factors that enhance the efficacy of CAR T cell remain elusive. Here we perform a genome-wide CRISPR screening to probe genes that regulate the proliferation and survival of CAR T cells following repetitive antigen stimulations. We find that genetic ablation of CUL5, encoding a core element of the multi-protein E3 ubiquitin-protein ligase complex, cullin-RING ligase 5, enhances human CD19 CAR T cell expansion potential and effector functions, potentially via the Janus kinase/signal transducers and activators of transcription (JAK/STAT) pathway. In this regard, CUL5 knockout CD19 CAR T cells show sustained STAT3 and STAT5 phosphorylation, as well as delayed phosphorylation and degradation of JAK1 and JAK3. In vivo, shRNA-mediated knockdown of CUL5 enhances CD19 CAR T treatment outcomes in tumor-bearing mice. Our findings thus imply that targeting CUL5 in the ubiquitin system may enhance CAR T cell effector functions to enhance immunotherapy efficacy.
Collapse
Affiliation(s)
- Yoshitaka Adachi
- Department of Hematology and Oncology, Nagoya University Graduate School of Medicine, Nagoya, Japan.
| | - Seitaro Terakura
- Department of Hematology and Oncology, Nagoya University Graduate School of Medicine, Nagoya, Japan.
| | - Masahide Osaki
- Department of Hematology and Oncology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Yusuke Okuno
- Department of Virology, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Yoshitaka Sato
- Department of Virology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Ken Sagou
- Department of Hematology and Oncology, Nagoya University Graduate School of Medicine, Nagoya, Japan
- Department of Virology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Yuki Takeuchi
- Department of Hematology and Oncology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Hirofumi Yokota
- Department of Hematology and Oncology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Kanae Imai
- Department of Hematology and Oncology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Peter Steinberger
- Division for Immune Receptors and T Cell Activation, Institute of Immunology, Medical University of Vienna, Vienna, Austria
| | - Judith Leitner
- Division for Immune Receptors and T Cell Activation, Institute of Immunology, Medical University of Vienna, Vienna, Austria
| | - Ryo Hanajiri
- Department of Hematology and Oncology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Makoto Murata
- Department of Hematology and Oncology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Hitoshi Kiyoi
- Department of Hematology and Oncology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| |
Collapse
|
6
|
Kuehl M, Wong MN, Wanner N, Bonn S, Puelles VG. Gene count estimation with pytximport enables reproducible analysis of bulk RNA sequencing data in Python. Bioinformatics 2024; 40:btae700. [PMID: 39565903 PMCID: PMC11629965 DOI: 10.1093/bioinformatics/btae700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Revised: 10/08/2024] [Accepted: 11/18/2024] [Indexed: 11/22/2024] Open
Abstract
SUMMARY Transcript quantification tools efficiently map bulk RNA sequencing (RNA-seq) reads to reference transcriptomes. However, their output consists of transcript count estimates that are subject to multiple biases and cannot be readily used with existing differential gene expression analysis tools in Python.Here we present pytximport, a Python implementation of the tximport R package that supports a variety of input formats, different modes of bias correction, inferential replicates, gene-level summarization of transcript counts, transcript-level exports, transcript-to-gene mapping generation, and optional filtering of transcripts by biotype. pytximport is part of the scverse ecosystem of open-source Python software packages for omics analyses and includes both a Python as well as a command-line interface.With pytximport, we propose a bulk RNA-seq analysis workflow based on Bioconda and scverse ecosystem packages, ensuring reproducible analyses through Snakemake rules. We apply this pipeline to a publicly available RNA-seq dataset, demonstrating how pytximport enables the creation of Python-centric workflows capable of providing insights into transcriptomic alterations. AVAILABILITY AND IMPLEMENTATION pytximport is licensed under the GNU General Public License version 3. The source code is available at https://github.com/complextissue/pytximport and via Zenodo with DOI: 10.5281/zenodo.13907917. A related Snakemake workflow is available through GitHub at https://github.com/complextissue/snakemake-bulk-rna-seq-workflow and Zenodo with DOI: 10.5281/zenodo.12713811. Documentation and a vignette for new users are available at: https://pytximport.readthedocs.io.
Collapse
Affiliation(s)
- Malte Kuehl
- Department of Clinical Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, Midtjylland, 8200, Denmark
- Department of Pathology, Aarhus University Hospital, Palle Juul-Jensens Boulevard 69, Aarhus N, Midtjylland, 8200, Denmark
- Institute of Medical Systems Biology, University Medical Center Hamburg-Eppendorf, Falkenried 94, Hamburg, Hamburg, 20251, Germany
- Center for Biomedical AI, University Medical Center Hamburg-Eppendorf, Martinistraße 52, Hamburg, Hamburg, 20246, Germany
| | - Milagros N Wong
- Department of Clinical Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, Midtjylland, 8200, Denmark
- Department of Pathology, Aarhus University Hospital, Palle Juul-Jensens Boulevard 69, Aarhus N, Midtjylland, 8200, Denmark
- III. Department of Medicine, University Medical Center Hamburg-Eppendorf, Martinistraße 52, Hamburg, Hamburg, 20246, Germany
- Hamburg Center for Kidney Health, University Medical Center Hamburg-Eppendorf, Martinistraße 52, Hamburg, Hamburg, 20246, Germany
| | - Nicola Wanner
- III. Department of Medicine, University Medical Center Hamburg-Eppendorf, Martinistraße 52, Hamburg, Hamburg, 20246, Germany
- Hamburg Center for Kidney Health, University Medical Center Hamburg-Eppendorf, Martinistraße 52, Hamburg, Hamburg, 20246, Germany
| | - Stefan Bonn
- Institute of Medical Systems Biology, University Medical Center Hamburg-Eppendorf, Falkenried 94, Hamburg, Hamburg, 20251, Germany
- Center for Biomedical AI, University Medical Center Hamburg-Eppendorf, Martinistraße 52, Hamburg, Hamburg, 20246, Germany
| | - Victor G Puelles
- Department of Clinical Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, Midtjylland, 8200, Denmark
- Department of Pathology, Aarhus University Hospital, Palle Juul-Jensens Boulevard 69, Aarhus N, Midtjylland, 8200, Denmark
- III. Department of Medicine, University Medical Center Hamburg-Eppendorf, Martinistraße 52, Hamburg, Hamburg, 20246, Germany
- Hamburg Center for Kidney Health, University Medical Center Hamburg-Eppendorf, Martinistraße 52, Hamburg, Hamburg, 20246, Germany
| |
Collapse
|
7
|
Feng N, Mandal A, Jambhale A, Narnur P, Chen G, Akula N, Kramer R, Kolachana B, Xu Q, McMahon FJ, Lipska BK, Auluck PK, Marenco S. Schizophrenia risk-associated SNPs affect expression of microRNA 137 host gene: a postmortem study. Hum Mol Genet 2024; 33:1939-1947. [PMID: 39239979 DOI: 10.1093/hmg/ddae130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 08/23/2024] [Accepted: 08/28/2024] [Indexed: 09/07/2024] Open
Abstract
Common variants in the MicroRNA 137 host gene MIR137HG and its adjacent gene DPYD have been associated with schizophrenia risk and the latest Psychiatric Genomics Consortium (PGC). Genome-Wide Association Study on schizophrenia has confirmed and extended these findings. To elucidate the association of schizophrenia risk-associated SNPs in this genomic region, we examined the expression of both mature and immature transcripts of the miR-137 host gene (MIR137HG) in the dorsolateral prefrontal cortex (DLPFC) and subgenual anterior cingulate cortex (sgACC) of postmortem brain samples of donors with schizophrenia and psychiatrically-unaffected controls using qPCR and RNA-Seq approaches. No differential expression of miR-137, MIR137HG, or its transcripts was observed. Two schizophrenia risk-associated SNPs identified in the PGC study, rs11165917 (DLPFC: P = 2.0e-16; sgACC: P = 6.4e-10) and rs4274102 (DLPFC: P = 0.036; sgACC: P = 0.002), were associated with expression of the MIR137HG long non-coding RNA transcript MIR137HG-203 (ENST00000602672.2) in individuals of European ancestry. Carriers of the minor (risk) allele of rs11165917 had significantly lower expression of MIR137HG-203 compared with those carrying the major allele. However, we were unable to validate this result by short-read sequencing of RNA extracted from DLPFC or sgACC tissue. This finding suggests that immature transcripts of MIR137HG may contribute to genetic risk for schizophrenia.
Collapse
Affiliation(s)
- Ningping Feng
- Human Brain Collection Core, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 10 Center Drive, Bldg 10, room 4N218, Bethesda, MD 20892, United States
| | - Ajeet Mandal
- Human Brain Collection Core, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 10 Center Drive, Bldg 10, room 4N218, Bethesda, MD 20892, United States
| | - Ananya Jambhale
- Human Brain Collection Core, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 10 Center Drive, Bldg 10, room 4N218, Bethesda, MD 20892, United States
| | - Pranav Narnur
- Human Brain Collection Core, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 10 Center Drive, Bldg 10, room 4N218, Bethesda, MD 20892, United States
| | - Gang Chen
- Scientific and Statistical Computing Core, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 10 Center Drive, bldg 10, room 1D73, Bethesda, MD 20892, United States
| | - Nirmala Akula
- Human Genetics Branch, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 35 Convent Dr. Bldg. 35, RM 1A202, MSC 3719, Bethesda, MD 20892, United States
| | - Robin Kramer
- Human Brain Collection Core, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 10 Center Drive, Bldg 10, room 4N218, Bethesda, MD 20892, United States
| | - Bhaskar Kolachana
- Human Brain Collection Core, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 10 Center Drive, Bldg 10, room 4N218, Bethesda, MD 20892, United States
| | - Qing Xu
- Human Brain Collection Core, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 10 Center Drive, Bldg 10, room 4N218, Bethesda, MD 20892, United States
| | - Francis J McMahon
- Human Genetics Branch, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 35 Convent Dr. Bldg. 35, RM 1A202, MSC 3719, Bethesda, MD 20892, United States
| | - Barbara K Lipska
- Human Brain Collection Core, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 10 Center Drive, Bldg 10, room 4N218, Bethesda, MD 20892, United States
| | - Pavan K Auluck
- Human Brain Collection Core, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 10 Center Drive, Bldg 10, room 4N218, Bethesda, MD 20892, United States
| | - Stefano Marenco
- Human Brain Collection Core, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, 10 Center Drive, Bldg 10, room 4N218, Bethesda, MD 20892, United States
| |
Collapse
|
8
|
Riepe TV, Stemerdink M, Salz R, Rey AD, de Bruijn SE, Boonen E, Tomkiewicz TZ, Kwint M, Gloerich J, Wessels HJCT, Delanote E, De Baere E, van Nieuwerburgh F, De Keulenaer S, Ferrari B, Ferrari S, Coppieters F, Cremers FPM, van Wyk E, Roosing S, de Vrieze E, ‘t Hoen PAC. A proteogenomic atlas of the human neural retina. Front Genet 2024; 15:1451024. [PMID: 39371417 PMCID: PMC11450717 DOI: 10.3389/fgene.2024.1451024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 08/30/2024] [Indexed: 10/08/2024] Open
Abstract
The human neural retina is a complex tissue with abundant alternative splicing and more than 10% of genetic variants linked to inherited retinal diseases (IRDs) alter splicing. Traditional short-read RNA-sequencing methods have been used for understanding retina-specific splicing but have limitations in detailing transcript isoforms. To address this, we generated a proteogenomic atlas that combines PacBio long-read RNA-sequencing data with mass spectrometry and whole genome sequencing data of three healthy human neural retina samples. We identified nearly 60,000 transcript isoforms, of which approximately one-third are novel. Additionally, ten novel peptides confirmed novel transcript isoforms. For instance, we identified a novel IMPDH1 isoform with a novel combination of known exons that is supported by peptide evidence. Our research underscores the potential of in-depth tissue-specific transcriptomic analysis to enhance our grasp of tissue-specific alternative splicing. The data underlying the proteogenomic atlas are available via EGA with identifier EGAD50000000101, via ProteomeXchange with identifier PXD045187, and accessible through the UCSC genome browser.
Collapse
Affiliation(s)
- Tabea V. Riepe
- Department of Medical BioSciences, Radboud University Medical Center, Nijmegen, Netherlands
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
- Academic Alliance Genetics, Radboud University Medical Center, Nijmegen, Gelderland, Netherlands
- Maastricht University Medical Center+, Maastricht, Netherlands
| | - Merel Stemerdink
- Department of Otorhinolaryngology, Radboud University Medical Center, Nijmegen, Gelderland, Netherlands
| | - Renee Salz
- Department of Medical BioSciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Alfredo Dueñas Rey
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Suzanne E. de Bruijn
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
- Academic Alliance Genetics, Radboud University Medical Center, Nijmegen, Gelderland, Netherlands
| | - Erica Boonen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
- Academic Alliance Genetics, Radboud University Medical Center, Nijmegen, Gelderland, Netherlands
- Maastricht University Medical Center+, Maastricht, Netherlands
| | - Tomasz Z. Tomkiewicz
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
- Academic Alliance Genetics, Radboud University Medical Center, Nijmegen, Gelderland, Netherlands
- Maastricht University Medical Center+, Maastricht, Netherlands
| | - Michael Kwint
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
| | - Jolein Gloerich
- Department of Human Genetics, Translational Metabolic Laboratory, Radboud University Medical Center, Nijmegen, Gelderland, Netherlands
| | - Hans J. C. T. Wessels
- Department of Human Genetics, Translational Metabolic Laboratory, Radboud University Medical Center, Nijmegen, Gelderland, Netherlands
| | - Emma Delanote
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Elfride De Baere
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | | | - Sarah De Keulenaer
- NXTGNT, Faculty of Pharmaceutical Sciences, Ghent University, Ghent, Belgium
| | | | | | - Frauke Coppieters
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Department of Pharmaceutics, Ghent University, Ghent, Belgium
| | - Frans P. M. Cremers
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
- Academic Alliance Genetics, Radboud University Medical Center, Nijmegen, Gelderland, Netherlands
- Maastricht University Medical Center+, Maastricht, Netherlands
| | - Erwin van Wyk
- Department of Otorhinolaryngology, Radboud University Medical Center, Nijmegen, Gelderland, Netherlands
| | - Susanne Roosing
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
- Academic Alliance Genetics, Radboud University Medical Center, Nijmegen, Gelderland, Netherlands
- Maastricht University Medical Center+, Maastricht, Netherlands
| | - Erik de Vrieze
- Department of Otorhinolaryngology, Radboud University Medical Center, Nijmegen, Gelderland, Netherlands
| | - Peter A. C. ‘t Hoen
- Department of Medical BioSciences, Radboud University Medical Center, Nijmegen, Netherlands
| |
Collapse
|
9
|
Abood A, Mesner LD, Jeffery ED, Murali M, Lehe MD, Saquing J, Farber CR, Sheynkman GM. Long-read proteogenomics to connect disease-associated sQTLs to the protein isoform effectors of disease. Am J Hum Genet 2024; 111:1914-1931. [PMID: 39079539 PMCID: PMC11393689 DOI: 10.1016/j.ajhg.2024.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 07/01/2024] [Accepted: 07/02/2024] [Indexed: 08/07/2024] Open
Abstract
A major fraction of loci identified by genome-wide association studies (GWASs) mediate alternative splicing, but mechanistic interpretation is hindered by the technical limitations of short-read RNA sequencing (RNA-seq), which cannot directly link splicing events to full-length protein isoforms. Long-read RNA-seq represents a powerful tool to characterize transcript isoforms, and recently, infer protein isoform existence. Here, we present an approach that integrates information from GWASs, splicing quantitative trait loci (sQTLs), and PacBio long-read RNA-seq in a disease-relevant model to infer the effects of sQTLs on the ultimate protein isoform products they encode. We demonstrate the utility of our approach using bone mineral density (BMD) GWAS data. We identified 1,863 sQTLs from the Genotype-Tissue Expression (GTEx) project in 732 protein-coding genes that colocalized with BMD associations (H4PP ≥ 0.75). We generated PacBio Iso-Seq data (N = ∼22 million full-length reads) on human osteoblasts, identifying 68,326 protein-coding isoforms, of which 17,375 (25%) were unannotated. By casting the sQTLs onto protein isoforms, we connected 809 sQTLs to 2,029 protein isoforms from 441 genes expressed in osteoblasts. Overall, we found that 74 sQTLs influenced isoforms likely impacted by nonsense-mediated decay and 190 that potentially resulted in the expression of unannotated protein isoforms. Finally, we functionally validated colocalizing sQTLs in TPM2, in which siRNA-mediated knockdown in osteoblasts showed two TPM2 isoforms with opposing effects on mineralization but exhibited no effect upon knockdown of the entire gene. Our approach should be to generalize across diverse clinical traits and to provide insights into protein isoform activities modulated by GWAS loci.
Collapse
Affiliation(s)
- Abdullah Abood
- Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA; Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| | - Larry D Mesner
- Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA; Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
| | - Erin D Jeffery
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | - Mayank Murali
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | - Micah D Lehe
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | - Jamie Saquing
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | - Charles R Farber
- Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA; Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA; Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA.
| | - Gloria M Sheynkman
- Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA; Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA; UVA Comprehensive Cancer Center, University of Virginia, Charlottesville, VA, USA.
| |
Collapse
|
10
|
Hou Y, Li Q, Zhou H, Kafle S, Li W, Tan L, Liang J, Meng L, Xin H. SMRT sequencing of a full-length transcriptome reveals cold induced alternative splicing in Vitis amurensis root. PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2024; 213:108863. [PMID: 38917739 DOI: 10.1016/j.plaphy.2024.108863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Revised: 05/31/2024] [Accepted: 06/19/2024] [Indexed: 06/27/2024]
Abstract
Alternative splicing enhances diversity at the transcriptional and protein levels that widely involved in plant response to biotic and abiotic stresses. V. amurensis is an extremely cold-tolerant wild grape variety, however, studies on alternative splicing (AS) in amur grape at low temperatures are currently poorly understood. In this study, we analyzed full-length transcriptome and RNA seq data at 0, 2, and 24 h after cold stress in V. amurensis roots. Following quality control and correction, 221,170 high-quality full-length non-concatemer (FLNC) reads were identified. A total of 16,181 loci and 30,733 isoforms were identified. These included 22,868 novel isoforms from annotated genes and 2815 isoforms from 2389 novel genes. Among the distinguished novel isoforms, 673 Long non-coding RNAs (LncRNAs) and 18,164 novel isoforms open reading frame (ORF) region were found. A total of 2958 genes produced 8797 AS events, of which 189 genes were involved in the low-temperature response. Twelve transcription factors show AS during cold treatment and VaMYB108 was selected for initial exploration. Two transcripts, Chr05.63.1 (VaMYB108short) and Chr05.63.2 (VaMYB108normal) of VaMYB108, display up-regulated expression after cold treatment in amur grape roots and are both localized in the nucleus. Only VaMYB108normal exhibits transcriptional activation activity. Overexpression of either VaMYB108short or VaMYB108normal in grape roots leads to increased expression of the other transcript and both increased chilling resistance of amur grape roots. The results improve and supplement the genome annotations and provide insights for further investigation into AS mechanisms during cold stress in V. amurensis.
Collapse
Affiliation(s)
- Yujun Hou
- State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Qingyun Li
- State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Huimin Zhou
- State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Subash Kafle
- State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Wenjuan Li
- State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Lisha Tan
- State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, China
| | - Ju Liang
- Turpan Institute of Agricultural Sciences, Xinjiang Academy of Agricultural Sciences, Xinjiang, 830091, China
| | - Lin Meng
- State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, China
| | - Haiping Xin
- State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, China; Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, China.
| |
Collapse
|
11
|
Haj Abdullah Alieh L, Cardoso de Toledo B, Hadarovich A, Toth-Petroczy A, Calegari F. Characterization of alternative splicing during mammalian brain development reveals the extent of isoform diversity and potential effects on protein structural changes. Biol Open 2024; 13:bio061721. [PMID: 39387301 PMCID: PMC11554263 DOI: 10.1242/bio.061721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Accepted: 09/09/2024] [Indexed: 10/15/2024] Open
Abstract
Regulation of gene expression is critical for fate commitment of stem and progenitor cells during tissue formation. In the context of mammalian brain development, a plethora of studies have described how changes in the expression of individual genes characterize cell types across ontogeny and phylogeny. However, little attention has been paid to the fact that different transcripts can arise from any given gene through alternative splicing (AS). Considered a key mechanism expanding transcriptome diversity during evolution, assessing the full potential of AS on isoform diversity and protein function has been notoriously difficult. Here, we capitalize on the use of a validated reporter mouse line to isolate neural stem cells, neurogenic progenitors and neurons during corticogenesis and combine the use of short- and long-read sequencing to reconstruct the full transcriptome diversity characterizing neurogenic commitment. Extending available transcriptional profiles of the mammalian brain by nearly 50,000 new isoforms, we found that neurogenic commitment is characterized by a progressive increase in exon inclusion resulting in the profound remodeling of the transcriptional profile of specific cortical cell types. Most importantly, we computationally infer the biological significance of AS on protein structure by using AlphaFold2, revealing how radical protein conformational changes can arise from subtle changes in isoforms sequence. Together, our study reveals that AS has a greater potential to impact protein diversity and function than previously thought, independently from changes in gene expression.
Collapse
Affiliation(s)
| | | | - Anna Hadarovich
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
| | - Agnes Toth-Petroczy
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- Cluster of Excellence Physics of Life, TU Dresden, 01062 Dresden, Germany
| | - Federico Calegari
- CRTD-Center for Regenerative Therapies Dresden, School of Medicine, TU Dresden, Germany
| |
Collapse
|
12
|
Brooks TG, Lahens NF, Mrčela A, Sarantopoulou D, Nayak S, Naik A, Sengupta S, Choi PS, Grant GR. BEERS2: RNA-Seq simulation through high fidelity in silico modeling. Brief Bioinform 2024; 25:bbae164. [PMID: 38605641 PMCID: PMC11009461 DOI: 10.1093/bib/bbae164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 01/26/2024] [Accepted: 03/26/2024] [Indexed: 04/13/2024] Open
Abstract
Simulation of RNA-seq reads is critical in the assessment, comparison, benchmarking and development of bioinformatics tools. Yet the field of RNA-seq simulators has progressed little in the last decade. To address this need we have developed BEERS2, which combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline. BEERS2 takes input transcripts (typically fully length messenger RNA transcripts with polyA tails) from either customizable input or from CAMPAREE simulated RNA samples. It produces realistic reads of these transcripts as FASTQ, SAM or BAM formats with the SAM or BAM formats containing the true alignment to the reference genome. It also produces true transcript-level quantification values. BEERS2 combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline and is designed to include the effects of polyA selection and RiboZero for ribosomal depletion, hexamer priming sequence biases, GC-content biases in polymerase chain reaction (PCR) amplification, barcode read errors and errors during PCR amplification. These characteristics combine to make BEERS2 the most complete simulation of RNA-seq to date. Finally, we demonstrate the use of BEERS2 by measuring the effect of several settings on the popular Salmon pseudoalignment algorithm.
Collapse
Affiliation(s)
- Thomas G Brooks
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Nicholas F Lahens
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Antonijo Mrčela
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Dimitra Sarantopoulou
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Current address: National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Soumyashant Nayak
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Current address: Statistics and Mathematics Unit, Indian Statistical Institute, Bengaluru, Karnataka, India
| | - Amruta Naik
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Shaon Sengupta
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Peter S Choi
- Division of Cancer Pathobiology, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology & Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Gregory R Grant
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
13
|
Sun G, DeFelice MM, Gillies TE, Ahn-Horst TA, Andrews CJ, Krummenacker M, Karp PD, Morrison JH, Covert MW. Cross-evaluation of E. coli's operon structures via a whole-cell model suggests alternative cellular benefits for low- versus high-expressing operons. Cell Syst 2024; 15:227-245.e7. [PMID: 38417437 PMCID: PMC10957310 DOI: 10.1016/j.cels.2024.02.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 09/12/2023] [Accepted: 02/08/2024] [Indexed: 03/01/2024]
Abstract
Many bacteria use operons to coregulate genes, but it remains unclear how operons benefit bacteria. We integrated E. coli's 788 polycistronic operons and 1,231 transcription units into an existing whole-cell model and found inconsistencies between the proposed operon structures and the RNA-seq read counts that the model was parameterized from. We resolved these inconsistencies through iterative, model-guided corrections to both datasets, including the correction of RNA-seq counts of short genes that were misreported as zero by existing alignment algorithms. The resulting model suggested two main modes by which operons benefit bacteria. For 86% of low-expression operons, adding operons increased the co-expression probabilities of their constituent proteins, whereas for 92% of high-expression operons, adding operons resulted in more stable expression ratios between the proteins. These simulations underscored the need for further experimental work on how operons reduce noise and synchronize both the expression timing and the quantity of constituent genes. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Gwanggyu Sun
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Mialy M DeFelice
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Taryn E Gillies
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Travis A Ahn-Horst
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Cecelia J Andrews
- Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA
| | | | | | - Jerry H Morrison
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Markus W Covert
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
14
|
Perelo LW, Gabernet G, Straub D, Nahnsen S. How tool combinations in different pipeline versions affect the outcome in RNA-seq analysis. NAR Genom Bioinform 2024; 6:lqae020. [PMID: 38456178 PMCID: PMC10919883 DOI: 10.1093/nargab/lqae020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 01/07/2024] [Accepted: 02/12/2024] [Indexed: 03/09/2024] Open
Abstract
Data analysis tools are continuously changed and improved over time. In order to test how these changes influence the comparability between analyses, the output of different workflow options of the nf-core/rnaseq pipeline were compared. Five different pipeline settings (STAR+Salmon, STAR+RSEM, STAR+featureCounts, HISAT2+featureCounts, pseudoaligner Salmon) were run on three datasets (human, Arabidopsis, zebrafish) containing spike-ins of the External RNA Control Consortium (ERCC). Fold change ratios and differential expression of genes and spike-ins were used for comparative analyses of the different tools and versions settings of the pipeline. An overlap of 85% for differential gene classification between pipelines could be shown. Genes interpreted with a bias were mostly those present at lower concentration. Also, the number of isoforms and exons per gene were determinants. Previous pipeline versions using featureCounts showed a higher sensitivity to detect one-isoform genes like ERCC. To ensure data comparability in long-term analysis series it would be recommendable to either stay with the pipeline version the series was initialized with or to run both versions during a transition time in order to ensure that the target genes are addressed the same way.
Collapse
Affiliation(s)
- Louisa Wessels Perelo
- Quantitative Biology Center (QBiC), University of Tübingen, Otfried-Müller-Str. 37, 72076 Tübingen, Baden-Württemberg, 72076, Germany
| | - Gisela Gabernet
- Quantitative Biology Center (QBiC), University of Tübingen, Otfried-Müller-Str. 37, 72076 Tübingen, Baden-Württemberg, 72076, Germany
| | - Daniel Straub
- Quantitative Biology Center (QBiC), University of Tübingen, Otfried-Müller-Str. 37, 72076 Tübingen, Baden-Württemberg, 72076, Germany
| | - Sven Nahnsen
- Quantitative Biology Center (QBiC), University of Tübingen, Otfried-Müller-Str. 37, 72076 Tübingen, Baden-Württemberg, 72076, Germany
- M3 Research Center, Faculty of Medicine, University of Tübingen, Otfried-Müller-Str. 37, 72076 Tübingen, Baden-Württemberg, 72076, Germany
- Department of Computer Science, Institute for Bioinformatics and Medical Informatics (IBMI), University of Tübingen, Otfried-Müller-Str. 37, 72076 Tübingen, Baden-Württemberg, 72076, Germany
- Cluster of Excellence iFIT (EXC 2180), Image-Guided and Functionally Instructed Tumor Therapies, University of Tübingen, Otfried-Müller-Str. 37, 72076 Tübingen, Baden-Württemberg, 72076, Germany
| |
Collapse
|
15
|
Hao W, Yang W, Yang Y, Cheng T, Wei T, Tang L, Qian N, Yang Y, Li X, Jiang H, Wang M. Identification of lncRNA-miRNA-mRNA Networks in the Lenticular Nucleus Region of the Brain Contributes to Hepatolenticular Degeneration Pathogenesis and Therapy. Mol Neurobiol 2024; 61:1673-1686. [PMID: 37759104 PMCID: PMC10896925 DOI: 10.1007/s12035-023-03631-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 08/31/2023] [Indexed: 09/29/2023]
Abstract
Long non-coding RNAs (lncRNAs) are a recently discovered group of non-coding RNAs that play a crucial role in the regulation of various human diseases, especially in the study of nervous system diseases which has garnered significant attention. However, there is limited knowledge on the identification and function of lncRNAs in hepatolenticular degeneration (HLD). The objective of this study was to identify novel lncRNAs and determine their involvement in the networks associated with HLD. We conducted a comprehensive analysis of RNA sequencing (RNA-seq) data, reverse transcription-quantitative polymerase chain reaction (RT-qPCR), and computational biology to identify novel lncRNAs and explore their potential mechanisms in HLD. We identified 212 differently expressed lncRNAs, with 98 upregulated and 114 downregulated. Additionally, 32 differently expressed mRNAs were found, with 15 upregulated and 17 downregulated. We obtained a total of 1131 pairs of co-expressed lncRNAs and mRNAs by Pearson correlation test and prediction and annotation of the lncRNA-targeted miRNA-mRNA network. The differential lncRNAs identified in this study were found to be involved in various biological functions and signaling pathways. These include translational initiation, motor learning, locomotors behavior, dioxygenase activity, integral component of postsynaptic membrane, neuroactive ligand-receptor interaction, nuclear factor-kappa B (NF-κB) signaling pathway, cholinergic synapse, sphingolipid signaling pathway, and Parkinson's disease signaling pathway, as revealed by the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses. Six lncRNAs, including XR_001782921.1 (P < 0.01), XR_ 001780581.1 (P < 0.01), ENSMUST_00000207119 (P < 0.01), XR_865512.2 (P < 0.01), TCONS_00005916 (P < 0.01), and TCONS_00020683 (P < 0.01), showed significant differences in expression levels between the model group and normal group by RT-qPCR. Among these, four lncRNAs (TCONS_00020683, XR_865512.2, XR_001780581.1, and ENSMUST00000207119) displayed a high degree of conservation. This study provides a unique perspective for the pathogenesis and therapy of HLD by constructing the lncRNA-miRNA-mRNA network. This insight provides a foundation for future exploration in this field.
Collapse
Affiliation(s)
- Wenjie Hao
- Department of Neurology, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China
- Center for Xin'an Medicine and Modernization of Traditional Chinese Medicine of IHM, Anhui University of Chinese Medicine, Hefei, China
- Key Laboratory of Xin'an Medicine of the Ministry of Education, Anhui University of Chinese Medicine, Hefei, China
| | - Wenming Yang
- Department of Neurology, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China.
- Center for Xin'an Medicine and Modernization of Traditional Chinese Medicine of IHM, Anhui University of Chinese Medicine, Hefei, China.
- Key Laboratory of Xin'an Medicine of the Ministry of Education, Anhui University of Chinese Medicine, Hefei, China.
| | - Yue Yang
- Department of Neurology, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China
| | - Ting Cheng
- Department of Graduate, Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Taohua Wei
- Department of Neurology, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China
- Center for Xin'an Medicine and Modernization of Traditional Chinese Medicine of IHM, Anhui University of Chinese Medicine, Hefei, China
| | - Lulu Tang
- Department of Neurology, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China
- Center for Xin'an Medicine and Modernization of Traditional Chinese Medicine of IHM, Anhui University of Chinese Medicine, Hefei, China
| | - Nannan Qian
- Key Laboratory of Xin'an Medicine of the Ministry of Education, Anhui University of Chinese Medicine, Hefei, China
| | - Yulong Yang
- Key Laboratory of Xin'an Medicine of the Ministry of Education, Anhui University of Chinese Medicine, Hefei, China
| | - Xiang Li
- Department of Neurology, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China
| | - Hailin Jiang
- Center for Xin'an Medicine and Modernization of Traditional Chinese Medicine of IHM, Anhui University of Chinese Medicine, Hefei, China
| | - Meixia Wang
- Department of Neurology, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China
- Center for Xin'an Medicine and Modernization of Traditional Chinese Medicine of IHM, Anhui University of Chinese Medicine, Hefei, China
| |
Collapse
|
16
|
Lienhard M, van den Beucken T, Timmermann B, Hochradel M, Börno S, Caiment F, Vingron M, Herwig R. IsoTools: a flexible workflow for long-read transcriptome sequencing analysis. Bioinformatics 2023; 39:btad364. [PMID: 37267159 PMCID: PMC10287928 DOI: 10.1093/bioinformatics/btad364] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 04/28/2023] [Accepted: 06/01/2023] [Indexed: 06/04/2023] Open
Abstract
MOTIVATION Long-read transcriptome sequencing (LRTS) has the potential to enhance our understanding of alternative splicing and the complexity of this process requires the use of versatile computational tools, with the ability to accommodate various stages of the workflow with maximum flexibility. RESULTS We introduce IsoTools, a Python-based LRTS analysis framework that offers a wide range of functionality for transcriptome reconstruction and quantification of transcripts. Furthermore, we integrate a graph-based method for identifying alternative splicing events and a statistical approach based on the beta-binomial distribution for detecting differential events. To demonstrate the effectiveness of our methods, we applied IsoTools to PacBio LRTS data of human hepatocytes treated with the histone deacetylase inhibitor valproic acid. Our results indicate that LRTS can provide valuable insights into alternative splicing, particularly in terms of complex and differential splicing patterns, in comparison to short-read RNA-seq. AVAILABILITY AND IMPLEMENTATION IsoTools is available on GitHub and PyPI, and its documentation, including tutorials, CLI, and API references, can be found at https://isotools.readthedocs.io/.
Collapse
Affiliation(s)
- Matthias Lienhard
- Department of Computational Biology, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Twan van den Beucken
- Department of Toxicogenomics, Maastricht University, Maastricht 6229ER, The Netherlands
| | - Bernd Timmermann
- Sequencing Core Unit, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Myriam Hochradel
- Sequencing Core Unit, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Stefan Börno
- Sequencing Core Unit, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Florian Caiment
- Department of Toxicogenomics, Maastricht University, Maastricht 6229ER, The Netherlands
| | - Martin Vingron
- Department of Computational Biology, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Ralf Herwig
- Department of Computational Biology, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| |
Collapse
|
17
|
Oreper D, Klaeger S, Jhunjhunwala S, Delamarre L. The peptide woods are lovely, dark and deep: Hunting for novel cancer antigens. Semin Immunol 2023; 67:101758. [PMID: 37027981 DOI: 10.1016/j.smim.2023.101758] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 03/22/2023] [Accepted: 03/22/2023] [Indexed: 04/08/2023]
Abstract
Harnessing the patient's immune system to control a tumor is a proven avenue for cancer therapy. T cell therapies as well as therapeutic vaccines, which target specific antigens of interest, are being explored as treatments in conjunction with immune checkpoint blockade. For these therapies, selecting the best suited antigens is crucial. Most of the focus has thus far been on neoantigens that arise from tumor-specific somatic mutations. Although there is clear evidence that T-cell responses against mutated neoantigens are protective, the large majority of these mutations are not immunogenic. In addition, most somatic mutations are unique to each individual patient and their targeting requires the development of individualized approaches. Therefore, novel antigen types are needed to broaden the scope of such treatments. We review high throughput approaches for discovering novel tumor antigens and some of the key challenges associated with their detection, and discuss considerations when selecting tumor antigens to target in the clinic.
Collapse
Affiliation(s)
- Daniel Oreper
- Genentech, 1 DNA way, South San Francisco, 94080 CA, USA.
| | - Susan Klaeger
- Genentech, 1 DNA way, South San Francisco, 94080 CA, USA.
| | | | | |
Collapse
|
18
|
Brooks TG, Lahens NF, Mrčela A, Sarantopoulou D, Nayak S, Naik A, Sengupta S, Choi PS, Grant GR. BEERS2: RNA-Seq simulation through high fidelity in silico modeling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.21.537847. [PMID: 37162982 PMCID: PMC10168222 DOI: 10.1101/2023.04.21.537847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Simulation of RNA-seq reads is critical in the assessment, comparison, benchmarking, and development of bioinformatics tools. Yet the field of RNA-seq simulators has progressed little in the last decade. To address this need we have developed BEERS2, which combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline. BEERS2 takes input transcripts (typically fully-length mRNA transcripts with polyA tails) from either customizable input or from CAMPAREE simulated RNA samples. It produces realistic reads of these transcripts as FASTQ, SAM, or BAM formats with the SAM or BAM formats containing the true alignment to the reference genome. It also produces true transcript-level quantification values. BEERS2 combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline and is designed to include the effects of polyA selection and RiboZero for ribosomal depletion, hexamer priming sequence biases, GC-content biases in PCR amplification, barcode read errors, and errors during PCR amplification. These characteristics combine to make BEERS2 the most complete simulation of RNA-seq to date. Finally, we demonstrate the use of BEERS2 by measuring the effect of several settings on the popular Salmon pseudoalignment algorithm.
Collapse
Affiliation(s)
- Thomas G Brooks
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Nicholas F Lahens
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Antonijo Mrčela
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Dimitra Sarantopoulou
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Current address: National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Soumyashant Nayak
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Current address: Statistics and Mathematics Unit, Indian Statistical Institute, Bengaluru, Karnataka, India
| | - Amruta Naik
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Shaon Sengupta
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Peter S Choi
- Division of Cancer Pathobiology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology & Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Gregory R Grant
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
19
|
Abood A, Mesner LD, Jeffery ED, Murali M, Lehe M, Saquing J, Farber CR, Sheynkman GM. Long-read proteogenomics to connect disease-associated sQTLs to the protein isoform effectors of disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.17.531557. [PMID: 36993769 PMCID: PMC10055087 DOI: 10.1101/2023.03.17.531557] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
A major fraction of loci identified by genome-wide association studies (GWASs) lead to alterations in alternative splicing, but interpretation of how such alterations impact proteins is hindered by the technical limitations of short-read RNA-seq, which cannot directly link splicing events to full-length transcript or protein isoforms. Long-read RNA-seq represents a powerful tool to define and quantify transcript isoforms, and recently, infer protein isoform existence. Here we present a novel approach that integrates information from GWAS, splicing QTL (sQTL), and PacBio long-read RNA-seq in a disease-relevant model to infer the effects of sQTLs on the ultimate protein isoform products they encode. We demonstrate the utility of our approach using bone mineral density (BMD) GWAS data. We identified 1,863 sQTLs from the Genotype-Tissue Expression (GTEx) project in 732 protein-coding genes which colocalized with BMD associations (H 4 PP ≥ 0.75). We generated deep coverage PacBio long-read RNA-seq data (N=∼22 million full-length reads) on human osteoblasts, identifying 68,326 protein-coding isoforms, of which 17,375 (25%) were novel. By casting the colocalized sQTLs directly onto protein isoforms, we connected 809 sQTLs to 2,029 protein isoforms from 441 genes expressed in osteoblasts. Using these data, we created one of the first proteome-scale resources defining full-length isoforms impacted by colocalized sQTLs. Overall, we found that 74 sQTLs influenced isoforms likely impacted by nonsense mediated decay (NMD) and 190 that potentially resulted in the expression of new protein isoforms. Finally, we identified colocalizing sQTLs in TPM2 for splice junctions between two mutually exclusive exons, and two different transcript termination sites, making it impossible to interpret without long-read RNA-seq data. siRNA mediated knockdown in osteoblasts showed two TPM2 isoforms with opposing effects on mineralization. We expect our approach to be widely generalizable across diverse clinical traits and accelerate system-scale analyses of protein isoform activities modulated by GWAS loci.
Collapse
|
20
|
Poretti M, Praz CR, Sotiropoulos AG, Wicker T. A survey of lineage-specific genes in Triticeae reveals de novo gene evolution from genomic raw material. PLANT DIRECT 2023; 7:e484. [PMID: 36937792 PMCID: PMC10020141 DOI: 10.1002/pld3.484] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 01/26/2023] [Accepted: 01/27/2023] [Indexed: 06/18/2023]
Abstract
Diploid plant genomes typically contain ~35,000 genes, almost all belonging to highly conserved gene families. Only a small fraction are lineage-specific, which are found in only one or few closely related species. Little is known about how genes arise de novo in plant genomes and how often this occurs; however, they are believed to be important for plants diversification and adaptation. We developed a pipeline to identify lineage-specific genes in Triticeae, using newly available genome assemblies of wheat, barley, and rye. Applying a set of stringent criteria, we identified 5942 candidate Triticeae-specific genes (TSGs), of which 2337 were validated as protein-coding genes in wheat. Differential gene expression analyses revealed that stress-induced wheat TSGs are strongly enriched in putative secreted proteins. Some were previously described to be involved in Triticeae non-host resistance and cold response. Additionally, we show that 1079 TSGs have sequence homology to transposable elements (TEs), ~68% of them deriving from regulatory non-coding regions of Gypsy retrotransposons. Most importantly, we demonstrate that these TSGs are enriched in transmembrane domains and are among the most highly expressed wheat genes overall. To summarize, we conclude that de novo gene formation is relatively rare and that Triticeae probably possess ~779 lineage-specific genes per haploid genome. TSGs, which respond to pathogen and environmental stresses, may be interesting candidates for future targeted resistance breeding in Triticeae. Finally, we propose that non-coding regions of TEs might provide important genetic raw material for the functional innovation of TM domains and the evolution of novel secreted proteins.
Collapse
Affiliation(s)
- Manuel Poretti
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
- Department of BiologyUniversity of FribourgFribourgSwitzerland
| | - Coraline R. Praz
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
- Centro de Biotecnología y Genómica de PlantasUniversidad Politécnica de Madrid (UPM)–Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA)MadridSpain
| | | | - Thomas Wicker
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| |
Collapse
|
21
|
The Botrytis cinerea Gene Expression Browser. J Fungi (Basel) 2023; 9:jof9010084. [PMID: 36675905 PMCID: PMC9861337 DOI: 10.3390/jof9010084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 12/29/2022] [Accepted: 12/30/2022] [Indexed: 01/07/2023] Open
Abstract
For comprehensive gene expression analyses of the phytopathogenic fungus Botrytis cinerea, which infects a number of plant taxa and is a cause of substantial agricultural losses worldwide, we developed BEB, a web-based B. cinerea gene Expression Browser. This computationally inexpensive web-based application and its associated database contain manually curated RNA-Seq data for B. cinerea. BEB enables expression analyses of genes of interest under different culture conditions by providing publication-ready heatmaps depicting transcript levels, without requiring advanced computational skills. BEB also provides details of each experiment and user-defined gene expression clustering and visualization options. If needed, tables of gene expression values can be downloaded for further exploration, including, for instance, the determination of differentially expressed genes. The BEB implementation is based on open-source computational technologies that can be deployed for other organisms. In this case, the new implementation will be limited only by the number of transcriptomic experiments that are incorporated into the platform. To demonstrate the usability and value of BEB, we analyzed gene expression patterns across different conditions, with a focus on secondary metabolite gene clusters, chromosome-wide gene expression, previously described virulence factors, and reference genes, providing the first comprehensive expression overview of these groups of genes in this relevant fungal phytopathogen. We expect this tool to be broadly useful in B. cinerea research, providing a basis for comparative transcriptomics and candidate gene identification for functional assays.
Collapse
|
22
|
Srikakulam N, Sridevi G, Pandi G. High-quality reference transcriptome construction improves RNA-seq quantification in Oryza sativa indica. Front Genet 2022; 13:995072. [PMID: 36246658 PMCID: PMC9558114 DOI: 10.3389/fgene.2022.995072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 09/02/2022] [Indexed: 11/13/2022] Open
Abstract
The Reference Transcriptomic Dataset (RTD) is an accurate and comprehensive collection of transcripts originating from a given organism. It holds the key to precise transcript quantification and downstream analysis of differential expressions and regulations. Currently, transcriptome annotations for most crop plants are far from complete. For example, Oryza sativa indica (O. sativa indica) is reported to have 40,759 transcripts in the Ensembl database without alternative transcript isoforms and alternative splicing (AS) events. To generate a high-quality RTD, we conducted RNA sequencing of rice leaf samples collected at various time points during Rhizoctonia solani infection. The obtained reads were analyzed by adopting the recently developed computational analysis pipeline to assemble the RTD with increased transcript and AS diversity for O. sativa indica (IndicaRTD). After stringent quality filtering, the newly constructed transcriptome annotation was comprised of 122,968 non-redundant transcripts from 53,695 genes. This study identified many novel transcripts compared to Ensembl deposited data that are important for regulating molecular and physiological processes in the plant system. Currently, the assembled IndicaRTD must allow fast quantification of transcript and gene expression with high precision.
Collapse
Affiliation(s)
- Nagesh Srikakulam
- Laboratory of RNA Biology and Epigenomics, Department of Plant Biotechnology, School of Biotechnology, Madurai Kamaraj University, Madurai, India
- *Correspondence: Nagesh Srikakulam, ; Gopal Pandi,
| | - Ganapathi Sridevi
- Department of Plant Biotechnology, School of Biotechnology, Madurai Kamaraj University, Madurai, India
| | - Gopal Pandi
- Laboratory of RNA Biology and Epigenomics, Department of Plant Biotechnology, School of Biotechnology, Madurai Kamaraj University, Madurai, India
- *Correspondence: Nagesh Srikakulam, ; Gopal Pandi,
| |
Collapse
|
23
|
Sun J, Li L, Hu J, Gao Y, Song J, Zhang X, Hu H. Time-course RNA-Seq profiling reveals isoform-level gene expression dynamics of the cGAS-STING pathway. Comput Struct Biotechnol J 2022; 20:6490-6500. [PMCID: PMC9686058 DOI: 10.1016/j.csbj.2022.11.044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 11/21/2022] [Accepted: 11/21/2022] [Indexed: 11/27/2022] Open
Abstract
The cGAS-STING pathway, orchestrating complicated transcriptome-wide immune responses, is essential for host antiviral defense but can also drive immunopathology in severe COVID-19. Here, we performed time-course RNA-Seq experiments to dissect the transcriptome expression dynamics at the gene-isoform level after cGAS-STING pathway activation. The in-depth time-course transcriptome after cGAS-STING pathway activation within 12 h enabled quantification of 48,685 gene isoforms. By employing regression models, we obtained 13,232 gene isoforms with expression patterns significantly associated with the process of cGAS-STING pathway activation, which were named activation-associated isoforms. The combination of hierarchical and k-means clustering algorithms revealed four major expression patterns of activation-associated isoforms, including two clusters with increased expression patterns enriched in cell cycle, autophagy, antiviral innate-immune functions, and COVID-19 coronavirus disease pathway, and two clusters showing decreased expression pattern that mainly involved in ncRNA metabolism, translation process, and mRNA processing. Importantly, by merging four clusters of activation-associated isoforms, we identified three types of genes that underwent isoform usage alteration during the cGAS-STING pathway activation. We further found that genes exhibiting protein-coding and non-protein-coding gene isoform usage alteration were strongly enriched for the factors involved in innate immunity and RNA splicing. Notably, overexpression of an enriched splicing factor, EFTUD2, shifted transcriptome towards the cGAS-STING pathway activated status and promoted protein-coding isoform abundance of several key regulators of the cGAS-STING pathway. Taken together, our results revealed the isoform-level gene expression dynamics of the cGAS-STING pathway and uncovered novel roles of splicing factors in regulating cGAS-STING pathway mediated immune responses.
Collapse
|
24
|
Booeshaghi AS, Yao Z, van Velthoven C, Smith K, Tasic B, Zeng H, Pachter L. Isoform cell-type specificity in the mouse primary motor cortex. Nature 2021; 598:195-199. [PMID: 34616073 PMCID: PMC8494650 DOI: 10.1038/s41586-021-03969-3] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 08/27/2021] [Indexed: 12/17/2022]
Abstract
Full-length SMART-seq1 single-cell RNA sequencing can be used to measure gene expression at isoform resolution, making possible the identification of specific isoform markers for different cell types. Used in conjunction with spatial RNA capture and gene-tagging methods, this enables the inference of spatially resolved isoform expression for different cell types. Here, in a comprehensive analysis of 6,160 mouse primary motor cortex cells assayed with SMART-seq, 280,327 cells assayed with MERFISH2 and 94,162 cells assayed with 10x Genomics sequencing3, we find examples of isoform specificity in cell types-including isoform shifts between cell types that are masked in gene-level analysis-as well as examples of transcriptional regulation. Additionally, we show that isoform specificity helps to refine cell types, and that a multi-platform analysis of single-cell transcriptomic data leveraging multiple measurements provides a comprehensive atlas of transcription in the mouse primary motor cortex that improves on the possibilities offered by any single technology.
Collapse
Affiliation(s)
- A Sina Booeshaghi
- Department of Mechanical Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Zizhen Yao
- Allen Institute for Brain Science, Seattle, WA, USA
| | | | | | | | - Hongkui Zeng
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|