1
|
Kim GD, Shin SI, Jung SW, An H, Choi SY, Eun M, Jun CD, Lee S, Park J. Cell Type- and Age-Specific Expression of lncRNAs across Kidney Cell Types. J Am Soc Nephrol 2024:00001751-990000000-00292. [PMID: 38621182 DOI: 10.1681/asn.0000000000000354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 04/08/2024] [Indexed: 04/17/2024] Open
Abstract
Key Points
We constructed a single-cell long noncoding RNA atlas of various tissues, including normal and aged kidneys.We identified age- and cell type–specific expression changes of long noncoding RNAs in kidney cells.
Background
Accumulated evidence demonstrates that long noncoding RNAs (lncRNAs) regulate cell differentiation and homeostasis, influencing kidney aging and disease. Despite their versatility, the function of lncRNA remains poorly understood because of the lack of a reference map of lncRNA transcriptome in various cell types.
Methods
In this study, we used a targeted single-cell RNA sequencing method to enrich and characterize lncRNAs in individual cells. We applied this method to various mouse tissues, including normal and aged kidneys.
Results
Through tissue-specific clustering analysis, we identified cell type–specific lncRNAs that showed a high correlation with known cell-type marker genes. Furthermore, we constructed gene regulatory networks to explore the functional roles of differentially expressed lncRNAs in each cell type. In the kidney, we observed dynamic expression changes of lncRNAs during aging, with specific changes in glomerular cells. These cell type– and age-specific expression patterns of lncRNAs suggest that lncRNAs may have a potential role in regulating cellular processes, such as immune response and energy metabolism, during kidney aging.
Conclusions
Our study sheds light on the comprehensive landscape of lncRNA expression and function and provides a valuable resource for future analysis of lncRNAs (https://gist-fgl.github.io/sc-lncrna-atlas/).
Collapse
Affiliation(s)
- Gyeong Dae Kim
- School of Life Sciences, Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea
| | - So-I Shin
- School of Life Sciences, Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea
| | - Su Woong Jung
- School of Life Sciences, Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea
- Division of Nephrology, Department of Internal Medicine, Kyung Hee University, Seoul, Republic of Korea
| | - Hyunsu An
- School of Life Sciences, Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea
| | - Sin Young Choi
- School of Life Sciences, Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea
| | - Minho Eun
- School of Life Sciences, Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea
| | - Chang-Duk Jun
- School of Life Sciences, Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea
| | - Sangho Lee
- Division of Nephrology, Department of Internal Medicine, Kyung Hee University, Seoul, Republic of Korea
| | - Jihwan Park
- School of Life Sciences, Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea
| |
Collapse
|
2
|
Lambourne L, Mattioli K, Santoso C, Sheynkman G, Inukai S, Kaundal B, Berenson A, Spirohn-Fitzgerald K, Bhattacharjee A, Rothman E, Shrestha S, Laval F, Yang Z, Bisht D, Sewell JA, Li G, Prasad A, Phanor S, Lane R, Campbell DM, Hunt T, Balcha D, Gebbia M, Twizere JC, Hao T, Frankish A, Riback JA, Salomonis N, Calderwood MA, Hill DE, Sahni N, Vidal M, Bulyk ML, Fuxman Bass JI. Widespread variation in molecular interactions and regulatory properties among transcription factor isoforms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.12.584681. [PMID: 38617209 PMCID: PMC11014633 DOI: 10.1101/2024.03.12.584681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Most human Transcription factors (TFs) genes encode multiple protein isoforms differing in DNA binding domains, effector domains, or other protein regions. The global extent to which this results in functional differences between isoforms remains unknown. Here, we systematically compared 693 isoforms of 246 TF genes, assessing DNA binding, protein binding, transcriptional activation, subcellular localization, and condensate formation. Relative to reference isoforms, two-thirds of alternative TF isoforms exhibit differences in one or more molecular activities, which often could not be predicted from sequence. We observed two primary categories of alternative TF isoforms: "rewirers" and "negative regulators", both of which were associated with differentiation and cancer. Our results support a model wherein the relative expression levels of, and interactions involving, TF isoforms add an understudied layer of complexity to gene regulatory networks, demonstrating the importance of isoform-aware characterization of TF functions and providing a rich resource for further studies.
Collapse
Affiliation(s)
- Luke Lambourne
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Kaia Mattioli
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Clarissa Santoso
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
| | - Gloria Sheynkman
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sachi Inukai
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Babita Kaundal
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Anna Berenson
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| | - Kerstin Spirohn-Fitzgerald
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Anukana Bhattacharjee
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Elisabeth Rothman
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Florent Laval
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Zhipeng Yang
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Deepa Bisht
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jared A Sewell
- Department of Biology, Boston University, Boston, MA, USA
| | - Guangyuan Li
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Anisa Prasad
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Harvard College, Cambridge MA, USA
| | - Sabrina Phanor
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Ryan Lane
- Department of Biology, Boston University, Boston, MA, USA
| | | | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Dawit Balcha
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Marinella Gebbia
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Jean-Claude Twizere
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Adam Frankish
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Josh A Riback
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Nathan Salomonis
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - David E Hill
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nidhi Sahni
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Juan I Fuxman Bass
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| |
Collapse
|
3
|
Lucas CJ, Sheridan RM, Reynoso GV, Davenport BJ, McCarthy MK, Martin A, Hesselberth JR, Hickman HD, Tamburini BA, Morrison TE. Chikungunya virus infection disrupts lymph node lymphatic endothelial cell composition and function via MARCO. JCI Insight 2024; 9:e176537. [PMID: 38194268 PMCID: PMC11143926 DOI: 10.1172/jci.insight.176537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 01/05/2024] [Indexed: 01/10/2024] Open
Abstract
Infection with chikungunya virus (CHIKV) causes disruption of draining lymph node (dLN) organization, including paracortical relocalization of B cells, loss of the B cell-T cell border, and lymphocyte depletion that is associated with infiltration of the LN with inflammatory myeloid cells. Here, we found that, during the first 24 hours of infection, CHIKV RNA accumulated in MARCO-expressing lymphatic endothelial cells (LECs) in both the floor and medullary LN sinuses. The accumulation of viral RNA in the LN was associated with a switch to an antiviral and inflammatory gene expression program across LN stromal cells, and this inflammatory response - including recruitment of myeloid cells to the LN - was accelerated by CHIKV-MARCO interactions. As CHIKV infection progressed, both floor and medullary LECs diminished in number, suggesting further functional impairment of the LN by infection. Consistent with this idea, antigen acquisition by LECs, a key function of LN LECs during infection and immunization, was reduced during pathogenic CHIKV infection.
Collapse
Affiliation(s)
- Cormac J. Lucas
- Department of Immunology & Microbiology and
- RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - Ryan M. Sheridan
- RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - Glennys V. Reynoso
- Viral Immunity & Pathogenesis Unit, Laboratory of Clinical Immunology & Microbiology, National Institutes of Allergy & Infectious Disease, NIH, Bethesda, Maryland, USA
| | | | | | - Aspen Martin
- Department of Biochemistry & Molecular Genetics and
| | - Jay R. Hesselberth
- RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, Colorado, USA
- Department of Biochemistry & Molecular Genetics and
| | - Heather D. Hickman
- Viral Immunity & Pathogenesis Unit, Laboratory of Clinical Immunology & Microbiology, National Institutes of Allergy & Infectious Disease, NIH, Bethesda, Maryland, USA
| | - Beth A.J. Tamburini
- Department of Immunology & Microbiology and
- Division of Gastroenterology and Hepatology, Department of Medicine, University of Colorado School of Medicine, Aurora, Colorado, USA
| | | |
Collapse
|
4
|
Lucas CJ, Sheridan RM, Reynoso GV, Davenport BJ, McCarthy MK, Martin A, Hesselberth JR, Hickman HD, Tamburini BAJ, Morrison TE. Chikungunya virus infection disrupts lymph node lymphatic endothelial cell composition and function via MARCO. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.12.561615. [PMID: 37873393 PMCID: PMC10592756 DOI: 10.1101/2023.10.12.561615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Infection with chikungunya virus (CHIKV) causes disruption of draining lymph node (dLN) organization, including paracortical relocalization of B cells, loss of the B cell-T cell border, and lymphocyte depletion that is associated with infiltration of the LN with inflammatory myeloid cells. Here, we find that during the first 24 h of infection, CHIKV RNA accumulates in MARCO-expressing lymphatic endothelial cells (LECs) in both the floor and medullary LN sinuses. The accumulation of viral RNA in the LN was associated with a switch to an antiviral and inflammatory gene expression program across LN stromal cells, and this inflammatory response, including recruitment of myeloid cells to the LN, was accelerated by CHIKV-MARCO interactions. As CHIKV infection progressed, both floor and medullary LECs diminished in number, suggesting further functional impairment of the LN by infection. Consistent with this idea, we find that antigen acquisition by LECs, a key function of LN LECs during infection and immunization, was reduced during pathogenic CHIKV infection.
Collapse
|
5
|
Zhang Z, Bae B, Cuddleston WH, Miura P. Coordination of alternative splicing and alternative polyadenylation revealed by targeted long read sequencing. Nat Commun 2023; 14:5506. [PMID: 37679364 PMCID: PMC10484994 DOI: 10.1038/s41467-023-41207-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 08/25/2023] [Indexed: 09/09/2023] Open
Abstract
Nervous system development is associated with extensive regulation of alternative splicing (AS) and alternative polyadenylation (APA). AS and APA have been extensively studied in isolation, but little is known about how these processes are coordinated. Here, the coordination of cassette exon (CE) splicing and APA in Drosophila was investigated using a targeted long-read sequencing approach we call Pull-a-Long-Seq (PL-Seq). This cost-effective method uses cDNA pulldown and Nanopore sequencing combined with an analysis pipeline to quantify inclusion of alternative exons in connection with alternative 3' ends. Using PL-Seq, we identified genes that exhibit significant differences in CE splicing depending on connectivity to short versus long 3'UTRs. Genomic long 3'UTR deletion was found to alter upstream CE splicing in short 3'UTR isoforms and ELAV loss differentially affected CE splicing depending on connectivity to alternative 3'UTRs. This work highlights the importance of considering connectivity to alternative 3'UTRs when monitoring AS events.
Collapse
Affiliation(s)
- Zhiping Zhang
- Department of Genetics and Genome Sciences, University of Connecticut School of Medicine, Farmington, CT, USA
- Department of Biology, University of Nevada, Reno, Reno, NV, USA
| | - Bongmin Bae
- Department of Biology, University of Nevada, Reno, Reno, NV, USA
| | | | - Pedro Miura
- Department of Genetics and Genome Sciences, University of Connecticut School of Medicine, Farmington, CT, USA.
- Department of Biology, University of Nevada, Reno, Reno, NV, USA.
- Institute for System Genomics, University of Connecticut, Storrs, CT, USA.
| |
Collapse
|
6
|
Wang F, Xu Y, Wang R, Zhang B, Smith N, Notaro A, Gaerlan S, Kutschera E, Kadash-Edmondson KE, Xing Y, Lin L. TEQUILA-seq: a versatile and low-cost method for targeted long-read RNA sequencing. Nat Commun 2023; 14:4760. [PMID: 37553321 PMCID: PMC10409798 DOI: 10.1038/s41467-023-40083-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Accepted: 07/11/2023] [Indexed: 08/10/2023] Open
Abstract
Long-read RNA sequencing (RNA-seq) is a powerful technology for transcriptome analysis, but the relatively low throughput of current long-read sequencing platforms limits transcript coverage. One strategy for overcoming this bottleneck is targeted long-read RNA-seq for preselected gene panels. We present TEQUILA-seq, a versatile, easy-to-implement, and low-cost method for targeted long-read RNA-seq utilizing isothermally linear-amplified capture probes. When performed on the Oxford nanopore platform with multiple gene panels of varying sizes, TEQUILA-seq consistently and substantially enriches transcript coverage while preserving transcript quantification. We profile full-length transcript isoforms of 468 actionable cancer genes across 40 representative breast cancer cell lines. We identify transcript isoforms enriched in specific subtypes and discover novel transcript isoforms in extensively studied cancer genes such as TP53. Among cancer genes, tumor suppressor genes (TSGs) are significantly enriched for aberrant transcript isoforms targeted for degradation via mRNA nonsense-mediated decay, revealing a common RNA-associated mechanism for TSG inactivation. TEQUILA-seq reduces the per-reaction cost of targeted capture by 2-3 orders of magnitude, as compared to a standard commercial solution. TEQUILA-seq can be broadly used for targeted sequencing of full-length transcripts in diverse biomedical research settings.
Collapse
Affiliation(s)
- Feng Wang
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Yang Xu
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - Robert Wang
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - Beatrice Zhang
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Noah Smith
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Amber Notaro
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Samantha Gaerlan
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Eric Kutschera
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Kathryn E Kadash-Edmondson
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Yi Xing
- Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
| | - Lan Lin
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
| |
Collapse
|
7
|
Xu D, Tang L, Kapranov P. Complexities of mammalian transcriptome revealed by targeted RNA enrichment techniques. Trends Genet 2023; 39:320-333. [PMID: 36681580 DOI: 10.1016/j.tig.2022.12.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 12/27/2022] [Accepted: 12/30/2022] [Indexed: 01/21/2023]
Abstract
Studies using highly sensitive targeted RNA enrichment methods have shown that a large portion of the human transcriptome remains to be discovered and that most of the genome is transcribed in a complex, interleaved fashion characterized by a complex web of transcripts emanating from protein coding and noncoding loci. These results resonate with those from single-cell transcriptome profiling endeavors that reveal the existence of multiple novel, cell type-specific transcripts and clearly demonstrate that our understanding of the complexities of the human transcriptome is far from being complete. Here, we review the current status of the targeted RNA enrichment techniques, their application to the discovery of novel cell type-specific transcripts, and their impact on our understanding of the human genome and transcriptome.
Collapse
Affiliation(s)
- Dongyang Xu
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen 361021, China
| | - Lu Tang
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen 361021, China
| | - Philipp Kapranov
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen 361021, China.
| |
Collapse
|
8
|
Zafarullah M, Li J, Tseng E, Tassone F. Structure and Alternative Splicing of the Antisense FMR1 (ASFMR1) Gene. Mol Neurobiol 2023; 60:2051-2061. [PMID: 36598648 PMCID: PMC10461537 DOI: 10.1007/s12035-022-03176-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 12/10/2022] [Indexed: 01/05/2023]
Abstract
Fragile X-associated tremor/ataxia syndrome (FXTAS) is a neurodegenerative disorder caused by an expansion of 55-200 CGG repeats (premutation) in the 5'-UTR of the FMR1 gene. Bidirectional transcription at FMR1 locus has been demonstrated and specific alternative splicing of the Antisense FMR1 (ASFMR1) gene has been proposed to have a contributing role in the pathogenesis of FXTAS. The structure of ASFMR1 gene is still uncharacterized and it is currently unknown how many isoforms of the gene are expressed and at what level in premutation carriers (PM) and if they may contribute to the premutation pathology. In this study, we characterized the ASFMR1 gene structure and the transcriptional landscape by using PacBio SMRT sequencing with target enrichment (IDT customized probe panel). We identified 45 ASFMR1 isoforms ranging in sizes from 523 bp to 6 Kb, spanning approximately 59 kb of genomic DNA. Multiplexing and sequencing of six human brain samples from PM samples and normal control (HC) were carried out on the PacBio Sequel platform. We validated the presence of these isoforms by qRT-PCR and Sanger sequencing and characterized the acceptor and donor splicing site consensus sequences. Consistent with previous studies conducted in other tissue types, we found a high expression of ASFMR1 isoform Iso131bp in brain samples of PM as compared to HC, while no differences in expression levels were observed for the newly identified isoforms IsoAS1 and IsoAS2. We investigated the role of the splicing regulatory protein Sam68 which we did not observe in the alternative splicing of the ASFMR1 gene. Our study provides a useful insight into the structure of ASFMR1 gene and transcriptional landscape along with the expression pattern of various newly identified novel isoforms and on their potential role in premutation pathology.
Collapse
Affiliation(s)
- Marwa Zafarullah
- Department of Biochemistry and Molecular Medicine, University of California Davis, School of Medicine, Sacramento, CA, 95817, USA
| | - Jie Li
- Bioinformatics Core, Genome Center, University of California Davis, Davis, CA, 95616, USA
| | | | - Flora Tassone
- Department of Biochemistry and Molecular Medicine, University of California Davis, School of Medicine, Sacramento, CA, 95817, USA.
- MIND Institute, University of California Davis Medical Center, Sacramento, CA, 95817, USA.
| |
Collapse
|
9
|
Zhang Z, Bae B, Cuddleston WH, Miura P. Coordination of Alternative Splicing and Alternative Polyadenylation revealed by Targeted Long-Read Sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.23.533999. [PMID: 36993601 PMCID: PMC10055423 DOI: 10.1101/2023.03.23.533999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Nervous system development is associated with extensive regulation of alternative splicing (AS) and alternative polyadenylation (APA). AS and APA have been extensively studied in isolation, but little is known about how these processes are coordinated. Here, the coordination of cassette exon (CE) splicing and APA in Drosophila was investigated using a targeted long-read sequencing approach we call Pull-a-Long-Seq (PL-Seq). This cost-effective method uses cDNA pulldown and Nanopore sequencing combined with an analysis pipeline to resolve the connectivity of alternative exons to alternative 3' ends. Using PL-Seq, we identified genes that exhibit significant differences in CE splicing depending on connectivity to short versus long 3'UTRs. Genomic long 3'UTR deletion was found to alter upstream CE splicing in short 3'UTR isoforms and ELAV loss differentially affected CE splicing depending on connectivity to alternative 3'UTRs. This work highlights the importance of considering connectivity to alternative 3'UTRs when monitoring AS events.
Collapse
Affiliation(s)
- Zhiping Zhang
- Department of Biology, University of Nevada, Reno, Reno, NV, USA
- Department of Genetics and Genome Sciences, UConn Health, Farmington, CT, USA
| | - Bongmin Bae
- Department of Biology, University of Nevada, Reno, Reno, NV, USA
| | | | - Pedro Miura
- Department of Biology, University of Nevada, Reno, Reno, NV, USA
- Department of Genetics and Genome Sciences, UConn Health, Farmington, CT, USA
| |
Collapse
|
10
|
Castaldi PJ, Abood A, Farber CR, Sheynkman GM. Bridging the splicing gap in human genetics with long-read RNA sequencing: finding the protein isoform drivers of disease. Hum Mol Genet 2022; 31:R123-R136. [PMID: 35960994 PMCID: PMC9585682 DOI: 10.1093/hmg/ddac196] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 08/08/2022] [Accepted: 08/09/2022] [Indexed: 02/04/2023] Open
Abstract
Aberrant splicing underlies many human diseases, including cancer, cardiovascular diseases and neurological disorders. Genome-wide mapping of splicing quantitative trait loci (sQTLs) has shown that genetic regulation of alternative splicing is widespread. However, identification of the corresponding isoform or protein products associated with disease-associated sQTLs is challenging with short-read RNA-seq, which cannot precisely characterize full-length transcript isoforms. Furthermore, contemporary sQTL interpretation often relies on reference transcript annotations, which are incomplete. Solutions to these issues may be found through integration of newly emerging long-read sequencing technologies. Long-read sequencing offers the capability to sequence full-length mRNA transcripts and, in some cases, to link sQTLs to transcript isoforms containing disease-relevant protein alterations. Here, we provide an overview of sQTL mapping approaches, the use of long-read sequencing to characterize sQTL effects on isoforms, the linkage of RNA isoforms to protein-level functions and comment on future directions in the field. Based on recent progress, long-read RNA sequencing promises to be part of the human disease genetics toolkit to discover and treat protein isoforms causing rare and complex diseases.
Collapse
Affiliation(s)
- Peter J Castaldi
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA 02115, USA
- Division of General Medicine and Primary Care, Department of Medicine, Brigham and Women’s Hospital, Boston, MA 02115, USA
| | - Abdullah Abood
- Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
| | - Charles R Farber
- Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
- Department of Public Health Sciences, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
| | - Gloria M Sheynkman
- Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22903, USA
- UVA Comprehensive Cancer Center, University of Virginia, Charlottesville, VA 22903, USA
| |
Collapse
|
11
|
de Crécy-lagard V, Amorin de Hegedus R, Arighi C, Babor J, Bateman A, Blaby I, Blaby-Haas C, Bridge AJ, Burley SK, Cleveland S, Colwell LJ, Conesa A, Dallago C, Danchin A, de Waard A, Deutschbauer A, Dias R, Ding Y, Fang G, Friedberg I, Gerlt J, Goldford J, Gorelik M, Gyori BM, Henry C, Hutinet G, Jaroch M, Karp PD, Kondratova L, Lu Z, Marchler-Bauer A, Martin MJ, McWhite C, Moghe GD, Monaghan P, Morgat A, Mungall CJ, Natale DA, Nelson WC, O’Donoghue S, Orengo C, O’Toole KH, Radivojac P, Reed C, Roberts RJ, Rodionov D, Rodionova IA, Rudolf JD, Saleh L, Sheynkman G, Thibaud-Nissen F, Thomas PD, Uetz P, Vallenet D, Carter EW, Weigele PR, Wood V, Wood-Charlson EM, Xu J. A roadmap for the functional annotation of protein families: a community perspective. Database (Oxford) 2022; 2022:6663924. [PMID: 35961013 PMCID: PMC9374478 DOI: 10.1093/database/baac062] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 06/28/2022] [Accepted: 08/03/2022] [Indexed: 12/23/2022]
Abstract
Over the last 25 years, biology has entered the genomic era and is becoming a science of ‘big data’. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3–4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.
Collapse
Affiliation(s)
- Valérie de Crécy-lagard
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | | | - Cecilia Arighi
- Department of Computer and Information Sciences, University of Delaware , Newark, DE 19713, USA
| | - Jill Babor
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton CB10 1SD, UK
| | - Ian Blaby
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Crysten Blaby-Haas
- Biology Department, Brookhaven National Laboratory , Upton, NY 11973, USA
| | - Alan J Bridge
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire , Geneva 4 CH-1211, Switzerland
| | - Stephen K Burley
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
| | - Stacey Cleveland
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Lucy J Colwell
- Departmenf of Chemistry, University of Cambridge , Lensfield Road, Cambridge CB2 1EW, UK
| | - Ana Conesa
- Spanish National Research Council, Institute for Integrative Systems Biology , Paterna, Valencia 46980, Spain
| | - Christian Dallago
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology , i12, Boltzmannstr. 3, Garching/Munich 85748, Germany
| | - Antoine Danchin
- School of Biomedical Sciences, Li KaShing Faculty of Medicine, The University of Hong Kong , 21 Sassoon Road, Pokfulam, SAR Hong Kong 999077, China
| | - Anita de Waard
- Research Collaboration Unit, Elsevier , Jericho, VT 05465, USA
| | - Adam Deutschbauer
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Raquel Dias
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Yousong Ding
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida , Gainesville, FL 32610, USA
| | - Gang Fang
- NYU-Shanghai , Shanghai 200120, China
| | - Iddo Friedberg
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University , Ames, IA 50011, USA
| | - John Gerlt
- Institute for Genomic Biology and Departments of Biochemistry and Chemistry, University of Illinois at Urbana-Champaign , Urbana, IL 61801, USA
| | - Joshua Goldford
- Physics of Living Systems, Massachusetts Institute of Technology , Cambridge, MA 02139, USA
| | - Mark Gorelik
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Benjamin M Gyori
- Laboratory of Systems Pharmacology, Harvard Medical School , Boston, MA 02115, USA
| | - Christopher Henry
- Mathematics and Computer Science Division, Argonne National Laboratory , Argonne, IL 60439, USA
| | - Geoffrey Hutinet
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Marshall Jaroch
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Peter D Karp
- Bioinformatics Research Group, SRI International , Menlo Park, CA 94025, USA
| | | | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH) , 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Aron Marchler-Bauer
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH) , 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Maria-Jesus Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton CB10 1SD, UK
| | - Claire McWhite
- Lewis-Sigler Institute for Integrative Genomics, Princeton University , Princeton, NJ 08540, USA
| | - Gaurav D Moghe
- Plant Biology Section, School of Integrative Plant Science, Cornell University , Ithaca, NY 14853, USA
| | - Paul Monaghan
- Department of Agricultural Education and Communication, University of Florida , Gainesville, FL 32611, USA
| | - Anne Morgat
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire , Geneva 4 CH-1211, Switzerland
| | - Christopher J Mungall
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Darren A Natale
- Georgetown University Medical Center , Washington, DC 20007, USA
| | - William C Nelson
- Biological Sciences Division, Pacific Northwest National Laboratories , Richland, WA 99354, USA
| | - Seán O’Donoghue
- School of Biotechnology and Biomolecular Sciences, University of NSW , Sydney, NSW 2052, Australia
| | - Christine Orengo
- Department of Structural and Molecular Biology, University College London , London WC1E 6BT, UK
| | | | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University , Boston, MA 02115, USA
| | - Colbie Reed
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | | | - Dmitri Rodionov
- Sanford Burnham Prebys Medical Discovery Institute , La Jolla, CA 92037, USA
| | - Irina A Rodionova
- Department of Bioengineering, Division of Engineering, University of California at San Diego , La Jolla, CA 92093-0412, USA
| | - Jeffrey D Rudolf
- Department of Chemistry, University of Florida , Gainesville, FL 32611, USA
| | - Lana Saleh
- New England Biolabs , Ipswich, MA 01938, USA
| | - Gloria Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia , Charlottesville, VA, USA
| | - Francoise Thibaud-Nissen
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH) , 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033, USA
| | - Peter Uetz
- Center for Biological Data Science, Virginia Commonwealth University , Richmond, VA 23284, USA
| | - David Vallenet
- LABGeM, Génomique Métabolique, CEA, Genoscope, Institut François Jacob, Université d’Évry, Université Paris-Saclay, CNRS , Evry 91057, France
| | - Erica Watson Carter
- Department of Plant Pathology, University of Florida Citrus Research and Education Center , 700 Experiment Station Rd., Lake Alfred, FL 33850, USA
| | | | - Valerie Wood
- Department of Biochemistry, University of Cambridge , Cambridge CB2 1GA, UK
| | - Elisha M Wood-Charlson
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Jin Xu
- Department of Plant Pathology, University of Florida Citrus Research and Education Center , 700 Experiment Station Rd., Lake Alfred, FL 33850, USA
| |
Collapse
|
12
|
Miller RM, Jordan BT, Mehlferber MM, Jeffery ED, Chatzipantsiou C, Kaur S, Millikin RJ, Dai Y, Tiberi S, Castaldi PJ, Shortreed MR, Luckey CJ, Conesa A, Smith LM, Deslattes Mays A, Sheynkman GM. Enhanced protein isoform characterization through long-read proteogenomics. Genome Biol 2022; 23:69. [PMID: 35241129 PMCID: PMC8892804 DOI: 10.1186/s13059-022-02624-y] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 02/02/2022] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms. RESULTS We describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis. CONCLUSIONS Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.
Collapse
Affiliation(s)
- Rachel M. Miller
- grid.14003.360000 0001 2167 3675Department of Chemistry, University of Wisconsin-Madison, Madison, WI USA
| | - Ben T. Jordan
- grid.27755.320000 0000 9136 933XDepartment of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA USA
| | - Madison M. Mehlferber
- grid.27755.320000 0000 9136 933XDepartment of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA USA ,grid.27755.320000 0000 9136 933XDepartment of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA USA
| | - Erin D. Jeffery
- grid.27755.320000 0000 9136 933XDepartment of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA USA
| | | | - Simi Kaur
- grid.14003.360000 0001 2167 3675Department of Chemistry, University of Wisconsin-Madison, Madison, WI USA
| | - Robert J. Millikin
- grid.14003.360000 0001 2167 3675Department of Chemistry, University of Wisconsin-Madison, Madison, WI USA
| | - Yunxiang Dai
- grid.14003.360000 0001 2167 3675Department of Chemistry, University of Wisconsin-Madison, Madison, WI USA
| | - Simone Tiberi
- grid.7400.30000 0004 1937 0650Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland ,grid.7400.30000 0004 1937 0650Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
| | - Peter J. Castaldi
- grid.62560.370000 0004 0378 8294Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA USA ,grid.62560.370000 0004 0378 8294Division of General Medicine and Primary Care, Brigham and Women’s Hospital, Boston, MA USA
| | - Michael R. Shortreed
- grid.14003.360000 0001 2167 3675Department of Chemistry, University of Wisconsin-Madison, Madison, WI USA
| | - Chance John Luckey
- grid.27755.320000 0000 9136 933XDepartment of Pathology, University of Virginia, Charlottesville, VA USA
| | - Ana Conesa
- grid.4711.30000 0001 2183 4846Institute for Integrative Systems Biology, Spanish National Research Council (CSIC), Paterna, Spain ,grid.15276.370000 0004 1936 8091Microbiology and Cell Science Department, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL USA
| | - Lloyd M. Smith
- grid.14003.360000 0001 2167 3675Department of Chemistry, University of Wisconsin-Madison, Madison, WI USA
| | - Anne Deslattes Mays
- grid.420089.70000 0000 9635 8082 Office of Data Science and Sharing, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Rockville, MD USA
| | - Gloria M. Sheynkman
- grid.27755.320000 0000 9136 933XDepartment of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA USA ,grid.27755.320000 0000 9136 933XCenter for Public Health Genomics, University of Virginia, Charlottesville, VA USA ,grid.27755.320000 0000 9136 933XUVA Cancer Center, University of Virginia, Charlottesville, VA USA
| |
Collapse
|
13
|
Veiga DFT, Nesta A, Zhao Y, Mays AD, Huynh R, Rossi R, Wu TC, Palucka K, Anczukow O, Beck CR, Banchereau J. A comprehensive long-read isoform analysis platform and sequencing resource for breast cancer. SCIENCE ADVANCES 2022; 8:eabg6711. [PMID: 35044822 PMCID: PMC8769553 DOI: 10.1126/sciadv.abg6711] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Tumors display widespread transcriptome alterations, but the full repertoire of isoform-level alternative splicing in cancer is unknown. We developed a long-read (LR) RNA sequencing and analytical platform that identifies and annotates full-length isoforms and infers tumor-specific splicing events. Application of this platform to breast cancer samples identifies thousands of previously unannotated isoforms; ~30% affect protein coding exons and are predicted to alter protein localization and function. We performed extensive cross-validation with -omics datasets to support transcription and translation of novel isoforms. We identified 3059 breast tumor–specific splicing events, including 35 that are significantly associated with patient survival. Of these, 21 are absent from GENCODE and 10 are enriched in specific breast cancer subtypes. Together, our results demonstrate the complexity, cancer subtype specificity, and clinical relevance of previously unidentified isoforms and splicing events in breast cancer that are only annotatable by LR-seq and provide a rich resource of immuno-oncology therapeutic targets.
Collapse
Affiliation(s)
- Diogo F. T. Veiga
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
| | - Alex Nesta
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030, USA
| | - Yuqi Zhao
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
| | | | - Richie Huynh
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
| | - Robert Rossi
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
| | - Te-Chia Wu
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
| | - Karolina Palucka
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
| | - Olga Anczukow
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030, USA
- Institute for Systems Genomics, University of Connecticut Health Center, Farmington, CT 06030, USA
- Corresponding author. (O.A.); (C.R.B.); (J.B.)
| | - Christine R. Beck
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030, USA
- Institute for Systems Genomics, University of Connecticut Health Center, Farmington, CT 06030, USA
- Corresponding author. (O.A.); (C.R.B.); (J.B.)
| | - Jacques Banchereau
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
- Corresponding author. (O.A.); (C.R.B.); (J.B.)
| |
Collapse
|
14
|
Kikuchi Y, Tokita S, Hirama T, Kochin V, Nakatsugawa M, Shinkawa T, Hirohashi Y, Tsukahara T, Hata F, Takemasa I, Sato N, Kanaseki T, Torigoe T. CD8 + T-cell Immune Surveillance against a Tumor Antigen Encoded by the Oncogenic Long Noncoding RNA PVT1. Cancer Immunol Res 2021; 9:1342-1353. [PMID: 34433589 DOI: 10.1158/2326-6066.cir-20-0964] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Revised: 04/11/2021] [Accepted: 08/23/2021] [Indexed: 11/16/2022]
Abstract
CD8+ T cells recognize peptides displayed by HLA class I molecules on cell surfaces, monitoring pathologic conditions such as cancer. Advances in proteogenomic analysis of HLA ligandomes have demonstrated that cells present a subset of cryptic peptides derived from noncoding regions of the genome; however, the roles of cryptic HLA ligands in tumor immunity remain unknown. In the current study, we comprehensively and quantitatively investigated the HLA class I ligandome of a set of human colorectal cancer and matched normal tissues, showing that cryptic translation products accounted for approximately 5% of the HLA class I ligandome. We also found that a peptide encoded by the long noncoding RNA (lncRNA) PVT1 was predominantly enriched in multiple colorectal cancer tissues. The PVT1 gene is located downstream of the MYC gene in the genome and is aberrantly overexpressed across a variety of cancers, reflecting its oncogenic property. The PVT1 peptide was recognized by patient CD8+ tumor-infiltrating lymphocytes, as well as peripheral blood mononuclear cells, suggesting the presence of patient immune surveillance. Our findings show that peptides can be translated from lncRNAs and presented by HLA class I and that cancer patient T cells are capable of sensing aberrations in noncoding regions of the genome.
Collapse
Affiliation(s)
- Yasuhiro Kikuchi
- Department of Pathology, Sapporo Medical University, Sapporo, Japan
| | - Serina Tokita
- Department of Pathology, Sapporo Medical University, Sapporo, Japan.,Sapporo Dohto Hospital, Sapporo, Japan
| | - Tomomi Hirama
- Department of Pathology, Sapporo Medical University, Sapporo, Japan.,Sapporo Dohto Hospital, Sapporo, Japan
| | - Vitaly Kochin
- Department of Pathology, Sapporo Medical University, Sapporo, Japan.,Department of Immunology, Nagoya University, Nagoya, Japan
| | - Munehide Nakatsugawa
- Department of Pathology, Sapporo Medical University, Sapporo, Japan.,Department of Pathology, Tokyo Medical University Hachioji Medical Center, Hachioji, Tokyo, Japan
| | - Tomoyo Shinkawa
- Department of Pathology, Sapporo Medical University, Sapporo, Japan
| | | | | | | | - Ichiro Takemasa
- Department of Surgery, Surgical Oncology and Science, Sapporo Medical University, Sapporo, Japan
| | - Noriyuki Sato
- Department of Pathology, Sapporo Medical University, Sapporo, Japan.,Sapporo Dohto Hospital, Sapporo, Japan
| | - Takayuki Kanaseki
- Department of Pathology, Sapporo Medical University, Sapporo, Japan.
| | | |
Collapse
|
15
|
Tedersoo L, Albertsen M, Anslan S, Callahan B. Perspectives and Benefits of High-Throughput Long-Read Sequencing in Microbial Ecology. Appl Environ Microbiol 2021; 87:e0062621. [PMID: 34132589 PMCID: PMC8357291 DOI: 10.1128/aem.00626-21] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Short-read, high-throughput sequencing (HTS) methods have yielded numerous important insights into microbial ecology and function. Yet, in many instances short-read HTS techniques are suboptimal, for example, by providing insufficient phylogenetic resolution or low integrity of assembled genomes. Single-molecule and synthetic long-read (SLR) HTS methods have successfully ameliorated these limitations. In addition, nanopore sequencing has generated a number of unique analysis opportunities, such as rapid molecular diagnostics and direct RNA sequencing, and both Pacific Biosciences (PacBio) and nanopore sequencing support detection of epigenetic modifications. Although initially suffering from relatively low sequence quality, recent advances have greatly improved the accuracy of long-read sequencing technologies. In spite of great technological progress in recent years, the long-read HTS methods (PacBio and nanopore sequencing) are still relatively costly, require large amounts of high-quality starting material, and commonly need specific solutions in various analysis steps. Despite these challenges, long-read sequencing technologies offer high-quality, cutting-edge alternatives for testing hypotheses about microbiome structure and functioning as well as assembly of eukaryote genomes from complex environmental DNA samples.
Collapse
Affiliation(s)
- Leho Tedersoo
- Mycology and Microbiology Center, University of Tartu, Tartu, Estonia
| | - Mads Albertsen
- Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Sten Anslan
- Mycology and Microbiology Center, University of Tartu, Tartu, Estonia
- Braunschweig University of Technology, Zoological Institute, Braunschweig, Germany
| | - Benjamin Callahan
- Department of Population Health and Pathobiology, College of Veterinary Medicine and Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, USA
| |
Collapse
|
16
|
De Paoli-Iseppi R, Gleeson J, Clark MB. Isoform Age - Splice Isoform Profiling Using Long-Read Technologies. Front Mol Biosci 2021; 8:711733. [PMID: 34409069 PMCID: PMC8364947 DOI: 10.3389/fmolb.2021.711733] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 07/19/2021] [Indexed: 01/12/2023] Open
Abstract
Alternative splicing (AS) of RNA is a key mechanism that results in the expression of multiple transcript isoforms from single genes and leads to an increase in the complexity of both the transcriptome and proteome. Regulation of AS is critical for the correct functioning of many biological pathways, while disruption of AS can be directly pathogenic in diseases such as cancer or cause risk for complex disorders. Current short-read sequencing technologies achieve high read depth but are limited in their ability to resolve complex isoforms. In this review we examine how long-read sequencing (LRS) technologies can address this challenge by covering the entire RNA sequence in a single read and thereby distinguish isoform changes that could impact RNA regulation or protein function. Coupling LRS with technologies such as single cell sequencing, targeted sequencing and spatial transcriptomics is producing a rapidly expanding suite of technological approaches to profile alternative splicing at the isoform level with unprecedented detail. In addition, integrating LRS with genotype now allows the impact of genetic variation on isoform expression to be determined. Recent results demonstrate the potential of these techniques to elucidate the landscape of splicing, including in tissues such as the brain where AS is particularly prevalent. Finally, we also discuss how AS can impact protein function, potentially leading to novel therapeutic targets for a range of diseases.
Collapse
Affiliation(s)
| | | | - Michael B. Clark
- Centre for Stem Cell Systems, Department of Anatomy and Physiology, The University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
17
|
Chung M, Bruno VM, Rasko DA, Cuomo CA, Muñoz JF, Livny J, Shetty AC, Mahurkar A, Dunning Hotopp JC. Best practices on the differential expression analysis of multi-species RNA-seq. Genome Biol 2021; 22:121. [PMID: 33926528 PMCID: PMC8082843 DOI: 10.1186/s13059-021-02337-8] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Accepted: 04/01/2021] [Indexed: 02/07/2023] Open
Abstract
Advances in transcriptome sequencing allow for simultaneous interrogation of differentially expressed genes from multiple species originating from a single RNA sample, termed dual or multi-species transcriptomics. Compared to single-species differential expression analysis, the design of multi-species differential expression experiments must account for the relative abundances of each organism of interest within the sample, often requiring enrichment methods and yielding differences in total read counts across samples. The analysis of multi-species transcriptomics datasets requires modifications to the alignment, quantification, and downstream analysis steps compared to the single-species analysis pipelines. We describe best practices for multi-species transcriptomics and differential gene expression.
Collapse
Affiliation(s)
- Matthew Chung
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Vincent M Bruno
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - David A Rasko
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Christina A Cuomo
- Infectious Disease and Microbiome Program, Broad Institute, Cambridge, MA, 02142, USA
| | - José F Muñoz
- Infectious Disease and Microbiome Program, Broad Institute, Cambridge, MA, 02142, USA
| | - Jonathan Livny
- Infectious Disease and Microbiome Program, Broad Institute, Cambridge, MA, 02142, USA
| | - Amol C Shetty
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Anup Mahurkar
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Julie C Dunning Hotopp
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA. .,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA. .,Greenebaum Cancer Center, University of Maryland, Baltimore, MD, 21201, USA.
| |
Collapse
|
18
|
Tung KF, Pan CY, Chen CH, Lin WC. Top-ranked expressed gene transcripts of human protein-coding genes investigated with GTEx dataset. Sci Rep 2020; 10:16245. [PMID: 33004865 PMCID: PMC7530651 DOI: 10.1038/s41598-020-73081-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Accepted: 09/07/2020] [Indexed: 12/13/2022] Open
Abstract
With considerable accumulation of RNA-Seq transcriptome data, we have extended our understanding about protein-coding gene transcript compositions. However, alternatively compounded patterns of human protein-coding gene transcripts would complicate gene expression data processing and interpretation. It is essential to exhaustively interrogate complex mRNA isoforms of protein-coding genes with an unified data resource. In order to investigate representative mRNA transcript isoforms to be utilized as transcriptome analysis references, we utilized GTEx data to establish a top-ranked transcript isoform expression data resource for human protein-coding genes. Distinctive tissue specific expression profiles and modulations could be observed for individual top-ranked transcripts of protein-coding genes. Protein-coding transcripts or genes do occupy much higher expression fraction in transcriptome data. In addition, top-ranked transcripts are the dominantly expressed ones in various normal tissues. Intriguingly, some of the top-ranked transcripts are noncoding splicing isoforms, which imply diverse gene regulation mechanisms. Comprehensive investigation on the tissue expression patterns of top-ranked transcript isoforms is crucial. Thus, we established a web tool to examine top-ranked transcript isoforms in various human normal tissue types, which provides concise transcript information and easy-to-use graphical user interfaces. Investigation of top-ranked transcript isoforms would contribute understanding on the functional significance of distinctive alternatively spliced transcript isoforms.
Collapse
Affiliation(s)
- Kuo-Feng Tung
- Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan, ROC
| | - Chao-Yu Pan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan, ROC.,Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan, ROC
| | - Chao-Hsin Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan, ROC
| | - Wen-Chang Lin
- Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan, ROC. .,Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan, ROC.
| |
Collapse
|