1
|
Korchak J, Jeffery ED, Bandyopadhyay S, Jordan BT, Lehe MD, Watts EF, Fenix A, Wilhelm M, Sheynkman GM. IS-PRM-Based Peptide Targeting Informed by Long-Read Sequencing for Alternative Proteome Detection. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:2614-2630. [PMID: 39012054 PMCID: PMC11544703 DOI: 10.1021/jasms.4c00119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 05/24/2024] [Accepted: 06/25/2024] [Indexed: 07/17/2024]
Abstract
Alternative splicing is a major contributor of transcriptomic complexity, but the extent to which transcript isoforms are translated into stable, functional protein isoforms is unclear. Furthermore, detection of relatively scarce isoform-specific peptides is challenging, with many protein isoforms remaining uncharted due to technical limitations. Recently, a family of advanced targeted MS strategies, termed internal standard parallel reaction monitoring (IS-PRM), have demonstrated multiplexed, sensitive detection of predefined peptides of interest. Such approaches have not yet been used to confirm existence of novel peptides. Here, we present a targeted proteogenomic approach that leverages sample-matched long-read RNA sequencing (lrRNA-seq) data to predict potential protein isoforms with prior transcript evidence. Predicted tryptic isoform-specific peptides, which are specific to individual gene product isoforms, serve as "triggers" and "targets" in the IS-PRM method, Tomahto. Using the model human stem cell line WTC11, LR RNaseq data were generated and used to inform the generation of synthetic standards for 192 isoform-specific peptides (114 isoforms from 55 genes). These synthetic "trigger" peptides were labeled with super heavy tandem mass tags (TMT) and spiked into TMT-labeled WTC11 tryptic digest, predicted to contain corresponding endogenous "target" peptides. Compared to DDA mode, Tomahto increased detectability of isoforms by 3.6-fold, resulting in the identification of five previously unannotated isoforms. Our method detected protein isoform expression for 43 out of 55 genes corresponding to 54 resolved isoforms. This lrRNA-seq-informed Tomahto targeted approach is a new modality for generating protein-level evidence of alternative isoforms─a critical first step in designing functional studies and eventually clinical assays.
Collapse
Affiliation(s)
- Jennifer
A. Korchak
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Erin D. Jeffery
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Saikat Bandyopadhyay
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
- Center
for Public Health Genomics, University of
Virginia, Charlottesville, Virginia 22903, United States
| | - Ben T. Jordan
- Cancer
Genomics Research Laboratory, Frederick
National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Micah D. Lehe
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Emily F. Watts
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Aidan Fenix
- Department
of Laboratory Medicine and Pathology, University
of Washington, Seattle, Washington 98195, United States
| | - Mathias Wilhelm
- Computational
Mass Spectrometry, Technical University
of Munich (TUM), D-85354 Freising, Germany
| | - Gloria M. Sheynkman
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
- Department
of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, Virginia 22903, United States
- UVA
Comprehensive Cancer Center, University
of Virginia, Charlottesville, Virginia 22903, United States
| |
Collapse
|
2
|
Pandi B, Brenman S, Black A, Ng DCM, Lau E, Lam MPY. Tissue Usage Preference and Intrinsically Disordered Region Remodeling of Alternative Splicing Derived Proteoforms in the Heart. J Proteome Res 2024; 23:3161-3173. [PMID: 38456420 PMCID: PMC11296937 DOI: 10.1021/acs.jproteome.3c00789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 02/08/2024] [Accepted: 02/27/2024] [Indexed: 03/09/2024]
Abstract
A computational analysis of mass spectrometry data was performed to uncover alternative splicing derived protein variants across chambers of the human heart. Evidence for 216 non-canonical isoforms was apparent in the atrium and the ventricle, including 52 isoforms not documented on SwissProt and recovered using an RNA sequencing derived database. Among non-canonical isoforms, 29 show signs of regulation based on statistically significant preferences in tissue usage, including a ventricular enriched protein isoform of tensin-1 (TNS1) and an atrium-enriched PDZ and LIM Domain 3 (PDLIM3) isoform 2 (PDLIM3-2/ALP-H). Examined variant regions that differ between alternative and canonical isoforms are highly enriched with intrinsically disordered regions. Moreover, over two-thirds of such regions are predicted to function in protein binding and RNA binding. The analysis here lends further credence to the notion that alternative splicing diversifies the proteome by rewiring intrinsically disordered regions, which are increasingly recognized to play important roles in the generation of biological function from protein sequences.
Collapse
Affiliation(s)
- Boomathi Pandi
- Department
of Medicine/Division of Cardiology, Department of Biochemistry &
Molecular Genetics, and Consortium for Fibrosis Research and Translation (CFReT), University of Colorado School of Medicine, Aurora, Colorado 80045, United States
| | - Stella Brenman
- Department
of Medicine/Division of Cardiology, Department of Biochemistry &
Molecular Genetics, and Consortium for Fibrosis Research and Translation (CFReT), University of Colorado School of Medicine, Aurora, Colorado 80045, United States
| | - Alexander Black
- Department
of Medicine/Division of Cardiology, Department of Biochemistry &
Molecular Genetics, and Consortium for Fibrosis Research and Translation (CFReT), University of Colorado School of Medicine, Aurora, Colorado 80045, United States
| | - Dominic C. M. Ng
- Department
of Medicine/Division of Cardiology, Department of Biochemistry &
Molecular Genetics, and Consortium for Fibrosis Research and Translation (CFReT), University of Colorado School of Medicine, Aurora, Colorado 80045, United States
| | - Edward Lau
- Department
of Medicine/Division of Cardiology, Department of Biochemistry &
Molecular Genetics, and Consortium for Fibrosis Research and Translation (CFReT), University of Colorado School of Medicine, Aurora, Colorado 80045, United States
| | - Maggie P. Y. Lam
- Department
of Medicine/Division of Cardiology, Department of Biochemistry &
Molecular Genetics, and Consortium for Fibrosis Research and Translation (CFReT), University of Colorado School of Medicine, Aurora, Colorado 80045, United States
| |
Collapse
|
3
|
Gallo CM, Kistler SA, Natrakul A, Labadorf AT, Beffert U, Ho A. APOER2 splicing repertoire in Alzheimer's disease: Insights from long-read RNA sequencing. PLoS Genet 2024; 20:e1011348. [PMID: 39038048 PMCID: PMC11293713 DOI: 10.1371/journal.pgen.1011348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 08/01/2024] [Accepted: 06/21/2024] [Indexed: 07/24/2024] Open
Abstract
Disrupted alternative splicing plays a determinative role in neurological diseases, either as a direct cause or as a driver in disease susceptibility. Transcriptomic profiling of aged human postmortem brain samples has uncovered hundreds of aberrant mRNA splicing events in Alzheimer's disease (AD) brains, associating dysregulated RNA splicing with disease. We previously identified a complex array of alternative splicing combinations across apolipoprotein E receptor 2 (APOER2), a transmembrane receptor that interacts with both the neuroprotective ligand Reelin and the AD-associated risk factor, APOE. Many of the human APOER2 isoforms, predominantly featuring cassette splicing events within functionally important domains, are critical for the receptor's function and ligand interaction. However, a comprehensive repertoire and the functional implications of APOER2 isoforms under both physiological and AD conditions are not fully understood. Here, we present an in-depth analysis of the splicing landscape of human APOER2 isoforms in normal and AD states. Using single-molecule, long-read sequencing, we profiled the entire APOER2 transcript from the parietal cortex and hippocampus of Braak stage IV AD brain tissues along with age-matched controls and investigated several functional properties of APOER2 isoforms. Our findings reveal diverse patterns of cassette exon skipping for APOER2 isoforms, with some showing region-specific expression and others unique to AD-affected brains. Notably, exon 15 of APOER2, which encodes the glycosylation domain, showed less inclusion in AD compared to control in the parietal cortex of females with an APOE ɛ3/ɛ3 genotype. Also, some of these APOER2 isoforms demonstrated changes in cell surface expression, APOE-mediated receptor processing, and synaptic number. These variations are likely critical in inducing synaptic alterations and may contribute to the neuronal dysfunction underlying AD pathogenesis.
Collapse
Affiliation(s)
- Christina M. Gallo
- Department of Biology, Boston University, Boston, Massachusetts, United States of America
- Department of Pharmacology, Physiology & Biophysics, Boston University Chobanian & Avedisian School of Medicine, Boston, Massachusetts, United States of America
| | - Sabrina A. Kistler
- Department of Biology, Boston University, Boston, Massachusetts, United States of America
- Department of Pharmacology, Physiology & Biophysics, Boston University Chobanian & Avedisian School of Medicine, Boston, Massachusetts, United States of America
| | - Anna Natrakul
- Department of Biology, Boston University, Boston, Massachusetts, United States of America
| | - Adam T. Labadorf
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
- Department of Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, Massachusetts, United States of America
| | - Uwe Beffert
- Department of Biology, Boston University, Boston, Massachusetts, United States of America
| | - Angela Ho
- Department of Biology, Boston University, Boston, Massachusetts, United States of America
- Department of Pharmacology, Physiology & Biophysics, Boston University Chobanian & Avedisian School of Medicine, Boston, Massachusetts, United States of America
| |
Collapse
|
4
|
Korchak JA, Jeffery ED, Bandyopadhyay S, Jordan BT, Lehe M, Watts EF, Fenix A, Wilhelm M, Sheynkman GM. IS-PRM-based peptide targeting informed by long-read sequencing for alternative proteome detection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.01.587549. [PMID: 38617311 PMCID: PMC11014528 DOI: 10.1101/2024.04.01.587549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Alternative splicing is a major contributor of transcriptomic complexity, but the extent to which transcript isoforms are translated into stable, functional protein isoforms is unclear. Furthermore, detection of relatively scarce isoform-specific peptides is challenging, with many protein isoforms remaining uncharted due to technical limitations. Recently, a family of advanced targeted MS strategies, termed internal standard parallel reaction monitoring (IS-PRM), have demonstrated multiplexed, sensitive detection of pre-defined peptides of interest. Such approaches have not yet been used to confirm existence of novel peptides. Here, we present a targeted proteogenomic approach that leverages sample-matched long-read RNA sequencing (LR RNAseq) data to predict potential protein isoforms with prior transcript evidence. Predicted tryptic isoform-specific peptides, which are specific to individual gene product isoforms, serve as "triggers" and "targets" in the IS-PRM method, Tomahto. Using the model human stem cell line WTC11, LR RNAseq data were generated and used to inform the generation of synthetic standards for 192 isoform-specific peptides (114 isoforms from 55 genes). These synthetic "trigger" peptides were labeled with super heavy tandem mass tags (TMT) and spiked into TMT-labeled WTC11 tryptic digest, predicted to contain corresponding endogenous "target" peptides. Compared to DDA mode, Tomahto increased detectability of isoforms by 3.6-fold, resulting in the identification of five previously unannotated isoforms. Our method detected protein isoform expression for 43 out of 55 genes corresponding to 54 resolved isoforms. This LR RNA seq-informed Tomahto targeted approach, called LRP-IS-PRM, is a new modality for generating protein-level evidence of alternative isoforms - a critical first step in designing functional studies and eventually clinical assays.
Collapse
Affiliation(s)
- Jennifer A. Korchak
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Erin D. Jeffery
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Saikat Bandyopadhyay
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Ben T. Jordan
- Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, MD USA
| | - Micah Lehe
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Emily F. Watts
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Aidan Fenix
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | - Mathias Wilhelm
- Computational Mass Spectrometry, Technical University of Munich (TUM), D-85354 Freising, Germany
| | - Gloria M. Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, USA
- UVA Comprehensive Cancer Center, University of Virginia, Charlottesville, VA, USA
| |
Collapse
|
5
|
Pandi B, Brenman S, Black A, Ng DCM, Lau E, Lam MPY. Tissue Usage Preference and Intrinsically Disordered Region Remodeling of Alternative Splicing Derived Proteoforms in the Heart. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.08.561375. [PMID: 37873130 PMCID: PMC10592692 DOI: 10.1101/2023.10.08.561375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
A computational analysis of mass spectrometry data was performed to uncover alternative splicing derived protein variants across chambers of the human heart. Evidence for 216 non-canonical isoforms was apparent in the atrium and the ventricle, including 52 isoforms not documented on SwissProt and recovered using an RNA sequencing derived database. Among non-canonical isoforms, 29 show signs of regulation based on statistically significant preferences in tissue usage, including a ventricular enriched protein isoform of tensin-1 (TNS1) and an atrium-enriched PDZ and LIM Domain 3 (PDLIM3) isoform 2 (PDLIM3-2/ALP-H). Examined variant regions that differ between alternative and canonical isoforms are highly enriched in intrinsically disordered regions, and over two-thirds of such regions are predicted to function in protein binding and/or RNA binding. The analysis here lends further credence to the notion that alternative splicing diversifies the proteome by rewiring intrinsically disordered regions, which are increasingly recognized to play important roles in the generation of biological function from protein sequences.
Collapse
Affiliation(s)
- Boomathi Pandi
- Department of Medicine/Division of Cardiology, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Stella Brenman
- Department of Medicine/Division of Cardiology, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Alexander Black
- Department of Medicine/Division of Cardiology, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Dominic C. M. Ng
- Department of Medicine/Division of Cardiology, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Edward Lau
- Department of Medicine/Division of Cardiology, University of Colorado School of Medicine, Aurora, CO 80045, USA
- Consortium for Fibrosis Research and Translation (CFReT), University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Maggie P. Y. Lam
- Department of Medicine/Division of Cardiology, University of Colorado School of Medicine, Aurora, CO 80045, USA
- Department of Biochemistry & Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA
- Consortium for Fibrosis Research and Translation (CFReT), University of Colorado School of Medicine, Aurora, CO 80045, USA
| |
Collapse
|
6
|
Liberti A, Pollastro C, Pinto G, Illiano A, Marino R, Amoresano A, Spagnuolo A, Sordino P. Transcriptional and proteomic analysis of the innate immune response to microbial stimuli in a model invertebrate chordate. Front Immunol 2023; 14:1217077. [PMID: 37600818 PMCID: PMC10433773 DOI: 10.3389/fimmu.2023.1217077] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 07/13/2023] [Indexed: 08/22/2023] Open
Abstract
Inflammatory response triggered by innate immunity can act to protect against microorganisms that behave as pathogens, with the aim to restore the homeostatic state between host and beneficial microbes. As a filter-feeder organism, the ascidian Ciona robusta is continuously exposed to external microbes that may be harmful under some conditions. In this work, we used transcriptional and proteomic approaches to investigate the inflammatory response induced by stimuli of bacterial (lipopolysaccharide -LPS- and diacylated lipopeptide - Pam2CSK4) and fungal (zymosan) origin, in Ciona juveniles at stage 4 of metamorphosis. We focused on receptors, co-interactors, transcription factors and cytokines belonging to the TLR and Dectin-1 pathways and on immune factors identified by homology approach (i.e. immunoglobulin (Ig) or C-type lectin domain containing molecules). While LPS did not induce a significant response in juvenile ascidians, Pam2CSK4 and zymosan exposure triggered the activation of specific inflammatory mechanisms. In particular, Pam2CSK4-induced inflammation was characterized by modulation of TLR and Dectin-1 pathway molecules, including receptors, transcription factors, and cytokines, while immune response to zymosan primarily involved C-type lectin receptors, co-interactors, Ig-containing molecules, and cytokines. A targeted proteomic analysis enabled to confirm transcriptional data, also highlighting a temporal delay between transcriptional induction and protein level changes. Finally, a protein-protein interaction network of Ciona immune molecules was rendered to provide a wide visualization and analysis platform of innate immunity. The in vivo inflammatory model described here reveals interconnections of innate immune pathways in specific responses to selected microbial stimuli. It also represents the starting point for studying ontogeny and regulation of inflammatory disorders in different physiological conditions.
Collapse
Affiliation(s)
- Assunta Liberti
- Biology and Evolution of Marine Organisms (BEOM), Stazione Zoologica Anton Dohrn, Naples, Italy
| | - Carla Pollastro
- Biology and Evolution of Marine Organisms (BEOM), Stazione Zoologica Anton Dohrn, Naples, Italy
| | - Gabriella Pinto
- Department of Chemical Sciences, University of Naples Federico II, Naples, Italy
- Istituto Nazionale Biostrutture e Biosistemi-Consorzio Interuniversitario, Rome, Italy
| | - Anna Illiano
- Department of Chemical Sciences, University of Naples Federico II, Naples, Italy
- Istituto Nazionale Biostrutture e Biosistemi-Consorzio Interuniversitario, Rome, Italy
| | - Rita Marino
- Biology and Evolution of Marine Organisms (BEOM), Stazione Zoologica Anton Dohrn, Naples, Italy
| | - Angela Amoresano
- Department of Chemical Sciences, University of Naples Federico II, Naples, Italy
- Istituto Nazionale Biostrutture e Biosistemi-Consorzio Interuniversitario, Rome, Italy
| | - Antonietta Spagnuolo
- Biology and Evolution of Marine Organisms (BEOM), Stazione Zoologica Anton Dohrn, Naples, Italy
| | - Paolo Sordino
- Biology and Evolution of Marine Organisms (BEOM), Stazione Zoologica Anton Dohrn, Sicily Marine Centre, Messina, Italy
| |
Collapse
|
7
|
Mehlferber MM, Kuyumcu-Martinez M, Miller CL, Sheynkman GM. Transcription factors and splice factors - interconnected regulators of stem cell differentiation. CURRENT STEM CELL REPORTS 2023; 9:31-41. [PMID: 38939410 PMCID: PMC11210451 DOI: 10.1007/s40778-023-00227-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/12/2023] [Indexed: 06/29/2024]
Abstract
Purpose of review The underlying molecular mechanisms that direct stem cell differentiation into fully functional, mature cells remain an area of ongoing investigation. Cell state is the product of the combinatorial effect of individual factors operating within a coordinated regulatory network. Here, we discuss the contribution of both gene regulatory and splicing regulatory networks in defining stem cell fate during differentiation and the critical role of protein isoforms in this process. Recent findings We review recent experimental and computational approaches that characterize gene regulatory networks, splice regulatory networks, and the resulting transcriptome and proteome they mediate during differentiation. Such approaches include long-read RNA sequencing, which has demonstrated high-resolution profiling of mRNA isoforms, and Cas13-based CRISPR, which could make possible high-throughput isoform screening. Collectively, these developments enable systems-level profiling of factors contributing to cell state. Summary Overall, gene and splice regulatory networks are important in defining cell state. The emerging high-throughput systems-level approaches will characterize the gene regulatory network components necessary in driving stem cell differentiation.
Collapse
Affiliation(s)
- Madison M Mehlferber
- Department of Biochemistry and Molecular Genetics, University Virginia, Charlottesville, VA 22903
| | - Muge Kuyumcu-Martinez
- Department of Molecular Physiology and Biological Physics, University of Virginia, School of Medicine, Fontaine Medical Office Building 1, 415 Ray C. Hunt Dr, Charlottesville, VA 22903
| | - Clint L Miller
- Department of Public Health Sciences, Department of Biochemistry and Molecular Genetics, and Department of Biomedical Engineering, University of Virginia, Multistory Building, West Complex, 1335 Lee St, Charlottesville, VA 22908, PO Box 800717, Charlottesville, Virginia 22908
| | - Gloria M Sheynkman
- Department of Molecular Physiology and Biological Physics, Center for Public Health Genomics, UVA Comprehensive Cancer Center, Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22903
| |
Collapse
|
8
|
Manuel JM, Guilloy N, Khatir I, Roucou X, Laurent B. Re-evaluating the impact of alternative RNA splicing on proteomic diversity. Front Genet 2023; 14:1089053. [PMID: 36845399 PMCID: PMC9947481 DOI: 10.3389/fgene.2023.1089053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 01/23/2023] [Indexed: 02/11/2023] Open
Abstract
Alternative splicing (AS) constitutes a mechanism by which protein-coding genes and long non-coding RNA (lncRNA) genes produce more than a single mature transcript. From plants to humans, AS is a powerful process that increases transcriptome complexity. Importantly, splice variants produced from AS can potentially encode for distinct protein isoforms which can lose or gain specific domains and, hence, differ in their functional properties. Advances in proteomics have shown that the proteome is indeed diverse due to the presence of numerous protein isoforms. For the past decades, with the help of advanced high-throughput technologies, numerous alternatively spliced transcripts have been identified. However, the low detection rate of protein isoforms in proteomic studies raised debatable questions on whether AS contributes to proteomic diversity and on how many AS events are really functional. We propose here to assess and discuss the impact of AS on proteomic complexity in the light of the technological progress, updated genome annotation, and current scientific knowledge.
Collapse
Affiliation(s)
- Jeru Manoj Manuel
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Noé Guilloy
- Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Inès Khatir
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada,Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC, Canada,Quebec Network for Research on Protein Function Structure and Engineering, PROTEO, Québec, QC, Canada
| | - Benoit Laurent
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada,*Correspondence: Benoit Laurent,
| |
Collapse
|
9
|
Reixachs‐Solé M, Eyras E. Uncovering the impacts of alternative splicing on the proteome with current omics techniques. WILEY INTERDISCIPLINARY REVIEWS. RNA 2022; 13:e1707. [PMID: 34979593 PMCID: PMC9542554 DOI: 10.1002/wrna.1707] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Revised: 11/27/2021] [Accepted: 11/29/2021] [Indexed: 12/15/2022]
Abstract
The high-throughput sequencing of cellular RNAs has underscored a broad effect of isoform diversification through alternative splicing on the transcriptome. Moreover, the differential production of transcript isoforms from gene loci has been recognized as a critical mechanism in cell differentiation, organismal development, and disease. Yet, the extent of the impact of alternative splicing on protein production and cellular function remains a matter of debate. Multiple experimental and computational approaches have been developed in recent years to address this question. These studies have unveiled how molecular changes at different steps in the RNA processing pathway can lead to differences in protein production and have functional effects. New and emerging experimental technologies open exciting new opportunities to develop new methods to fully establish the connection between messenger RNA expression and protein production and to further investigate how RNA variation impacts the proteome and cell function. This article is categorized under: RNA Processing > Splicing Regulation/Alternative Splicing Translation > Regulation RNA Evolution and Genomics > Computational Analyses of RNA.
Collapse
Affiliation(s)
- Marina Reixachs‐Solé
- The John Curtin School of Medical ResearchAustralian National UniversityCanberraAustralian Capital TerritoryAustralia
- EMBL Australia Partner Laboratory Network and the Australian National UniversityCanberraAustralian Capital TerritoryAustralia
| | - Eduardo Eyras
- The John Curtin School of Medical ResearchAustralian National UniversityCanberraAustralian Capital TerritoryAustralia
- EMBL Australia Partner Laboratory Network and the Australian National UniversityCanberraAustralian Capital TerritoryAustralia
- Catalan Institution for Research and Advanced StudiesBarcelonaSpain
- Hospital del Mar Medical Research Institute (IMIM)BarcelonaSpain
| |
Collapse
|
10
|
Tay AP, Hamey JJ, Martyn GE, Wilson LOW, Wilkins MR. Identification of Protein Isoforms Using Reference Databases Built from Long and Short Read RNA-Sequencing. J Proteome Res 2022; 21:1628-1639. [PMID: 35612954 DOI: 10.1021/acs.jproteome.1c00968] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Alternative splicing can lead to distinct protein isoforms. These can have different functions in specific cells and tissues or in different developmental stages. In this study, we explored whether transcripts assembled from long read, nanopore-based, direct RNA-sequencing (RNA-seq) could improve the identification of protein isoforms in human K562 cells. By comparing with Illumina-based short read RNA-seq, we showed that a large proportion of Ensembl transcripts (5949/14,326) and genes expressing alternatively spliced transcripts (486/2981) identified with long direct reads were missed by short paired-end reads. By co-analyzing proteomic and transcriptomic data, we also showed that some peptides (826/35,976), proteins (262/3215), and protein isoforms arising from distinct transcript variants (574/1212) identified with isoform-specific peptides via custom long-read-based databases were missed in Illumina-derived databases. Finally, we generated unequivocal peptide evidence for a set of protein isoforms and showed that long read, direct RNA-seq allows the discovery of novel protein isoforms not already in reference databases or custom databases built from short read RNA-seq data. Our analysis highlights the benefits of long read RNA-seq data in the generation of reference databases to increase tandem mass spectrometry (MS/MS) identification of protein isoforms.
Collapse
Affiliation(s)
- Aidan P Tay
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia.,Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Sydney, New South Wales 2113, Australia.,Applied Biosciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Joshua J Hamey
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Gabriella E Martyn
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Laurence O W Wilson
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Sydney, New South Wales 2113, Australia.,Applied Biosciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Marc R Wilkins
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| |
Collapse
|
11
|
Miller RM, Jordan BT, Mehlferber MM, Jeffery ED, Chatzipantsiou C, Kaur S, Millikin RJ, Dai Y, Tiberi S, Castaldi PJ, Shortreed MR, Luckey CJ, Conesa A, Smith LM, Deslattes Mays A, Sheynkman GM. Enhanced protein isoform characterization through long-read proteogenomics. Genome Biol 2022; 23:69. [PMID: 35241129 PMCID: PMC8892804 DOI: 10.1186/s13059-022-02624-y] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 02/02/2022] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms. RESULTS We describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis. CONCLUSIONS Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.
Collapse
Affiliation(s)
- Rachel M Miller
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Ben T Jordan
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | - Madison M Mehlferber
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, USA
| | - Erin D Jeffery
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | | | - Simi Kaur
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Robert J Millikin
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Yunxiang Dai
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Simone Tiberi
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
| | - Peter J Castaldi
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Division of General Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Chance John Luckey
- Department of Pathology, University of Virginia, Charlottesville, VA, USA
| | - Ana Conesa
- Institute for Integrative Systems Biology, Spanish National Research Council (CSIC), Paterna, Spain
- Microbiology and Cell Science Department, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, USA
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Anne Deslattes Mays
- Office of Data Science and Sharing, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Rockville, MD, USA
| | - Gloria M Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA.
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA.
- UVA Cancer Center, University of Virginia, Charlottesville, VA, USA.
| |
Collapse
|
12
|
Mehlferber MM, Jeffery ED, Saquing J, Jordan BT, Sheynkman L, Murali M, Genet G, Acharya BR, Hirschi KK, Sheynkman GM. Characterization of protein isoform diversity in human umbilical vein endothelial cells via long-read proteogenomics. RNA Biol 2022; 19:1228-1243. [PMID: 36457147 PMCID: PMC9721438 DOI: 10.1080/15476286.2022.2141938] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 10/26/2022] [Indexed: 12/04/2022] Open
Abstract
Endothelial cells (ECs) comprise the lumenal lining of all blood vessels and are critical for the functioning of the cardiovascular system. Their phenotypes can be modulated by alternative splicing of RNA to produce distinct protein isoforms. To characterize the RNA and protein isoform landscape within ECs, we applied a long read proteogenomics approach to analyse human umbilical vein endothelial cells (HUVECs). Transcripts delineated from PacBio sequencing serve as the basis for a sample-specific protein database used for downstream mass-spectrometry (MS) analysis to infer protein isoform expression. We detected 53,863 transcript isoforms from 10,426 genes, with 22,195 of those transcripts being novel. Furthermore, the predominant isoform in HUVECs does not correspond with the accepted "reference isoform" 25% of the time, with vascular pathway-related genes among this group. We found 2,597 protein isoforms supported through unique peptides, with an additional 2,280 isoforms nominated upon incorporation of long-read transcript evidence. We characterized a novel alternative acceptor for endothelial-related gene CDH5, suggesting potential changes in its associated signalling pathways. Finally, we identified novel protein isoforms arising from a diversity of RNA splicing mechanisms supported by uniquely mapped novel peptides. Our results represent a high-resolution atlas of known and novel isoforms of potential relevance to endothelial phenotypes and function.[Figure: see text].
Collapse
Affiliation(s)
- Madison M. Mehlferber
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, USA
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Erin D. Jeffery
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Jamie Saquing
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Ben T. Jordan
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Leon Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Mayank Murali
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Gael Genet
- Department of Cell Biology, University of Virginia School of Medicine, Charlottesville, VA, USA
| | - Bipul R. Acharya
- Department of Cell Biology, University of Virginia School of Medicine, Charlottesville, VA, USA
- Cardiovascular Research Center, University of Virginia, Charlottesville, VA, USA
- Wellcome Centre for Cell-Matrix Research, Faculty of Biology, Medicine and Health, the University of Manchester, UK
| | - Karen K. Hirschi
- Department of Cell Biology, University of Virginia School of Medicine, Charlottesville, VA, USA
- Cardiovascular Research Center, University of Virginia, Charlottesville, VA, USA
| | - Gloria M. Sheynkman
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, USA
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
- UVA Comprehensive Cancer Center, University of Virginia, Charlottesville, Virginia, USA
| |
Collapse
|
13
|
Guerra-Almeida D, Tschoeke DA, da-Fonseca RN. Understanding small ORF diversity through a comprehensive transcription feature classification. DNA Res 2021; 28:6317669. [PMID: 34240112 PMCID: PMC8435553 DOI: 10.1093/dnares/dsab007] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Indexed: 11/13/2022] Open
Abstract
Small open reading frames (small ORFs/sORFs/smORFs) are potentially coding sequences smaller than 100 codons that have historically been considered junk DNA by gene prediction software and in annotation screening; however, the advent of next-generation sequencing has contributed to the deeper investigation of junk DNA regions and their transcription products, resulting in the emergence of smORFs as a new focus of interest in systems biology. Several smORF peptides were recently reported in noncanonical mRNAs as new players in numerous biological contexts; however, their relevance is still overlooked in coding potential analysis. Hence, this review proposes a smORF classification based on transcriptional features, discussing the most promising approaches to investigate smORFs based on their different characteristics. First, smORFs were divided into nonexpressed (intergenic) and expressed (genic) smORFs. Second, genic smORFs were classified as smORFs located in noncoding RNAs (ncRNAs) or canonical mRNAs. Finally, smORFs in ncRNAs were further subdivided into sequences located in small or long RNAs, whereas smORFs located in canonical mRNAs were subdivided into several specific classes depending on their localization along the gene. We hope that this review provides new insights into large-scale annotations and reinforces the role of smORFs as essential components of a hidden coding DNA world.
Collapse
Affiliation(s)
- Diego Guerra-Almeida
- Institute of Biodiversity and Sustainability, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Diogo Antonio Tschoeke
- Alberto Luiz Coimbra Institute of Graduate Studies and Engineering Research (COPPE), Biomedical Engineering Program, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Rodrigo Nunes- da-Fonseca
- Institute of Biodiversity and Sustainability, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.,National Institute of Science and Technology in Molecular Entomology, Rio de Janeiro, Brazil
| |
Collapse
|
14
|
Wang W, Chen Y, Zhao J, Chen L, Song W, Li L, Lin GN. Alternatively Splicing Interactomes Identify Novel Isoform-Specific Partners for NSD2. Front Cell Dev Biol 2021; 9:612019. [PMID: 33718354 PMCID: PMC7947288 DOI: 10.3389/fcell.2021.612019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 02/05/2021] [Indexed: 11/13/2022] Open
Abstract
Nuclear receptor SET domain protein (NSD2) plays a fundamental role in the pathogenesis of Wolf-Hirschhorn Syndrome (WHS) and is overexpressed in multiple human myelomas, but its protein-protein interaction (PPI) patterns, particularly at the isoform/exon levels, are poorly understood. We explored the subcellular localizations of four representative NSD2 transcripts with immunofluorescence microscopy. Next, we used label-free quantification to perform immunoprecipitation mass spectrometry (IP-MS) analyses of the transcripts. Using the interaction partners for each transcript detected in the IP-MS results, we identified 890 isoform-specific PPI partners (83% are novel). These PPI networks were further divided into four categories of the exon-specific interactome. In these exon-specific PPI partners, two genes, RPL10 and HSPA8, were successfully confirmed by co-immunoprecipitation and Western blotting. RPL10 primarily interacted with Isoforms 1, 3, and 5, and HSPA8 interacted with all four isoforms, respectively. Using our extended NSD2 protein interactions, we constructed an isoform-level PPI landscape for NSD2 to serve as reference interactome data for NSD2 spliceosome-level studies. Furthermore, the RNA splicing processes supported by these isoform partners shed light on the diverse roles NSD2 plays in WHS and myeloma development. We also validated the interactions using Western blotting, RPL10, and the three NSD2 (Isoform 1, 3, and 5). Our results expand gene-level NSD2 PPI networks and provide a basis for the treatment of NSD2-related developmental diseases.
Collapse
Affiliation(s)
- Weidi Wang
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China
| | - Yucan Chen
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Jingjing Zhao
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Liang Chen
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Weichen Song
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Li Li
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Guan Ning Lin
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China
| |
Collapse
|
15
|
Montero-Calle A, Barderas R. Analysis of Protein-Protein Interactions by Protein Microarrays. Methods Mol Biol 2021; 2344:81-97. [PMID: 34115353 DOI: 10.1007/978-1-0716-1562-1_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2023]
Abstract
The analysis of the proteome and the interactome would be useful for a better understanding of the pathophysiology of several disorders, allowing the identification of potential specific markers for early diagnosis and prognosis, as well as potential targets of intervention. Among different proteomic approaches, high-density protein microarrays have become an interesting tool for the screening of protein-protein interactions and the interactome definition of disease-associated dysregulated proteins. This information might contribute to the identification of altered signaling pathways and protein functions involved in the pathogenesis of a disease. Remarkably, protein microarrays have been already satisfactorily employed for the study of protein-protein interactions in cancer, allergy, or neurodegenerative diseases. Here, we describe the utilization of recombinant protein microarrays for the identification of protein-protein interactions to help in the definition of disease-specific dysregulated interactomes.
Collapse
Affiliation(s)
- Ana Montero-Calle
- Chronic Disease Programme, UFIEC, Instituto de Salud Carlos III, Madrid, Spain
| | - Rodrigo Barderas
- Chronic Disease Programme, UFIEC, Instituto de Salud Carlos III, Madrid, Spain.
| |
Collapse
|
16
|
Jensen P, Patel B, Smith S, Sabnis R, Kaboord B. Improved Immunoprecipitation to Mass Spectrometry Method for the Enrichment of Low-Abundant Protein Targets. Methods Mol Biol 2021; 2261:229-246. [PMID: 33420993 DOI: 10.1007/978-1-0716-1186-9_14] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Immunoprecipitation (IP) is commonly used upstream of mass spectrometry (MS) as an enrichment tool for low-abundant protein targets. However, several aspects of the classical IP procedure such as nonspecific protein binding to the isolation matrix, detergents or high salt concentrations in wash and elution buffers, and antibody chain contamination in elution fractions render it incompatible with downstream mass spectrometry analysis. Here, we discuss an improved IP-MS workflow that is designed to minimize sample prep time and these contaminants. The method employs biotinylated antibodies to the targets of interest and streptavidin magnetic beads that exhibit low background binding. In addition, alterations in the elution protocol and subsequent MS sample prep were made to reduce time and antibody leaching in the eluent, minimizing potential ion suppression effects and thereby maximizing detection of multiple target antigens and interacting proteins.
Collapse
Affiliation(s)
| | | | | | - Renuka Sabnis
- Nisarga Biotech Pvt. Ltd., Satara, Maharashtra, India
| | | |
Collapse
|
17
|
Zhao L, Ge C, Zhang Z, Hu H, Zhang Y, Zhao W, Li R, Zeng B, Song X, Li G. FAM136A immunoreactivity is associated with nodal involvement and survival in lung adenocarcinoma in a Chinese case series. Bioengineered 2020; 11:261-271. [PMID: 32098576 PMCID: PMC7051133 DOI: 10.1080/21655979.2020.1735611] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Revised: 02/19/2020] [Accepted: 02/20/2020] [Indexed: 12/26/2022] Open
Abstract
Lung cancer patients with lymph node metastasis usually had short overall survival and occurred distant metastases at the early stage. However, some of these people did have more prolonged survival. The underlying reason is still unclear. In this study, we found a novel molecule, family with sequence similarity 136, member A gene (FAM136A). First, we performed immunohistochemistry for FAM136A in 177 lung carcinoma tissues. Second, we carried out in vitro studies by using A549 and PC-9. We detected FAM136A immunoreactivity in 79 out of 177 (44.6%) lung carcinoma tissues, and the FAM136A status was significantly associated with tumor T stage, lymph node metastasis, and the Tumor-Node-Metastasis (TNM) staging system in these cases. Importantly, it was significantly associated with the overall survival of the patients with lymph node metastasis, especially FAM136A positive patients, who had worse outcomes. Subsequent in vitro experiments revealed that the proliferation activity and migration property decreased both A549 and PC-9 lung carcinoma cells transfected with siRNA-FAM136A, and apoptosis reduced. Meanwhile, the expression of CDK4 and CDK6 decreased. FAM136A status would be a potent, worse prognostic factor in lung cancer patients with lymph node metastasis. It would play a vital role in the proliferation, apoptosis, and migration properties of A549 and PC-9. In the future, We will focus on the uncovered signal mechanism between FAM136A and lung cancer.
Collapse
Affiliation(s)
- Liufang Zhao
- First Department of Head and Neck Surgery, The Third Affiliated Hospital of Kunming Medical University, Tumor Hospital of Yunnan Province, Kunming, Yunnan, 650118, P.R. China
| | - Chunlei Ge
- Department of Cancer Biotherapy Center, Tumor Hospital of Yunnan Province, Kunming, Yunnan, 650118, P.R. China
| | - Zhiwei Zhang
- Department of Cancer Biotherapy Center, Tumor Hospital of Yunnan Province, Kunming, Yunnan, 650118, P.R. China
| | - Hongyan Hu
- Department of Pathology and Histotechnology, Tumor Hospital of Yunnan Province, Kunming, Yunnan, 650118, P.R. China
| | - Yi Zhang
- Department of Gynecology, Tumor Hospital of Yunnan Province, Kunming, Yunnan, 650118, P.R. China
| | - Wentao Zhao
- Department of Medical Oncology, Tumor Hospital of Yunnan Province, Kunming, Yunnan, 650118, P.R. China
| | - Ruilei Li
- Department of Cancer Biotherapy Center, Tumor Hospital of Yunnan Province, Kunming, Yunnan, 650118, P.R. China
| | - Baozhen Zeng
- Department of Pathology and Histotechnology, Tumor Hospital of Yunnan Province, Kunming, Yunnan, 650118, P.R. China
| | - Xin Song
- Department of Pathology and Histotechnology, Tumor Hospital of Yunnan Province, Kunming, Yunnan, 650118, P.R. China
| | - Gaofeng Li
- Department of Thoracic Surgery, The Third Affiliated Hospital of Kunming Medical University, Kunming, Yunnan, P.R. China
| |
Collapse
|
18
|
Asselin-Mullen P, Chauvin A, Dubois ML, Drissi R, Lévesque D, Boisvert FM. Protein interaction network of alternatively spliced NudCD1 isoforms. Sci Rep 2017; 7:12987. [PMID: 29021621 PMCID: PMC5636827 DOI: 10.1038/s41598-017-13441-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 09/25/2017] [Indexed: 12/16/2022] Open
Abstract
NudCD1, also known as CML66 or OVA66, is a protein initially identified as overexpressed in patients with chronic myelogenous leukemia. The mRNA of NudCD1 is expressed in heart and testis of normal tissues, and is overexpressed in several cancers. Previous studies have shown that the expression level of the protein correlates with tumoral phenotype, possibly interacting upstream of the Insulin Growth Factor - 1 Receptor (IGF-1R). The gene encoding the NudCD1 protein consists of 12 exons that can be alternative spliced, leading to the expression of three different isoforms. These isoforms possess a common region of 492 amino acids in their C-terminus region and have an isoform specific N-terminus. To determine the distinct function of each isoforms, we have localised the isoforms within the cells using immunofluorescence microscopy and used a quantitative proteomics approach (SILAC) to identify specific protein interaction partners for each isoforms. Localization studies showed a different subcellular distribution for the different isoforms, with the first isoform being nuclear, while the other two isoforms have distinct cytoplasmic and nuclear location. We found that the different NudCD1 isoforms have unique interacting partners, with the first isoform binding to a putative RNA helicase named DHX15 involved in mRNA splicing.
Collapse
Affiliation(s)
- Patrick Asselin-Mullen
- Department of Anatomy and Cell Biology, Université de Sherbrooke, 3201 Jean-Mignault, Sherbrooke, Québec, J1E 4K8, Canada
| | - Anaïs Chauvin
- Department of Anatomy and Cell Biology, Université de Sherbrooke, 3201 Jean-Mignault, Sherbrooke, Québec, J1E 4K8, Canada
| | - Marie-Line Dubois
- Department of Anatomy and Cell Biology, Université de Sherbrooke, 3201 Jean-Mignault, Sherbrooke, Québec, J1E 4K8, Canada
| | - Romain Drissi
- Department of Anatomy and Cell Biology, Université de Sherbrooke, 3201 Jean-Mignault, Sherbrooke, Québec, J1E 4K8, Canada
| | - Dominique Lévesque
- Department of Anatomy and Cell Biology, Université de Sherbrooke, 3201 Jean-Mignault, Sherbrooke, Québec, J1E 4K8, Canada
| | - François-Michel Boisvert
- Department of Anatomy and Cell Biology, Université de Sherbrooke, 3201 Jean-Mignault, Sherbrooke, Québec, J1E 4K8, Canada.
| |
Collapse
|
19
|
Liu Y, Gonzàlez-Porta M, Santos S, Brazma A, Marioni JC, Aebersold R, Venkitaraman AR, Wickramasinghe VO. Impact of Alternative Splicing on the Human Proteome. Cell Rep 2017; 20:1229-1241. [PMID: 28768205 PMCID: PMC5554779 DOI: 10.1016/j.celrep.2017.07.025] [Citation(s) in RCA: 124] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2017] [Revised: 06/02/2017] [Accepted: 07/12/2017] [Indexed: 02/02/2023] Open
Abstract
Alternative splicing is a critical determinant of genome complexity and, by implication, is assumed to engender proteomic diversity. This notion has not been experimentally tested in a targeted, quantitative manner. Here, we have developed an integrative approach to ask whether perturbations in mRNA splicing patterns alter the composition of the proteome. We integrate RNA sequencing (RNA-seq) (to comprehensively report intron retention, differential transcript usage, and gene expression) with a data-independent acquisition (DIA) method, SWATH-MS (sequential window acquisition of all theoretical spectra-mass spectrometry), to capture an unbiased, quantitative snapshot of the impact of constitutive and alternative splicing events on the proteome. Whereas intron retention is accompanied by decreased protein abundance, alterations in differential transcript usage and gene expression alter protein abundance proportionate to transcript levels. Our findings illustrate how RNA splicing links isoform expression in the human transcriptome with proteomic diversity and provides a foundation for studying perturbations associated with human diseases.
Collapse
Affiliation(s)
- Yansheng Liu
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Mar Gonzàlez-Porta
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Sergio Santos
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Alvis Brazma
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - John C Marioni
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
| | - Ashok R Venkitaraman
- The Medical Research Council Cancer Unit, University of Cambridge, Cambridge CB2 0XZ, UK.
| | - Vihandha O Wickramasinghe
- The Medical Research Council Cancer Unit, University of Cambridge, Cambridge CB2 0XZ, UK; RNA Biology and Cancer Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC 3000, Australia.
| |
Collapse
|
20
|
Sheynkman GM, Shortreed MR, Cesnik AJ, Smith LM. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation. ANNUAL REVIEW OF ANALYTICAL CHEMISTRY (PALO ALTO, CALIF.) 2016; 9:521-45. [PMID: 27049631 PMCID: PMC4991544 DOI: 10.1146/annurev-anchem-071015-041722] [Citation(s) in RCA: 81] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Mass spectrometry-based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.
Collapse
Affiliation(s)
- Gloria M Sheynkman
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts 02215;
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
| | - Michael R Shortreed
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
| | - Anthony J Cesnik
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
- Genome Center of Wisconsin, University of Wisconsin, Madison, Wisconsin 53706;
| |
Collapse
|
21
|
Trevisiol S, Ayoub D, Lesur A, Ancheva L, Gallien S, Domon B. The use of proteases complementary to trypsin to probe isoforms and modifications. Proteomics 2016; 16:715-28. [DOI: 10.1002/pmic.201500379] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Revised: 11/06/2015] [Accepted: 12/08/2015] [Indexed: 12/15/2022]
Affiliation(s)
- Stéphane Trevisiol
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| | - Daniel Ayoub
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| | - Antoine Lesur
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| | - Lina Ancheva
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| | - Sébastien Gallien
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| | - Bruno Domon
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| |
Collapse
|
22
|
Kliuchnikova A, Kuznetsova K, Moshkovskii S. ADAR-mediated messenger RNA editing: analysis at the proteome level. ACTA ACUST UNITED AC 2016; 62:510-519. [DOI: 10.18097/pbmc20166205510] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Post-transcriptional RNA editing by RNA specific adenosine deaminases (ADAR) was discovered more than two decades ago. It provides additional regulation of animal and human transcriptome. In most cases, it occurs in nervous tissue, where, as a result of the reaction, adenosine is converted to inosine in particular sites of RNA. In case of messenger RNA, during translation, inosine is recognized as guanine leading to amino acid substitutions. Those substitutions are shown to affect substantially the function of proteins, e.g. subunits of the glutamate receptor. Nevertheless, most of the works on RNA editing use analysis of nucleic acids, even those which deal with a coding RNA. In this review, we propose the use of shotgun proteomics based on high resolution liquid chromatography and mass spectrometry for investigation of the effects of RNA editing at the protein level. Recently developed methods of big data processing allow combining the results of various omics techniques, being referred to as proteogenomics. The proposed proteogenomic approach for the analysis of RNA editing at the protein level will directly conduct a qualitative and quantitative analysis of protein edited sequences in the scale of whole proteome.
Collapse
Affiliation(s)
| | | | - S.A. Moshkovskii
- Institute of Biomedical Chemistry, Moscow, Russia; Pirogov Russian National Research Medical University, Moscow, Russia
| |
Collapse
|
23
|
Hao Y, Colak R, Teyra J, Corbi-Verge C, Ignatchenko A, Hahne H, Wilhelm M, Kuster B, Braun P, Kaida D, Kislinger T, Kim PM. Semi-supervised Learning Predicts Approximately One Third of the Alternative Splicing Isoforms as Functional Proteins. Cell Rep 2015; 12:183-9. [PMID: 26146086 DOI: 10.1016/j.celrep.2015.06.031] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Revised: 02/18/2015] [Accepted: 06/09/2015] [Indexed: 12/30/2022] Open
Abstract
Alternative splicing acts on transcripts from almost all human multi-exon genes. Notwithstanding its ubiquity, fundamental ramifications of splicing on protein expression remain unresolved. The number and identity of spliced transcripts that form stably folded proteins remain the sources of considerable debate, due largely to low coverage of experimental methods and the resulting absence of negative data. We circumvent this issue by developing a semi-supervised learning algorithm, positive unlabeled learning for splicing elucidation (PULSE; http://www.kimlab.org/software/pulse), which uses 48 features spanning various categories. We validated its accuracy on sets of bona fide protein isoforms and directly on mass spectrometry (MS) spectra for an overall AU-ROC of 0.85. We predict that around 32% of "exon skipping" alternative splicing events produce stable proteins, suggesting that the process engenders a significant number of previously uncharacterized proteins. We also provide insights into the distribution of positive isoforms in various functional classes and into the structural effects of alternative splicing.
Collapse
Affiliation(s)
- Yanqi Hao
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON M5S 1AS, Canada; Department of Computer Science, University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Recep Colak
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON M5S 1AS, Canada; Department of Computer Science, University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Joan Teyra
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON M5S 1AS, Canada
| | - Carles Corbi-Verge
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON M5S 1AS, Canada
| | - Alexander Ignatchenko
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Hannes Hahne
- Chair for Proteomics and Bioanalytics, TU Muenchen, Freising 85354, Germany
| | - Mathias Wilhelm
- Chair for Proteomics and Bioanalytics, TU Muenchen, Freising 85354, Germany
| | - Bernhard Kuster
- Chair for Proteomics and Bioanalytics, TU Muenchen, Freising 85354, Germany; German Cancer Consortium (DKTK), Munich, Germany; German Cancer Research Center (DKFZ), Heidelberg, Germany; Center for Integrated Protein Science Munich, Munich, Germany; Bavarian Biomolecular Mass Spectrometry Center, Technische Universität München, Freising, Germany
| | - Pascal Braun
- Lehrstuhl fuer Systembiologie der Pflanzen, TU Muenchen, Munich, Germany
| | - Daisuke Kaida
- Frontier Research Core for Life Sciences, University of Toyama, Toyama 930-8555, Japan
| | - Thomas Kislinger
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada; Princess Margaret Cancer Center, University Health Network, Toronto, ON M5T 2M9, Canada
| | - Philip M Kim
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON M5S 1AS, Canada; Department of Computer Science, University of Toronto, Toronto, ON M5S 3G4, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1AS, Canada.
| |
Collapse
|
24
|
Tay AP, Pang CNI, Twine NA, Hart-Smith G, Harkness L, Kassem M, Wilkins MR. Proteomic Validation of Transcript Isoforms, Including Those Assembled from RNA-Seq Data. J Proteome Res 2015; 14:3541-54. [PMID: 25961807 DOI: 10.1021/pr5011394] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Human proteome analysis now requires an understanding of protein isoforms. We recently published the PG Nexus pipeline, which facilitates high confidence validation of exons and splice junctions by integrating genomics and proteomics data. Here we comprehensively explore how RNA-seq transcriptomics data, and proteomic analysis of the same sample, can identify protein isoforms. RNA-seq data from human mesenchymal (hMSC) stem cells were analyzed with our new TranscriptCoder tool to generate a database of protein isoform sequences. MS/MS data from matching hMSC samples were then matched against the TranscriptCoder-derived database, along with Ensembl and the neXtProt database. Querying the TranscriptCoder-derived or Ensembl database could unambiguously identify ∼450 protein isoforms, with isoform-specific proteotypic peptides, including candidate hMSC-specific isoforms for the genes DPYSL2 and FXR1. Where isoform-specific peptides did not exist, groups of nonisoform-specific proteotypic peptides could specifically identify many isoforms. In both the above cases, isoforms will be detectable with targeted MS/MS assays. Unfortunately, our analysis also revealed that some isoforms will be difficult to identify unambiguously as they do not have peptides that are sufficiently distinguishing. We covisualize mRNA isoforms and peptides in a genome browser to illustrate the above situations. Mass spectrometry data is available via ProteomeXchange (PXD001449).
Collapse
Affiliation(s)
- Aidan P Tay
- Systems Biology Initiative, The University of New South Wales , Sydney, New South Wales 2052, Australia.,School of Biotechnology and Biomolecular Sciences, The University of New South Wales , Sydney, New South Wales 2052, Australia
| | - Chi Nam Ignatius Pang
- Systems Biology Initiative, The University of New South Wales , Sydney, New South Wales 2052, Australia.,School of Biotechnology and Biomolecular Sciences, The University of New South Wales , Sydney, New South Wales 2052, Australia
| | - Natalie A Twine
- Systems Biology Initiative, The University of New South Wales , Sydney, New South Wales 2052, Australia.,School of Biotechnology and Biomolecular Sciences, The University of New South Wales , Sydney, New South Wales 2052, Australia
| | - Gene Hart-Smith
- Systems Biology Initiative, The University of New South Wales , Sydney, New South Wales 2052, Australia.,School of Biotechnology and Biomolecular Sciences, The University of New South Wales , Sydney, New South Wales 2052, Australia
| | - Linda Harkness
- Endocrine Research Laboratory (KMEB), Department of Endocrinology and Metabolism, Odense University Hospital & University of Southern Denmark , Odense 5230, Denmark
| | - Moustapha Kassem
- Endocrine Research Laboratory (KMEB), Department of Endocrinology and Metabolism, Odense University Hospital & University of Southern Denmark , Odense 5230, Denmark
| | - Marc R Wilkins
- Systems Biology Initiative, The University of New South Wales , Sydney, New South Wales 2052, Australia.,School of Biotechnology and Biomolecular Sciences, The University of New South Wales , Sydney, New South Wales 2052, Australia
| |
Collapse
|
25
|
Abstract
The high degree of protein sequence similarity in the MUPs (major urinary proteins) poses considerable challenges for their individual differentiation, analysis and quantification. In the present review, we discuss MS approaches for MUP quantification, at either the protein or the peptide level. In particular, we describe an approach to multiplexed quantification based on the design and synthesis of novel proteins (QconCATs) that are concatamers of quantification standards, providing a simple route to the generation of a set of stable-isotope-labelled peptide standards. The MUPs pose a particular challenge to QconCAT design, because of their sequence similarity and the limited number of peptides that can be used to construct the standards. Such difficulties can be overcome by careful attention to the analytical workflow.
Collapse
|
26
|
Jorrín-Novo JV, Pascual J, Sánchez-Lucas R, Romero-Rodríguez MC, Rodríguez-Ortega MJ, Lenz C, Valledor L. Fourteen years of plant proteomics reflected in Proteomics: moving from model species and 2DE-based approaches to orphan species and gel-free platforms. Proteomics 2015; 15:1089-112. [PMID: 25487722 DOI: 10.1002/pmic.201400349] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2014] [Revised: 10/23/2014] [Accepted: 12/04/2014] [Indexed: 12/21/2022]
Abstract
In this article, the topic of plant proteomics is reviewed based on related papers published in the journal Proteomics since publication of the first issue in 2001. In total, around 300 original papers and 41 reviews published in Proteomics between 2000 and 2014 have been surveyed. Our main objective for this review is to help bridge the gap between plant biologists and proteomics technologists, two often very separate groups. Over the past years a number of reviews on plant proteomics have been published . To avoid repetition we have focused on more recent literature published after 2010, and have chosen to rather make continuous reference to older publications. The use of the latest proteomics techniques and their integration with other approaches in the "systems biology" direction are discussed more in detail. Finally we comment on the recent history, state of the art, and future directions of plant proteomics, using publications in Proteomics to illustrate the progress in the field. The review is organized into two major blocks, the first devoted to provide an overview of experimental systems (plants, plant organs, biological processes) and the second one to the methodology.
Collapse
Affiliation(s)
- Jesus V Jorrín-Novo
- Agroforestry and Plant Biochemistry and Proteomics Research Group, Department of Biochemistry and Molecular Biology, University of Cordoba-CeiA3, Cordoba, Spain
| | | | | | | | | | | | | |
Collapse
|
27
|
Kroll JE, de Souza SJ, de Souza GA. Identification of rare alternative splicing events in MS/MS data reveals a significant fraction of alternative translation initiation sites. PeerJ 2014; 2:e673. [PMID: 25405079 PMCID: PMC4232841 DOI: 10.7717/peerj.673] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Accepted: 10/30/2014] [Indexed: 01/08/2023] Open
Abstract
Integration of transcriptome data is a crucial step for the identification of rare protein variants in mass-spectrometry (MS) data with important consequences for all branches of biotechnology research. Here, we used Splooce, a database of splicing variants recently developed by us, to search MS data derived from a variety of human tumor cell lines. More than 800 new protein variants were identified whose corresponding MS spectra were specific to protein entries from Splooce. Although the types of splicing variants (exon skipping, alternative splice sites and intron retention) were found at the same frequency as in the transcriptome, we observed a large variety of modifications at the protein level induced by alternative splicing events. Surprisingly, we found that 40% of all protein modifications induced by alternative splicing led to the use of alternative translation initiation sites. Other modifications include frameshifts in the open reading frame and inclusion or deletion of peptide sequences. To make the dataset generated here available to the community in a more effective form, the Splooce portal (http://www.bioinformatics-brazil.org/splooce) was modified to report the alternative splicing events supported by MS data.
Collapse
Affiliation(s)
- José E Kroll
- Institute of Bioinformatics and Biotechnology , Natal , Brazil ; Brain Institute, UFRN , Natal , Brazil
| | | | - Gustavo A de Souza
- Department of Immunology and Centre for Immune Regulation, Oslo University Hospital HF Rikshospitalet, University of Oslo , Oslo , Norway
| |
Collapse
|
28
|
Tavares R, de Miranda Scherer N, Pauletti BA, Araújo E, Folador EL, Espindola G, Ferreira CG, Paes Leme AF, de Oliveira PSL, Passetti F. SpliceProt: A protein sequence repository of predicted human splice variants. Proteomics 2014; 14:181-5. [DOI: 10.1002/pmic.201300078] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2013] [Revised: 10/03/2013] [Accepted: 11/06/2013] [Indexed: 12/22/2022]
Affiliation(s)
- Raphael Tavares
- Bioinformatics Unit; Clinical Research Coordination; Instituto Nacional de Câncer (INCA); Rio de Janeiro Brazil
| | - Nicole de Miranda Scherer
- Bioinformatics Unit; Clinical Research Coordination; Instituto Nacional de Câncer (INCA); Rio de Janeiro Brazil
| | - Bianca Alves Pauletti
- Laboratório de Espectrometria de Massas; Laboratório Nacional de Biociências (LNBio); CNPEM; Campinas Brazil
| | - Elói Araújo
- Faculdade de Computação; Universidade Federal de Mato Grosso do Sul; Campo Grande Brazil
| | - Edson Luiz Folador
- Bioinformatics Unit; Clinical Research Coordination; Instituto Nacional de Câncer (INCA); Rio de Janeiro Brazil
| | - Gabriel Espindola
- Bioinformatics Unit; Clinical Research Coordination; Instituto Nacional de Câncer (INCA); Rio de Janeiro Brazil
| | - Carlos Gil Ferreira
- Clinical Research Coordination; Instituto Nacional de Câncer (INCA); Rio de Janeiro Brazil
| | - Adriana Franco Paes Leme
- Laboratório de Espectrometria de Massas; Laboratório Nacional de Biociências (LNBio); CNPEM; Campinas Brazil
| | | | - Fabio Passetti
- Bioinformatics Unit; Clinical Research Coordination; Instituto Nacional de Câncer (INCA); Rio de Janeiro Brazil
| |
Collapse
|
29
|
Horvatovich P, Franke L, Bischoff R. Proteomic studies related to genetic determinants of variability in protein concentrations. J Proteome Res 2013; 13:5-14. [PMID: 24237071 DOI: 10.1021/pr400765y] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Genetic variation has multiple effects on the proteome. It may influence the expression level of proteins, modify their sequences through single nucleotide polymorphisms, the occurrence of allelic variants, or alternative splicing (ASP) events. This perspective paper summarizes the major effects of genetic variability on protein expression and isoforms and provides an overview of proteomics techniques and methods that allow studying the effects of genetic variability at different levels of the proteome. The paper provides an overview of recent quantitative trait loci studies performed to explore the effect of genetic variation on protein expression (pQTL). Finally it gives a perspective view on advances in proteomics technology and the role of the Chromosome-Centric Human Proteome Project (C-HPP) by creating large-scale resources that may facilitate performing more comprehensive pQTL experiments in the future.
Collapse
Affiliation(s)
- Péter Horvatovich
- Analytical Biochemistry, Department of Pharmacy, University of Groningen , A. Deusinglaan 1, 9713 AV Groningen, The Netherlands
| | | | | |
Collapse
|
30
|
Pang CNI, Tay AP, Aya C, Twine NA, Harkness L, Hart-Smith G, Chia SZ, Chen Z, Deshpande NP, Kaakoush NO, Mitchell HM, Kassem M, Wilkins MR. Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing. J Proteome Res 2013; 13:84-98. [PMID: 24152167 DOI: 10.1021/pr400820p] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Direct links between proteomic and genomic/transcriptomic data are not frequently made, partly because of lack of appropriate bioinformatics tools. To help address this, we have developed the PG Nexus pipeline. The PG Nexus allows users to covisualize peptides in the context of genomes or genomic contigs, along with RNA-seq reads. This is done in the Integrated Genome Viewer (IGV). A Results Analyzer reports the precise base position where LC-MS/MS-derived peptides cover genes or gene isoforms, on the chromosomes or contigs where this occurs. In prokaryotes, the PG Nexus pipeline facilitates the validation of genes, where annotation or gene prediction is available, or the discovery of genes using a "virtual protein"-based unbiased approach. We illustrate this with a comprehensive proteogenomics analysis of two strains of Campylobacter concisus . For higher eukaryotes, the PG Nexus facilitates gene validation and supports the identification of mRNA splice junction boundaries and splice variants that are protein-coding. This is illustrated with an analysis of splice junctions covered by human phosphopeptides, and other examples of relevance to the Chromosome-Centric Human Proteome Project. The PG Nexus is open-source and available from https://github.com/IntersectAustralia/ap11_Samifier. It has been integrated into Galaxy and made available in the Galaxy tool shed.
Collapse
Affiliation(s)
- Chi Nam Ignatius Pang
- Systems Biology Initiative, The University of New South Wales , Sydney, New South Wales 2052, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Sheynkman GM, Shortreed MR, Frey BL, Smith LM. Discovery and mass spectrometric analysis of novel splice-junction peptides using RNA-Seq. Mol Cell Proteomics 2013; 12:2341-53. [PMID: 23629695 DOI: 10.1074/mcp.o113.028142] [Citation(s) in RCA: 104] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Human proteomic databases required for MS peptide identification are frequently updated and carefully curated, yet are still incomplete because it has been challenging to acquire every protein sequence from the diverse assemblage of proteoforms expressed in every tissue and cell type. In particular, alternative splicing has been shown to be a major source of this cell-specific proteomic variation. Many new alternative splice forms have been detected at the transcript level using next generation sequencing methods, especially RNA-Seq, but it is not known how many of these transcripts are being translated. Leveraging the unprecedented capabilities of next generation sequencing methods, we collected RNA-Seq and proteomics data from the same cell population (Jurkat cells) and created a bioinformatics pipeline that builds customized databases for the discovery of novel splice-junction peptides. Eighty million paired-end Illumina reads and ∼500,000 tandem mass spectra were used to identify 12,873 transcripts (19,320 including isoforms) and 6810 proteins. We developed a bioinformatics workflow to retrieve high-confidence, novel splice junction sequences from the RNA data, translate these sequences into the analogous polypeptide sequence, and create a customized splice junction database for MS searching. Based on the RefSeq gene models, we detected 136,123 annotated and 144,818 unannotated transcript junctions. Of those, 24,834 unannotated junctions passed various quality filters (e.g. minimum read depth) and these entries were translated into 33,589 polypeptide sequences and used for database searching. We discovered 57 splice junction peptides not present in the Uniprot-Trembl proteomic database comprising an array of different splicing events, including skipped exons, alternative donors and acceptors, and noncanonical transcriptional start sites. To our knowledge this is the first example of using sample-specific RNA-Seq data to create a splice-junction database and discover new peptides resulting from alternative splicing.
Collapse
Affiliation(s)
- Gloria M Sheynkman
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Ave., Madison, Wisconsin 53706, USA
| | | | | | | |
Collapse
|
32
|
Zgoda VG, Kopylov AT, Tikhonova OV, Moisa AA, Pyndyk NV, Farafonova TE, Novikova SE, Lisitsa AV, Ponomarenko EA, Poverennaya EV, Radko SP, Khmeleva SA, Kurbatov LK, Filimonov AD, Bogolyubova NA, Ilgisonis EV, Chernobrovkin AL, Ivanov AS, Medvedev AE, Mezentsev YV, Moshkovskii SA, Naryzhny SN, Ilina EN, Kostrjukova ES, Alexeev DG, Tyakht AV, Govorun VM, Archakov AI. Chromosome 18 transcriptome profiling and targeted proteome mapping in depleted plasma, liver tissue and HepG2 cells. J Proteome Res 2012; 12:123-34. [PMID: 23256950 DOI: 10.1021/pr300821n] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The final goal of the Russian part of the Chromosome-centric Human Proteome Project (C-HPP) was established as the analysis of the chromosome 18 (Chr 18) protein complement in plasma, liver tissue and HepG2 cells with the sensitivity of 10(-18) M. Using SRM, we have recently targeted 277 Chr 18 proteins in plasma, liver, and HepG2 cells. On the basis of the results of the survey, the SRM assays were drafted for 250 proteins: 41 proteins were found only in the liver tissue, 82 proteins were specifically detected in depleted plasma, and 127 proteins were mapped in both samples. The targeted analysis of HepG2 cells was carried out for 49 proteins; 41 of them were successfully registered using ordinary SRM and 5 additional proteins were registered using a combination of irreversible binding of proteins on CN-Br Sepharose 4B with SRM. Transcriptome profiling of HepG2 cells performed by RNAseq and RT-PCR has shown a significant correlation (r = 0.78) for 42 gene transcripts. A pilot affinity-based interactome analysis was performed for cytochrome b5 using analytical and preparative optical biosensor fishing followed by MS analysis of the fished proteins. All of the data on the proteome complement of the Chr 18 have been integrated into our gene-centric knowledgebase ( www.kb18.ru ).
Collapse
Affiliation(s)
- Victor G Zgoda
- Orekhovich Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, Russia
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Casado-Vela J, Lacal JC, Elortza F. Protein chimerism: Novel source of protein diversity in humans adds complexity to bottom-up proteomics. Proteomics 2012; 13:5-11. [DOI: 10.1002/pmic.201200371] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2012] [Revised: 10/04/2012] [Accepted: 10/29/2012] [Indexed: 12/20/2022]
Affiliation(s)
- Juan Casado-Vela
- Centro Nacional de Biotecnología. Lab 115. Dpt. Biología Molecular y Celular; Spanish National Research Council (CSIC); 28049 Madrid Spain
| | - Juan Carlos Lacal
- Translational Oncology Unit; Instituto de Investigaciones Biomédicas ‘Alberto Sols’; Spanish National Research Council (CSIC-UAM); Madrid Spain
| | - Felix Elortza
- Proteomics Platform; CIC bioGUNE; CIBERehd, ProteoRed-ISCIII; Technology Park of Bizkaia; Derio Spain
| |
Collapse
|
34
|
Blakeley P, Overton IM, Hubbard SJ. Addressing statistical biases in nucleotide-derived protein databases for proteogenomic search strategies. J Proteome Res 2012; 11:5221-34. [PMID: 23025403 PMCID: PMC3703792 DOI: 10.1021/pr300411q] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Proteogenomics has the potential to advance genome annotation through high quality peptide identifications derived from mass spectrometry experiments, which demonstrate a given gene or isoform is expressed and translated at the protein level. This can advance our understanding of genome function, discovering novel genes and gene structure that have not yet been identified or validated. Because of the high-throughput shotgun nature of most proteomics experiments, it is essential to carefully control for false positives and prevent any potential misannotation. A number of statistical procedures to deal with this are in wide use in proteomics, calculating false discovery rate (FDR) and posterior error probability (PEP) values for groups and individual peptide spectrum matches (PSMs). These methods control for multiple testing and exploit decoy databases to estimate statistical significance. Here, we show that database choice has a major effect on these confidence estimates leading to significant differences in the number of PSMs reported. We note that standard target:decoy approaches using six-frame translations of nucleotide sequences, such as assembled transcriptome data, apparently underestimate the confidence assigned to the PSMs. The source of this error stems from the inflated and unusual nature of the six-frame database, where for every target sequence there exists five "incorrect" targets that are unlikely to code for protein. The attendant FDR and PEP estimates lead to fewer accepted PSMs at fixed thresholds, and we show that this effect is a product of the database and statistical modeling and not the search engine. A variety of approaches to limit database size and remove noncoding target sequences are examined and discussed in terms of the altered statistical estimates generated and PSMs reported. These results are of importance to groups carrying out proteogenomics, aiming to maximize the validation and discovery of gene structure in sequenced genomes, while still controlling for false positives.
Collapse
Affiliation(s)
- Paul Blakeley
- Faculty of Life Sciences, The University of Manchester, Manchester M13 9PT, UK
| | | | | |
Collapse
|
35
|
Fei SS, Wilmarth PA, Hitzemann RJ, McWeeney SK, Belknap JK, David LL. Protein database and quantitative analysis considerations when integrating genetics and proteomics to compare mouse strains. J Proteome Res 2011; 10:2905-12. [PMID: 21553863 PMCID: PMC3128464 DOI: 10.1021/pr200133p] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Decades of genetics research comparing mouse strains has identified many regions of the genome associated with quantitative traits. Microarrays have been used to identify which genes in those regions are differentially expressed and are therefore potentially causal; however, genetic variants that affect probe hybridization lead to many false conclusions. Here we used spectral counting to compare brain striata between two mouse strains. Using strain-specific protein databases, we concluded that proteomics was more robust to sequence differences than microarrays; however, some proteins were still significantly affected. To generate strain-specific databases, we used a complete database that contained all of the putative genetic isoforms for each protein. While the increased proteome coverage in the databases led to a 6.8% gain in peptide assignments compared to a nonredundant database, it also necessitated the development of a strategy for grouping similar proteins due to a large number of shared peptides. Of the 4563 identified proteins (2.1% FDR), there were 1807 quantifiable proteins/groups that exceeded minimum count cutoffs. With four pooled biological replicates per strain, we used quantile normalization, ComBat (a package that adjusts for batch effects), and edgeR (a package for differential expression analysis of count data) to identify 101 differentially expressed proteins/groups, 84 of which had a coding region within one of the genomic regions of interest identified by the Portland Alcohol Research Center.
Collapse
Affiliation(s)
- Suzanne S Fei
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, 3181 SW Sam Jackson Park Road, Portland, Oregon 97239, USA.
| | | | | | | | | | | |
Collapse
|
36
|
Leoni G, Le Pera L, Ferrè F, Raimondo D, Tramontano A. Coding potential of the products of alternative splicing in human. Genome Biol 2011; 12:R9. [PMID: 21251333 PMCID: PMC3091307 DOI: 10.1186/gb-2011-12-1-r9] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2010] [Revised: 12/17/2010] [Accepted: 01/20/2011] [Indexed: 12/22/2022] Open
Abstract
Background Analysis of the human genome has revealed that as much as an order of magnitude more of the genomic sequence is transcribed than accounted for by the predicted and characterized genes. A number of these transcripts are alternatively spliced forms of known protein coding genes; however, it is becoming clear that many of them do not necessarily correspond to a functional protein. Results In this study we analyze alternative splicing isoforms of human gene products that are unambiguously identified by mass spectrometry and compare their properties with those of isoforms of the same genes for which no peptide was found in publicly available mass spectrometry datasets. We analyze them in detail for the presence of uninterrupted functional domains, active sites as well as the plausibility of their predicted structure. We report how well each of these strategies and their combination can correctly identify translated isoforms and derive a lower limit for their specificity, that is, their ability to correctly identify non-translated products. Conclusions The most effective strategy for correctly identifying translated products relies on the conservation of active sites, but it can only be applied to a small fraction of isoforms, while a reasonably high coverage, sensitivity and specificity can be achieved by analyzing the presence of non-truncated functional domains. Combining the latter with an assessment of the plausibility of the modeled structure of the isoform increases both coverage and specificity with a moderate cost in terms of sensitivity.
Collapse
Affiliation(s)
- Guido Leoni
- Dipartimento di Scienze Biochimiche, Sapienza Università di Roma, P.le A. Moro, 5 - 00185 Rome, Italy
| | | | | | | | | |
Collapse
|
37
|
Casado-Vela J, Cebrián A, Gómez del Pulgar MT, Sánchez-López E, Vilaseca M, Menchén L, Diema C, Sellés-Marchart S, Martínez-Esteso MJ, Yubero N, Bru-Martínez R, Lacal JC. Lights and shadows of proteomic technologies for the study of protein species including isoforms, splicing variants and protein post-translational modifications. Proteomics 2011; 11:590-603. [DOI: 10.1002/pmic.201000287] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2010] [Revised: 08/13/2010] [Accepted: 08/23/2010] [Indexed: 01/12/2023]
|
38
|
Zhang C. Proteomic Studies on the Development of the Central Nervous System and Beyond. Neurochem Res 2010; 35:1487-500. [DOI: 10.1007/s11064-010-0218-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/11/2010] [Indexed: 11/27/2022]
|