1
|
Agrawal A, Saghatelian A. Identification of microproteins with transactivation activity by polyalanine motif selection. RSC Chem Biol 2025; 6:800-808. [PMID: 40083654 PMCID: PMC11898273 DOI: 10.1039/d4cb00277f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2024] [Accepted: 02/26/2025] [Indexed: 03/16/2025] Open
Abstract
Microproteins are an emerging class of proteins that are encoded by small open reading frames (smORFs) less than or equal to 100 amino acids. The functions of several microproteins have been illuminated through phenotypic screening or protein-protein interaction studies, but thousands of microproteins remain uncharacterized. The functional characterization of microproteins is challenging due to a lack of sequence homology. Here, we demonstrate a strategy to enrich microproteins that contain specific motifs as a means to more rapidly characterize microproteins. Specifically, we used the fact that polyalanine motifs are associated with nuclear proteins to select 58 candidate microproteins to screen for transactivation function. We identified three microproteins with transactivation activity when tested as GAL4-fusions in a cell-based luciferase assay. The results support the continued use of the motif selection strategy for the discovery of microprotein function.
Collapse
Affiliation(s)
- Archita Agrawal
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies La Jolla CA USA
| | - Alan Saghatelian
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies La Jolla CA USA
| |
Collapse
|
2
|
Smykal V, Tobita H, Dolezel D. Evolution of circadian clock and light-input pathway genes in Hemiptera. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2025; 180:104298. [PMID: 40058530 DOI: 10.1016/j.ibmb.2025.104298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2025] [Revised: 03/04/2025] [Accepted: 03/05/2025] [Indexed: 03/17/2025]
Abstract
Circadian clocks are timekeeping mechanisms that help organisms anticipate periodic alterations of day and night. These clocks are widespread, and in the case of animals, they rely on genetically related components. At the molecular level, the animal circadian clock consists of several interconnected transcription-translation feedback loops. Although the clock setup is generally conserved, some important differences exist even among various insect groups. Therefore, we decided to identify in silico all major clock components and closely related genes in Hemiptera. Our analyses indicate several lineage-specific alterations of the clock setup in Hemiptera, derived from gene losses observed in the complete gene set identified in the outgroup, Thysanoptera, which thus presents the insect lineage with a complete clock setup. Nilaparvata and Fulgoroidea, in general, lost the (6-4)-photolyase, while all Hemiptera lost FBXL3, and several lineage-specific losses of dCRY and jetlag were identified. Importantly, we identified non-canonical splicing variants of period and m-cry genes, which might provide another regulatory mechanism for clock functioning. Lastly, we performed a detailed reconstruction of Hemiptera's light input pathway genetic repertoire and explored the horizontal gene transfer of cryptochrome-DASH from plant to Bemisia. Altogether, this inventory reveals important trends in clock gene evolution and provides a reference for clock research in Hemiptera, including several lineages of important pest species.
Collapse
Affiliation(s)
- Vlastimil Smykal
- Biology Centre of the Czech Academy of Sciences, České Budějovice, 37005, Czech Republic.
| | - Hisashi Tobita
- Biology Centre of the Czech Academy of Sciences, České Budějovice, 37005, Czech Republic
| | - David Dolezel
- Biology Centre of the Czech Academy of Sciences, České Budějovice, 37005, Czech Republic.
| |
Collapse
|
3
|
Vasylieva V, Arefiev I, Bourassa F, Trifiro FA, Brunet MA. Proteomics Can Rise to the Challenge of Pseudogenes' Coding Nature. J Proteome Res 2024; 23:5233-5249. [PMID: 39486438 PMCID: PMC11629383 DOI: 10.1021/acs.jproteome.4c00116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 09/18/2024] [Accepted: 10/18/2024] [Indexed: 11/04/2024]
Abstract
Throughout the past decade, technological advances in genomics and transcriptomics have revealed pervasive translation throughout mammalian genomes. These putative proteins are usually excluded from proteomics analyses, as they are absent from common protein repositories. A sizable portion of these noncanonical proteins is translated from pseudogenes. Pseudogenes are commonly termed defective copies of coding genes unable to produce proteins. Here, we suggest that proteomics can help in their annotation. First, we define important terms and review specific examples underlining the caveats in pseudogene annotation and their coding potential. Then, we will discuss the challenges inherent to pseudogenes that have thus far rendered complex their confidence in omics data. Finally, we identify recent developments in experimental procedures, instrumentation, and computational methods in proteomics that put the field in a unique position to solve the pseudogene annotation conundrum.
Collapse
Affiliation(s)
- Valeriia Vasylieva
- Pediatrics
Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre
de Recherche du Centre hospitalier de l’université de
Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Ihor Arefiev
- Pediatrics
Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre
de Recherche du Centre hospitalier de l’université de
Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Francis Bourassa
- Pediatrics
Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre
de Recherche du Centre hospitalier de l’université de
Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Félix-Antoine Trifiro
- Pediatrics
Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre
de Recherche du Centre hospitalier de l’université de
Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Marie A. Brunet
- Pediatrics
Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre
de Recherche du Centre hospitalier de l’université de
Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| |
Collapse
|
4
|
Liu QR, Zhu M, Salekin F, McCoy BM, Kennedy V, Tian J, Mazucanti CH, Chia CW, Egan JM. An Insulin Upstream Open Reading Frame (INSU) Is Present in Skeletal Muscle Satellite Cells: Changes with Age. Cells 2024; 13:1903. [PMID: 39594651 PMCID: PMC11592829 DOI: 10.3390/cells13221903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 11/06/2024] [Accepted: 11/12/2024] [Indexed: 11/28/2024] Open
Abstract
Insulin resistance, stem cell dysfunction, and muscle fiber dystrophy are all age-related events in skeletal muscle (SKM). However, age-related changes in insulin isoforms and insulin receptors in myogenic progenitor satellite cells have not been studied. Since SKM is an extra-pancreatic tissue that does not express mature insulin, we investigated the levels of insulin receptors (INSRs) and a novel human insulin upstream open reading frame (INSU) at the mRNA, protein, and anatomical levels in Baltimore Longitudinal Study of Aging (BLSA) biopsied SKM samples of 27-89-year-old (yrs) participants. Using RT-qPCR and the MS-based selected reaction monitoring (SRM) assay, we found that the levels of INSR and INSU mRNAs and the proteins were positively correlated with the age of human SKM biopsies. We applied RNAscope fluorescence in situ hybridization (FISH) and immunofluorescence (IF) to SKM cryosections and found that INSR and INSU were co-localized with PAX7-labeled satellite cells, with enhanced expression in SKM sections from an 89 yrs old compared to a 27 yrs old. We hypothesized that the SKM aging process might induce compensatory upregulation of INSR and re-expression of INSU, which might be beneficial in early embryogenesis and have deleterious effects on proliferative and myogenic satellite cells with advanced age.
Collapse
Affiliation(s)
- Qing-Rong Liu
- Intramural Research Program, National Institute on Aging, National Institutes of Health, 251 Bayview Blvd, Baltimore, MD 21224, USA; (M.Z.); (B.M.M.); (J.T.); (C.H.M.); (C.W.C.); (J.M.E.)
| | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Salz R, Vorsteveld EE, van der Made CI, Kersten S, Stemerdink M, Riepe TV, Hsieh TH, Mhlanga M, Netea MG, Volders PJ, Hoischen A, ’t Hoen PA. Multi-omic profiling of pathogen-stimulated primary immune cells. iScience 2024; 27:110471. [PMID: 39091463 PMCID: PMC11293528 DOI: 10.1016/j.isci.2024.110471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/23/2024] [Accepted: 07/04/2024] [Indexed: 08/04/2024] Open
Abstract
We performed long-read transcriptome and proteome profiling of pathogen-stimulated peripheral blood mononuclear cells (PBMCs) from healthy donors to discover new transcript and protein isoforms expressed during immune responses to diverse pathogens. Long-read transcriptome profiling reveals novel sequences and isoform switching induced upon pathogen stimulation, including transcripts that are difficult to detect using traditional short-read sequencing. Widespread loss of intron retention occurs as a common result of all pathogen stimulations. We highlight novel transcripts of NFKB1 and CASP1 that may indicate novel immunological mechanisms. RNA expression differences did not result in differences in the amounts of secreted proteins. Clustering analysis of secreted proteins revealed a correlation between chemokine (receptor) expression on the RNA and protein levels in C. albicans- and poly(I:C)-stimulated PBMCs. Isoform aware long-read sequencing of pathogen-stimulated immune cells highlights the potential of these methods to identify novel transcripts, revealing a more complex transcriptome landscape than previously appreciated.
Collapse
Affiliation(s)
- Renee Salz
- Department of Medical BioSciences, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| | - Emil E. Vorsteveld
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| | - Caspar I. van der Made
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Internal Medicine and Radboud Centre for Infectious Diseases (RCI), Radboud University Medical Centre, 6525 GA Nijmegen, the Netherlands
| | - Simone Kersten
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| | - Merel Stemerdink
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Otorhinolaryngology, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| | - Tabea V. Riepe
- Department of Medical BioSciences, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| | - Tsung-han Hsieh
- Department of Cell Biology, Radboud University, 6500 HB Nijmegen, the Netherlands
| | - Musa Mhlanga
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Cell Biology, Radboud University, 6500 HB Nijmegen, the Netherlands
| | - Mihai G. Netea
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Internal Medicine and Radboud Centre for Infectious Diseases (RCI), Radboud University Medical Centre, 6525 GA Nijmegen, the Netherlands
| | - Pieter-Jan Volders
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Laboratory of Molecular Diagnostics, Department of Clinical Biology, Jessa Hospital, 3500 Hasselt, Belgium
| | - Alexander Hoischen
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Internal Medicine and Radboud Centre for Infectious Diseases (RCI), Radboud University Medical Centre, 6525 GA Nijmegen, the Netherlands
| | - Peter A.C. ’t Hoen
- Department of Medical BioSciences, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| |
Collapse
|
6
|
Rocha AL, Pai V, Perkins G, Chang T, Ma J, De Souza EV, Chu Q, Vaughan JM, Diedrich JK, Ellisman MH, Saghatelian A. An Inner Mitochondrial Membrane Microprotein from the SLC35A4 Upstream ORF Regulates Cellular Metabolism. J Mol Biol 2024; 436:168559. [PMID: 38580077 PMCID: PMC11292582 DOI: 10.1016/j.jmb.2024.168559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 03/29/2024] [Accepted: 03/31/2024] [Indexed: 04/07/2024]
Abstract
Upstream open reading frames (uORFs) are cis-acting elements that can dynamically regulate the translation of downstream ORFs by suppressing downstream translation under basal conditions and, in some cases, increasing downstream translation under stress conditions. Computational and empirical methods have identified uORFs in the 5'-UTRs of approximately half of all mouse and human transcripts, making uORFs one of the largest regulatory elements known. Because the prevailing dogma was that eukaryotic mRNAs produce a single functional protein, the peptides and small proteins, or microproteins, encoded by uORFs were rarely studied. We hypothesized that a uORF in the SLC35A4 mRNA is producing a functional microprotein (SLC35A4-MP) because of its conserved amino acid sequence. Through a series of biochemical and cellular experiments, we find that the 103-amino acid SLC35A4-MP is a single-pass transmembrane inner mitochondrial membrane (IMM) microprotein. The IMM contains the protein machinery crucial for cellular respiration and ATP generation, and loss of function studies with SLC35A4-MP significantly diminish maximal cellular respiration, indicating a vital role for this microprotein in cellular metabolism. The findings add SLC35A4-MP to the growing list of functional microproteins and, more generally, indicate that uORFs that encode conserved microproteins are an untapped reservoir of functional microproteins.
Collapse
Affiliation(s)
- Andréa L Rocha
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Victor Pai
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Guy Perkins
- National Center for Microscopy and Imaging Research, Center for Research in Biological Systems, Department of Neurosciences, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Tina Chang
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Jiao Ma
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Eduardo V De Souza
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Qian Chu
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Joan M Vaughan
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Jolene K Diedrich
- Mass Spectrometry Core for Proteomics and Metabolomics, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA, USA
| | - Mark H Ellisman
- National Center for Microscopy and Imaging Research, Center for Research in Biological Systems, Department of Neurosciences, School of Medicine, University of California San Diego, La Jolla, CA, USA.
| | - Alan Saghatelian
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA.
| |
Collapse
|
7
|
Qin Z, Yang J, Zhang K, Gao X, Ran Q, Xu Y, Wang Z, Lou D, Huang C, Zellmer L, Meng G, Chen N, Ma H, Wang Z, Liao DJ. Updating mRNA variants of the human RSK4 gene and their expression in different stressed situations. Heliyon 2024; 10:e27475. [PMID: 38560189 PMCID: PMC10980951 DOI: 10.1016/j.heliyon.2024.e27475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Revised: 02/11/2024] [Accepted: 02/29/2024] [Indexed: 04/04/2024] Open
Abstract
We determined RNA spectrum of the human RSK4 (hRSK4) gene (also called RPS6KA6) and identified 29 novel mRNA variants derived from alternative splicing, which, plus the NCBI-documented ones and the five we reported previously, totaled 50 hRSK4 RNAs that, by our bioinformatics analyses, encode 35 hRSK4 protein isoforms of 35-762 amino acids. Many of the mRNAs are bicistronic or tricistronic for hRSK4. The NCBI-normalized NM_014496.5 and the protein it encodes are designated herein as the Wt-1 mRNA and protein, respectively, whereas the NM_001330512.1 and the long protein it encodes are designated as the Wt-2 mRNA and protein, respectively. Many of the mRNA variants responded differently to different situations of stress, including serum starvation, a febrile temperature, treatment with ethanol or ethanol-extracted clove buds (an herbal medicine), whereas the same stressed situation often caused quite different alterations among different mRNA variants in different cell lines. Mosifloxacin, an antibiotics and also a functional inhibitor of hRSK4, could inhibit the expression of certain hRSK4 mRNA variants. The hRSK4 gene likely uses alternative splicing as a handy tool to adapt to different stressed situations, and the mRNA and protein multiplicities may partly explain the incongruous literature on its expression and comports.
Collapse
Affiliation(s)
- Zhenwei Qin
- Section of Forensic Science and Pathology, School of Basic Medical Sciences, Guizhou University of Traditional Chinese Medicine, Dong-Qing-Nan Road, Guiyang, 550025, Guizhou Province, China
| | - Jianglin Yang
- Center for Clinical Laboratories, The Affiliated Hospital of Guizhou Medical University, 4 Beijing Rd, Guiyang, 550004, Guizhou Province, China
- Key Lab of Endemic and Ethnic Diseases of the Ministry of Education of China in Guizhou Medical University, Guiyang, 550004, Guizhou Province, China
| | - Keyin Zhang
- Department of Pathology, The Affiliated Hospital of Guizhou Medical University, 4 Beijing Road, Guiyang, 550004, Guizhou Province, China
| | - Xia Gao
- Department of Pathology, The Affiliated Hospital of Guizhou Medical University, 4 Beijing Road, Guiyang, 550004, Guizhou Province, China
| | - Qianchuan Ran
- Section of Forensic Science and Pathology, School of Basic Medical Sciences, Guizhou University of Traditional Chinese Medicine, Dong-Qing-Nan Road, Guiyang, 550025, Guizhou Province, China
| | - Yuanhong Xu
- Section of Forensic Science and Pathology, School of Basic Medical Sciences, Guizhou University of Traditional Chinese Medicine, Dong-Qing-Nan Road, Guiyang, 550025, Guizhou Province, China
| | - Zhi Wang
- Department of Pathology, The Affiliated Hospital of Guizhou Medical University, 4 Beijing Road, Guiyang, 550004, Guizhou Province, China
| | - Didong Lou
- Section of Forensic Science and Pathology, School of Basic Medical Sciences, Guizhou University of Traditional Chinese Medicine, Dong-Qing-Nan Road, Guiyang, 550025, Guizhou Province, China
| | - Chunhua Huang
- Section of Forensic Science and Pathology, School of Basic Medical Sciences, Guizhou University of Traditional Chinese Medicine, Dong-Qing-Nan Road, Guiyang, 550025, Guizhou Province, China
| | - Lucas Zellmer
- Department of Medicine, Hennepin County Medical Center, 730 South 8th St., Minneapolis, MN, 55415, USA
| | - Guangxue Meng
- Department of Oral and Maxillofacial Surgery, School of Stomatology, Guizhou Medical University, 9 Beijing Road, Guiyang, 550004, Guizhou Province, China
| | - Na Chen
- Department of Oral and Maxillofacial Surgery, School of Stomatology, Guizhou Medical University, 9 Beijing Road, Guiyang, 550004, Guizhou Province, China
| | - Hong Ma
- Department of Oral and Maxillofacial Surgery, School of Stomatology, Guizhou Medical University, 9 Beijing Road, Guiyang, 550004, Guizhou Province, China
| | - Zhe Wang
- State Key Laboratory of Cancer Biology, Department of Pathology, Xijing Hospital, Air Force Medical University, 169 Changle West Road, Xi'an, 710032, China
| | - Dezhong Joshua Liao
- Center for Clinical Laboratories, The Affiliated Hospital of Guizhou Medical University, 4 Beijing Rd, Guiyang, 550004, Guizhou Province, China
- Key Lab of Endemic and Ethnic Diseases of the Ministry of Education of China in Guizhou Medical University, Guiyang, 550004, Guizhou Province, China
| |
Collapse
|
8
|
Mohsen JJ, Martel AA, Slavoff SA. Microproteins-Discovery, structure, and function. Proteomics 2023; 23:e2100211. [PMID: 37603371 PMCID: PMC10841188 DOI: 10.1002/pmic.202100211] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/03/2023] [Accepted: 08/10/2023] [Indexed: 08/22/2023]
Abstract
Advances in proteogenomic technologies have revealed hundreds to thousands of translated small open reading frames (sORFs) that encode microproteins in genomes across evolutionary space. While many microproteins have now been shown to play critical roles in biology and human disease, a majority of recently identified microproteins have little or no experimental evidence regarding their functionality. Computational tools have some limitations for analysis of short, poorly conserved microprotein sequences, so additional approaches are needed to determine the role of each member of this recently discovered polypeptide class. A currently underexplored avenue in the study of microproteins is structure prediction and determination, which delivers a depth of functional information. In this review, we provide a brief overview of microprotein discovery methods, then examine examples of microprotein structures (and, conversely, intrinsic disorder) that have been experimentally determined using crystallography, cryo-electron microscopy, and NMR, which provide insight into their molecular functions and mechanisms. Additionally, we discuss examples of predicted microprotein structures that have provided insight or context regarding their function. Analysis of microprotein structure at the angstrom level, and confirmation of predicted structures, therefore, has potential to identify translated microproteins that are of biological importance and to provide molecular mechanism for their in vivo roles.
Collapse
Affiliation(s)
- Jessica J. Mohsen
- Department of Chemistry, Yale University, New Haven, CT, USA
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Alina A. Martel
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Sarah A. Slavoff
- Department of Chemistry, Yale University, New Haven, CT, USA
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| |
Collapse
|
9
|
Harold C. All these screens that we've done: how functional genetic screens have informed our understanding of ribosome biogenesis. Biosci Rep 2023; 43:BSR20230631. [PMID: 37335083 PMCID: PMC10329186 DOI: 10.1042/bsr20230631] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Revised: 06/08/2023] [Accepted: 06/19/2023] [Indexed: 06/21/2023] Open
Abstract
Ribosome biogenesis is the complex and essential process that ultimately leads to the synthesis of cellular proteins. Understanding each step of this essential process is imperative to increase our understanding of basic biology, but also more critically, to provide novel therapeutic avenues for genetic and developmental diseases such as ribosomopathies and cancers which can arise when this process is impaired. In recent years, significant advances in technology have made identifying and characterizing novel human regulators of ribosome biogenesis via high-content, high-throughput screens. Additionally, screening platforms have been used to discover novel therapeutics for cancer. These screens have uncovered a wealth of knowledge regarding novel proteins involved in human ribosome biogenesis, from the regulation of the transcription of the ribosomal RNA to global protein synthesis. Specifically, comparing the discovered proteins in these screens showed interesting connections between large ribosomal subunit (LSU) maturation factors and earlier steps in ribosome biogenesis, as well as overall nucleolar integrity. In this review, a discussion of the current standing of screens for human ribosome biogenesis factors through the lens of comparing the datasets and discussing the biological implications of the areas of overlap will be combined with a look toward other technologies and how they can be adapted to discover more factors involved in ribosome synthesis, and answer other outstanding questions in the field.
Collapse
Affiliation(s)
- Cecelia M. Harold
- Department of Genetics, Yale School of Medicine, New Haven, CT, U.S.A
| |
Collapse
|
10
|
Chen Y, Cao X, Loh KH, Slavoff SA. Chemical labeling and proteomics for characterization of unannotated small and alternative open reading frame-encoded polypeptides. Biochem Soc Trans 2023; 51:1071-1082. [PMID: 37171061 PMCID: PMC10317152 DOI: 10.1042/bst20221074] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 03/27/2023] [Accepted: 04/13/2023] [Indexed: 05/13/2023]
Abstract
Thousands of unannotated small and alternative open reading frames (smORFs and alt-ORFs, respectively) have recently been revealed in mammalian genomes. While hundreds of mammalian smORF- and alt-ORF-encoded proteins (SEPs and alt-proteins, respectively) affect cell proliferation, the overwhelming majority of smORFs and alt-ORFs remain uncharacterized at the molecular level. Complicating the task of identifying the biological roles of smORFs and alt-ORFs, the SEPs and alt-proteins that they encode exhibit limited sequence homology to protein domains of known function. Experimental techniques for the functionalization of these gene classes are therefore required. Approaches combining chemical labeling and quantitative proteomics have greatly advanced our ability to identify and characterize functional SEPs and alt-proteins in high throughput. In this review, we briefly describe the principles of proteomic discovery of SEPs and alt-proteins, then summarize how these technologies interface with chemical labeling for identification of SEPs and alt-proteins with specific properties, as well as in defining the interactome of SEPs and alt-proteins.
Collapse
Affiliation(s)
- Yanran Chen
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
| | - Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT, U.S.A
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Ken H. Loh
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT, U.S.A
| | - Sarah A. Slavoff
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, U.S.A
| |
Collapse
|
11
|
Leblanc S, Brunet MA, Jacques JF, Lekehal AM, Duclos A, Tremblay A, Bruggeman-Gascon A, Samandi S, Brunelle M, Cohen AA, Scott MS, Roucou X. Newfound Coding Potential of Transcripts Unveils Missing Members of Human Protein Communities. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:515-534. [PMID: 36183975 PMCID: PMC10787177 DOI: 10.1016/j.gpb.2022.09.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 08/10/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
Recent proteogenomic approaches have led to the discovery that regions of the transcriptome previously annotated as non-coding regions [i.e., untranslated regions (UTRs), open reading frames overlapping annotated coding sequences in a different reading frame, and non-coding RNAs] frequently encode proteins, termed alternative proteins (altProts). This suggests that previously identified protein-protein interaction (PPI) networks are partially incomplete because altProts are not present in conventional protein databases. Here, we used the proteogenomic resource OpenProt and a combined spectrum- and peptide-centric analysis for the re-analysis of a high-throughput human network proteomics dataset, thereby revealing the presence of 261 altProts in the network. We found 19 genes encoding both an annotated (reference) and an alternative protein interacting with each other. Of the 117 altProts encoded by pseudogenes, 38 are direct interactors of reference proteins encoded by their respective parental genes. Finally, we experimentally validate several interactions involving altProts. These data improve the blueprints of the human PPI network and suggest functional roles for hundreds of altProts.
Collapse
Affiliation(s)
- Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Marie A Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Jean-François Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Amina M Lekehal
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Andréa Duclos
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Alexia Tremblay
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Alexis Bruggeman-Gascon
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Sondos Samandi
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Mylène Brunelle
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Alan A Cohen
- Department of Family Medicine, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Michelle S Scott
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada.
| |
Collapse
|
12
|
Inchingolo MA, Diman A, Adamczewski M, Humphreys T, Jaquier-Gubler P, Curran JA. TP53BP1, a dual-coding gene, uses promoter switching and translational reinitiation to express a smORF protein. iScience 2023; 26:106757. [PMID: 37216125 PMCID: PMC10193022 DOI: 10.1016/j.isci.2023.106757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 03/07/2023] [Accepted: 04/24/2023] [Indexed: 05/24/2023] Open
Abstract
The complexity of the metazoan proteome is significantly increased by the expression of small proteins (<100 aa) derived from smORFs within lncRNAs, uORFs, 3' UTRs and, reading frames overlapping the CDS. These smORF encoded proteins (SEPs) have diverse roles, ranging from the regulation of cellular physiological to essential developmental functions. We report the characterization of a new member of this protein family, SEP53BP1, derived from a small internal ORF that overlaps the CDS encoding 53BP1. Its expression is coupled to the utilization of an alternative, cell-type specific promoter coupled to translational reinitiation events mediated by a uORF in the alternative 5' TL of the mRNA. This uORF-mediated reinitiation at an internal ORF is also observed in zebrafish. Interactome studies indicate that the human SEP53BP1 associates with components of the protein turnover pathway including the proteasome, and the TRiC/CCT chaperonin complex, suggesting that it may play a role in cellular proteostasis.
Collapse
Affiliation(s)
- Marta A. Inchingolo
- Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Aurélie Diman
- Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Maxime Adamczewski
- Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Faculté de Médecine et Pharmacie, Université Grenoble Alpes, Grenoble, France
| | - Tom Humphreys
- Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| | - Pascale Jaquier-Gubler
- Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Joseph A. Curran
- Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Institute of Genetics and Genomics of Geneva (iGE3), University of Geneva, Geneva, Switzerland
| |
Collapse
|
13
|
Muraleedharan A, Vanderperre B. The endo-lysosomal system in Parkinson's disease: expanding the horizon. J Mol Biol 2023:168140. [PMID: 37148997 DOI: 10.1016/j.jmb.2023.168140] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 04/22/2023] [Accepted: 04/27/2023] [Indexed: 05/08/2023]
Abstract
Parkinson's disease (PD) is the second most common neurodegenerative disorder after Alzheimer's disease, and its prevalence is increasing with age. A wealth of genetic evidence indicates that the endo-lysosomal system is a major pathway driving PD pathogenesis with a growing number of genes encoding endo-lysosomal proteins identified as risk factors for PD, making it a promising target for therapeutic intervention. However, detailed knowledge and understanding of the molecular mechanisms linking these genes to the disease are available for only a handful of them (e.g. LRRK2, GBA1, VPS35). Taking on the challenge of studying poorly characterized genes and proteins can be daunting, due to the limited availability of tools and knowledge from previous literature. This review aims at providing a valuable source of molecular and cellular insights into the biology of lesser-studied PD-linked endo-lysosomal genes, to help and encourage researchers in filling the knowledge gap around these less popular genetic players. Specific endo-lysosomal pathways discussed range from endocytosis, sorting, and vesicular trafficking to the regulation of membrane lipids of these membrane-bound organelles and the specific enzymatic activities they contain. We also provide perspectives on future challenges that the community needs to tackle and propose approaches to move forward in our understanding of these poorly studied endo-lysosomal genes. This will help harness their potential in designing innovative and efficient treatments to ultimately re-establish neuronal homeostasis in PD but also other diseases involving endo-lysosomal dysfunction.
Collapse
Affiliation(s)
- Amitha Muraleedharan
- Centre d'Excellence en Recherche sur les Maladies Orphelines - Fondation Courtois and Biological Sciences Department, Université du Québec à Montréal
| | - Benoît Vanderperre
- Centre d'Excellence en Recherche sur les Maladies Orphelines - Fondation Courtois and Biological Sciences Department, Université du Québec à Montréal
| |
Collapse
|
14
|
Manuel JM, Guilloy N, Khatir I, Roucou X, Laurent B. Re-evaluating the impact of alternative RNA splicing on proteomic diversity. Front Genet 2023; 14:1089053. [PMID: 36845399 PMCID: PMC9947481 DOI: 10.3389/fgene.2023.1089053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 01/23/2023] [Indexed: 02/11/2023] Open
Abstract
Alternative splicing (AS) constitutes a mechanism by which protein-coding genes and long non-coding RNA (lncRNA) genes produce more than a single mature transcript. From plants to humans, AS is a powerful process that increases transcriptome complexity. Importantly, splice variants produced from AS can potentially encode for distinct protein isoforms which can lose or gain specific domains and, hence, differ in their functional properties. Advances in proteomics have shown that the proteome is indeed diverse due to the presence of numerous protein isoforms. For the past decades, with the help of advanced high-throughput technologies, numerous alternatively spliced transcripts have been identified. However, the low detection rate of protein isoforms in proteomic studies raised debatable questions on whether AS contributes to proteomic diversity and on how many AS events are really functional. We propose here to assess and discuss the impact of AS on proteomic complexity in the light of the technological progress, updated genome annotation, and current scientific knowledge.
Collapse
Affiliation(s)
- Jeru Manoj Manuel
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Noé Guilloy
- Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Inès Khatir
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada,Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC, Canada,Quebec Network for Research on Protein Function Structure and Engineering, PROTEO, Québec, QC, Canada
| | - Benoit Laurent
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada,*Correspondence: Benoit Laurent,
| |
Collapse
|
15
|
Cao X, Chen Y, Khitun A, Slavoff SA. BONCAT-based Profiling of Nascent Small and Alternative Open Reading Frame-encoded Proteins. Bio Protoc 2023; 13:e4585. [PMID: 36789088 PMCID: PMC9901453 DOI: 10.21769/bioprotoc.4585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 10/25/2022] [Accepted: 12/14/2022] [Indexed: 01/06/2023] Open
Abstract
RIBO-seq and proteogenomics have revealed that mammalian genomes harbor thousands of unannotated small and alternative open reading frames (smORFs, <100 amino acids, and alt-ORFs, >100 amino acids, respectively). Several dozen mammalian smORF-encoded proteins (SEPs) and alt-ORF-encoded proteins (alt-proteins) have been shown to play important biological roles, while the overwhelming majority of smORFs and alt-ORFs remain uncharacterized, particularly at the molecular level. Functional proteomics has the potential to reveal key properties of unannotated SEPs and alt-proteins in high throughput, and an approach to identify SEPs and alt-proteins undergoing regulated synthesis should be of broad utility. Here, we introduce a chemoproteomic pipeline based on bio-orthogonal non-canonical amino acid tagging (BONCAT) (Dieterich et al., 2006) to profile nascent SEPs and alt-proteins in human cells. This approach is able to identify cellular stress-induced and cell-cycle regulated SEPs and alt-proteins in cells. Graphical abstract Schematic overview of BONCAT-based chemoproteomic profiling of nascent, unannotated small and alternative open reading frame-encoded proteins (SEPs and alt-proteins).
Collapse
Affiliation(s)
- Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States,Institute of Biomolecular Design and Discovery, Yale University, West Haven, Connecticut 06516, United States
| | - Yanran Chen
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States,Institute of Biomolecular Design and Discovery, Yale University, West Haven, Connecticut 06516, United States
| | - Alexandra Khitun
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States,Institute of Biomolecular Design and Discovery, Yale University, West Haven, Connecticut 06516, United States
| | - Sarah A. Slavoff
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States,Institute of Biomolecular Design and Discovery, Yale University, West Haven, Connecticut 06516, United States,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, United States,*For correspondence:
| |
Collapse
|
16
|
Álvarez-Urdiola R, Borràs E, Valverde F, Matus JT, Sabidó E, Riechmann JL. Peptidomics Methods Applied to the Study of Flower Development. Methods Mol Biol 2023; 2686:509-536. [PMID: 37540375 DOI: 10.1007/978-1-0716-3299-4_24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/05/2023]
Abstract
Understanding the global and dynamic nature of plant developmental processes requires not only the study of the transcriptome, but also of the proteome, including its largely uncharacterized peptidome fraction. Recent advances in proteomics and high-throughput analyses of translating RNAs (ribosome profiling) have begun to address this issue, evidencing the existence of novel, uncharacterized, and possibly functional peptides. To validate the accumulation in tissues of sORF-encoded polypeptides (SEPs), the basic setup of proteomic analyses (i.e., LC-MS/MS) can be followed. However, the detection of peptides that are small (up to ~100 aa, 6-7 kDa) and novel (i.e., not annotated in reference databases) presents specific challenges that need to be addressed both experimentally and with computational biology resources. Several methods have been developed in recent years to isolate and identify peptides from plant tissues. In this chapter, we outline two different peptide extraction protocols and the subsequent peptide identification by mass spectrometry using the database search or the de novo identification methods.
Collapse
Affiliation(s)
- Raquel Álvarez-Urdiola
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Edifici CRAG, Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | - Eva Borràs
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Federico Valverde
- Institute for Plant Biochemistry and Photosynthesis CSIC - University of Seville, Seville, Spain
| | - José Tomás Matus
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Edifici CRAG, Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
- Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC, Paterna, Valencia, Spain
| | - Eduard Sabidó
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - José Luis Riechmann
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Edifici CRAG, Campus UAB, Cerdanyola del Vallès, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
17
|
Vasu K, Khan D, Ramachandiran I, Blankenberg D, Fox P. Analysis of nested alternate open reading frames and their encoded proteins. NAR Genom Bioinform 2022; 4:lqac076. [PMID: 36267124 PMCID: PMC9580016 DOI: 10.1093/nargab/lqac076] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 08/14/2022] [Accepted: 09/27/2022] [Indexed: 11/22/2022] Open
Abstract
Transcriptional and post-transcriptional mechanisms diversify the proteome beyond gene number, while maintaining a sequence relationship between original and altered proteins. A new mechanism breaks this paradigm, generating novel proteins by translating alternative open reading frames (Alt-ORFs) within canonical host mRNAs. Uniquely, ‘alt-proteins’ lack sequence homology with host ORF-derived proteins. We show global amino acid frequencies, and consequent biochemical characteristics of Alt-ORFs nested within host ORFs (nAlt-ORFs), are genetically-driven, and predicted by summation of frequencies of hundreds of encompassing host codon-pairs. Analysis of 101 human nAlt-ORFs of length ≥150 codons confirms the theoretical predictions, revealing an extraordinarily high median isoelectric point (pI) of 11.68, due to anomalous charged amino acid levels. Also, nAlt-ORF proteins exhibit a >2-fold preference for reading frame 2 versus 3, predicted mitochondrial and nuclear localization, and elevated codon adaptation index indicative of natural selection. Our results provide a theoretical and conceptual framework for exploration of these largely unannotated, but potentially significant, alternative ORFs and their encoded proteins.
Collapse
Affiliation(s)
- Kommireddy Vasu
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Debjit Khan
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Iyappan Ramachandiran
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Daniel Blankenberg
- Correspondence may also be addressed to Daniel Blankenberg. Tel: +1 216 444 4336;
| | - Paul L Fox
- To whom correspondence should be addressed. Tel: +1 216 444 8053; Fax: +1 216 444 9404;
| |
Collapse
|
18
|
Brunet MA, Leblanc S, Roucou X. OpenVar: functional annotation of variants in non-canonical open reading frames. Cell Biosci 2022; 12:130. [PMID: 35965322 PMCID: PMC9375913 DOI: 10.1186/s13578-022-00871-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 08/03/2022] [Indexed: 11/12/2022] Open
Abstract
Background Recent technological advances have revealed thousands of functional open reading frames (ORF) that have eluded reference genome annotations. These overlooked ORFs are found throughout the genome, in any reading frame of transcripts, mature or non-coding, and can overlap annotated ORFs in a different reading frame. The exploration of these novel ORFs in genomic datasets and of their role in genetic traits is hindered by a lack of software. Results Here, we present OpenVar, a genomic variant annotator that mends that gap and fosters meaningful discoveries. To illustrate the potential of OpenVar, we analysed all variants within SynMicDB, a database of cancer-associated synonymous mutations. By including non-canonical ORFs in the analysis, OpenVar yields a 33.6-fold, 13.8-fold and 8.3-fold increase in high impact variants over Annovar, SnpEff and VEP respectively. We highlighted an overlapping non-canonical ORF in the HEY2 gene where variants significantly clustered. Conclusions OpenVar integrates non-canonical ORFs in the analysis of genomic variants, unveiling new research avenues to better understand the genotype–phenotype relationships.
Collapse
|
19
|
Na Z, Dai X, Zheng SJ, Bryant CJ, Loh KH, Su H, Luo Y, Buhagiar AF, Cao X, Baserga SJ, Chen S, Slavoff SA. Mapping subcellular localizations of unannotated microproteins and alternative proteins with MicroID. Mol Cell 2022; 82:2900-2911.e7. [PMID: 35905735 PMCID: PMC9662605 DOI: 10.1016/j.molcel.2022.06.035] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 04/08/2022] [Accepted: 06/29/2022] [Indexed: 11/15/2022]
Abstract
Proteogenomic identification of translated small open reading frames has revealed thousands of previously unannotated, largely uncharacterized microproteins, or polypeptides of less than 100 amino acids, and alternative proteins (alt-proteins) that are co-encoded with canonical proteins and are often larger. The subcellular localizations of microproteins and alt-proteins are generally unknown but can have significant implications for their functions. Proximity biotinylation is an attractive approach to define the protein composition of subcellular compartments in cells and in animals. Here, we developed a high-throughput technology to map unannotated microproteins and alt-proteins to subcellular localizations by proximity biotinylation with TurboID (MicroID). More than 150 microproteins and alt-proteins are associated with subnuclear organelles. One alt-protein, alt-LAMA3, localizes to the nucleolus and functions in pre-rRNA transcription. We applied MicroID in a mouse model, validating expression of a conserved nuclear microprotein, and establishing MicroID for discovery of microproteins and alt-proteins in vivo.
Collapse
Affiliation(s)
- Zhenkun Na
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Xiaoyun Dai
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA; Systems Biology Institute, Yale University, West Haven, CT 06516, USA
| | - Shu-Jian Zheng
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Carson J Bryant
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06529, USA
| | - Ken H Loh
- Laboratory of Molecular Genetics, Howard Hughes Medical Institute, The Rockefeller University, New York, NY 10065, USA
| | - Haomiao Su
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Yang Luo
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Amber F Buhagiar
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06529, USA
| | - Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Susan J Baserga
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06529, USA; Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA; Department of Therapeutic Radiology, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Sidi Chen
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA; Systems Biology Institute, Yale University, West Haven, CT 06516, USA
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06529, USA.
| |
Collapse
|
20
|
Cao X, Khitun A, Harold CM, Bryant CJ, Zheng SJ, Baserga SJ, Slavoff SA. Nascent alt-protein chemoproteomics reveals a pre-60S assembly checkpoint inhibitor. Nat Chem Biol 2022; 18:643-651. [PMID: 35393574 PMCID: PMC9423127 DOI: 10.1038/s41589-022-01003-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 02/25/2022] [Indexed: 12/29/2022]
Abstract
Many unannotated microproteins and alternative proteins (alt-proteins) are coencoded with canonical proteins, but few of their functions are known. Motivated by the hypothesis that alt-proteins undergoing regulated synthesis could play important cellular roles, we developed a chemoproteomic pipeline to identify nascent alt-proteins in human cells. We identified 22 actively translated alt-proteins or N-terminal extensions, one of which is post-transcriptionally upregulated by DNA damage stress. We further defined a nucleolar, cell-cycle-regulated alt-protein that negatively regulates assembly of the pre-60S ribosomal subunit (MINAS-60). Depletion of MINAS-60 increases the amount of cytoplasmic 60S ribosomal subunit, upregulating global protein synthesis and cell proliferation. Mechanistically, MINAS-60 represses the rate of late-stage pre-60S assembly and export to the cytoplasm. Together, these results implicate MINAS-60 as a potential checkpoint inhibitor of pre-60S assembly and demonstrate that chemoproteomics enables hypothesis generation for uncharacterized alt-proteins.
Collapse
Affiliation(s)
- Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, CT, USA.,Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Alexandra Khitun
- Department of Chemistry, Yale University, New Haven, CT, USA.,Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Cecelia M Harold
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | - Carson J Bryant
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Shu-Jian Zheng
- Department of Chemistry, Yale University, New Haven, CT, USA.,Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Susan J Baserga
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA.,Department of Therapeutic Radiology, Yale University School of Medicine, New Haven, CT, USA
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, CT, USA. .,Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA. .,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA.
| |
Collapse
|
21
|
The dark proteome: translation from noncanonical open reading frames. Trends Cell Biol 2022; 32:243-258. [PMID: 34844857 PMCID: PMC8934435 DOI: 10.1016/j.tcb.2021.10.010] [Citation(s) in RCA: 96] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Revised: 10/26/2021] [Accepted: 10/29/2021] [Indexed: 02/07/2023]
Abstract
Omics-based technologies have revolutionized our understanding of the coding potential of the genome. In particular, these studies revealed widespread unannotated open reading frames (ORFs) throughout genomes and that these regions have the potential to encode novel functional (micro-)proteins and/or hold regulatory roles. However, despite their genomic prevalence, relatively few of these noncanonical ORFs have been functionally characterized, likely in part due to their under-recognition by the broader scientific community. The few that have been investigated in detail have demonstrated their essentiality in critical and divergent biological processes. As such, here we aim to discuss recent advances in understanding the diversity of noncanonical ORFs and their roles, as well as detail biologically important examples within the context of the mammalian genome.
Collapse
|
22
|
Abstract
Modern genome-scale methods that identify new genes, such as proteogenomics and ribosome profiling, have revealed, to the surprise of many, that overlap in genes, open reading frames and even coding sequences is widespread and functionally integrated into prokaryotic, eukaryotic and viral genomes. In parallel, the constraints that overlapping regions place on genome sequences and their evolution can be harnessed in bioengineering to build more robust synthetic strains and constructs. With a focus on overlapping protein-coding and RNA-coding genes, this Review examines their discovery, topology and biogenesis in the context of their genome biology. We highlight exciting new uses for sequence overlap to control translation, compress synthetic genetic constructs, and protect against mutation.
Collapse
|
23
|
Zhang K, Zhang J, Ding N, Zellmer L, Zhao Y, Liu S, Liao DJ. ACTB and GAPDH appear at multiple SDS-PAGE positions, thus not suitable as reference genes for determining protein loading in techniques like Western blotting. Open Life Sci 2021; 16:1278-1292. [PMID: 34966852 PMCID: PMC8669867 DOI: 10.1515/biol-2021-0130] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 10/21/2021] [Accepted: 11/01/2021] [Indexed: 11/19/2022] Open
Abstract
We performed polyacrylamide gel electrophoresis of human proteins with sodium dodecyl sulfate, isolated proteins at multiple positions, and then used liquid chromatography and tandem mass spectrometry (LC-MS/MS) to determine the protein identities. Although beta-actin (ACTB) and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) are 41.7 and 36 kDa proteins, respectively, LC-MS/MS identified their peptides at all the positions studied. The National Center for Biotechnology Information (USA) database lists only one ACTB mRNA but five GAPDH mRNAs and one noncoding RNA. The five GAPDH mRNAs encode three protein isoforms, while our bioinformatics analysis identified a 17.6 kDa isoform encoded by the noncoding RNA. All LC-MS/MS-identified GAPDH peptides at all positions studied are unique, but some of the identified ACTB peptides are shared by ACTC1, ACTBL2, POTEF, POTEE, POTEI, and POTEJ. ACTC1 and ACTBL2 belong to the ACT family with significant similarities to ACTB in protein sequence, whereas the four POTEs are ACTB-containing chimeric genes with the C-terminus of their proteins highly similar to the ACTB. These data lead us to conclude that GAPDH and ACTB are poor reference genes for determining the protein loading in such techniques as Western blotting, a leading role these two genes have been playing for decades in biomedical research.
Collapse
Affiliation(s)
- Keyin Zhang
- Department of Pathology, School of Clinical Medicine, Guizhou Medical University , Guiyang 550004 , Guizhou Province , People’s Republic of China
| | - Ju Zhang
- Beijing Key Laboratory of Emerging Infectious Diseases, Institute of Infectious Diseases, Beijing Ditan Hospital, Capital Medical University , Beijing 100015 , People’s Republic of China
| | - Nan Ding
- Beijing Key Laboratory of Emerging Infectious Diseases, Institute of Infectious Diseases, Beijing Ditan Hospital, Capital Medical University , Beijing 100015 , People’s Republic of China
| | - Lucas Zellmer
- Department of Medicine, Hennepin County Medical Center , 730 South 8th St. , Minneapolis , MN 55415 , United States of America
| | - Yan Zhao
- Key Lab of Endemic and Ethnic Diseases of the Ministry of Education of China in Guizhou Medical University , Guiyang 550004 , Guizhou Province , People’s Republic of China
| | - Siqi Liu
- Beijing Genomic Institute, Building 11 of Beishan Industrial Zone, Tantian District , Shengzhen 518083 , Guangdong Province , People’s Republic of China
| | - Dezhong Joshua Liao
- Department of Pathology, School of Clinical Medicine, Guizhou Medical University , Guiyang 550004 , Guizhou Province , People’s Republic of China
- Key Lab of Endemic and Ethnic Diseases of the Ministry of Education of China in Guizhou Medical University , Guiyang 550004 , Guizhou Province , People’s Republic of China
- Department of Clinical Biochemistry, Guizhou Medical University Hospital , Guiyang 550004 , Guizhou Province , People’s Republic of China
| |
Collapse
|
24
|
Abstract
Recent human activity has profoundly transformed Earth biomes on a scale and at rates that are unprecedented. Given the central role of symbioses in ecosystem processes, functions, and services throughout the Earth biosphere, the impacts of human-driven change on symbioses are critical to understand. Symbioses are not merely collections of organisms, but co-evolved partners that arise from the synergistic combination and action of different genetic programs. They function with varying degrees of permanence and selection as emergent units with substantial potential for combinatorial and evolutionary innovation in both structure and function. Following an articulation of operational definitions of symbiosis and related concepts and characteristics of the Anthropocene, we outline a basic typology of anthropogenic change (AC) and a conceptual framework for how AC might mechanistically impact symbioses with select case examples to highlight our perspective. We discuss surprising connections between symbiosis and the Anthropocene, suggesting ways in which new symbioses could arise due to AC, how symbioses could be agents of ecosystem change, and how symbioses, broadly defined, of humans and "farmed" organisms may have launched the Anthropocene. We conclude with reflections on the robustness of symbioses to AC and our perspective on the importance of symbioses as ecosystem keystones and the need to tackle anthropogenic challenges as wise and humble stewards embedded within the system.
Collapse
Affiliation(s)
- Erik F. Y. Hom
- Department of Biology and Center for Biodiversity and Conservation Research, University of Mississippi, University, MS 38677 USA
| | - Alexandra S. Penn
- Department of Sociology and Centre for Evaluation of Complexity Across the Nexus, University of Surrey, Guildford, Surrey, GU2 7XH UK
| |
Collapse
|
25
|
Brunet MA, Lekehal AM, Roucou X. How to Illuminate the Dark Proteome Using the Multi-omic OpenProt Resource. ACTA ACUST UNITED AC 2021; 71:e103. [PMID: 32780568 DOI: 10.1002/cpbi.103] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Ten of thousands of open reading frames (ORFs) are hidden within genomes. These alternative ORFs, or small ORFs, have eluded annotations because they are either small or within unsuspected locations. They are found in untranslated regions or overlap a known coding sequence in messenger RNA and anywhere in a "non-coding" RNA. Serendipitous discoveries have highlighted these ORFs' importance in biological functions and pathways. With their discovery came the need for deeper ORF annotation and large-scale mining of public repositories to gather supporting experimental evidence. OpenProt, accessible at https://openprot.org/, is the first proteogenomic resource enforcing a polycistronic model of annotation across an exhaustive transcriptome for 10 species. Moreover, OpenProt reports experimental evidence cumulated across a re-analysis of 114 mass spectrometry and 87 ribosome profiling datasets. The multi-omics OpenProt resource also includes the identification of predicted functional domains and evaluation of conservation for all predicted ORFs. The OpenProt web server provides two query interfaces and one genome browser. The query interfaces allow for exploration of the coding potential of genes or transcripts of interest as well as custom downloads of all information contained in OpenProt. © 2020 The Authors. Basic Protocol 1: Using the Search interface Basic Protocol 2: Using the Downloads interface.
Collapse
Affiliation(s)
- Marie A Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Québec, Canada
| | - Amina M Lekehal
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Québec, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Québec, Canada
| |
Collapse
|
26
|
Pavesi A. Origin, Evolution and Stability of Overlapping Genes in Viruses: A Systematic Review. Genes (Basel) 2021; 12:genes12060809. [PMID: 34073395 PMCID: PMC8227390 DOI: 10.3390/genes12060809] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 05/22/2021] [Accepted: 05/24/2021] [Indexed: 12/11/2022] Open
Abstract
During their long evolutionary history viruses generated many proteins de novo by a mechanism called “overprinting”. Overprinting is a process in which critical nucleotide substitutions in a pre-existing gene can induce the expression of a novel protein by translation of an alternative open reading frame (ORF). Overlapping genes represent an intriguing example of adaptive conflict, because they simultaneously encode two proteins whose freedom to change is constrained by each other. However, overlapping genes are also a source of genetic novelties, as the constraints under which alternative ORFs evolve can give rise to proteins with unusual sequence properties, most importantly the potential for novel functions. Starting with the discovery of overlapping genes in phages infecting Escherichia coli, this review covers a range of studies dealing with detection of overlapping genes in small eukaryotic viruses (genomic length below 30 kb) and recognition of their critical role in the evolution of pathogenicity. Origin of overlapping genes, what factors favor their birth and retention, and how they manage their inherent adaptive conflict are extensively reviewed. Special attention is paid to the assembly of overlapping genes into ad hoc databases, suitable for future studies, and to the development of statistical methods for exploring viral genome sequences in search of undiscovered overlaps.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area delle Scienze 23/A, I-43124 Parma, Italy
| |
Collapse
|
27
|
Ruiz Cuevas MV, Hardy MP, Hollý J, Bonneil É, Durette C, Courcelles M, Lanoix J, Côté C, Staudt LM, Lemieux S, Thibault P, Perreault C, Yewdell JW. Most non-canonical proteins uniquely populate the proteome or immunopeptidome. Cell Rep 2021; 34:108815. [PMID: 33691108 PMCID: PMC8040094 DOI: 10.1016/j.celrep.2021.108815] [Citation(s) in RCA: 126] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 01/29/2021] [Accepted: 02/10/2021] [Indexed: 12/16/2022] Open
Abstract
Combining RNA sequencing, ribosome profiling, and mass spectrometry, we elucidate the contribution of non-canonical translation to the proteome and major histocompatibility complex (MHC) class I immunopeptidome. Remarkably, of 14,498 proteins identified in three human B cell lymphomas, 2,503 are non-canonical proteins. Of these, 28% are novel isoforms and 72% are cryptic proteins encoded by ostensibly non-coding regions (60%) or frameshifted canonical genes (12%). Cryptic proteins are translated as efficiently as canonical proteins, have more predicted disordered residues and lower stability, and critically generate MHC-I peptides 5-fold more efficiently per translation event. Translating 5' "untranslated" regions hinders downstream translation of genes involved in transcription, translation, and antiviral responses. Novel protein isoforms show strong enrichment for signaling pathways deregulated in cancer. Only a small fraction of cryptic proteins detected in the proteome contribute to the MHC-I immunopeptidome, demonstrating the high preferential access of cryptic defective ribosomal products to the class I pathway.
Collapse
Affiliation(s)
- Maria Virginia Ruiz Cuevas
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada; Department of Biochemistry and Molecular Medicine, Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Marie-Pierre Hardy
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Jaroslav Hollý
- Cellular Biology Section, Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Éric Bonneil
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Chantal Durette
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Mathieu Courcelles
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Joël Lanoix
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Caroline Côté
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Louis M Staudt
- Lymphoid Malignancies Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Sébastien Lemieux
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada; Department of Biochemistry and Molecular Medicine, Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Pierre Thibault
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada; Department of Chemistry, Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Claude Perreault
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada; Department of Medicine, Université de Montréal, Montreal, QC H3C 3J7, Canada.
| | - Jonathan W Yewdell
- Cellular Biology Section, Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA.
| |
Collapse
|
28
|
Cohen AA, Leblanc S, Roucou X. Robust Physiological Metrics From Sparsely Sampled Networks. Front Physiol 2021; 12:624097. [PMID: 33643068 PMCID: PMC7902772 DOI: 10.3389/fphys.2021.624097] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 01/12/2021] [Indexed: 12/14/2022] Open
Abstract
Physiological and biochemical networks are highly complex, involving thousands of nodes as well as a hierarchical structure. True network structure is also rarely known. This presents major challenges for applying classical network theory to these networks. However, complex systems generally share the property of having a diffuse or distributed signal. Accordingly, we should predict that system state can be robustly estimated with sparse sampling, and with limited knowledge of true network structure. In this review, we summarize recent findings from several methodologies to estimate system state via a limited sample of biomarkers, notably Mahalanobis distance, principal components analysis, and cluster analysis. While statistically simple, these methods allow novel characterizations of system state when applied judiciously. Broadly, system state can often be estimated even from random samples of biomarkers. Furthermore, appropriate methods can detect emergent underlying physiological structure from this sparse data. We propose that approaches such as these are a powerful tool to understand physiology, and could lead to a new understanding and mapping of the functional implications of biological variation.
Collapse
Affiliation(s)
- Alan A. Cohen
- Groupe de Recherche PRIMUS, Département de Médecine de Famille et de Médecine d’Urgence, Université de Sherbrooke, Sherbrooke, QC, Canada
- Centre de Recherche, Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC, Canada
- Research Center on Aging, CIUSSS-de-l’Estrie-CHUS, Sherbrooke, QC, Canada
| | - Sebastien Leblanc
- Département de Biochimie et de Génomique Fonctionnelle, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Xavier Roucou
- Département de Biochimie et de Génomique Fonctionnelle, Université de Sherbrooke, Sherbrooke, QC, Canada
| |
Collapse
|
29
|
Neville MDC, Kohze R, Erady C, Meena N, Hayden M, Cooper DN, Mort M, Prabakaran S. A platform for curated products from novel open reading frames prompts reinterpretation of disease variants. Genome Res 2021; 31:327-336. [PMID: 33468550 PMCID: PMC7849405 DOI: 10.1101/gr.263202.120] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 08/26/2020] [Indexed: 11/29/2022]
Abstract
Recent evidence from proteomics and deep massively parallel sequencing studies have revealed that eukaryotic genomes contain substantial numbers of as-yet-uncharacterized open reading frames (ORFs). We define these uncharacterized ORFs as novel ORFs (nORFs). nORFs in humans are mostly under 100 codons and are found in diverse regions of the genome, including in long noncoding RNAs, pseudogenes, 3' UTRs, 5' UTRs, and alternative reading frames of canonical protein coding exons. There is therefore a pressing need to evaluate the potential functional importance of these unannotated transcripts and proteins in biological pathways and human disease on a larger scale, rather than one at a time. In this study, we outline the creation of a valuable nORFs data set with experimental evidence of translation for the community, use measures of heritability and selection that reveal signals for functional importance, and show the potential implications for functional interpretation of genetic variants in nORFs. Our results indicate that some variants that were previously classified as being benign or of uncertain significance may have to be reinterpreted.
Collapse
Affiliation(s)
- Matthew D C Neville
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Robin Kohze
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Chaitanya Erady
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Narendra Meena
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra 411008, India
| | - Matthew Hayden
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom
| | - Sudhakaran Prabakaran
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra 411008, India
- St Edmund's College, University of Cambridge, Cambridge CB3 0BN, United Kingdom
| |
Collapse
|
30
|
Gagnon M, Savard M, Jacques JF, Bkaily G, Geha S, Roucou X, Gobeil F. Potentiation of B2 receptor signaling by AltB2R, a newly identified alternative protein encoded in the human bradykinin B2 receptor gene. J Biol Chem 2021; 296:100329. [PMID: 33497625 PMCID: PMC7949122 DOI: 10.1016/j.jbc.2021.100329] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 01/12/2021] [Accepted: 01/21/2021] [Indexed: 12/27/2022] Open
Abstract
Recent functional and proteomic studies in eukaryotes (www.openprot.org) predict the translation of alternative open reading frames (AltORFs) in mature G-protein-coupled receptor (GPCR) mRNAs, including that of bradykinin B2 receptor (B2R). Our main objective was to determine the implication of a newly discovered AltORF resulting protein, termed AltB2R, in the known signaling properties of B2R using complementary methodological approaches. When ectopically expressed in HeLa cells, AltB2R presented predominant punctate cytoplasmic/perinuclear distribution and apparent cointeraction with B2R at plasma and endosomal/vesicular membranes. The presence of AltB2R increases intracellular [Ca2+] and ERK1/2-MAPK activation (via phosphorylation) following B2R stimulation. Moreover, HEK293A cells expressing mutant B2R lacking concomitant expression of AltB2R displayed significantly decreased maximal responses in agonist-stimulated Gαq-Gαi2/3-protein coupling, IP3 generation, and ERK1/2-MAPK activation as compared with wild-type controls. Conversely, there was no difference in cell-surface density as well as ligand-binding properties of B2R and in efficiencies of cognate agonists at promoting B2R internalization and β-arrestin 2 recruitment. Importantly, both AltB2R and B2R proteins were overexpressed in prostate and breast cancers, compared with their normal counterparts suggesting new associative roles of AltB2R in these diseases. Our study shows that BDKRB2 is a dual-coding gene and identifies AltB2R as a novel positive modulator of some B2R signaling pathways. More broadly, it also supports a new, unexpected alternative proteome for GPCRs, which opens new frontiers in fields of GPCR biology, diseases, and drug discovery.
Collapse
Affiliation(s)
- Maxime Gagnon
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada; Institute of Pharmacology, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Martin Savard
- Department of Pharmacology & Physiology, Université de Sherbrooke, Sherbrooke, Québec, Canada; Institute of Pharmacology, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Jean-François Jacques
- Department of Pharmacology & Physiology, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Ghassan Bkaily
- Department of Immunology & Cellular Biology, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Sameh Geha
- Department of Pathology, Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, Québec, Canada
| | - Xavier Roucou
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada; Institute of Pharmacology, Université de Sherbrooke, Sherbrooke, Québec, Canada.
| | - Fernand Gobeil
- Department of Pharmacology & Physiology, Université de Sherbrooke, Sherbrooke, Québec, Canada; Institute of Pharmacology, Université de Sherbrooke, Sherbrooke, Québec, Canada.
| |
Collapse
|
31
|
Alt-RPL36 downregulates the PI3K-AKT-mTOR signaling pathway by interacting with TMEM24. Nat Commun 2021; 12:508. [PMID: 33479206 PMCID: PMC7820019 DOI: 10.1038/s41467-020-20841-6] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Accepted: 12/21/2020] [Indexed: 12/11/2022] Open
Abstract
Thousands of human small and alternative open reading frames (smORFs and alt-ORFs, respectively) have recently been annotated. Many alt-ORFs are co-encoded with canonical proteins in multicistronic configurations, but few of their functions are known. Here, we report the detection of alt-RPL36, a protein co-encoded with human RPL36. Alt-RPL36 partially localizes to the endoplasmic reticulum, where it interacts with TMEM24, which transports the phosphatidylinositol 4,5-bisphosphate (PI(4,5)P2) precursor phosphatidylinositol from the endoplasmic reticulum to the plasma membrane. Knock-out of alt-RPL36 increases plasma membrane PI(4,5)P2 levels, upregulates PI3K-AKT-mTOR signaling, and increases cell size. Alt-RPL36 contains four phosphoserine residues, point mutations of which abolish interaction with TMEM24 and, consequently, alt-RPL36 effects on PI3K signaling and cell size. These results implicate alt-RPL36 as an upstream regulator of PI3K-AKT-mTOR signaling. More broadly, the RPL36 transcript encodes two sequence-independent polypeptides that co-regulate translation via different molecular mechanisms, expanding our knowledge of multicistronic human gene functions. Many alternative ORFs are co-encoded with characterized proteins, but their function is often not understood. Here, the authors discover that ribosomal protein L36 is co-encoded with alternative protein, which they identify as an upstream regulator of PI3K-AKT-mTOR signaling.
Collapse
|
32
|
Brunet MA, Lucier JF, Levesque M, Leblanc S, Jacques JF, Al-Saedi HRH, Guilloy N, Grenier F, Avino M, Fournier I, Salzet M, Ouangraoua A, Scott M, Boisvert FM, Roucou X. OpenProt 2021: deeper functional annotation of the coding potential of eukaryotic genomes. Nucleic Acids Res 2021; 49:D380-D388. [PMID: 33179748 PMCID: PMC7779043 DOI: 10.1093/nar/gkaa1036] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/15/2020] [Accepted: 10/16/2020] [Indexed: 12/12/2022] Open
Abstract
OpenProt (www.openprot.org) is the first proteogenomic resource supporting a polycistronic annotation model for eukaryotic genomes. It provides a deeper annotation of open reading frames (ORFs) while mining experimental data for supporting evidence using cutting-edge algorithms. This update presents the major improvements since the initial release of OpenProt. All species support recent NCBI RefSeq and Ensembl annotations, with changes in annotations being reported in OpenProt. Using the 131 ribosome profiling datasets re-analysed by OpenProt to date, non-AUG initiation starts are reported alongside a confidence score of the initiating codon. From the 177 mass spectrometry datasets re-analysed by OpenProt to date, the unicity of the detected peptides is controlled at each implementation. Furthermore, to guide the users, detectability statistics and protein relationships (isoforms) are now reported for each protein. Finally, to foster access to deeper ORF annotation independently of one's bioinformatics skills or computational resources, OpenProt now offers a data analysis platform. Users can submit their dataset for analysis and receive the results from the analysis by OpenProt. All data on OpenProt are freely available and downloadable for each species, the release-based format ensuring a continuous access to the data. Thus, OpenProt enables a more comprehensive annotation of eukaryotic genomes and fosters functional proteomic discoveries.
Collapse
Affiliation(s)
- Marie A Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Jean-François Lucier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Biology Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Maxime Levesque
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Biology Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Jean-Francois Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Hassan R H Al-Saedi
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Noé Guilloy
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Frederic Grenier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Biology Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Mariano Avino
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Isabelle Fournier
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Michel Salzet
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Aïda Ouangraoua
- Informatics Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Michelle S Scott
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - François-Michel Boisvert
- Department of Immunology and Cellular Biology, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| |
Collapse
|
33
|
Brunet MA, Jacques J, Nassari S, Tyzack GE, McGoldrick P, Zinman L, Jean S, Robertson J, Patani R, Roucou X. The FUS gene is dual-coding with both proteins contributing to FUS-mediated toxicity. EMBO Rep 2021; 22:e50640. [PMID: 33226175 PMCID: PMC7788448 DOI: 10.15252/embr.202050640] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 10/08/2020] [Accepted: 10/13/2020] [Indexed: 12/12/2022] Open
Abstract
Novel functional coding sequences (altORFs) are camouflaged within annotated ones (CDS) in a different reading frame. We show here that an altORF is nested in the FUS CDS, encoding a conserved 170 amino acid protein, altFUS. AltFUS is endogenously expressed in human tissues, notably in the motor cortex and motor neurons. Over-expression of wild-type FUS and/or amyotrophic lateral sclerosis-linked FUS mutants is known to trigger toxic mechanisms in different models. These include inhibition of autophagy, loss of mitochondrial potential and accumulation of cytoplasmic aggregates. We find that altFUS, not FUS, is responsible for the inhibition of autophagy, and pivotal in mitochondrial potential loss and accumulation of cytoplasmic aggregates. Suppression of altFUS expression in a Drosophila model of FUS-related toxicity protects against neurodegeneration. Some mutations found in ALS patients are overlooked because of their synonymous effect on the FUS protein. Yet, we show they exert a deleterious effect causing missense mutations in the overlapping altFUS protein. These findings demonstrate that FUS is a bicistronic gene and suggests that both proteins, FUS and altFUS, cooperate in toxic mechanisms.
Collapse
Affiliation(s)
- Marie A Brunet
- Department of Biochemistry and Functional GenomicsUniversité de SherbrookeSherbrookeQCCanada
- PROTEOQuebec Network for Research on Protein Function, Structure, and EngineeringQuebecQCCanada
| | - Jean‐Francois Jacques
- Department of Biochemistry and Functional GenomicsUniversité de SherbrookeSherbrookeQCCanada
- PROTEOQuebec Network for Research on Protein Function, Structure, and EngineeringQuebecQCCanada
| | - Sonya Nassari
- Immunology and Cell Biology DepartmentUniversité de SherbrookeSherbrookeQCCanada
| | - Giulia E Tyzack
- The Francis Crick InstituteLondonUK
- Department of Neuromuscular DiseasesUCL Queen Square Institute of NeurologyLondonUK
| | - Philip McGoldrick
- Tanz Centre for Research in Neurodegenerative DiseasesUniversity of TorontoTorontoONCanada
| | - Lorne Zinman
- Division of NeurologyDepartment of MedicineSunnybrook Health Sciences CentreUniversity of TorontoTorontoONCanada
| | - Steve Jean
- Immunology and Cell Biology DepartmentUniversité de SherbrookeSherbrookeQCCanada
| | - Janice Robertson
- Tanz Centre for Research in Neurodegenerative DiseasesUniversity of TorontoTorontoONCanada
| | - Rickie Patani
- The Francis Crick InstituteLondonUK
- Department of Neuromuscular DiseasesUCL Queen Square Institute of NeurologyLondonUK
| | - Xavier Roucou
- Department of Biochemistry and Functional GenomicsUniversité de SherbrookeSherbrookeQCCanada
- PROTEOQuebec Network for Research on Protein Function, Structure, and EngineeringQuebecQCCanada
| |
Collapse
|
34
|
Witkowski JM, Bryl E, Fulop T. Proteodynamics and aging of eukaryotic cells. Mech Ageing Dev 2021; 194:111430. [PMID: 33421431 DOI: 10.1016/j.mad.2021.111430] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Revised: 12/28/2020] [Accepted: 12/30/2020] [Indexed: 12/11/2022]
Abstract
All aspects of each protein existence in the eukaryotic cells, starting from the pre-translation events, through translation, multiple different post-translational modifications, functional life and eventual proteostatic removal after loss of functionality and changes in physico-chemical properties, can be collectively called the proteodynamics. With aging, passing of time as well as accumulating effects of exposures, interactions and wearing-off lead to problems at each of the above mentioned stages, eventually leading to general malfunction of the proteome. This work briefly reviews and summarizes current knowledge concerning this important topic.
Collapse
Affiliation(s)
- Jacek M Witkowski
- Department of Pathophysiology, Medical University of Gdańsk, Gdańsk, Poland.
| | - Ewa Bryl
- Department of Pathology and Experimental Rheumatology, Medical University of Gdańsk, Gdańsk, Poland
| | - Tamas Fulop
- Research Center on Aging, Graduate Program in Immunology, Faculty of Medicine and Health Sciences, University of Sherbrooke, Sherbrooke, Quebec, Canada
| |
Collapse
|
35
|
Dvorak P, Hlavac V, Soucek P. 5' Untranslated Region Elements Show High Abundance and Great Variability in Homologous ABCA Subfamily Genes. Int J Mol Sci 2020; 21:ijms21228878. [PMID: 33238634 PMCID: PMC7700387 DOI: 10.3390/ijms21228878] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 11/16/2020] [Accepted: 11/20/2020] [Indexed: 11/16/2022] Open
Abstract
The 12 members of the ABCA subfamily in humans are known for their ability to transport cholesterol and its derivatives, vitamins, and xenobiotics across biomembranes. Several ABCA genes are causatively linked to inborn diseases, and the role in cancer progression and metastasis is studied intensively. The regulation of translation initiation is implicated as the major mechanism in the processes of post-transcriptional modifications determining final protein levels. In the current bioinformatics study, we mapped the features of the 5' untranslated regions (5'UTR) known to have the potential to regulate translation, such as the length of 5'UTRs, upstream ATG codons, upstream open-reading frames, introns, RNA G-quadruplex-forming sequences, stem loops, and Kozak consensus motifs, in the DNA sequences of all members of the subfamily. Subsequently, the conservation of the features, correlations among them, ribosome profiling data as well as protein levels in normal human tissues were examined. The 5'UTRs of ABCA genes contain above-average numbers of upstream ATGs, open-reading frames and introns, as well as conserved ones, and these elements probably play important biological roles in this subfamily, unlike RG4s. Although we found significant correlations among the features, we did not find any correlation between the numbers of 5'UTR features and protein tissue distribution and expression scores. We showed the existence of single nucleotide variants in relation to the 5'UTR features experimentally in a cohort of 105 breast cancer patients. 5'UTR features presumably prepare a complex playground, in which the other elements such as RNA binding proteins and non-coding RNAs play the major role in the fine-tuning of protein expression.
Collapse
Affiliation(s)
- Pavel Dvorak
- Department of Biology, Faculty of Medicine in Pilsen, Charles University, 32300 Pilsen, Czech Republic
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, 32300 Pilsen, Czech Republic; (V.H.); (P.S.)
- Correspondence: ; Tel.: +420-377593263
| | - Viktor Hlavac
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, 32300 Pilsen, Czech Republic; (V.H.); (P.S.)
- Toxicogenomics Unit, National Institute of Public Health, 100 42 Prague, Czech Republic
| | - Pavel Soucek
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, 32300 Pilsen, Czech Republic; (V.H.); (P.S.)
- Toxicogenomics Unit, National Institute of Public Health, 100 42 Prague, Czech Republic
| |
Collapse
|
36
|
Xie C, Bekpen C, Künzel S, Keshavarz M, Krebs-Wheaton R, Skrabar N, Ullrich KK, Zhang W, Tautz D. Dedicated transcriptomics combined with power analysis lead to functional understanding of genes with weak phenotypic changes in knockout lines. PLoS Comput Biol 2020; 16:e1008354. [PMID: 33180766 PMCID: PMC7685438 DOI: 10.1371/journal.pcbi.1008354] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 11/24/2020] [Accepted: 09/20/2020] [Indexed: 12/26/2022] Open
Abstract
Systematic knockout studies in mice have shown that a large fraction of the gene replacements show no lethal or other overt phenotypes. This has led to the development of more refined analysis schemes, including physiological, behavioral, developmental and cytological tests. However, transcriptomic analyses have not yet been systematically evaluated for non-lethal knockouts. We conducted a power analysis to determine the experimental conditions under which even small changes in transcript levels can be reliably traced. We have applied this to two gene disruption lines of genes for which no function was known so far. Dedicated phenotyping tests informed by the tissues and stages of highest expression of the two genes show small effects on the tested phenotypes. For the transcriptome analysis of these stages and tissues, we used a prior power analysis to determine the number of biological replicates and the sequencing depth. We find that under these conditions, the knockouts have a significant impact on the transcriptional networks, with thousands of genes showing small transcriptional changes. GO analysis suggests that A930004D18Rik is involved in developmental processes through contributing to protein complexes, and A830005F24Rik in extracellular matrix functions. Subsampling analysis of the data reveals that the increase in the number of biological replicates was more important that increasing the sequencing depth to arrive at these results. Hence, our proof-of-principle experiment suggests that transcriptomic analysis is indeed an option to study gene functions of genes with weak or no traceable phenotypic effects and it provides the boundary conditions under which this is possible. Knockout mice benefit the understanding of gene functions in mammals. However, it has proven difficult for many genes to identify clear phenotypes, related due to lack of sufficient assays. As Lewis Wolpert put it in a famous quote “But did you take them to the opera?”, thus metaphorically alluding to the need to extend phenotyping efforts. This insight led to the establishment of phenotyping pipelines that are nowadays routinely used to characterize knock-out lines. However, transcriptomic approaches based on RNA-Seq have been much less explored for such deep-level studies. We conducted here both, a theoretical power analysis and practical RNA-Seq experiments on two knockout lines with small phenotypic effects to investigate the parameters including sample size, sequencing depth, fold change, and dispersion. Our dedicated RNA-Seq studies discovered thousands of genes with small transcriptional changes and enriched in specific functions in both knockout lines. We find that it is more important to increase the number of samples than to increase the sequencing depth. Our work shows that a deep RNA-Seq study on knockouts is powerful for understanding gene functions in cases of weak phenotypic effects, and provides a guideline for the experimental design of such studies.
Collapse
Affiliation(s)
- Chen Xie
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- * E-mail:
| | - Cemalettin Bekpen
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Sven Künzel
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Maryam Keshavarz
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Rebecca Krebs-Wheaton
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Neva Skrabar
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Kristian K. Ullrich
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Wenyu Zhang
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Diethard Tautz
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| |
Collapse
|
37
|
Vergara D, Verri T, Damato M, Trerotola M, Simeone P, Franck J, Fournier I, Salzet M, Maffia M. A Hidden Human Proteome Signature Characterizes the Epithelial Mesenchymal Transition Program. Curr Pharm Des 2020; 26:372-375. [PMID: 31995001 DOI: 10.2174/1381612826666200129091610] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 01/27/2020] [Indexed: 12/14/2022]
Abstract
BACKGROUND Molecular changes associated with the initiation of the epithelial to mesenchymal transition (EMT) program involve alterations of large proteome-based networks. The role of protein products mapping to non-coding genomic regions is still unexplored. OBJECTIVE The goal of this study was the identification of an alternative protein signature in breast cancer cellular models with a distinct expression of EMT markers. METHODS We profiled MCF-7 and MDA-MB-231 cells using liquid-chromatography mass/spectrometry (LCMS/ MS) and interrogated the OpenProt database to identify novel predicted isoforms and novel predicted proteins from alternative open reading frames (AltProts). RESULTS Our analysis revealed an AltProt and isoform protein signature capable of classifying the two breast cancer cell lines. Among the most highly expressed alternative proteins, we observed proteins potentially associated with inflammation, metabolism and EMT. CONCLUSION Here, we present an AltProts signature associated with EMT. Further studies will be needed to define their role in cancer progression.
Collapse
Affiliation(s)
- Daniele Vergara
- Department of Biological and Environmental Sciences and Technologies, University of Salento, Lecce, Italy
| | - Tiziano Verri
- Department of Biological and Environmental Sciences and Technologies, University of Salento, Lecce, Italy
| | - Marina Damato
- Department of Biological and Environmental Sciences and Technologies, University of Salento, Lecce, Italy
| | - Marco Trerotola
- Department of Medical, Oral and Biotechnological Sciences, "G.d'Annunzio" University of Chieti-Pescara, Italy
| | - Pasquale Simeone
- Department of Medicine and Aging Sciences, "G.d'Annunzio" University of Chieti-Pescara, Italy; Laboratory of Cytomorphology, Center for Advanced Studies and Technology (CAST), "G.d'Annunzio" University of Chieti-Pescara, Italy
| | - Julien Franck
- University of Lille, Inserm, U-1192, Laboratoire Proteomique, Reponse Inflammatoire et Spectrometrie de Masse-PRISM, F-59000, Lille, France
| | - Isabelle Fournier
- University of Lille, Inserm, U-1192, Laboratoire Proteomique, Reponse Inflammatoire et Spectrometrie de Masse-PRISM, F-59000, Lille, France
| | - Michel Salzet
- University of Lille, Inserm, U-1192, Laboratoire Proteomique, Reponse Inflammatoire et Spectrometrie de Masse-PRISM, F-59000, Lille, France
| | - Michele Maffia
- Department of Biological and Environmental Sciences and Technologies, University of Salento, Lecce, Italy
| |
Collapse
|
38
|
Leblanc S, Brunet MA. Modelling of pathogen-host systems using deeper ORF annotations and transcriptomics to inform proteomics analyses. Comput Struct Biotechnol J 2020; 18:2836-2850. [PMID: 33133425 PMCID: PMC7585943 DOI: 10.1016/j.csbj.2020.10.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 10/07/2020] [Accepted: 10/08/2020] [Indexed: 01/08/2023] Open
Abstract
The Zika virus is a flavivirus that can cause fulminant outbreaks and lead to Guillain-Barré syndrome, microcephaly and fetal demise. Like other flaviviruses, the Zika virus is transmitted by mosquitoes and provokes neurological disorders. Despite its risk to public health, no antiviral nor vaccine are currently available. In the recent years, several studies have set to identify human host proteins interacting with Zika viral proteins to better understand its pathogenicity. Yet these studies used standard human protein sequence databases. Such databases rely on genome annotations, which enforce a minimal open reading frame (ORF) length criterion. An ever-increasing number of studies have demonstrated the shortcomings of such annotation, which overlooks thousands of functional ORFs. Here we show that the use of a customized database including currently non-annotated proteins led to the identification of 4 alternative proteins as interactors of the viral capsid and NS4A proteins. Furthermore, 12 alternative proteins were identified in the proteome profiling of Zika infected monocytes, one of which was significantly up-regulated. This study presents a computational framework for the re-analysis of proteomics datasets to better investigate the viral-host protein interplays upon infection with the Zika virus.
Collapse
Key Words
- AP-MS, affinity-purification mass spectrometry
- Alternative ORFs
- DEP, differentially expressed proteins
- FDR, false discovery rate
- FPKM, fragments per kilobase of exon model per million reads mapped
- Flavivirus
- HCIP, highly confident interacting proteins
- HCMV, human cytomegalovirus
- LFQ, label free quantification
- MS, mass spectrometry
- ORF, open reading frame
- PSM, peptide spectrum match
- Protein network
- Proteogenomics
- Proteome profiling
- ZIKV, Zika virus
- Zika
- altProt, alternative protein
- ncRNA, non-coding RNA
- sORF, small open reading frame
Collapse
Affiliation(s)
- Sebastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Canada
| | - Marie A. Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Canada
| |
Collapse
|
39
|
Wu Q, Wright M, Gogol MM, Bradford WD, Zhang N, Bazzini AA. Translation of small downstream ORFs enhances translation of canonical main open reading frames. EMBO J 2020; 39:e104763. [PMID: 32744758 PMCID: PMC7459409 DOI: 10.15252/embj.2020104763] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 06/23/2020] [Accepted: 06/26/2020] [Indexed: 12/26/2022] Open
Abstract
In addition to canonical open reading frames (ORFs), thousands of translated small ORFs (containing less than 100 codons) have been identified in untranslated mRNA regions (UTRs) across eukaryotes. Small ORFs in 5′ UTRs (upstream (u)ORFs) often repress translation of the canonical ORF within the same mRNA. However, the function of translated small ORFs in the 3′ UTRs (downstream (d)ORFs) is unknown. Contrary to uORFs, we find that translation of dORFs enhances translation of their corresponding canonical ORFs. This translation stimulatory effect of dORFs depends on the number of dORFs, but not the length or peptide they encode. We propose that dORFs represent a new, strong, and universal translation regulatory mechanism in vertebrates.
Collapse
Affiliation(s)
- Qiushuang Wu
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Matthew Wright
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | | | | | - Ning Zhang
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Ariel A Bazzini
- Stowers Institute for Medical Research, Kansas City, MO, USA.,Department of Molecular and Integrative Physiology, University of Kansas Medical Center, Kansas City, KS, USA
| |
Collapse
|
40
|
Translation initiation downstream from annotated start codons in human mRNAs coevolves with the Kozak context. Genome Res 2020; 30:974-984. [PMID: 32669370 PMCID: PMC7397870 DOI: 10.1101/gr.257352.119] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 06/25/2020] [Indexed: 12/13/2022]
Abstract
Eukaryotic translation initiation involves preinitiation ribosomal complex 5′-to-3′ directional probing of mRNA for codons suitable for starting protein synthesis. The recognition of codons as starts depends on the codon identity and on its immediate nucleotide context known as Kozak context. When the context is weak (i.e., nonoptimal), leaky scanning takes place during which a fraction of ribosomes continues the mRNA probing. We explored the relationship between the context of AUG codons annotated as starts of protein-coding sequences and the next AUG codon occurrence. We found that AUG codons downstream from weak starts occur in the same frame more frequently than downstream from strong starts. We suggest that evolutionary selection on in-frame AUGs downstream from weak start codons is driven by the advantage of the reduction of wasteful out-of-frame product synthesis and also by the advantage of producing multiple proteoforms from certain mRNAs. We confirmed translation initiation downstream from weak start codons using ribosome profiling data. We also tested translation of alternative start codons in 10 specific human genes using reporter constructs. In all tested cases, initiation at downstream start codons was more productive than at the annotated ones. In most cases, optimization of Kozak context did not completely abolish downstream initiation, and in the specific example of CMPK1 mRNA, the optimized start remained unproductive. Collectively, our work reveals previously uncharacterized forces shaping the evolution of protein-coding genes and points to the plurality of translation initiation and the existence of sequence features influencing start codon selection, other than Kozak context.
Collapse
|
41
|
Brunet MA, Brunelle M, Lucier JF, Delcourt V, Levesque M, Grenier F, Samandi S, Leblanc S, Aguilar JD, Dufour P, Jacques JF, Fournier I, Ouangraoua A, Scott MS, Boisvert FM, Roucou X. OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes. Nucleic Acids Res 2020; 47:D403-D410. [PMID: 30299502 PMCID: PMC6323990 DOI: 10.1093/nar/gky936] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Accepted: 10/04/2018] [Indexed: 01/06/2023] Open
Abstract
Advances in proteomics and sequencing have highlighted many non-annotated open reading frames (ORFs) in eukaryotic genomes. Genome annotations, cornerstones of today's research, mostly rely on protein prior knowledge and on ab initio prediction algorithms. Such algorithms notably enforce an arbitrary criterion of one coding sequence (CDS) per transcript, leading to a substantial underestimation of the coding potential of eukaryotes. Here, we present OpenProt, the first database fully endorsing a polycistronic model of eukaryotic genomes to date. OpenProt contains all possible ORFs longer than 30 codons across 10 species, and cumulates supporting evidence such as protein conservation, translation and expression. OpenProt annotates all known proteins (RefProts), novel predicted isoforms (Isoforms) and novel predicted proteins from alternative ORFs (AltProts). It incorporates cutting-edge algorithms to evaluate protein orthology and re-interrogate publicly available ribosome profiling and mass spectrometry datasets, supporting the annotation of thousands of predicted ORFs. The constantly growing database currently cumulates evidence from 87 ribosome profiling and 114 mass spectrometry studies from several species, tissues and cell lines. All data is freely available and downloadable from a web platform (www.openprot.org) supporting a genome browser and advanced queries for each species. Thus, OpenProt enables a more comprehensive landscape of eukaryotic genomes’ coding potential.
Collapse
Affiliation(s)
- Marie A Brunet
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France
| | - Mylène Brunelle
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France
| | - Jean-François Lucier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, Québec, Canada.,Biology Department, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Vivian Delcourt
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France.,INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Maxime Levesque
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, Québec, Canada.,Biology Department, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Frédéric Grenier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, Québec, Canada.,Biology Department, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Sondos Samandi
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France
| | - Sébastien Leblanc
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Jean-David Aguilar
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Pascal Dufour
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Jean-Francois Jacques
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France
| | - Isabelle Fournier
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Aida Ouangraoua
- Informatics Department, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Michelle S Scott
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | | | - Xavier Roucou
- Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.,PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université de Lille, F-59000 Lille, France
| |
Collapse
|
42
|
Brunet MA, Leblanc S, Roucou X. Reconsidering proteomic diversity with functional investigation of small ORFs and alternative ORFs. Exp Cell Res 2020; 393:112057. [PMID: 32387289 DOI: 10.1016/j.yexcr.2020.112057] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2019] [Revised: 04/21/2020] [Accepted: 05/02/2020] [Indexed: 12/13/2022]
Abstract
The discovery of functional yet non-annotated open reading frames (ORFs) throughout the genome of several species presents an unprecedented challenge in current genome annotation. These novel ORFs are shorter than annotated ones and many can be found on the same RNA, in opposition to current assumptions in annotation methodologies. Whilst the literature lacks consensus, these novel ORFs are commonly referred to as small ORFs (sORFs) or alternative ORFs (alt-ORFs). Unannotated ORFs represent an overlooked layer of complexity in the coding potential of genomes and are transforming our current vision of the nature of coding genes. In this review, we outline what constitutes a sORF or an alt-ORF and emphasize differences between both nomenclatures. We then describe complementary large-scale methods to accurately discover novel ORFs as well as yield functional insights on the novel proteins they encode. While serendipitous discoveries highlighted the functional importance of some novel ORFs, omics methods facilitate and improve their characterization to better understand physiological and pathological pathways. Functional annotation of sORFs, alt-ORFs and their corresponding microproteins will likely help fundamental and clinical research.
Collapse
Affiliation(s)
- Marie A Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Canada.
| | - Sebastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Canada.
| |
Collapse
|
43
|
Kiniry SJ, Michel AM, Baranov PV. Computational methods for ribosome profiling data analysis. WILEY INTERDISCIPLINARY REVIEWS. RNA 2020; 11:e1577. [PMID: 31760685 DOI: 10.1002/wrna.1577] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 10/12/2019] [Accepted: 10/16/2019] [Indexed: 12/15/2022]
Abstract
Since the introduction of the ribosome profiling technique in 2009 its popularity has greatly increased. It is widely used for the comprehensive assessment of gene expression and for studying the mechanisms of regulation at the translational level. As the number of ribosome profiling datasets being produced continues to grow, so too does the need for reliable software that can provide answers to the biological questions it can address. This review describes the computational methods and tools that have been developed to analyze ribosome profiling data at the different stages of the process. It starts with initial routine processing of raw data and follows with more specific tasks such as the identification of translated open reading frames, differential gene expression analysis, or evaluation of local or global codon decoding rates. The review pinpoints challenges associated with each step and explains the ways in which they are currently addressed. In addition it provides a comprehensive, albeit incomplete, list of publicly available software applicable to each step, which may be a beneficial starting point to those unexposed to ribosome profiling analysis. The outline of current challenges in ribosome profiling data analysis may inspire computational biologists to search for novel, potentially superior, solutions that will improve and expand the bioinformatician's toolbox for ribosome profiling data analysis. This article is characterized under: Translation > Ribosome Structure/Function RNA Evolution and Genomics > Computational Analyses of RNA Translation > Translation Mechanisms Translation > Translation Regulation.
Collapse
Affiliation(s)
- Stephen J Kiniry
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Audrey M Michel
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, RAS, Moscow, Russia
| |
Collapse
|
44
|
Pavesi A. New insights into the evolutionary features of viral overlapping genes by discriminant analysis. Virology 2020; 546:51-66. [PMID: 32452417 PMCID: PMC7157939 DOI: 10.1016/j.virol.2020.03.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 03/29/2020] [Indexed: 12/18/2022]
Abstract
Overlapping genes originate by a mechanism of overprinting, in which nucleotide substitutions in a pre-existing frame induce the expression of a de novo protein from an alternative frame. In this study, I assembled a dataset of 319 viral overlapping genes, which included 82 overlaps whose expression is experimentally known and the respective 237 homologs. Principal component analysis revealed that overlapping genes have a common pattern of nucleotide and amino acid composition. Discriminant analysis separated overlapping from non-overlapping genes with an accuracy of 97%. When applied to overlapping genes with known genealogy, it separated ancestral from de novo frames with an accuracy close to 100%. This high discriminant power was crucial to computationally design variants of de novo viral proteins known to possess selective anticancer toxicity (apoptin) or protection against neurodegeneration (X protein), as well as to detect two new potential overlapping genes in the genome of the new coronavirus SARS-CoV-2.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area Delle Scienze 23/A, I-43124, Parma, Italy.
| |
Collapse
|
45
|
Govek EE, Hatten ME. Tag-Team Genetics of Spinocerebellar Ataxia 6. Neuron 2020; 102:707-709. [PMID: 31121118 DOI: 10.1016/j.neuron.2019.04.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
In this issue of Neuron, Du et al. (2019) demonstrate that the bicistronic CACNA1A gene encodes a transcription factor α1ACT, mutations in which are associated with SCA6, that controls expression of genes important for cerebellar Purkinje cell development and excitability. Reduction of α1ACT in the adult is well tolerated, suggesting a potential new therapy for SCA6.
Collapse
Affiliation(s)
- Eve-Ellen Govek
- Laboratory of Developmental Neurobiology, The Rockefeller University, New York, NY 10065, USA
| | - Mary E Hatten
- Laboratory of Developmental Neurobiology, The Rockefeller University, New York, NY 10065, USA.
| |
Collapse
|
46
|
Choi S, Ju S, Lee J, Na S, Lee C, Paek E. Proteogenomic Approach to UTR Peptide Identification. J Proteome Res 2019; 19:212-220. [DOI: 10.1021/acs.jproteome.9b00498] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
| | - Shinyeong Ju
- Center for Theragnosis, Korea Institute of Science and Technology, Seoul 02792, Republic of Korea
| | | | | | - Cheolju Lee
- Center for Theragnosis, Korea Institute of Science and Technology, Seoul 02792, Republic of Korea
- Division of Bio-Medical Science & Technology, KIST School, Korea University of Science and Technology, Seoul 02792, Republic of Korea
- KHU-KIST Department of Converging Science and Technology, Kyung Hee University, Seoul 02447, Republic of Korea
| | | |
Collapse
|
47
|
Dvorak P, Leupen S, Soucek P. Functionally Significant Features in the 5' Untranslated Region of the ABCA1 Gene and Their Comparison in Vertebrates. Cells 2019; 8:cells8060623. [PMID: 31234415 PMCID: PMC6627321 DOI: 10.3390/cells8060623] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 06/17/2019] [Accepted: 06/19/2019] [Indexed: 02/07/2023] Open
Abstract
Single nucleotide polymorphisms located in 5′ untranslated regions (5′UTRs) can regulate gene expression and have clinical impact. Recognition of functionally significant sequences within 5′UTRs is crucial in next-generation sequencing applications. Furthermore, information about the behavior of 5′UTRs during gene evolution is scarce. Using the example of the ATP-binding cassette transporter A1 (ABCA1) gene (Tangier disease), we describe our algorithm for functionally significant sequence finding. 5′UTR features (upstream start and stop codons, open reading frames (ORFs), GC content, motifs, and secondary structures) were studied using freely available bioinformatics tools in 55 vertebrate orthologous genes obtained from Ensembl and UCSC. The most conserved sequences were suggested as hot spots. Exon and intron enhancers and silencers (sc35, ighg2 cgamma2, ctnt, gh-1, and fibronectin eda exon), transcription factors (TFIIA, TATA, NFAT1, NFAT4, and HOXA13), some of them cancer related, and microRNA (hsa-miR-4474-3p) were localized to these regions. An upstream ORF, overlapping with the main ORF in primates and possibly coding for a small bioactive peptide, was also detected. Moreover, we showed several features of 5′UTRs, such as GC content variation, hairpin structure conservation or 5′UTR segmentation, which are interesting from a phylogenetic point of view and can stimulate further evolutionary oriented research.
Collapse
Affiliation(s)
- Pavel Dvorak
- Department of Biology, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300 Pilsen, Czech Republic.
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300 Pilsen, Czech Republic.
| | - Sarah Leupen
- Department of Biological Sciences, University of Maryland Baltimore County, Baltimore, MD 21250, USA.
| | - Pavel Soucek
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300 Pilsen, Czech Republic.
- Toxicogenomics Unit, National Institute of Public Health, Srobarova 48, 100 42 Prague 10, Czech Republic.
| |
Collapse
|
48
|
Abstract
Classically, phenotype is what is observed, and genotype is the genetic makeup. Statistical studies aim to project phenotypic likelihoods of genotypic patterns. The traditional genotype-to-phenotype theory embraces the view that the encoded protein shape together with gene expression level largely determines the resulting phenotypic trait. Here, we point out that the molecular biology revolution at the turn of the century explained that the gene encodes not one but ensembles of conformations, which in turn spell all possible gene-associated phenotypes. The significance of a dynamic ensemble view is in understanding the linkage between genetic change and the gained observable physical or biochemical characteristics. Thus, despite the transformative shift in our understanding of the basis of protein structure and function, the literature still commonly relates to the classical genotype-phenotype paradigm. This is important because an ensemble view clarifies how even seemingly small genetic alterations can lead to pleiotropic traits in adaptive evolution and in disease, why cellular pathways can be modified in monogenic and polygenic traits, and how the environment may tweak protein function.
Collapse
Affiliation(s)
- Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Chung-Jung Tsai
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
| | - Hyunbum Jang
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
| |
Collapse
|
49
|
Delcourt V, Brunelle M, Roy AV, Jacques JF, Salzet M, Fournier I, Roucou X. The Protein Coded by a Short Open Reading Frame, Not by the Annotated Coding Sequence, Is the Main Gene Product of the Dual-Coding Gene MIEF1. Mol Cell Proteomics 2018; 17:2402-2411. [PMID: 30181344 PMCID: PMC6283296 DOI: 10.1074/mcp.ra118.000593] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Revised: 07/19/2018] [Indexed: 12/18/2022] Open
Abstract
Proteogenomics and ribosome profiling concurrently show that genes may code for both a large and one or more small proteins translated from annotated coding sequences (CDSs) and unannotated alternative open reading frames (named alternative ORFs or altORFs), respectively, but the stoichiometry between large and small proteins translated from a same gene is unknown. MIEF1, a gene recently identified as a dual-coding gene, harbors a CDS and a newly annotated and actively translated altORF located in the 5′UTR. Here, we use absolute quantification with stable isotope-labeled peptides and parallel reaction monitoring to determine levels of both proteins in two human cells lines and in human colon. We report that the main MIEF1 translational product is not the canonical 463 amino acid MiD51 protein but the small 70 amino acid alternative MiD51 protein (altMiD51). These results demonstrate the inadequacy of the single CDS concept and provide a strong argument for incorporating altORFs and small proteins in functional annotations.
Collapse
Affiliation(s)
- Vivian Delcourt
- Département de Biochimie, Université de Sherbrooke, Québec, Canada; Univ. Lille, INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire and Spectrométrie de Masse (PRISM) F-59000 Lille, France; PROTEO, Québec Network for Research on Protein Function, Structure, and Engineering, Québec, Canada
| | - Mylène Brunelle
- Département de Biochimie, Université de Sherbrooke, Québec, Canada; PROTEO, Québec Network for Research on Protein Function, Structure, and Engineering, Québec, Canada
| | - Annie V Roy
- Département de Biochimie, Université de Sherbrooke, Québec, Canada; PROTEO, Québec Network for Research on Protein Function, Structure, and Engineering, Québec, Canada
| | - Jean-François Jacques
- Département de Biochimie, Université de Sherbrooke, Québec, Canada; PROTEO, Québec Network for Research on Protein Function, Structure, and Engineering, Québec, Canada
| | - Michel Salzet
- Univ. Lille, INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire and Spectrométrie de Masse (PRISM) F-59000 Lille, France
| | - Isabelle Fournier
- Univ. Lille, INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire and Spectrométrie de Masse (PRISM) F-59000 Lille, France
| | - Xavier Roucou
- Département de Biochimie, Université de Sherbrooke, Québec, Canada; PROTEO, Québec Network for Research on Protein Function, Structure, and Engineering, Québec, Canada.
| |
Collapse
|