1
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Moritz RL, Deutsch EW, van Heesch S. What Can Ribo-Seq, Immunopeptidomics, and Proteomics Tell Us About the Noncanonical Proteome? Mol Cell Proteomics 2023; 22:100631. [PMID: 37572790 PMCID: PMC10506109 DOI: 10.1016/j.mcpro.2023.100631] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 07/21/2023] [Accepted: 08/08/2023] [Indexed: 08/14/2023] Open
Abstract
Ribosome profiling (Ribo-Seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of noncanonical sites of ribosome translation outside the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7000 noncanonical ORFs are translated, which, at first glance, has the potential to expand the number of human protein CDSs by 30%, from ∼19,500 annotated CDSs to over 26,000 annotated CDSs. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of noncanonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome but searching for guidance on how to proceed. Here, we discuss the current state of noncanonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein coding."
Collapse
Affiliation(s)
- John R Prensner
- Division of Pediatric Hematology/Oncology, Department of Pediatrics, University of Michigan Medical School, Ann Arbor, Michigan, USA; Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, Michigan, USA.
| | | | - Leron W Kok
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Karl R Clauser
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, Agora Center Bugnon 25A, University of Lausanne, Lausanne, Switzerland; Department of Oncology, Centre Hospitalier Universitaire Vaudois (CHUV), Lausanne, Switzerland; Agora Cancer Research Centre, Lausanne, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | | |
Collapse
|
2
|
Chen Y, Cao X, Loh KH, Slavoff SA. Chemical labeling and proteomics for characterization of unannotated small and alternative open reading frame-encoded polypeptides. Biochem Soc Trans 2023; 51:1071-1082. [PMID: 37171061 PMCID: PMC10317152 DOI: 10.1042/bst20221074] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 03/27/2023] [Accepted: 04/13/2023] [Indexed: 05/13/2023]
Abstract
Thousands of unannotated small and alternative open reading frames (smORFs and alt-ORFs, respectively) have recently been revealed in mammalian genomes. While hundreds of mammalian smORF- and alt-ORF-encoded proteins (SEPs and alt-proteins, respectively) affect cell proliferation, the overwhelming majority of smORFs and alt-ORFs remain uncharacterized at the molecular level. Complicating the task of identifying the biological roles of smORFs and alt-ORFs, the SEPs and alt-proteins that they encode exhibit limited sequence homology to protein domains of known function. Experimental techniques for the functionalization of these gene classes are therefore required. Approaches combining chemical labeling and quantitative proteomics have greatly advanced our ability to identify and characterize functional SEPs and alt-proteins in high throughput. In this review, we briefly describe the principles of proteomic discovery of SEPs and alt-proteins, then summarize how these technologies interface with chemical labeling for identification of SEPs and alt-proteins with specific properties, as well as in defining the interactome of SEPs and alt-proteins.
Collapse
Affiliation(s)
- Yanran Chen
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
| | - Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT, U.S.A
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Ken H. Loh
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT, U.S.A
| | - Sarah A. Slavoff
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, U.S.A
| |
Collapse
|
3
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Deutsch EW, van Heesch S. What can Ribo-seq and proteomics tell us about the non-canonical proteome? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541049. [PMID: 37292611 PMCID: PMC10245706 DOI: 10.1101/2023.05.16.541049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Ribosome profiling (Ribo-seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of non-canonical sites of ribosome translation outside of the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7,000 non-canonical open reading frames (ORFs) are translated, which, at first glance, has the potential to expand the number of human protein-coding sequences by 30%, from ∼19,500 annotated CDSs to over 26,000. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of non-canonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome, but searching for guidance on how to proceed. Here, we discuss the current state of non-canonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein-coding". In brief The human genome encodes thousands of non-canonical open reading frames (ORFs) in addition to protein-coding genes. As a nascent field, many questions remain regarding non-canonical ORFs. How many exist? Do they encode proteins? What level of evidence is needed for their verification? Central to these debates has been the advent of ribosome profiling (Ribo-seq) as a method to discern genome-wide ribosome occupancy, and immunopeptidomics as a method to detect peptides that are processed and presented by MHC molecules and not observed in traditional proteomics experiments. This article provides a synthesis of the current state of non-canonical ORF research and proposes standards for their future investigation and reporting. Highlights Combined use of Ribo-seq and proteomics-based methods enables optimal confidence in detecting non-canonical ORFs and their protein products.Ribo-seq can provide more sensitive detection of non-canonical ORFs, but data quality and analytical pipelines will impact results.Non-canonical ORF catalogs are diverse and span both high-stringency and low-stringency ORF nominations.A framework for standardized non-canonical ORF evidence will advance the research field.
Collapse
Affiliation(s)
- John R. Prensner
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | | | - Leron W. Kok
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| | - Karl R. Clauser
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Jonathan M. Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Agora Center Bugnon 25A, 1005 Lausanne, Switzerland
- Department of Oncology, Centre hospitalier universitaire vaudois (CHUV), Rue du Bugnon 46, 1005 Lausanne, Switzerland
- Agora Cancer Research Centre, 1011 Lausanne, Switzerland
| | - Eric W. Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| |
Collapse
|
4
|
Bogaert A, Fijalkowska D, Staes A, Van de Steene T, Demol H, Gevaert K. Limited evidence for protein products of non-coding transcripts in the HEK293T cellular cytosol. Mol Cell Proteomics 2022; 21:100264. [PMID: 35788065 PMCID: PMC9396073 DOI: 10.1016/j.mcpro.2022.100264] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 06/22/2022] [Accepted: 06/30/2022] [Indexed: 10/25/2022] Open
Abstract
Ribosome profiling has revealed translation outside of canonical coding sequences (CDSs) including translation of short upstream ORFs, long non-coding RNAs, overlapping ORFs, ORFs in UTRs or ORFs in alternative reading frames. Studies combining mass spectrometry, ribosome profiling and CRISPR-based screens showed that hundreds of ORFs derived from non-coding transcripts produce (micro)proteins, while other studies failed to find evidence for such types of non-canonical translation products. Here, we attempted to discover translation products from non-coding regions by strongly reducing the complexity of the sample prior to mass spectrometric analysis. We used an extended database as the search space and applied stringent filtering of the identified peptides to find evidence for novel translation events. We show that, theoretically our strategy facilitates the detection of translation events of transcripts from non-coding regions, but experimentally only find 19 peptides that might originate from such translation events. Finally, Virotrap based interactome analysis of two N-terminal proteoforms originating from non-coding regions finally showed the functional potential of these novel proteins.
Collapse
Affiliation(s)
- Annelies Bogaert
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Daria Fijalkowska
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - An Staes
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Tessa Van de Steene
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Hans Demol
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Kris Gevaert
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium.
| |
Collapse
|
5
|
Cheng XF, Wang N, Jiang Z, Chen Z, Niu Y, Tong L, Yu T, Tang B. Quantitative Chemoproteomic Profiling of Targets of Au(I) Complexes by Competitive Activity-Based Protein Profiling. Bioconjug Chem 2022; 33:1131-1137. [PMID: 35576584 DOI: 10.1021/acs.bioconjchem.2c00080] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Owing to the encouraging pharmacological action and acceptable toxicity profile, Au(I) complexes have attracted growing interest in the application of disease treatment. In order to investigate their potential target proteins and related bioinformation, herein, we screened four Au(I) complexes and explored the binding proteins utilizing a competitive activity-based protein profiling (ABPP) strategy, including identification experiments and reactivity classification experiments, which offers a simple and robust method to identify the target proteins of Au(I) complexes. We quantified the target proteins of the four Au(I) complexes and found that most of proteins were associated with cancer. In addition, the newly Au(I)-binding proteins and biological gold-protein interaction pathways were exhibited. Furthermore, we estimated the correlation between target proteins of Au(I) complexes and various cancers, which will promote the development of the gold anticancer drugs.
Collapse
Affiliation(s)
- Xiu-Fen Cheng
- College of Chemistry, Chemical Engineering and Materials Science, Collaborative Innovation Center of Functionalized Probes for Chemical Imaging in Universities of Shandong, Key Laboratory of Molecular and Nano Probes, Ministry of Education, Institute of Biomedical Sciences, Shandong Normal University, Jinan, 250014, P. R. China
| | - Nan Wang
- College of Chemistry, Chemical Engineering and Materials Science, Collaborative Innovation Center of Functionalized Probes for Chemical Imaging in Universities of Shandong, Key Laboratory of Molecular and Nano Probes, Ministry of Education, Institute of Biomedical Sciences, Shandong Normal University, Jinan, 250014, P. R. China
| | - Zhongyao Jiang
- College of Chemistry, Chemical Engineering and Materials Science, Collaborative Innovation Center of Functionalized Probes for Chemical Imaging in Universities of Shandong, Key Laboratory of Molecular and Nano Probes, Ministry of Education, Institute of Biomedical Sciences, Shandong Normal University, Jinan, 250014, P. R. China
| | - Zhenzhen Chen
- College of Chemistry, Chemical Engineering and Materials Science, Collaborative Innovation Center of Functionalized Probes for Chemical Imaging in Universities of Shandong, Key Laboratory of Molecular and Nano Probes, Ministry of Education, Institute of Biomedical Sciences, Shandong Normal University, Jinan, 250014, P. R. China
| | - Yaxin Niu
- College of Chemistry, Chemical Engineering and Materials Science, Collaborative Innovation Center of Functionalized Probes for Chemical Imaging in Universities of Shandong, Key Laboratory of Molecular and Nano Probes, Ministry of Education, Institute of Biomedical Sciences, Shandong Normal University, Jinan, 250014, P. R. China
| | - Lili Tong
- College of Chemistry, Chemical Engineering and Materials Science, Collaborative Innovation Center of Functionalized Probes for Chemical Imaging in Universities of Shandong, Key Laboratory of Molecular and Nano Probes, Ministry of Education, Institute of Biomedical Sciences, Shandong Normal University, Jinan, 250014, P. R. China
| | - Ting Yu
- College of Chemistry, Chemical Engineering and Materials Science, Collaborative Innovation Center of Functionalized Probes for Chemical Imaging in Universities of Shandong, Key Laboratory of Molecular and Nano Probes, Ministry of Education, Institute of Biomedical Sciences, Shandong Normal University, Jinan, 250014, P. R. China
| | - Bo Tang
- College of Chemistry, Chemical Engineering and Materials Science, Collaborative Innovation Center of Functionalized Probes for Chemical Imaging in Universities of Shandong, Key Laboratory of Molecular and Nano Probes, Ministry of Education, Institute of Biomedical Sciences, Shandong Normal University, Jinan, 250014, P. R. China
| |
Collapse
|
6
|
Cao X, Khitun A, Harold CM, Bryant CJ, Zheng SJ, Baserga SJ, Slavoff SA. Nascent alt-protein chemoproteomics reveals a pre-60S assembly checkpoint inhibitor. Nat Chem Biol 2022; 18:643-651. [PMID: 35393574 PMCID: PMC9423127 DOI: 10.1038/s41589-022-01003-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 02/25/2022] [Indexed: 12/29/2022]
Abstract
Many unannotated microproteins and alternative proteins (alt-proteins) are coencoded with canonical proteins, but few of their functions are known. Motivated by the hypothesis that alt-proteins undergoing regulated synthesis could play important cellular roles, we developed a chemoproteomic pipeline to identify nascent alt-proteins in human cells. We identified 22 actively translated alt-proteins or N-terminal extensions, one of which is post-transcriptionally upregulated by DNA damage stress. We further defined a nucleolar, cell-cycle-regulated alt-protein that negatively regulates assembly of the pre-60S ribosomal subunit (MINAS-60). Depletion of MINAS-60 increases the amount of cytoplasmic 60S ribosomal subunit, upregulating global protein synthesis and cell proliferation. Mechanistically, MINAS-60 represses the rate of late-stage pre-60S assembly and export to the cytoplasm. Together, these results implicate MINAS-60 as a potential checkpoint inhibitor of pre-60S assembly and demonstrate that chemoproteomics enables hypothesis generation for uncharacterized alt-proteins.
Collapse
Affiliation(s)
- Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, CT, USA.,Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Alexandra Khitun
- Department of Chemistry, Yale University, New Haven, CT, USA.,Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Cecelia M Harold
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | - Carson J Bryant
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Shu-Jian Zheng
- Department of Chemistry, Yale University, New Haven, CT, USA.,Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Susan J Baserga
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA.,Department of Therapeutic Radiology, Yale University School of Medicine, New Haven, CT, USA
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, CT, USA. .,Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA. .,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA.
| |
Collapse
|
7
|
Zhang Z, Li Y, Yuan W, Wang Z, Wan C. Proteomic-driven identification of short open reading frame-encoded peptides. Proteomics 2022; 22:e2100312. [PMID: 35384297 DOI: 10.1002/pmic.202100312] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 03/29/2022] [Accepted: 03/30/2022] [Indexed: 11/10/2022]
Abstract
Accumulating evidence has shown that a large number of short open reading frames (sORFs) also have the ability to encode proteins. The discovery of sORFs opens up a new research area, leading to the identification and functional study of sORF encoded peptides (SEPs) at the omics level. Besides bioinformatics prediction and ribosomal profiling, mass spectrometry (MS) has become a significant tool as it directly detects the sequence of SEPs. Though MS-based proteomics methods have proved to be effective for qualitative and quantitative analysis of SEPs, the detection of SEPs is still a great challenge due to their low abundance and short sequence. To illustrate the progress in method development, we described and discussed the main steps of large-scale proteomics identification of SEPs, including SEP extraction and enrichment, MS detection, data processing and quality control, quantification, and function prediction and validation methods. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Zheng Zhang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Yujie Li
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Wenqian Yuan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Zhiwei Wang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Cuihong Wan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| |
Collapse
|
8
|
Cassidy L, Kaulich PT, Maaß S, Bartel J, Becher D, Tholey A. Bottom-up and top-down proteomic approaches for the identification, characterization, and quantification of the low molecular weight proteome with focus on short open reading frame-encoded peptides. Proteomics 2021; 21:e2100008. [PMID: 34145981 DOI: 10.1002/pmic.202100008] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 06/09/2021] [Accepted: 06/09/2021] [Indexed: 01/14/2023]
Abstract
The recent discovery of alternative open reading frames creates a need for suitable analytical approaches to verify their translation and to characterize the corresponding gene products at the molecular level. As the analysis of small proteins within a background proteome by means of classical bottom-up proteomics is challenging, method development for the analysis of small open reading frame encoded peptides (SEPs) have become a focal point for research. Here, we highlight bottom-up and top-down proteomics approaches established for the analysis of SEPs in both pro- and eukaryotes. Major steps of analysis, including sample preparation and (small) proteome isolation, separation and mass spectrometry, data interpretation and quality control, quantification, the analysis of post-translational modifications, and exploration of functional aspects of the SEPs by means of proteomics technologies are described. These methods do not exclusively cover the analytics of SEPs but simultaneously include the low molecular weight proteome, and moreover, can also be used for the proteome-wide analysis of proteolytic processing events.
Collapse
Affiliation(s)
- Liam Cassidy
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Philipp T Kaulich
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Sandra Maaß
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Greifswald, Germany
| | - Jürgen Bartel
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Greifswald, Germany
| | - Dörte Becher
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Greifswald, Germany
| | - Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| |
Collapse
|
9
|
Prensner JR, Enache OM, Luria V, Krug K, Clauser KR, Dempster JM, Karger A, Wang L, Stumbraite K, Wang VM, Botta G, Lyons NJ, Goodale A, Kalani Z, Fritchman B, Brown A, Alan D, Green T, Yang X, Jaffe JD, Roth JA, Piccioni F, Kirschner MW, Ji Z, Root DE, Golub TR. Noncanonical open reading frames encode functional proteins essential for cancer cell survival. Nat Biotechnol 2021; 39:697-704. [PMID: 33510483 PMCID: PMC8195866 DOI: 10.1038/s41587-020-00806-2] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 12/16/2020] [Indexed: 01/30/2023]
Abstract
Although genomic analyses predict many noncanonical open reading frames (ORFs) in the human genome, it is unclear whether they encode biologically active proteins. Here we experimentally interrogated 553 candidates selected from noncanonical ORF datasets. Of these, 57 induced viability defects when knocked out in human cancer cell lines. Following ectopic expression, 257 showed evidence of protein expression and 401 induced gene expression changes. Clustered regularly interspaced short palindromic repeat (CRISPR) tiling and start codon mutagenesis indicated that their biological effects required translation as opposed to RNA-mediated effects. We found that one of these ORFs, G029442-renamed glycine-rich extracellular protein-1 (GREP1)-encodes a secreted protein highly expressed in breast cancer, and its knockout in 263 cancer cell lines showed preferential essentiality in breast cancer-derived lines. The secretome of GREP1-expressing cells has an increased abundance of the oncogenic cytokine GDF15, and GDF15 supplementation mitigated the growth-inhibitory effect of GREP1 knockout. Our experiments suggest that noncanonical ORFs can express biologically active proteins that are potential therapeutic targets.
Collapse
Affiliation(s)
- John R. Prensner
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215,Division of Pediatric Hematology/Oncology, Boston Children’s Hospital, Boston, MA, 02115
| | - Oana M. Enache
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Victor Luria
- Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA
| | - Karsten Krug
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Karl R. Clauser
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | | | - Amir Karger
- IT-Research Computing, Harvard Medical School, Boston, MA, USA, 02115
| | - Li Wang
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | | | - Vickie M. Wang
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Ginevra Botta
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | | | - Amy Goodale
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Zohra Kalani
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | | | - Adam Brown
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Douglas Alan
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Thomas Green
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Xiaoping Yang
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Jacob D. Jaffe
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,Present address: Inzen Therapeutics, Cambridge, MA, 02139, USA
| | | | - Federica Piccioni
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,Present address: Merck Research Laboratories, Boston, MA, 02115, USA
| | - Marc W. Kirschner
- Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA
| | - Zhe Ji
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611,Department of Biomedical Engineering, McCormick School of Engineering, Northwestern University, Evanston, IL 60628
| | - David E. Root
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Todd R. Golub
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215,Division of Pediatric Hematology/Oncology, Boston Children’s Hospital, Boston, MA, 02115,Corresponding author: Address correspondence to: Todd R. Golub, MD, Chief Scientific Officer, Broad Institute of Harvard and MIT, Room 4013, 415 Main Street, Cambridge, MA, 02142, , Phone: 617-714-7050
| |
Collapse
|
10
|
Schlesinger D, Elsässer SJ. Revisiting sORFs: overcoming challenges to identify and characterize functional microproteins. FEBS J 2021; 289:53-74. [PMID: 33595896 DOI: 10.1111/febs.15769] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 01/17/2021] [Accepted: 02/15/2021] [Indexed: 02/07/2023]
Abstract
Short ORFs (sORFs), that is, occurrences of a start and stop codon within 100 codons or less, can be found in organisms of all domains of life, outnumbering annotated protein-coding ORFs by orders of magnitude. Even though functional proteins smaller than 100 amino acids are known, the coding potential of sORFs has often been overlooked, as it is not trivial to predict and test for functionality within the large number of sORFs. Recent advances in ribosome profiling and mass spectrometry approaches, together with refined bioinformatic predictions, have enabled a huge leap forward in this field and identified thousands of likely coding sORFs. A relatively low number of small proteins or microproteins produced from these sORFs have been characterized so far on the molecular, structural, and/or mechanistic level. These however display versatile and, in some cases, essential cellular functions, allowing for the exciting possibility that many more, previously unknown small proteins might be encoded in the genome, waiting to be discovered. This review will give an overview of the steadily growing microprotein field, focusing on eukaryotic small proteins. We will discuss emerging themes in the molecular action of microproteins, as well as advances and challenges in microprotein identification and characterization.
Collapse
Affiliation(s)
- Dörte Schlesinger
- Science for Life Laboratory, Division of Genome Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.,Ming Wai Lau Centre for Reparative Medicine, Stockholm node, Karolinska Institutet, Stockholm, Sweden
| | - Simon J Elsässer
- Science for Life Laboratory, Division of Genome Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.,Ming Wai Lau Centre for Reparative Medicine, Stockholm node, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
11
|
Wang B, Hao J, Pan N, Wang Z, Chen Y, Wan C. Identification and analysis of small proteins and short open reading frame encoded peptides in Hep3B cell. J Proteomics 2020; 230:103965. [PMID: 32891891 DOI: 10.1016/j.jprot.2020.103965] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Revised: 06/25/2020] [Accepted: 08/31/2020] [Indexed: 02/05/2023]
Abstract
The small proteins and short open reading frames encoded peptides (SEPs) are of fundamental importance because of their essential roles in biological processes. However, the annotation or identification of them is challenging, in part owing to the limitation of the traditional genome annotation pipeline and their inherent characteristics of low abundance and low molecular weight. To discover and characterize SEPs in Hep3B cell line, we developed an optimized peptidomic assay by combining different peptide extraction and separation methods. The organic solvent precipitation method in peptidomic showed promotion in the enrichment of low molecular proteins or peptides, and the data clearly showed a beneficial effect from the reduction of sample complexity, resulting in high-quality MS/MS spectra. Furthermore, different strategies exhibited good complementarity in improving the total amount of small proteins and their sequence coverage. In total, 1192 proteins within less than 100 amino acids were identified, including 271 newly discovered SEPs that been annotated in the OpenProt database and 147 SEPs of them encoded from ncRNA or lincRNA. Results in this work provide robust evidence to date that the human proteome is more complicated than previously appreciated, and this will be a benefit to discoveries of proteins without function annotation. SIGNIFICANCE: In this work, methods were optimized to identify SEPs in Hep3B. The organic solvent precipitation presents promotion in enrichment of low molecular proteins or peptides, and the data clearly showed a beneficial effect from the reduction of sample complexity, resulting in high quality MS/MS spectra. Different strategies exhibited good complementarity in improving total amount of small proteins and their sequence coverage. In total, 1192 proteins within less than 100 amino acids were identified, including 271 newly discovered SEPs that been annotated in the OpenProt database and 147 SEPs of them encoded from ncRNA or lincRNA. Furthermore, 22 SEPs generated from the uORF may has potential effect in translation control, and 149 newly identified SEPs have known functional domains or cross-species conservation. Results in this work present robust evidence for the coding potential of the ignored region of human genomes and may provide additional insights into tumor biology.
Collapse
Affiliation(s)
- Bing Wang
- Hubei Key Lab of Genetic Regulation & Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, PR China
| | - Junhui Hao
- Hubei Key Lab of Genetic Regulation & Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, PR China
| | - Ni Pan
- Hubei Key Lab of Genetic Regulation & Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, PR China
| | - Zhiwei Wang
- Hubei Key Lab of Genetic Regulation & Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, PR China
| | - Yinxuan Chen
- Hubei Key Lab of Genetic Regulation & Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, PR China
| | - Cuihong Wan
- Hubei Key Lab of Genetic Regulation & Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, PR China.
| |
Collapse
|
12
|
Cardon T, Hervé F, Delcourt V, Roucou X, Salzet M, Franck J, Fournier I. Optimized Sample Preparation Workflow for Improved Identification of Ghost Proteins. Anal Chem 2019; 92:1122-1129. [PMID: 31829555 DOI: 10.1021/acs.analchem.9b04188] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Large scale proteomic strategies rely on database interrogation. Thus, only referenced proteins can be identified. Recently, Alternative Proteins (AltProts) translated from nonannotated Alternative Open reading frame (AltORFs) were discovered using customized databases. Because of their small size which confers them peptide-like physicochemical properties, they are more difficult to detect using standard proteomics strategies. In this study, we tested different preparation workflows for improving the identification of AltProts in NCH82 human glioma cell line. The highest number of identified AltProts was achieved with RIPA buffer or boiling water extraction followed by acetic acid precipitation.
Collapse
Affiliation(s)
- Tristan Cardon
- Inserm, U1192 - Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM) , Université de Lille , F-59000 Lille , France
| | - Flore Hervé
- Inserm, U1192 - Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM) , Université de Lille , F-59000 Lille , France
| | - Vivian Delcourt
- Inserm, U1192 - Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM) , Université de Lille , F-59000 Lille , France.,Department of Biochemistry , Université de Sherbrooke , Quebec , Sherbrooke , Canada
| | - Xavier Roucou
- Inserm, U1192 - Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM) , Université de Lille , F-59000 Lille , France.,Department of Biochemistry , Université de Sherbrooke , Quebec , Sherbrooke , Canada
| | - Michel Salzet
- Inserm, U1192 - Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM) , Université de Lille , F-59000 Lille , France.,Institut Universitaire de France (IUF) , Paris , France
| | - Julien Franck
- Inserm, U1192 - Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM) , Université de Lille , F-59000 Lille , France
| | - Isabelle Fournier
- Inserm, U1192 - Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM) , Université de Lille , F-59000 Lille , France.,Institut Universitaire de France (IUF) , Paris , France
| |
Collapse
|
13
|
Functions and impact of tal-like genes in animals with regard to applied aspects. Appl Microbiol Biotechnol 2018; 102:6841-6845. [PMID: 29909570 DOI: 10.1007/s00253-018-9159-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Revised: 06/04/2018] [Accepted: 06/11/2018] [Indexed: 02/03/2023]
Abstract
A large number of DNAs in eukaryote genomes can code for atypical transcripts, and their functions are controversial. It has been reported that the transcripts contain many small open reading frames (sORFs), which were originally considered as non-translatable RNAs. However, increasing evidence has suggested that some of these sORFs can encode for small peptides and some are conserved across large evolutionary distances. It has been reported that the small peptides have functions and may be involved in varieties of cellular processes, playing important roles in development, physiology, and metabolism. Among the sORFs, studies of the non-canonical gene polished rice/tarsal-less (pri/tal) in Drosophila and mille-pattes(mlpt) in Tribolium have been more thoroughly studied. The genes similar to pri/tal in other species have been defined as the tarsal-less-related gene family, tal-like gene. In this review, we described recent progress in the discovery and functional characterization of the small peptides encoded by the tal-like gene and their possible functional potentials.
Collapse
|
14
|
Budamgunta H, Olexiouk V, Luyten W, Schildermans K, Maes E, Boonen K, Menschaert G, Baggerman G. Comprehensive Peptide Analysis of Mouse Brain Striatum Identifies Novel sORF-Encoded Polypeptides. Proteomics 2018; 18:e1700218. [DOI: 10.1002/pmic.201700218] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Revised: 03/30/2018] [Indexed: 11/10/2022]
Affiliation(s)
| | - Volodimir Olexiouk
- BioBix; Lab for Bioinformatics and Computational Genomics; Department of Mathematical Modelling; Statistics and Bio-informatics; Ghent University; Ghent Belgium
| | - Walter Luyten
- Animal Physiology and Neurobiology; KULeuven; Leuven Belgium
| | | | - Evelyne Maes
- Centre for Proteomics; UAntwerp; Antwerp Belgium
- Proteins and Biomaterials; AgResearch; Christchurch New Zealand
| | - Kurt Boonen
- Centre for Proteomics; UAntwerp; Antwerp Belgium
- Unit Environmental Risk and Health; VITO; Mol Belgium
| | - Gerben Menschaert
- BioBix; Lab for Bioinformatics and Computational Genomics; Department of Mathematical Modelling; Statistics and Bio-informatics; Ghent University; Ghent Belgium
| | - Geert Baggerman
- Centre for Proteomics; UAntwerp; Antwerp Belgium
- Unit Environmental Risk and Health; VITO; Mol Belgium
| |
Collapse
|
15
|
Erpf PE, Fraser JA. The Long History of the Diverse Roles of Short ORFs: sPEPs in Fungi. Proteomics 2018; 18:e1700219. [PMID: 29465163 DOI: 10.1002/pmic.201700219] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Revised: 01/30/2018] [Indexed: 12/30/2022]
Abstract
Since the completion of the genome sequence of the model eukaryote Saccharomyces cerevisiae, there have been significant advancements in the field of genome annotation, in no small part due to the availability of datasets that make large-scale comparative analyses possible. As a result, since its completion there has been a significant change in annotated ORF size distribution in this first eukaryotic genome, especially in short ORFs (sORFs) predicted to encode polypeptides less than 150 amino acids in length. Due to their small size and the difficulties associated with their study, it is only relatively recently that these genomic features and the sORF-encoded peptides (sPEPs) they encode have become a focus of many researchers. Yet while this class of peptides may seem new and exciting, the study of this part of the proteome is nothing new in S. cerevisiae, a species where the biological importance of sPEPs has been elegantly illustrated over the past 30 years. Here the authors showcase a range of different sORFs found in S. cerevisiae and the diverse biological roles of their encoded sPEPs, and provide an insight into the sORFs found in other fungal species, particularly those pathogenic to humans.
Collapse
Affiliation(s)
- Paige E Erpf
- Australian Infectious Diseases Research Centre, St Lucia, Queensland, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Queensland, Australia
| | - James A Fraser
- Australian Infectious Diseases Research Centre, St Lucia, Queensland, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
16
|
Hollerer I, Higdon A, Brar GA. Strategies and Challenges in Identifying Function for Thousands of sORF-Encoded Peptides in Meiosis. Proteomics 2018; 18:e1700274. [PMID: 28929627 PMCID: PMC6135095 DOI: 10.1002/pmic.201700274] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Indexed: 11/11/2022]
Abstract
Recent genomic analyses have revealed pervasive translation from formerly unrecognized short open reading frames (sORFs) during yeast meiosis. Despite their short length, which has caused these regions to be systematically overlooked by traditional gene annotation approaches, meiotic sORFs share many features with classical genes, implying the potential for similar types of cellular functions. We found that sORF expression accounts for approximately 10-20% of the cellular translation capacity in yeast during meiotic differentiation and occurs within well-defined time windows, suggesting the production of relatively abundant peptides with stage-specific meiotic roles from these regions. Here, we provide arguments supporting this hypothesis and discuss sORF similarities and differences, as a group, to traditional protein coding regions, as well as challenges in defining their specific functions.
Collapse
Affiliation(s)
- Ina Hollerer
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA, USA
| | - Andrea Higdon
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA, USA
| | - Gloria A Brar
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA, USA
| |
Collapse
|
17
|
Yeasmin F, Yada T, Akimitsu N. Micropeptides Encoded in Transcripts Previously Identified as Long Noncoding RNAs: A New Chapter in Transcriptomics and Proteomics. Front Genet 2018; 9:144. [PMID: 29922328 PMCID: PMC5996887 DOI: 10.3389/fgene.2018.00144] [Citation(s) in RCA: 71] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Accepted: 04/09/2018] [Indexed: 11/13/2022] Open
Abstract
Integrative analysis using omics-based technologies results in the identification of a large number of putative short open reading frames (sORFs) with protein-coding capacity within transcripts previously identified as long noncoding RNAs (lncRNAs) or transcripts of unknown function (TUFs). sORFs were previously overlooked because of their diminutive size and the difficulty of identification by bioinformatics analyses. There is now growing evidence of the existence of potentially functional micropeptides produced from sORFs within cells of diverse species. Recent characterization of a few of these revealed their significant divergent roles in many fundamental biological processes, where some also show important relationships with pathogenesis. Recent works therefore provide new insights for exploring the wealth of information that may lie within sORF-encoded short proteins. Here, we summarize the current progress and view of micropeptides encoded in sORFs of protein-coding genes.
Collapse
Affiliation(s)
- Fouzia Yeasmin
- Isotope Science Centre, The University of Tokyo, Tokyo, Japan
| | - Tetsushi Yada
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Fukuoka, Japan
| | | |
Collapse
|
18
|
Abstract
A large body of evidence indicates that genome annotation pipelines have biased our view of coding sequences because they generally undersample small proteins and peptides. The recent development of genome-wide translation profiling reveals the prevalence of small/short open reading frames (smORFs or sORFs), which are scattered over all classes of transcripts, including both mRNAs and presumptive long noncoding RNAs. Proteomic approaches further confirm an unexpected variety of smORF-encoded peptides (SEPs), representing an overlooked reservoir of bioactive molecules. Indeed, functional studies in a broad range of species from yeast to humans demonstrate that SEPs can harbor key activities for the control of development, differentiation, and physiology. Here we summarize recent advances in the discovery and functional characterization of smORF/SEPs and discuss why these small players can no longer be ignored with regard to genome function.
Collapse
Affiliation(s)
- Serge Plaza
- Laboratoire de Recherches en Sciences Végétales, Université de Toulouse, Université Paul Sabatier, 31326 Castanet Tolosan, France; .,CNRS, UMR5546, Laboratoire de Recherches en Sciences Végétales, 31326 Castanet Tolosan, France
| | - Gerben Menschaert
- Department of Mathematical Modeling, Statistics and Bioinformatics, University of Ghent, 9000 Gent, Belgium
| | - François Payre
- Centre de Biologie du Développement, Centre de Biologie Intégrative, Université de Toulouse, CNRS, Université Paul Sabatier, 31062 Toulouse, France;
| |
Collapse
|
19
|
Ribosome profiling reveals the what, when, where and how of protein synthesis. Nat Rev Mol Cell Biol 2015; 16:651-64. [PMID: 26465719 DOI: 10.1038/nrm4069] [Citation(s) in RCA: 307] [Impact Index Per Article: 34.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Ribosome profiling, which involves the deep sequencing of ribosome-protected mRNA fragments, is a powerful tool for globally monitoring translation in vivo. The method has facilitated discovery of the regulation of gene expression underlying diverse and complex biological processes, of important aspects of the mechanism of protein synthesis, and even of new proteins, by providing a systematic approach for experimental annotation of coding regions. Here, we introduce the methodology of ribosome profiling and discuss examples in which this approach has been a key factor in guiding biological discovery, including its prominent role in identifying thousands of novel translated short open reading frames and alternative translation products.
Collapse
|
20
|
Housman G, Ulitsky I. Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2015; 1859:31-40. [PMID: 26265145 DOI: 10.1016/j.bbagrm.2015.07.017] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 06/18/2015] [Accepted: 07/19/2015] [Indexed: 12/12/2022]
Abstract
Long noncoding RNAs (lncRNAs) are a diverse class of RNAs with increasingly appreciated functions in vertebrates, yet much of their biology remains poorly understood. In particular, it is unclear to what extent the current catalog of over 10,000 annotated lncRNAs is indeed devoid of genes coding for proteins. Here we review the available computational and experimental schemes for distinguishing between coding and noncoding transcripts and assess the conclusions from their recent genome-wide applications. We conclude that the model most consistent with the available data is that a large number of mammalian lncRNAs undergo translation, but only a very small minority of such translation events results in stable and functional peptides. The outcomes of the majority of the translation events and their potential biological purposes remain an intriguing topic for future investigation. This article is part of a Special Issue entitled: Clues to long noncoding RNA taxonomy1, edited by Dr. Tetsuro Hirose and Dr. Shinichi Nakagawa.
Collapse
Affiliation(s)
- Gali Housman
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
21
|
Crappé J, Van Criekinge W, Menschaert G. Little things make big things happen: A summary of micropeptide encoding genes. EUPA OPEN PROTEOMICS 2014. [DOI: 10.1016/j.euprot.2014.02.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
22
|
Bazzini AA, Johnstone TG, Christiano R, Mackowiak SD, Obermayer B, Fleming ES, Vejnar CE, Lee MT, Rajewsky N, Walther TC, Giraldez AJ. Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J 2014; 33:981-93. [PMID: 24705786 DOI: 10.1002/embj.201488411] [Citation(s) in RCA: 459] [Impact Index Per Article: 45.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Identification of the coding elements in the genome is a fundamental step to understanding the building blocks of living systems. Short peptides (< 100 aa) have emerged as important regulators of development and physiology, but their identification has been limited by their size. We have leveraged the periodicity of ribosome movement on the mRNA to define actively translated ORFs by ribosome footprinting. This approach identifies several hundred translated small ORFs in zebrafish and human. Computational prediction of small ORFs from codon conservation patterns corroborates and extends these findings and identifies conserved sequences in zebrafish and human, suggesting functional peptide products (micropeptides). These results identify micropeptide-encoding genes in vertebrates, providing an entry point to define their function in vivo.
Collapse
Affiliation(s)
- Ariel A Bazzini
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Slavoff SA, Heo J, Budnik BA, Hanakahi LA, Saghatelian A. A human short open reading frame (sORF)-encoded polypeptide that stimulates DNA end joining. J Biol Chem 2014; 289:10950-10957. [PMID: 24610814 DOI: 10.1074/jbc.c113.533968] [Citation(s) in RCA: 108] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The recent discovery of numerous human short open reading frame (sORF)-encoded polypeptides (SEPs) has raised important questions about the functional roles of these molecules in cells. Here, we show that a 69-amino acid SEP, MRI-2, physically interacts with the Ku heterodimer to stimulate DNA double-strand break ligation via nonhomologous end joining. The characterization of MRI-2 suggests that this SEP may participate in DNA repair and underscores the potential of SEPs to serve important biological functions in mammalian cells.
Collapse
Affiliation(s)
- Sarah A Slavoff
- Department of Chemistry and Chemical Biology and Harvard University, Cambridge, Massachusetts 02138
| | - Jinho Heo
- Department of Medicinal Chemistry and Pharmacognosy, University of Illinois College of Pharmacy, Rockford, Illinois 60612
| | - Bogdan A Budnik
- Faculty of Arts and Sciences (FAS) Center for Systems Biology, Harvard University, Cambridge, Massachusetts 02138 and
| | - Leslyn A Hanakahi
- Department of Medicinal Chemistry and Pharmacognosy, University of Illinois College of Pharmacy, Rockford, Illinois 60612.
| | - Alan Saghatelian
- Department of Chemistry and Chemical Biology and Harvard University, Cambridge, Massachusetts 02138.
| |
Collapse
|
24
|
Ma J, Ward CC, Jungreis I, Slavoff SA, Schwaid AG, Neveu J, Budnik BA, Kellis M, Saghatelian A. Discovery of human sORF-encoded polypeptides (SEPs) in cell lines and tissue. J Proteome Res 2014; 13:1757-65. [PMID: 24490786 PMCID: PMC3993966 DOI: 10.1021/pr401280w] [Citation(s) in RCA: 118] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The existence of nonannotated protein-coding human short open reading frames (sORFs) has been revealed through the direct detection of their sORF-encoded polypeptide (SEP) products. The discovery of novel SEPs increases the size of the genome and the proteome and provides insights into the molecular biology of mammalian cells, such as the prevalent usage of non-AUG start codons. Through modifications of the existing SEP-discovery workflow, we discover an additional 195 SEPs in K562 cells and extend this methodology to identify novel human SEPs in additional cell lines and human tissue for a final tally of 237 new SEPs. These results continue to expand the human genome and proteome and demonstrate that SEPs are a ubiquitous class of nonannotated polypeptides that require further investigation.
Collapse
Affiliation(s)
- Jiao Ma
- Department of Chemistry and Chemical Biology, Harvard University , 12 Oxford Street, Cambridge, Massachusetts 02138, United States
| | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Ribosome profiling: new views of translation, from single codons to genome scale. Nat Rev Genet 2014; 15:205-13. [PMID: 24468696 DOI: 10.1038/nrg3645] [Citation(s) in RCA: 420] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Genome-wide analyses of gene expression have so far focused on the abundance of mRNA species as measured either by microarray or, more recently, by RNA sequencing. However, neither approach provides information on protein synthesis, which is the true end point of gene expression. Ribosome profiling is an emerging technique that uses deep sequencing to monitor in vivo translation. Studies using ribosome profiling have already provided new insights into the identity and the amount of proteins that are produced by cells, as well as detailed views into the mechanism of protein synthesis itself.
Collapse
|