1
|
Whited AM, Jungreis I, Allen J, Cleveland CL, Mudge JM, Kellis M, Rinn JL, Hough LE. Biophysical characterization of high-confidence, small human proteins. BIOPHYSICAL REPORTS 2024; 4:100167. [PMID: 38909903 PMCID: PMC11305224 DOI: 10.1016/j.bpr.2024.100167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 04/09/2024] [Accepted: 06/20/2024] [Indexed: 06/25/2024]
Abstract
Significant efforts have been made to characterize the biophysical properties of proteins. Small proteins have received less attention because their annotation has historically been less reliable. However, recent improvements in sequencing, proteomics, and bioinformatics techniques have led to the high-confidence annotation of small open reading frames (smORFs) that encode for functional proteins, producing smORF-encoded proteins (SEPs). SEPs have been found to perform critical functions in several species, including humans. While significant efforts have been made to annotate SEPs, less attention has been given to the biophysical properties of these proteins. We characterized the distributions of predicted and curated biophysical properties, including sequence composition, structure, localization, function, and disease association of a conservative list of previously identified human SEPs. We found significant differences between SEPs and both larger proteins and control sets. In addition, we provide an example of how our characterization of biophysical properties can contribute to distinguishing protein-coding smORFs from noncoding ones in otherwise ambiguous cases.
Collapse
Affiliation(s)
- A M Whited
- BioFrontiers Institute, University of Colorado, Boulder, Colorado
| | - Irwin Jungreis
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts; MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts
| | - Jeffre Allen
- BioFrontiers Institute, University of Colorado, Boulder, Colorado; Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado
| | | | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Manolis Kellis
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts; MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts
| | - John L Rinn
- BioFrontiers Institute, University of Colorado, Boulder, Colorado; Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado
| | - Loren E Hough
- BioFrontiers Institute, University of Colorado, Boulder, Colorado; Department of Physics, University of Colorado Boulder, Boulder, Colorado.
| |
Collapse
|
2
|
Li Q, Liu F, Ma X, Chen F, Yi Z, Du Y, Huang A, Zhao C, Wang D, Chen Y, Cao X. Proteomic Profiling of Unannotated Microproteins in Human Placenta Reveals XRCC6P1 as a Potential Negative Regulator of Translation. J Proteome Res 2024; 23:4005-4013. [PMID: 39171377 DOI: 10.1021/acs.jproteome.4c00319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Ribosome profiling and mass spectrometry have revealed thousands of previously unannotated small and alternative open reading frames (sm/alt-ORFs) that are translated into micro/alt-proteins in mammalian cells. However, their prevalence across human tissues and biological roles remains largely undefined. The placenta is an ideal model for identifying unannotated microproteins and alt-proteins due to its considerable protein diversity that is required to sustain fetal development during pregnancy. Here, we profiled unannotated microproteins and alt-proteins in human placental tissues from preeclampsia patients or healthy individuals by proteomics, identified 52 unannotated microproteins or alt-proteins, and demonstrated that five microproteins can be translated from overexpression constructs in a heterologous cell line, although several are unstable. We further demonstrated that one microprotein, XRCC6P1, associates with translation initiation factor eIF3 and negatively regulates translation when exogenously overexpressed. Thus, we revealed a hidden sm/alt-ORF-encoded proteome in the human placenta, which may advance the mechanism studies for placenta development as well as placental disorders such as preeclampsia.
Collapse
Affiliation(s)
- Qiong Li
- Department of Obstetrics and Gynecology, The First People's Hospital of Chenzhou, Chenzhou 423000, China
- The First Affiliated Hospital of Jinan University, Guangzhou 510632, China
| | - Fanrong Liu
- Department of Orthopedics, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou 325000, Zhejiang, China
| | - Xiaoyu Ma
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Feifei Chen
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Ziying Yi
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Yangyang Du
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Anxin Huang
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Chenyang Zhao
- Department of Obstetrics and Gynecology, The First People's Hospital of Chenzhou, Chenzhou 423000, China
- The First Affiliated Hospital of Jinan University, Guangzhou 510632, China
| | - Da Wang
- Department of Orthopedic Oncology, Shanghai Changzheng Hospital, Navy Military Medical University, Shanghai 200003, China
| | - Yanran Chen
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Xiongwen Cao
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
- Key Laboratory of Brain Functional Genomics, Ministry of Education and Shanghai, School of Life Sciences, East China Normal University, Shanghai 200062, China
| |
Collapse
|
3
|
Mohsen JJ, Mohsen MG, Jiang K, Landajuela A, Quinto L, Isaacs FJ, Karatekin E, Slavoff SA. Cellular function of the GndA small open reading frame-encoded polypeptide during heat shock. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.29.601336. [PMID: 38979229 PMCID: PMC11230408 DOI: 10.1101/2024.06.29.601336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Over the past 15 years, hundreds of previously undiscovered bacterial small open reading frame (sORF)-encoded polypeptides (SEPs) of fewer than fifty amino acids have been identified, and biological functions have been ascribed to an increasing number of SEPs from intergenic regions and small RNAs. However, despite numbering in the dozens in Escherichia coli, and hundreds to thousands in humans, same-strand nested sORFs that overlap protein coding genes in alternative reading frames remain understudied. In order to provide insight into this enigmatic class of unannotated genes, we characterized GndA, a 36-amino acid, heat shock-regulated SEP encoded within the +2 reading frame of the gnd gene in E. coli K-12 MG1655. We show that GndA pulls down components of respiratory complex I (RCI) and is required for proper localization of a RCI subunit during heat shock. At high temperature GndA deletion (ΔGndA) cells exhibit perturbations in cell growth, NADH+/NAD ratio, and expression of a number of genes including several associated with oxidative stress. These findings suggest that GndA may function in maintenance of homeostasis during heat shock. Characterization of GndA therefore supports the nascent but growing consensus that functional, overlapping genes occur in genomes from viruses to humans.
Collapse
Affiliation(s)
- Jessica J. Mohsen
- Department of Chemistry, Yale University, New Haven, CT 06511
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516
| | - Michael G. Mohsen
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06511
- Howard Hughes Medical Institute, Yale University, New Haven, CT 06511
| | - Kevin Jiang
- Department of Chemistry, Yale University, New Haven, CT 06511
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516
| | - Ane Landajuela
- Department of Cellular and Molecular Physiology, Yale School of Medicine, New Haven, CT 06510
- Nanobiology Institute, Yale University, West Haven, CT 06516
| | - Laura Quinto
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06511
- Systems Biology Institute, Yale University, West Haven, CT 06516
| | - Farren J. Isaacs
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06511
- Systems Biology Institute, Yale University, West Haven, CT 06516
| | - Erdem Karatekin
- Department of Cellular and Molecular Physiology, Yale School of Medicine, New Haven, CT 06510
- Nanobiology Institute, Yale University, West Haven, CT 06516
- Wu Tsai Institute, Yale University, New Haven, CT 06511
- Université de Paris, Saints-Pères Paris Institute for the Neurosciences (SPPIN), Centre National de la Recherche Scientifique (CNRS), 75006 Paris, France
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06511
| | - Sarah A. Slavoff
- Department of Chemistry, Yale University, New Haven, CT 06511
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06511
| |
Collapse
|
4
|
Nair AM, Jiang T, Mu B, Zhao R. Plastid Molecular Chaperone HSP90C Interacts with the SecA1 Subunit of Sec Translocase for Thylakoid Protein Transport. PLANTS (BASEL, SWITZERLAND) 2024; 13:1265. [PMID: 38732479 PMCID: PMC11085213 DOI: 10.3390/plants13091265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 04/24/2024] [Accepted: 04/29/2024] [Indexed: 05/13/2024]
Abstract
The plastid stroma-localized chaperone HSP90C plays a crucial role in maintaining optimal proteostasis within chloroplasts and participates in protein translocation processes. While existing studies have revealed HSP90C's direct interaction with the Sec translocase-dependent client pre-protein PsbO1 and the SecY1 subunit of the thylakoid membrane-bound Sec1 translocase channel system, its direct involvement with the extrinsic homodimeric Sec translocase subunit, SecA1, remains elusive. Employing bimolecular fluorescence complementation (BiFC) assay and other in vitro analyses, we unraveled potential interactions between HSP90C and SecA1. Our investigation revealed dynamic interactions between HSP90C and SecA1 at the thylakoid membrane and stroma. The thylakoid membrane localization of this interaction was contingent upon active HSP90C ATPase activity, whereas their stromal interaction was associated with active SecA1 ATPase activity. Furthermore, we observed a direct interaction between these two proteins by analyzing their ATP hydrolysis activities, and their interaction likely impacts their respective functional cycles. Additionally, using PsbO1, a model Sec translocase client pre-protein, we studied the intricacies of HSP90C's possible involvement in pre-protein translocation via the Sec1 system in chloroplasts. The results suggest a complex nature of the HSP90C-SecA1 interaction, possibly mediated by the Sec client protein. Our studies shed light on the nuanced aspects of HSP90C's engagement in orchestrating pre-protein translocation, and we propose a potential collaborative role of HSP90C with SecA1 in actively facilitating pre-protein transport across the thylakoid membrane.
Collapse
Affiliation(s)
| | | | | | - Rongmin Zhao
- Department of Biological Sciences, University of Toronto Scarborough, Toronto, ON M1C 1A4, Canada; Department of Cell & Systems Biology, University of Toronto, Toronto, ON M5S 3B2, Canada; (A.M.N.); (T.J.); (B.M.)
| |
Collapse
|
5
|
Whited AM, Jungreis I, Allen J, Cleveland CL, Mudge JM, Kellis M, Rinn JL, Hough LE. Biophysical characterization of high-confidence, small human proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.12.589296. [PMID: 38659920 PMCID: PMC11042228 DOI: 10.1101/2024.04.12.589296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Significant efforts have been made to characterize the biophysical properties of proteins. Small proteins have received less attention because their annotation has historically been less reliable. However, recent improvements in sequencing, proteomics, and bioinformatics techniques have led to the high-confidence annotation of small open reading frames (smORFs) that encode for functional proteins, producing smORF-encoded proteins (SEPs). SEPs have been found to perform critical functions in several species, including humans. While significant efforts have been made to annotate SEPs, less attention has been given to the biophysical properties of these proteins. We characterized the distributions of predicted and curated biophysical properties, including sequence composition, structure, localization, function, and disease association of a conservative list of previously identified human SEPs. We found significant differences between SEPs and both larger proteins and control sets. Additionally, we provide an example of how our characterization of biophysical properties can contribute to distinguishing protein-coding smORFs from non-coding ones in otherwise ambiguous cases.
Collapse
Affiliation(s)
- A M Whited
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA
| | - Irwin Jungreis
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
| | - Jeffre Allen
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA
- Department of Biochemistry, University of Colorado Boulder, CO, USA
| | | | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Manolis Kellis
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
| | - John L Rinn
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA
- Department of Biochemistry, University of Colorado Boulder, CO, USA
| | - Loren E Hough
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA
- Department of Physics, University of Colorado Boulder, CO, USA
| |
Collapse
|
6
|
Mohsen JJ, Martel AA, Slavoff SA. Microproteins-Discovery, structure, and function. Proteomics 2023; 23:e2100211. [PMID: 37603371 PMCID: PMC10841188 DOI: 10.1002/pmic.202100211] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/03/2023] [Accepted: 08/10/2023] [Indexed: 08/22/2023]
Abstract
Advances in proteogenomic technologies have revealed hundreds to thousands of translated small open reading frames (sORFs) that encode microproteins in genomes across evolutionary space. While many microproteins have now been shown to play critical roles in biology and human disease, a majority of recently identified microproteins have little or no experimental evidence regarding their functionality. Computational tools have some limitations for analysis of short, poorly conserved microprotein sequences, so additional approaches are needed to determine the role of each member of this recently discovered polypeptide class. A currently underexplored avenue in the study of microproteins is structure prediction and determination, which delivers a depth of functional information. In this review, we provide a brief overview of microprotein discovery methods, then examine examples of microprotein structures (and, conversely, intrinsic disorder) that have been experimentally determined using crystallography, cryo-electron microscopy, and NMR, which provide insight into their molecular functions and mechanisms. Additionally, we discuss examples of predicted microprotein structures that have provided insight or context regarding their function. Analysis of microprotein structure at the angstrom level, and confirmation of predicted structures, therefore, has potential to identify translated microproteins that are of biological importance and to provide molecular mechanism for their in vivo roles.
Collapse
Affiliation(s)
- Jessica J. Mohsen
- Department of Chemistry, Yale University, New Haven, CT, USA
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Alina A. Martel
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Sarah A. Slavoff
- Department of Chemistry, Yale University, New Haven, CT, USA
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| |
Collapse
|
7
|
Chen Y, Su H, Zhao J, Na Z, Jiang K, Bacchiocchi A, Loh KH, Halaban R, Wang Z, Cao X, Slavoff SA. Unannotated microprotein EMBOW regulates the interactome and chromatin and mitotic functions of WDR5. Cell Rep 2023; 42:113145. [PMID: 37725512 PMCID: PMC10629662 DOI: 10.1016/j.celrep.2023.113145] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 07/20/2023] [Accepted: 08/31/2023] [Indexed: 09/21/2023] Open
Abstract
The conserved WD40-repeat protein WDR5 interacts with multiple proteins both inside and outside the nucleus. However, it is currently unclear whether and how the distribution of WDR5 between complexes is regulated. Here, we show that an unannotated microprotein EMBOW (endogenous microprotein binder of WDR5) dually encoded in the human SCRIB gene interacts with WDR5 and regulates its binding to multiple interaction partners, including KMT2A and KIF2A. EMBOW is cell cycle regulated, with two expression maxima at late G1 phase and G2/M phase. Loss of EMBOW decreases WDR5 interaction with KIF2A, aberrantly shortens mitotic spindle length, prolongs G2/M phase, and delays cell proliferation. In contrast, loss of EMBOW increases WDR5 interaction with KMT2A, leading to WDR5 binding to off-target genes, erroneously increasing H3K4me3 levels, and activating transcription of these genes. Together, these results implicate EMBOW as a regulator of WDR5 that regulates its interactions and prevents its off-target binding in multiple contexts.
Collapse
Affiliation(s)
- Yanran Chen
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA; Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China; Key Laboratory of Brain Functional Genomics, Ministry of Education and Shanghai, School of Life Sciences, East China Normal University, Shanghai 200062, China
| | - Haomiao Su
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Jianing Zhao
- Frontier Innovation Center, Department of Systems Biology for Medicine, School of Basic Medical Sciences, Fudan University, Shanghai 200433, China; Shanghai Fifth People's Hospital, Fudan University, Shanghai 200433, China
| | - Zhenkun Na
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Kevin Jiang
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Antonella Bacchiocchi
- Department of Dermatology, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Ken H Loh
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA; Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Ruth Halaban
- Department of Dermatology, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Zhentian Wang
- Frontier Innovation Center, Department of Systems Biology for Medicine, School of Basic Medical Sciences, Fudan University, Shanghai 200433, China; Shanghai Fifth People's Hospital, Fudan University, Shanghai 200433, China
| | - Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA; Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT 06520, USA; Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China; Key Laboratory of Brain Functional Genomics, Ministry of Education and Shanghai, School of Life Sciences, East China Normal University, Shanghai 200062, China.
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06529, USA.
| |
Collapse
|
8
|
Chen Y, Cao X, Loh KH, Slavoff SA. Chemical labeling and proteomics for characterization of unannotated small and alternative open reading frame-encoded polypeptides. Biochem Soc Trans 2023; 51:1071-1082. [PMID: 37171061 PMCID: PMC10317152 DOI: 10.1042/bst20221074] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 03/27/2023] [Accepted: 04/13/2023] [Indexed: 05/13/2023]
Abstract
Thousands of unannotated small and alternative open reading frames (smORFs and alt-ORFs, respectively) have recently been revealed in mammalian genomes. While hundreds of mammalian smORF- and alt-ORF-encoded proteins (SEPs and alt-proteins, respectively) affect cell proliferation, the overwhelming majority of smORFs and alt-ORFs remain uncharacterized at the molecular level. Complicating the task of identifying the biological roles of smORFs and alt-ORFs, the SEPs and alt-proteins that they encode exhibit limited sequence homology to protein domains of known function. Experimental techniques for the functionalization of these gene classes are therefore required. Approaches combining chemical labeling and quantitative proteomics have greatly advanced our ability to identify and characterize functional SEPs and alt-proteins in high throughput. In this review, we briefly describe the principles of proteomic discovery of SEPs and alt-proteins, then summarize how these technologies interface with chemical labeling for identification of SEPs and alt-proteins with specific properties, as well as in defining the interactome of SEPs and alt-proteins.
Collapse
Affiliation(s)
- Yanran Chen
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
| | - Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT, U.S.A
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Ken H. Loh
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT, U.S.A
| | - Sarah A. Slavoff
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, U.S.A
| |
Collapse
|
9
|
Zhang M, Song J, Xiao J, Jin J, Nomura CT, Chen S, Wang Q. Engineered multiple translation initiation sites: a novel tool to enhance protein production in Bacillus licheniformis and other industrially relevant bacteria. Nucleic Acids Res 2022; 50:11979-11990. [PMID: 36382403 PMCID: PMC9723656 DOI: 10.1093/nar/gkac1039] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 10/18/2022] [Accepted: 10/31/2022] [Indexed: 11/17/2022] Open
Abstract
Gram-positive bacteria are a nascent platform for synthetic biology and metabolic engineering that can provide new opportunities for the production of biomolecules. However, the lack of standardized methods and genetic parts is a major obstacle towards attaining the acceptance and widespread use of Gram-positive bacterial chassis for industrial bioproduction. In this study, we have engineered a novel mRNA leader sequence containing more than one ribosomal binding site (RBS) which could initiate translation from multiple sites, vastly enhancing the translation efficiency of the Gram-positive industrial strain Bacillus licheniformis. This is the first report elucidating the impact of more than one RBS to initiate translation and enhance protein output in B. licheniformis. We also explored the application of more than one RBS for both intracellular and extracellular protein production in B. licheniformis to demonstrate its efficiency, consistency and potential for biotechnological applications. Moreover, we applied these concepts for use in other industrially relevant Gram-positive bacteria, such as Bacillus subtilis and Corynebacterium glutamicum. In all, a highly efficient and robust broad-host expression element has been designed to strengthen and fine-tune the protein outputs for the use of bioproduction in microbial cell factories.
Collapse
Affiliation(s)
- Manyu Zhang
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Environmental Microbial Technology Center of Hubei Province, College of Life Science, Hubei University, Wuhan 430062, China
| | | | - Jun Xiao
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Environmental Microbial Technology Center of Hubei Province, College of Life Science, Hubei University, Wuhan 430062, China
| | - Jingjie Jin
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou 510632, China
| | - Christopher T Nomura
- Department of Biological Sciences, University of Idaho, 875 Perimeter Drive, Moscow, ID 83844, USA
| | - Shouwen Chen
- Correspondence may also be addressed to Shouwen Chen.
| | - Qin Wang
- To whom correspondence should be addressed. Tel: +86 18507140137;
| |
Collapse
|
10
|
Fijalkowski I, Willems P, Jonckheere V, Simoens L, Van Damme P. Hidden in plain sight: challenges in proteomics detection of small ORF-encoded polypeptides. MICROLIFE 2022; 3:uqac005. [PMID: 37223358 PMCID: PMC10117744 DOI: 10.1093/femsml/uqac005] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Revised: 04/18/2022] [Accepted: 04/29/2022] [Indexed: 05/25/2023]
Abstract
Genomic studies of bacteria have long pointed toward widespread prevalence of small open reading frames (sORFs) encoding for short proteins, <100 amino acids in length. Despite the mounting genomic evidence of their robust expression, relatively little progress has been made in their mass spectrometry-based detection and various blanket statements have been used to explain this observed discrepancy. In this study, we provide a large-scale riboproteogenomics investigation of the challenging nature of proteomic detection of such small proteins as informed by conditional translation data. A panel of physiochemical properties alongside recently developed mass spectrometry detectability metrics was interrogated to provide a comprehensive evidence-based assessment of sORF-encoded polypeptide (SEP) detectability. Moreover, a large-scale proteomics and translatomics compendium of proteins produced by Salmonella Typhimurium (S. Typhimurium), a model human pathogen, across a panel of growth conditions is presented and used in support of our in silico SEP detectability analysis. This integrative approach is used to provide a data-driven census of small proteins expressed by S. Typhimurium across growth phases and infection-relevant conditions. Taken together, our study pinpoints current limitations in proteomics-based detection of novel small proteins currently missing from bacterial genome annotations.
Collapse
Affiliation(s)
- Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Patrick Willems
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Veronique Jonckheere
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Laure Simoens
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
11
|
Abstract
Escherichia coli was one of the first species to have its genome sequenced and remains one of the best-characterized model organisms. Thus, it is perhaps surprising that recent studies have shown that a substantial number of genes have been overlooked. Genes encoding more than 140 small proteins, defined as those containing 50 or fewer amino acids, have been identified in E. coli in the past 10 years, and there is substantial evidence indicating that many more remain to be discovered. This review covers the methods that have been successful in identifying small proteins and the short open reading frames that encode them. The small proteins that have been functionally characterized to date in this model organism are also discussed. It is hoped that the review, along with the associated databases of known as well as predicted but undetected small proteins, will aid in and provide a roadmap for the continued identification and characterization of these proteins in E. coli as well as other bacteria.
Collapse
|
12
|
Cassidy L, Kaulich PT, Maaß S, Bartel J, Becher D, Tholey A. Bottom-up and top-down proteomic approaches for the identification, characterization, and quantification of the low molecular weight proteome with focus on short open reading frame-encoded peptides. Proteomics 2021; 21:e2100008. [PMID: 34145981 DOI: 10.1002/pmic.202100008] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 06/09/2021] [Accepted: 06/09/2021] [Indexed: 01/14/2023]
Abstract
The recent discovery of alternative open reading frames creates a need for suitable analytical approaches to verify their translation and to characterize the corresponding gene products at the molecular level. As the analysis of small proteins within a background proteome by means of classical bottom-up proteomics is challenging, method development for the analysis of small open reading frame encoded peptides (SEPs) have become a focal point for research. Here, we highlight bottom-up and top-down proteomics approaches established for the analysis of SEPs in both pro- and eukaryotes. Major steps of analysis, including sample preparation and (small) proteome isolation, separation and mass spectrometry, data interpretation and quality control, quantification, the analysis of post-translational modifications, and exploration of functional aspects of the SEPs by means of proteomics technologies are described. These methods do not exclusively cover the analytics of SEPs but simultaneously include the low molecular weight proteome, and moreover, can also be used for the proteome-wide analysis of proteolytic processing events.
Collapse
Affiliation(s)
- Liam Cassidy
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Philipp T Kaulich
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Sandra Maaß
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Greifswald, Germany
| | - Jürgen Bartel
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Greifswald, Germany
| | - Dörte Becher
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Greifswald, Germany
| | - Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| |
Collapse
|
13
|
Fuchs S, Kucklick M, Lehmann E, Beckmann A, Wilkens M, Kolte B, Mustafayeva A, Ludwig T, Diwo M, Wissing J, Jänsch L, Ahrens CH, Ignatova Z, Engelmann S. Towards the characterization of the hidden world of small proteins in Staphylococcus aureus, a proteogenomics approach. PLoS Genet 2021; 17:e1009585. [PMID: 34061833 PMCID: PMC8195425 DOI: 10.1371/journal.pgen.1009585] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 06/11/2021] [Accepted: 05/07/2021] [Indexed: 01/08/2023] Open
Abstract
Small proteins play essential roles in bacterial physiology and virulence, however, automated algorithms for genome annotation are often not yet able to accurately predict the corresponding genes. The accuracy and reliability of genome annotations, particularly for small open reading frames (sORFs), can be significantly improved by integrating protein evidence from experimental approaches. Here we present a highly optimized and flexible bioinformatics workflow for bacterial proteogenomics covering all steps from (i) generation of protein databases, (ii) database searches and (iii) peptide-to-genome mapping to (iv) visualization of results. We used the workflow to identify high quality peptide spectrum matches (PSMs) for small proteins (≤ 100 aa, SP100) in Staphylococcus aureus Newman. Protein extracts from S. aureus were subjected to different experimental workflows for protein digestion and prefractionation and measured with highly sensitive mass spectrometers. In total, 175 proteins with up to 100 aa (SP100) were identified. Out of these 24 (ranging from 9 to 99 aa) were novel and not contained in the used genome annotation.144 SP100 are highly conserved and were found in at least 50% of the publicly available S. aureus genomes, while 127 are additionally conserved in other staphylococci. Almost half of the identified SP100 were basic, suggesting a role in binding to more acidic molecules such as nucleic acids or phospholipids.
Collapse
Affiliation(s)
- Stephan Fuchs
- Robert Koch Institute, Methodenentwicklung und Forschungsinfrastruktur (MF), Berlin, Germany
| | - Martin Kucklick
- University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Center for Infection Research GmbH, Microbial Proteomics, Braunschweig, Germany
| | - Erik Lehmann
- University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Center for Infection Research GmbH, Microbial Proteomics, Braunschweig, Germany
| | - Alexander Beckmann
- University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Center for Infection Research GmbH, Microbial Proteomics, Braunschweig, Germany
| | - Maya Wilkens
- Robert Koch Institute, Methodenentwicklung und Forschungsinfrastruktur (MF), Berlin, Germany
- University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Center for Infection Research GmbH, Microbial Proteomics, Braunschweig, Germany
| | - Baban Kolte
- University of Hamburg, Institute of Biochemistry and Molecular Biology, Hamburg, Germany
| | - Ayten Mustafayeva
- University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Center for Infection Research GmbH, Microbial Proteomics, Braunschweig, Germany
| | - Tobias Ludwig
- University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Center for Infection Research GmbH, Microbial Proteomics, Braunschweig, Germany
| | - Maurice Diwo
- University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Center for Infection Research GmbH, Microbial Proteomics, Braunschweig, Germany
| | - Josef Wissing
- Helmholtz Center for Infection Research GmbH, Cellular Proteomics, Braunschweig, Germany
| | - Lothar Jänsch
- Helmholtz Center for Infection Research GmbH, Cellular Proteomics, Braunschweig, Germany
| | - Christian H Ahrens
- Agroscope, Research Group Molecular Diagnostics, Genomics and Bioinformatics & SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Zoya Ignatova
- University of Hamburg, Institute of Biochemistry and Molecular Biology, Hamburg, Germany
| | - Susanne Engelmann
- University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Center for Infection Research GmbH, Microbial Proteomics, Braunschweig, Germany
| |
Collapse
|
14
|
Fijalkowska D, Fijalkowski I, Willems P, Van Damme P. Bacterial riboproteogenomics: the era of N-terminal proteoform existence revealed. FEMS Microbiol Rev 2021; 44:418-431. [PMID: 32386204 DOI: 10.1093/femsre/fuaa013] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 05/07/2020] [Indexed: 12/17/2022] Open
Abstract
With the rapid increase in the number of sequenced prokaryotic genomes, relying on automated gene annotation became a necessity. Multiple lines of evidence, however, suggest that current bacterial genome annotations may contain inconsistencies and are incomplete, even for so-called well-annotated genomes. We here discuss underexplored sources of protein diversity and new methodologies for high-throughput genome reannotation. The expression of multiple molecular forms of proteins (proteoforms) from a single gene, particularly driven by alternative translation initiation, is gaining interest as a prominent contributor to bacterial protein diversity. In consequence, riboproteogenomic pipelines were proposed to comprehensively capture proteoform expression in prokaryotes by the complementary use of (positional) proteomics and the direct readout of translated genomic regions using ribosome profiling. To complement these discoveries, tailored strategies are required for the functional characterization of newly discovered bacterial proteoforms.
Collapse
Affiliation(s)
- Daria Fijalkowska
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Igor Fijalkowski
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Patrick Willems
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Petra Van Damme
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| |
Collapse
|
15
|
Nasir MA, Nawaz S, Huang J. A Mini-review of Computational Approaches to Predict Functions and Findings of Novel Micro Peptides. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200811130522] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
:
New techniques in bioinformatics and the study of the transcriptome at a wide-scale
have uncovered the fact that a large part of the genome is being translated than recently perceived
thoughts and research, bringing about the creation of a various quantity of RNA with proteincoding
and noncoding potential. A lot of RNA particles have been considered as noncoding due to
many reasons, according to developing proofs. Like many sORFs that encode many functional
micro peptides have neglected due to their tiny sizes.
:
Advanced studies reveal many major biological functions of these sORFs and their encoded micro
peptides in a different and wide range of species. All the achievement in the identification of these
sORFs and micro peptides is due to the progressive bioinformatics and high-throughput
sequencing methods. This field has pulled in more consideration due to the detection of a large
number of more sORFs and micro peptides. Nowadays, COVID-19 grabs all the attention of
science as it is a sudden outbreak. sORFs of COVID-19 should be revealed for new ways to
understand this virus. This review discusses ongoing progress in the systems for the identification
and distinguishing proof of sORFs and micro peptides.
Collapse
Affiliation(s)
- Mohsin Ali Nasir
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Samia Nawaz
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| |
Collapse
|
16
|
Meydan S, Klepacki D, Mankin AS, Vázquez-Laslop N. Identification of Translation Start Sites in Bacterial Genomes. Methods Mol Biol 2021; 2252:27-55. [PMID: 33765270 DOI: 10.1007/978-1-0716-1150-0_2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The knowledge of translation start sites is crucial for annotation of genes in bacterial genomes. However, systematic mapping of start codons in bacterial genes has mainly relied on predictions based on protein conservation and mRNA sequence features which, although useful, are not always accurate. We recently found that the pleuromutilin antibiotic retapamulin (RET) is a specific inhibitor of translation initiation that traps ribosomes specifically at start codons, and we used it in combination with ribosome profiling to map start codons in the Escherichia coli genome. This genome-wide strategy, that was named Ribo-RET, not only verifies the position of start codons in already annotated genes but also enables identification of previously unannotated open reading frames and reveals the presence of internal start sites within genes. Here, we provide a detailed Ribo-RET protocol for E. coli. Ribo-RET can be adapted for mapping the start codons of the protein-coding sequences in a variety of bacterial species.
Collapse
Affiliation(s)
- Sezen Meydan
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Dorota Klepacki
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL, USA
| | - Alexander S Mankin
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL, USA.
| | - Nora Vázquez-Laslop
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL, USA.
| |
Collapse
|
17
|
Khitun A, Slavoff SA. Proteomic Detection and Validation of Translated Small Open Reading Frames. ACTA ACUST UNITED AC 2020; 11:e77. [PMID: 31750990 DOI: 10.1002/cpch.77] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Small open reading frames (smORFs) encode previously unannotated polypeptides or short proteins that regulate translation in cis (eukaryotes) and/or are independently functional (prokaryotes and eukaryotes). Ongoing efforts for complete annotation and functional characterization of smORF-encoded proteins have yielded novel regulators and therapeutic targets. However, because they are excluded from protein databases, initiate at non-AUG start codons, and produce few unique tryptic peptides, unannotated small proteins cannot be detected with standard proteomic methods. Here,, we outline a procedure for mass spectrometry-based detection of translated smORFs in cultured human cells from protein extraction, digestion, and LC-MS/MS, to database preparation and data analysis. Following proteomic detection, translation from a unique smORF may be validated via siRNA-based silencing or overexpression and epitope tagging. This is necessary to unambiguously assign a peptide to a smORF within a specific transcript isoform or genomic locus. Provided that sufficient starting material is available, this workflow can be applied to any cell type/organism and adjusted to study specific (patho)physiological contexts including, but not limited to, development, stress, and disease. © 2019 by John Wiley & Sons, Inc. Basic Protocol 1: Protein extraction, size selection, and trypsin digestion Alternate Protocol 1: In-solution C8 column size selection Support Protocol 1: Chloroform/methanol precipitation Support Protocol 2: Reduction, alkylation, and in-solution protease digestion Support Protocol 3: Peptide de-salting Basic Protocol 2: Two-dimensional LC-MS/MS with ERLIC fractionation Basic Protocol 3: Transcriptomic database construction Alternate Protocol 2: Transcriptomics database generation with gffread Basic Protocol 4: Non-annotated peptide identification from LC-MS/MS data Basic Protocol 5: Validation using isotopically labeled synthetic peptide standards and siRNA Basic Protocol 6: Transcript validation using transient overexpression.
Collapse
Affiliation(s)
- Alexandra Khitun
- Department of Chemistry, Yale University, New Haven, Connecticut.,Chemical Biology Institute, Yale University, West Haven, Connecticut
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, Connecticut.,Chemical Biology Institute, Yale University, West Haven, Connecticut.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut
| |
Collapse
|
18
|
Cao X, Khitun A, Na Z, Dumitrescu DG, Kubica M, Olatunji E, Slavoff SA. Comparative Proteomic Profiling of Unannotated Microproteins and Alternative Proteins in Human Cell Lines. J Proteome Res 2020; 19:3418-3426. [PMID: 32449352 DOI: 10.1021/acs.jproteome.0c00254] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Ribosome profiling and mass spectrometry have revealed thousands of small and alternative open reading frames (sm/alt-ORFs) that are translated into polypeptides variously termed as microproteins and alt-proteins in mammalian cells. Some micro-/alt-proteins exhibit stress-, cell-type-, and/or tissue-specific expression; understanding this regulated expression will be critical to elucidating their functions. While differential translation has been inferred by ribosome profiling, quantitative mass spectrometry-based proteomics is needed for direct comparison of microprotein and alt-protein expression between samples and conditions. However, while label-free quantitative proteomics has been applied to detect stress-dependent expression of bacterial microproteins, this approach has not yet been demonstrated for analysis of differential expression of unannotated ORFs in the more complex human proteome. Here, we present global micro-/alt-protein quantitation in two human leukemia cell lines, K562 and MOLT4. We identify 12 unannotated proteins that are differentially expressed in these cell lines. The expression of six micro/alt-proteins from cDNA was validated biochemically, and two were found to localize to the nucleus. Thus, we demonstrate that label-free comparative proteomics enables quantitation of micro-/alt-protein expression between human cell lines. We anticipate that this workflow will enable the discovery of regulated sm/alt-ORF products across many biological conditions in human cells.
Collapse
Affiliation(s)
- Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States.,Chemical Biology Institute, Yale University, West Haven, Connecticut06516, United States
| | - Alexandra Khitun
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States.,Chemical Biology Institute, Yale University, West Haven, Connecticut06516, United States
| | - Zhenkun Na
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States.,Chemical Biology Institute, Yale University, West Haven, Connecticut06516, United States
| | - Daniel G Dumitrescu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
| | - Marcelina Kubica
- Chemical Biology Institute, Yale University, West Haven, Connecticut06516, United States
| | - Elizabeth Olatunji
- Chemical Biology Institute, Yale University, West Haven, Connecticut06516, United States
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States.,Chemical Biology Institute, Yale University, West Haven, Connecticut06516, United States.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut06529, United States
| |
Collapse
|
19
|
Cao X, Slavoff SA. Non-AUG start codons: Expanding and regulating the small and alternative ORFeome. Exp Cell Res 2020; 391:111973. [PMID: 32209305 DOI: 10.1016/j.yexcr.2020.111973] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Revised: 03/10/2020] [Accepted: 03/18/2020] [Indexed: 01/17/2023]
Abstract
Recent ribosome profiling and proteomic studies have revealed the presence of thousands of novel coding sequences, referred to as small open reading frames (sORFs), in prokaryotic and eukaryotic genomes. These genes have defied discovery via traditional genomic tools not only because they tend to be shorter than standard gene annotation length cutoffs, but also because they are, as a class, enriched in sequence properties previously assumed to be unusual, including non-AUG start codons. In this review, we summarize what is currently known about the incidence, efficiency, and mechanism of non-AUG start codon usage in prokaryotes and eukaryotes, and provide examples of regulatory and functional sORFs that initiate at non-AUG codons. While only a handful of non-AUG-initiated novel genes have been characterized in detail to date, their participation in important biological processes suggests that an improved understanding of this class of genes is needed.
Collapse
Affiliation(s)
- Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, CT, 06520, United States; Chemical Biology Institute, Yale University, West Haven, CT, 06516, United States
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, CT, 06520, United States; Chemical Biology Institute, Yale University, West Haven, CT, 06516, United States; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06529, United States.
| |
Collapse
|
20
|
Orr MW, Mao Y, Storz G, Qian SB. Alternative ORFs and small ORFs: shedding light on the dark proteome. Nucleic Acids Res 2020; 48:1029-1042. [PMID: 31504789 DOI: 10.1093/nar/gkz734] [Citation(s) in RCA: 183] [Impact Index Per Article: 36.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 08/03/2019] [Accepted: 08/15/2019] [Indexed: 02/06/2023] Open
Abstract
Traditional annotation of protein-encoding genes relied on assumptions, such as one open reading frame (ORF) encodes one protein and minimal lengths for translated proteins. With the serendipitous discoveries of translated ORFs encoded upstream and downstream of annotated ORFs, from alternative start sites nested within annotated ORFs and from RNAs previously considered noncoding, it is becoming clear that these initial assumptions are incorrect. The findings have led to the realization that genetic information is more densely coded and that the proteome is more complex than previously anticipated. As such, interest in the identification and characterization of the previously ignored 'dark proteome' is increasing, though we note that research in eukaryotes and bacteria has largely progressed in isolation. To bridge this gap and illustrate exciting findings emerging from studies of the dark proteome, we highlight recent advances in both eukaryotic and bacterial cells. We discuss progress in the detection of alternative ORFs as well as in the understanding of functions and the regulation of their expression and posit questions for future work.
Collapse
Affiliation(s)
- Mona Wu Orr
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Yuanhui Mao
- Division of Nutritional Sciences, Cornell University, Ithaca, NY 14853, USA
| | - Gisela Storz
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Shu-Bing Qian
- Division of Nutritional Sciences, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
21
|
Cardon T, Salzet M, Franck J, Fournier I. Nuclei of HeLa cells interactomes unravel a network of ghost proteins involved in proteins translation. Biochim Biophys Acta Gen Subj 2019; 1863:1458-1470. [PMID: 31128158 DOI: 10.1016/j.bbagen.2019.05.009] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Revised: 04/18/2019] [Accepted: 05/14/2019] [Indexed: 11/29/2022]
Abstract
Ghost proteins are issued from alternative Open Reading Frames (ORFs) and are missing a genome annotation. Indeed, historical filters applied for the detection of putative translated ORFs led to a wrong classification of transcripts considered as non-coding although translated proteins can be detected by proteomics. This Ghost (also called Alternative) proteome was neglected, and one major issue is to identify the implication of the Ghost proteins in the biological processes. In this context, we aimed to identify the protein-protein interactions (PPIs) of the Ghost proteins. For that, we re-explored a cross-link MS study performed on nuclei of HeLa cells using cross-linking mass spectrometry (XL-MS) associated with the HaltOrf database. Among 1679 cross-link interactions identified, 292 are involving Ghost Proteins. Forty-Four of these Ghost proteins are found to interact with 7 Reference proteins related to ribonucleoproteins, ribosome subunits and zinc finger proteins network. We, thus, have focused our attention on the heterotrimer between the RE/poly(U)-binding/degradation factor 1 (AUF1), the Ribosomal protein 10 (RPL10) and AltATAD2. Using I-Tasser software we performed docking models from which we could suggest the attachment of AUF1 on the external part of RPL10 and the interaction of AltATAD2 on the RPL10 region interacting with 5S ribosomal RNA as a mechanism of regulation of the ribosome. Taken together, these results reveal the importance of Ghost Proteins within known protein interaction networks.
Collapse
Affiliation(s)
- Tristan Cardon
- Inserm, U1192 - Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Michel Salzet
- Inserm, U1192 - Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France.
| | - Julien Franck
- Inserm, U1192 - Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France.
| | - Isabelle Fournier
- Inserm, U1192 - Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France.
| |
Collapse
|
22
|
Retapamulin-Assisted Ribosome Profiling Reveals the Alternative Bacterial Proteome. Mol Cell 2019; 74:481-493.e6. [PMID: 30904393 DOI: 10.1016/j.molcel.2019.02.017] [Citation(s) in RCA: 114] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Revised: 01/25/2019] [Accepted: 02/12/2019] [Indexed: 12/21/2022]
Abstract
The use of alternative translation initiation sites enables production of more than one protein from a single gene, thereby expanding the cellular proteome. Although several such examples have been serendipitously found in bacteria, genome-wide mapping of alternative translation start sites has been unattainable. We found that the antibiotic retapamulin specifically arrests initiating ribosomes at start codons of the genes. Retapamulin-enhanced Ribo-seq analysis (Ribo-RET) not only allowed mapping of conventional initiation sites at the beginning of the genes, but strikingly, it also revealed putative internal start sites in a number of Escherichia coli genes. Experiments demonstrated that the internal start codons can be recognized by the ribosomes and direct translation initiation in vitro and in vivo. Proteins, whose synthesis is initiated at internal in-frame and out-of-frame start sites, can be functionally important and contribute to the "alternative" bacterial proteome. The internal start sites may also play regulatory roles in gene expression.
Collapse
|
23
|
Khitun A, Ness TJ, Slavoff SA. Small open reading frames and cellular stress responses. Mol Omics 2019; 15:108-116. [PMID: 30810554 DOI: 10.1039/c8mo00283e] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Small open reading frames (smORFs) encoding polypeptides of less than 100 amino acids in eukaryotes (50 amino acids in prokaryotes) were historically excluded from genome annotation. However, recent advances in genomics, ribosome footprinting, and proteomics have revealed thousands of translated smORFs in genomes spanning evolutionary space. These smORFs can encode functional polypeptides, or act as cis-translational regulators. Herein we review evidence that some smORF-encoded polypeptides (SEPs) participate in stress responses in both prokaryotes and eukaryotes, and that some upstream ORFs (uORFs) regulate stress-responsive translation of downstream cistrons in eukaryotic cells. These studies provide insight into a regulated subclass of smORFs and suggest that at least some SEPs may participate in maintenance of cellular homeostasis under stress.
Collapse
Affiliation(s)
- Alexandra Khitun
- Chemical Biology Institute, Yale University, West Haven, CT 06516, USA. and Department of Chemistry, Yale University, New Haven, CT 06520, USA
| | - Travis J Ness
- Chemical Biology Institute, Yale University, West Haven, CT 06516, USA. and Department of Chemistry, Yale University, New Haven, CT 06520, USA
| | - Sarah A Slavoff
- Chemical Biology Institute, Yale University, West Haven, CT 06516, USA. and Department of Chemistry, Yale University, New Haven, CT 06520, USA and Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
24
|
Meydan S, Vázquez-Laslop N, Mankin AS. Genes within Genes in Bacterial Genomes. Microbiol Spectr 2018; 6:10.1128/microbiolspec.rwr-0020-2018. [PMID: 30003865 PMCID: PMC11633611 DOI: 10.1128/microbiolspec.rwr-0020-2018] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Indexed: 12/13/2022] Open
Abstract
Genetic coding in bacteria largely operates via the "one gene-one protein" paradigm. However, the peculiarities of the mRNA structure, the versatility of the genetic code, and the dynamic nature of translation sometimes allow organisms to deviate from the standard rules of protein encoding. Bacteria can use several unorthodox modes of translation to express more than one protein from a single mRNA cistron. One such alternative path is the use of additional translation initiation sites within the gene. Proteins whose translation is initiated at different start sites within the same reading frame will differ in their N termini but will have identical C-terminal segments. On the other hand, alternative initiation of translation in a register different from the frame dictated by the primary start codon will yield a protein whose sequence is entirely different from the one encoded in the main frame. The use of internal mRNA codons as translation start sites is controlled by the nucleotide sequence and the mRNA folding. The proteins of the alternative proteome generated via the "genes-within-genes" strategy may carry important functions. In this review, we summarize the currently known examples of bacterial genes encoding more than one protein due to the utilization of additional translation start sites and discuss the known or proposed functions of the alternative polypeptides in relation to the main protein product of the gene. We also discuss recent proteome- and genome-wide approaches that will allow the discovery of novel translation initiation sites in a systematic fashion.
Collapse
Affiliation(s)
- Sezen Meydan
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL 60607
| | - Nora Vázquez-Laslop
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL 60607
| | - Alexander S Mankin
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL 60607
| |
Collapse
|