1
|
Gough L, Miranda R, Hemm M, Norman L, Jara B. Faculty perceptions of a professional development program for developing CUREs and promoting inclusive and equitable teaching. JOURNAL OF MICROBIOLOGY & BIOLOGY EDUCATION 2025; 26:e0021524. [PMID: 39804063 PMCID: PMC12020812 DOI: 10.1128/jmbe.00215-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/24/2024] [Accepted: 12/07/2024] [Indexed: 04/25/2025]
Abstract
The Diffusion of Innovations (DOI) model can be used to explore how faculty prioritize learning about and adopting new pedagogical approaches. Here, we use the DOI framework to contextualize biology faculty perceptions of a professional development (PD) program designed to help them create a full semester course-based undergraduate research experience (CURE) class at a large, public comprehensive university. PD sessions included exploring self-reflexive identity while fostering inclusive classroom spaces through understanding and interrupting implicit bias and microaggressions. This qualitative study sought to determine 11 biology faculty members' beliefs about the influence of their year-long PD on their CURE development and teaching practices. Findings suggest that faculty were motivated to teach CUREs for a variety of reasons. A common incentive was integrating research into a CURE to bring their passion into their classroom and to engage more students in research. This may be particularly important at institutions where faculty have a heavy teaching load. Faculty also reported modifying their teaching in their CUREs and other courses to be more inclusive and equitable. The importance of peer interactions in the PD was emphasized repeatedly as faculty learned from experts, the literature, and faculty who had already developed a CURE. Our results illustrate that a community of practice structure can enhance the learning aspect of the community, helping faculty consider their implementation of inclusive, equitable, and high-impact practices as an ongoing educational process for themselves and emphasizing the importance of reflection and iteration in a DOI framework.
Collapse
Affiliation(s)
- Laura Gough
- Department of Biological Sciences, Towson University, Towson, Maryland, USA
| | - Rommel Miranda
- Department of Physics, Astronomy, and Geosciences, Towson University, Towson, Maryland, USA
| | - Matthew Hemm
- Department of Biological Sciences, Towson University, Towson, Maryland, USA
| | - Leann Norman
- Department of Biological Sciences, Towson University, Towson, Maryland, USA
| | - Brian Jara
- Office of Inclusion and Institutional Equity, Towson University, Towson, Maryland, USA
| |
Collapse
|
2
|
Iwadate Y, Slauch JM. The CorC proteins MgpA (YoaE) and CorC protect from excess-cation stress and are required for egg white tolerance and virulence in Salmonella. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.18.643926. [PMID: 40166170 PMCID: PMC11957008 DOI: 10.1101/2025.03.18.643926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Cation homeostasis is a vital function. In Salmonella, growth in very low Mg2+ induces expression of high-affinity Mg2+ transporters and synthesis of polyamines, organic cations that substitute for Mg2+. Once Mg2+ levels are re-established, the polyamines must be excreted by PaeA. Otherwise, cells lose viability due to a condition we term excess-cation stress. We sought additional tolerance mechanisms for this stress. We show that CorC and MgpA (YoaE) are essential for survival in stationary phase after Mg2+ starvation. Deletion of corC causes a loss of viability additive with the paeA phenotype. Deletion of mgpA causes a synthetic defect in the corC background. This lethality is suppressed by loss of the inducible Mg2+ transporters, suggesting that the corC mgpA mutant is sensitive to changes in intracellular Mg2+. CorC and MgpA function independently of PaeA. A paeA mutant is sensitive to externally added polyamine in stationary phase; loss of CorC and MgpA suppressed this sensitivity. Conversely, the corC mgpA mutant, but not the paeA mutant, exhibited sensitivity to high Mg2+ and egg white. The corC mgpA mutant is also attenuated in a mouse model. The corC and mgpA genes are induced in response to increased Mg2+ concentrations. Thus, CorC and MgpA play some interrelated role in cation homeostasis. It is unlikely that these phenotypes are due to absolute levels of cations. Rather, the cell maintains relative concentrations of various cations that likely compete for binding to anionic components. Imbalance of these cations affects some essential function(s), leading to a loss of viability.
Collapse
Affiliation(s)
- Yumi Iwadate
- Department of Microbiology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - James M. Slauch
- Department of Microbiology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| |
Collapse
|
3
|
Ando Y, Kobo A, Niwa T, Yamakawa A, Konoma S, Kobayashi Y, Nureki O, Taguchi H, Itoh Y, Chadani Y. A mini-hairpin shaped nascent peptide blocks translation termination by a distinct mechanism. Nat Commun 2025; 16:2323. [PMID: 40057501 PMCID: PMC11890864 DOI: 10.1038/s41467-025-57659-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Accepted: 02/25/2025] [Indexed: 05/13/2025] Open
Abstract
Protein synthesis by ribosomes produces functional proteins but also serves diverse regulatory functions, which depend on the coding amino acid sequences. Certain nascent peptides interact with the ribosome exit tunnel to arrest translation and modulate themselves or the expression of downstream genes. However, a comprehensive understanding of the mechanisms of such ribosome stalling and its regulation remains elusive. In this study, we systematically screen for unidentified ribosome arrest peptides through phenotypic evaluation, proteomics, and mass spectrometry analyses, leading to the discovery of the arrest peptides PepNL and NanCL in E. coli. Our cryo-EM study on PepNL reveals a distinct arrest mechanism, in which the N-terminus of PepNL folds back towards the tunnel entrance to prevent the catalytic GGQ motif of the release factor from accessing the peptidyl transferase center, causing translation arrest at the UGA stop codon. Furthermore, unlike sensory arrest peptides that require an arrest inducer, PepNL uses tryptophan as an arrest inhibitor, where Trp-tRNATrp reads through the stop codon. Our findings illuminate the mechanism and regulatory framework of nascent peptide-induced translation arrest, paving the way for exploring regulatory nascent peptides.
Collapse
Affiliation(s)
- Yushin Ando
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| | - Akinao Kobo
- School of Life Science and Technology, Institute of Science Tokyo, Yokohama, Japan
| | - Tatsuya Niwa
- School of Life Science and Technology, Institute of Science Tokyo, Yokohama, Japan
- Cell Biology Center, Institute of Integrated Research, Institute of Science Tokyo, Yokohama, Japan
| | - Ayako Yamakawa
- School of Life Science and Technology, Institute of Science Tokyo, Yokohama, Japan
| | - Suzuna Konoma
- School of Life Science and Technology, Institute of Science Tokyo, Yokohama, Japan
| | - Yuki Kobayashi
- School of Life Science and Technology, Institute of Science Tokyo, Yokohama, Japan
| | - Osamu Nureki
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan.
| | - Hideki Taguchi
- School of Life Science and Technology, Institute of Science Tokyo, Yokohama, Japan.
- Cell Biology Center, Institute of Integrated Research, Institute of Science Tokyo, Yokohama, Japan.
| | - Yuzuru Itoh
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan.
| | - Yuhei Chadani
- Faculty of Environmental, Life, Natural Science and Technology, Okayama University, Okayama, Japan.
| |
Collapse
|
4
|
Vellappan S, Sun J, Favate J, Jagadeesan P, Cerda D, Shah P, Yadavalli SS. Translation profiling of stress-induced small proteins reveals a novel link among signaling systems. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.13.612970. [PMID: 39345582 PMCID: PMC11429745 DOI: 10.1101/2024.09.13.612970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Signaling networks allow adaptation to stressful environments by activating genes that counteract stressors. Small proteins (≤ 50 amino acids long) are a rising class of stress response regulators. Escherichia coli encodes over 150 small proteins, most of which lack phenotypes and their biological roles remain elusive. Using magnesium limitation as a stressor, we identify stress-induced small proteins using ribosome profiling, RNA sequencing, and transcriptional reporter assays. We uncover 17 small proteins with increased translation initiation, several of them transcriptionally upregulated by the PhoQ-PhoP two-component signaling system, crucial for magnesium homeostasis. Next, we describe small protein-specific deletion and overexpression phenotypes, underscoring their physiological significance in low magnesium stress. Most remarkably, we elucidate an unusual connection via a small membrane protein YoaI, between major signaling networks - PhoR-PhoB and EnvZ-OmpR in E. coli, advancing our understanding of small protein regulators in cellular signaling.
Collapse
Affiliation(s)
- Sangeevan Vellappan
- Waksman Institute of Microbiology, Rutgers University, Piscataway, NJ USA
- Department of Genetics, School of Arts and Sciences, Rutgers University, Piscataway, NJ USA
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, USA
| | - Junhong Sun
- Waksman Institute of Microbiology, Rutgers University, Piscataway, NJ USA
| | - John Favate
- Department of Genetics, School of Arts and Sciences, Rutgers University, Piscataway, NJ USA
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, USA
| | - Pranavi Jagadeesan
- Waksman Institute of Microbiology, Rutgers University, Piscataway, NJ USA
| | - Debbie Cerda
- Waksman Institute of Microbiology, Rutgers University, Piscataway, NJ USA
- Department of Genetics, School of Arts and Sciences, Rutgers University, Piscataway, NJ USA
| | - Premal Shah
- Department of Genetics, School of Arts and Sciences, Rutgers University, Piscataway, NJ USA
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, USA
| | - Srujana S. Yadavalli
- Waksman Institute of Microbiology, Rutgers University, Piscataway, NJ USA
- Department of Genetics, School of Arts and Sciences, Rutgers University, Piscataway, NJ USA
| |
Collapse
|
5
|
Tufail MA, Jordan B, Hadjeras L, Gelhausen R, Cassidy L, Habenicht T, Gutt M, Hellwig L, Backofen R, Tholey A, Sharma CM, Schmitz RA. Uncovering the small proteome of Methanosarcina mazei using Ribo-seq and peptidomics under different nitrogen conditions. Nat Commun 2024; 15:8659. [PMID: 39370430 PMCID: PMC11456600 DOI: 10.1038/s41467-024-53008-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Accepted: 09/25/2024] [Indexed: 10/08/2024] Open
Abstract
The mesophilic methanogenic archaeal model organism Methanosarcina mazei strain Gö1 is crucial for climate and environmental research due to its ability to produce methane. Here, we establish a Ribo-seq protocol for M. mazei strain Gö1 under two growth conditions (nitrogen sufficiency and limitation). The translation of 93 previously annotated and 314 unannotated small ORFs, coding for proteins ≤ 70 amino acids, is predicted with high confidence based on Ribo-seq data. LC-MS analysis validates the translation for 62 annotated small ORFs and 26 unannotated small ORFs. Epitope tagging followed by immunoblotting analysis confirms the translation of 13 out of 16 selected unannotated small ORFs. A comprehensive differential transcription and translation analysis reveals that 29 of 314 unannotated small ORFs are differentially regulated in response to nitrogen availability at the transcriptional and 49 at the translational level. A high number of reported small RNAs are emerging as dual-function RNAs, including sRNA154, the central regulatory small RNA of nitrogen metabolism. Several unannotated small ORFs are conserved in Methanosarcina species and overproducing several (small ORF encoded) small proteins suggests key physiological functions. Overall, the comprehensive analysis opens an avenue to elucidate the function(s) of multitudinous small proteins and dual-function RNAs in M. mazei.
Collapse
Affiliation(s)
| | - Britta Jordan
- Institute for General Microbiology, Kiel University, 24118, Kiel, Germany
| | - Lydia Hadjeras
- Institute of Molecular Infection Biology, University of Würzburg, 97080, Würzburg, Germany
| | - Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110, Freiburg, Germany
| | - Liam Cassidy
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Kiel University, 24105, Kiel, Germany
| | - Tim Habenicht
- Institute for General Microbiology, Kiel University, 24118, Kiel, Germany
| | - Miriam Gutt
- Institute for General Microbiology, Kiel University, 24118, Kiel, Germany
| | - Lisa Hellwig
- Institute for General Microbiology, Kiel University, 24118, Kiel, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110, Freiburg, Germany
| | - Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Kiel University, 24105, Kiel, Germany
| | - Cynthia M Sharma
- Institute of Molecular Infection Biology, University of Würzburg, 97080, Würzburg, Germany
| | - Ruth A Schmitz
- Institute for General Microbiology, Kiel University, 24118, Kiel, Germany.
| |
Collapse
|
6
|
Vallejo-Schmidt T, Palm C, Obiorah T, Koudjra AR, Schmidt K, Scudder AH, Guzman-Cruz E, Ingram LP, Erickson BC, Akingbehin V, Riddick T, Hamilton S, Riaz T, Alexander Z, Anderson JT, Bader C, Calkins PH, Chaudhry SS, Collins H, Conteh M, Dada TA, David J, Fallah D, De Leon R, Duff R, Eromosele IR, Jones JK, Keshmiri N, Mercanti MA, Onwezi-Nwugwo J, Ojo MA, Pascoe ER, Poteat AM, Price SE, Riedlbauer D, Rolle LTA, Shoemaker P, Stefano A, Sterling MK, Sultana S, Toneygay L, Williams AN, Nallar S, Weldon JE, Snyder GA, Snyder MLD. Characterization of the Structural Requirements for the NADase Activity of Bacterial Toll/IL-1R domains in a Course-based Undergraduate Research Experience. Immunohorizons 2024; 8:563-576. [PMID: 39172026 PMCID: PMC11374754 DOI: 10.4049/immunohorizons.2300062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 07/29/2024] [Indexed: 08/23/2024] Open
Abstract
TLRs initiate innate immune signaling pathways via Toll/IL-1R (TIR) domains on their cytoplasmic tails. Various bacterial species also express TIR domain-containing proteins that contribute to bacterial evasion of the innate immune system. Bacterial TIR domains, along with the mammalian sterile α and TIR motif-containing protein 1 and TIRs from plants, also have been found to exhibit NADase activity. Initial X-ray crystallographic studies of the bacterial TIR from Acinetobacter baumannii provided insight into bacterial TIR structure but were unsuccessful in cocrystallization with the NAD+ ligand, leading to further questions about the TIR NAD binding site. In this study, we designed a Course-Based Undergraduate Research Experience (CURE) involving 16-20 students per year to identify amino acids crucial for NADase activity of A. baumannii TIR domain protein and the TIR from Escherichia coli (TIR domain-containing protein C). Students used structural data to identify amino acids that they hypothesized would play a role in TIR NADase activity, and created plasmids to express mutated TIRs through site-directed mutagenesis. Mutant TIRs were expressed, purified, and tested for NADase activity. The results from these studies provide evidence for a conformational change upon NAD binding, as was predicted by recent cryogenic electron microscopy and hydrogen-deuterium exchange mass spectrometry studies. Along with corroborating recent characterization of TIR NADases that could contribute to drug development for diseases associated with dysregulated TIR activity, this work also highlights the value of CURE-based projects for inclusion of a diverse group of students in authentic research experiences.
Collapse
Affiliation(s)
| | - Cheyenne Palm
- Department of Biological Sciences, Towson University, Towson, MD
| | - Trinity Obiorah
- Department of Biological Sciences, Towson University, Towson, MD
| | | | - Katrina Schmidt
- Department of Biological Sciences, Towson University, Towson, MD
| | | | - Eber Guzman-Cruz
- Department of Biological Sciences, Towson University, Towson, MD
| | | | | | | | - Terra Riddick
- Department of Biological Sciences, Towson University, Towson, MD
| | - Sarah Hamilton
- Department of Biological Sciences, Towson University, Towson, MD
| | - Tahreem Riaz
- Department of Biological Sciences, Towson University, Towson, MD
| | | | | | - Charlotte Bader
- Department of Biological Sciences, Towson University, Towson, MD
| | | | | | - Haley Collins
- Department of Biological Sciences, Towson University, Towson, MD
| | - Maimunah Conteh
- Department of Biological Sciences, Towson University, Towson, MD
| | - Tope A. Dada
- Department of Biological Sciences, Towson University, Towson, MD
| | - Jaira David
- Department of Biological Sciences, Towson University, Towson, MD
| | - Daniel Fallah
- Department of Biological Sciences, Towson University, Towson, MD
| | - Raquel De Leon
- Department of Biological Sciences, Towson University, Towson, MD
| | - Rachel Duff
- Department of Biological Sciences, Towson University, Towson, MD
| | | | - Jaliyl K. Jones
- Department of Biological Sciences, Towson University, Towson, MD
| | | | - Mark A. Mercanti
- Department of Biological Sciences, Towson University, Towson, MD
| | | | - Michael A. Ojo
- Department of Biological Sciences, Towson University, Towson, MD
| | - Emily R. Pascoe
- Department of Biological Sciences, Towson University, Towson, MD
| | - Ariana M. Poteat
- Department of Biological Sciences, Towson University, Towson, MD
| | - Sarah E. Price
- Department of Biological Sciences, Towson University, Towson, MD
| | | | | | - Payton Shoemaker
- Department of Biological Sciences, Towson University, Towson, MD
| | - Alanna Stefano
- Department of Biological Sciences, Towson University, Towson, MD
| | | | - Samina Sultana
- Department of Biological Sciences, Towson University, Towson, MD
| | - Lindsey Toneygay
- Department of Biological Sciences, Towson University, Towson, MD
| | | | - Sheeram Nallar
- Division of Vaccine Research, Institute of Human Virology, Department of Microbiology and Immunology, University of Maryland, School of Medicine, Baltimore, MD
| | - John E. Weldon
- Department of Biological Sciences, Towson University, Towson, MD
| | - Greg A. Snyder
- Division of Vaccine Research, Institute of Human Virology, Department of Microbiology and Immunology, University of Maryland, School of Medicine, Baltimore, MD
| | | |
Collapse
|
7
|
Salgado JCS, Alnoch RC, Polizeli MDLTDM, Ward RJ. Microenzymes: Is There Anybody Out There? Protein J 2024; 43:393-404. [PMID: 38507106 DOI: 10.1007/s10930-024-10193-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/08/2024] [Indexed: 03/22/2024]
Abstract
Biological macromolecules are found in different shapes and sizes. Among these, enzymes catalyze biochemical reactions and are essential in all organisms, but is there a limit size for them to function properly? Large enzymes such as catalases have hundreds of kDa and are formed by multiple subunits, whereas most enzymes are smaller, with molecular weights of 20-60 kDa. Enzymes smaller than 10 kDa could be called microenzymes and the present literature review brings together evidence of their occurrence in nature. Additionally, bioactive peptides could be a natural source for novel microenzymes hidden in larger peptides and molecular downsizing could be useful to engineer artificial enzymes with low molecular weight improving their stability and heterologous expression. An integrative approach is crucial to discover and determine the amino acid sequences of novel microenzymes, together with their genomic identification and their biochemical biological and evolutionary functions.
Collapse
Affiliation(s)
- Jose Carlos Santos Salgado
- Department of Chemistry, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), University of São Paulo, Ribeirão Preto, 14040-900, São Paulo, Brazil.
- Department of Biology, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), University of São Paulo, Ribeirão Preto, 14040-901, São Paulo, Brazil.
| | - Robson Carlos Alnoch
- Department of Biology, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), University of São Paulo, Ribeirão Preto, 14040-901, São Paulo, Brazil
- Department of Biochemistry and Immunology, Faculdade de Medicina de Ribeirão Preto (FMRP), University of São Paulo, Ribeirão Preto, 14049-900, São Paulo, Brazil
| | - Maria de Lourdes Teixeira de Moraes Polizeli
- Department of Biology, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), University of São Paulo, Ribeirão Preto, 14040-901, São Paulo, Brazil
- Department of Biochemistry and Immunology, Faculdade de Medicina de Ribeirão Preto (FMRP), University of São Paulo, Ribeirão Preto, 14049-900, São Paulo, Brazil
| | - Richard John Ward
- Department of Chemistry, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), University of São Paulo, Ribeirão Preto, 14040-900, São Paulo, Brazil
- Department of Biochemistry and Immunology, Faculdade de Medicina de Ribeirão Preto (FMRP), University of São Paulo, Ribeirão Preto, 14049-900, São Paulo, Brazil
| |
Collapse
|
8
|
Sinha PR, Balasubramanian R, Hegde SR. Integrated sequence and -omic features reveal novel small proteome of Mycobacterium tuberculosis. Front Microbiol 2024; 15:1335310. [PMID: 38812687 PMCID: PMC11133741 DOI: 10.3389/fmicb.2024.1335310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 04/15/2024] [Indexed: 05/31/2024] Open
Abstract
Bioinformatic studies on small proteins are under-represented due to difficulties in annotation posed by their small size. However, recent discoveries emphasize the functional significance of small proteins in cellular processes including cell signaling, metabolism, and adaptation to stress. In this study, we utilized a Random Forest classifier trained on sequence features, RNA-Seq, and Ribo-Seq data to uncover small proteins (smORFs) in M. tuberculosis. Independent predictions for the exponential and starvation conditions resulted in 695 potential smORFs. We examined the functional implications of these smORFs using homology searches, LC-MS/MS, and ChIP-seq data, testing their expression in diverse growth conditions, and identifying protein domains. We provide evidence that some of these smORFs could be part of operons, or exist as upstream ORFs. This expanded data resource for the proteins of M. tuberculosis would aid in fine-tuning the existing protein and gene regulatory networks, thereby improving system-wide studies. The primary goal of this study was to uncover and characterize smORFs in M. tuberculosis through bioinformatic analysis, shedding light on their functional roles and genomic organization. Further investigation of these potential smORFs would provide valuable insights into the genome organization and functional diversity of the M. tuberculosis proteome.
Collapse
Affiliation(s)
| | | | - Shubhada R. Hegde
- Institute of Bioinformatics and Applied Biotechnology (IBAB), Bengaluru, India
| |
Collapse
|
9
|
Miravet-Verde S, Mazzolini R, Segura-Morales C, Broto A, Lluch-Senar M, Serrano L. ProTInSeq: transposon insertion tracking by ultra-deep DNA sequencing to identify translated large and small ORFs. Nat Commun 2024; 15:2091. [PMID: 38453908 PMCID: PMC10920889 DOI: 10.1038/s41467-024-46112-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 02/14/2024] [Indexed: 03/09/2024] Open
Abstract
Identifying open reading frames (ORFs) being translated is not a trivial task. ProTInSeq is a technique designed to characterize proteomes by sequencing transposon insertions engineered to express a selection marker when they occur in-frame within a protein-coding gene. In the bacterium Mycoplasma pneumoniae, ProTInSeq identifies 83% of its annotated proteins, along with 5 proteins and 153 small ORF-encoded proteins (SEPs; ≤100 aa) that were not previously annotated. Moreover, ProTInSeq can be utilized for detecting translational noise, as well as for relative quantification and transmembrane topology estimation of fitness and non-essential proteins. By integrating various identification approaches, the number of initially annotated SEPs in this bacterium increases from 27 to 329, with a quarter of them predicted to possess antimicrobial potential. Herein, we describe a methodology complementary to Ribo-Seq and mass spectroscopy that can identify SEPs while providing other insights in a proteome with a flexible and cost-effective DNA ultra-deep sequencing approach.
Collapse
Affiliation(s)
- Samuel Miravet-Verde
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain.
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zurich, Zurich, Switzerland.
| | | | - Carolina Segura-Morales
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain
| | - Alicia Broto
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain
| | - Maria Lluch-Senar
- Pulmobiotics, Dr Aiguader 88, 08003, Barcelona, Spain.
- Institute of Biotechnology and Biomedicine "Vicent Villar Palasi" (IBB), Universitat Autònoma de Barcelona, Barcelona, Spain.
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
- ICREA, Pg. Lluis Companys 23, 08010, Barcelona, Spain.
| |
Collapse
|
10
|
Haft DH, Badretdin A, Coulouris G, DiCuccio M, Durkin A, Jovenitti E, Li W, Mersha M, O’Neill K, Virothaisakun J, Thibaud-Nissen F. RefSeq and the prokaryotic genome annotation pipeline in the age of metagenomes. Nucleic Acids Res 2024; 52:D762-D769. [PMID: 37962425 PMCID: PMC10767926 DOI: 10.1093/nar/gkad988] [Citation(s) in RCA: 36] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/13/2023] [Accepted: 10/18/2023] [Indexed: 11/15/2023] Open
Abstract
The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) contains over 315 000 bacterial and archaeal genomes and 236 million proteins with up-to-date and consistent annotation. In the past 3 years, we have expanded the diversity of the RefSeq collection by including the best quality metagenome-assembled genomes (MAGs) submitted to INSDC (DDBJ, ENA and GenBank), while maintaining its quality by adding validation checks. Assemblies are now more stringently evaluated for contamination and for completeness of annotation prior to acceptance into RefSeq. MAGs now account for over 17000 assemblies in RefSeq, split over 165 orders and 362 families. Changes in the Prokaryotic Genome Annotation Pipeline (PGAP), which is used to annotate nearly all RefSeq assemblies include better detection of protein-coding genes. Nearly 83% of RefSeq proteins are now named by a curated Protein Family Model, a 4.7% increase in the past three years ago. In addition to literature citations, Enzyme Commission numbers, and gene symbols, Gene Ontology terms are now assigned to 48% of RefSeq proteins, allowing for easier multi-genome comparison. RefSeq is found at https://www.ncbi.nlm.nih.gov/refseq/. PGAP is available as a stand-alone tool able to produce GenBank-ready files at https://github.com/ncbi/pgap.
Collapse
Affiliation(s)
- Daniel H Haft
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Azat Badretdin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - George Coulouris
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Michael DiCuccio
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - A Scott Durkin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Eric Jovenitti
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Wenjun Li
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Megdelawit Mersha
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Kathleen R O’Neill
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Joel Virothaisakun
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
11
|
Simoens L, Fijalkowski I, Van Damme P. Exposing the small protein load of bacterial life. FEMS Microbiol Rev 2023; 47:fuad063. [PMID: 38012116 PMCID: PMC10723866 DOI: 10.1093/femsre/fuad063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 11/10/2023] [Accepted: 11/24/2023] [Indexed: 11/29/2023] Open
Abstract
The ever-growing repertoire of genomic techniques continues to expand our understanding of the true diversity and richness of prokaryotic genomes. Riboproteogenomics laid the foundation for dynamic studies of previously overlooked genomic elements. Most strikingly, bacterial genomes were revealed to harbor robust repertoires of small open reading frames (sORFs) encoding a diverse and broadly expressed range of small proteins, or sORF-encoded polypeptides (SEPs). In recent years, continuous efforts led to great improvements in the annotation and characterization of such proteins, yet many challenges remain to fully comprehend the pervasive nature of small proteins and their impact on bacterial biology. In this work, we review the recent developments in the dynamic field of bacterial genome reannotation, catalog the important biological roles carried out by small proteins and identify challenges obstructing the way to full understanding of these elusive proteins.
Collapse
Affiliation(s)
- Laure Simoens
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| | - Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| |
Collapse
|
12
|
Miranda RJ, Warren C, Mcdougal K, Kimble S, Sanchez J, Norman L, Anderson V, Hemm M. Identifying new small proteins through a molecular biology course-based undergraduate research experience laboratory class. BIOCHEMISTRY AND MOLECULAR BIOLOGY EDUCATION : A BIMONTHLY PUBLICATION OF THE INTERNATIONAL UNION OF BIOCHEMISTRY AND MOLECULAR BIOLOGY 2023; 51:574-585. [PMID: 37436109 DOI: 10.1002/bmb.21764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 03/14/2023] [Accepted: 06/08/2023] [Indexed: 07/13/2023]
Abstract
We developed a curriculum for an upper-level molecular biology course-based undergraduate research laboratory class funded by a National Science Foundation CAREER grant that focuses on identifying new small proteins in the bacterium, Escherichia coli. Our CURE class has been continually offered each semester for the last 10 years, with multiple instructors collaboratively developing and implementing their own pedagogical approach while maintaining the same overall scientific goal and experimental strategy. In this paper, we delineate the experimental strategy for our molecular biology CURE laboratory class, describe a range of pedagogical approaches implemented by multiple instructors, and provide recommendations for teaching the class. The purpose of our paper is to share our experiences both in developing and teaching a molecular biology CURE laboratory class based on small protein identification and in creating a curriculum and support system that allows traditional, non-traditional, and under-represented students to participate in authentic research projects.
Collapse
Affiliation(s)
- Rommel J Miranda
- Department of Physics, Astronomy & Geosciences, Towson University, Towson, Maryland, USA
| | - Cheryl Warren
- Department of Biological Sciences, Towson University, Towson, Maryland, USA
| | - Kathryn Mcdougal
- Department of Biological Sciences, Towson University, Towson, Maryland, USA
| | - Steven Kimble
- Department of Biological Sciences, Towson University, Towson, Maryland, USA
| | - Joseph Sanchez
- Department of Biological Sciences, Towson University, Towson, Maryland, USA
- Merck & Co. Inc., West Point, Pennsylvania, USA
| | - Leann Norman
- Department of Biological Sciences, Towson University, Towson, Maryland, USA
| | - Virginia Anderson
- Department of Biological Sciences, Towson University, Towson, Maryland, USA
| | - Matthew Hemm
- Department of Biological Sciences, Towson University, Towson, Maryland, USA
| |
Collapse
|
13
|
Zehentner B, Scherer S, Neuhaus K. Non-canonical transcriptional start sites in E. coli O157:H7 EDL933 are regulated and appear in surprisingly high numbers. BMC Microbiol 2023; 23:243. [PMID: 37653502 PMCID: PMC10469882 DOI: 10.1186/s12866-023-02988-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 08/21/2023] [Indexed: 09/02/2023] Open
Abstract
Analysis of genome wide transcription start sites (TSSs) revealed an unexpected complexity since not only canonical TSS of annotated genes are recognized by RNA polymerase. Non-canonical TSS were detected antisense to, or within, annotated genes as well new intergenic (orphan) TSS, not associated with known genes. Previously, it was hypothesized that many such signals represent noise or pervasive transcription, not associated with a biological function. Here, a modified Cappable-seq protocol allows determining the primary transcriptome of the enterohemorrhagic E. coli O157:H7 EDL933 (EHEC). We used four different growth media, both in exponential and stationary growth phase, replicated each thrice. This yielded 19,975 EHEC canonical and non-canonical TSS, which reproducibly occurring in three biological replicates. This questions the hypothesis of experimental noise or pervasive transcription. Accordingly, conserved promoter motifs were found upstream indicating proper TSSs. More than 50% of 5,567 canonical and between 32% and 47% of 10,355 non-canonical TSS were differentially expressed in different media and growth phases, providing evidence for a potential biological function also of non-canonical TSS. Thus, reproducible and environmentally regulated expression suggests that a substantial number of the non-canonical TSSs may be of unknown function rather than being the result of noise or pervasive transcription.
Collapse
Affiliation(s)
- Barbara Zehentner
- Chair for Microbial Ecology, TUM School of Life Sciences, Department of Molecular Life Sciences, Technical University of Munich, Freising, Germany
| | - Siegfried Scherer
- Chair for Microbial Ecology, TUM School of Life Sciences, Department of Molecular Life Sciences, Technical University of Munich, Freising, Germany
- ZIEL - Institute for Food & Health, Technical University of Munich, Freising, Germany
| | - Klaus Neuhaus
- ZIEL - Institute for Food & Health, Technical University of Munich, Freising, Germany.
- Core Facility Microbiome, ZIEL - Institute for Food & Health, Technical University of Munich, Freising, Germany.
| |
Collapse
|
14
|
Hadjeras L, Heiniger B, Maaß S, Scheuer R, Gelhausen R, Azarderakhsh S, Barth-Weber S, Backofen R, Becher D, Ahrens CH, Sharma CM, Evguenieva-Hackenberg E. Unraveling the small proteome of the plant symbiont Sinorhizobium meliloti by ribosome profiling and proteogenomics. MICROLIFE 2023; 4:uqad012. [PMID: 37223733 PMCID: PMC10117765 DOI: 10.1093/femsml/uqad012] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/08/2023] [Accepted: 03/07/2023] [Indexed: 05/25/2023]
Abstract
The soil-dwelling plant symbiont Sinorhizobium meliloti is a major model organism of Alphaproteobacteria. Despite numerous detailed OMICS studies, information about small open reading frame (sORF)-encoded proteins (SEPs) is largely missing, because sORFs are poorly annotated and SEPs are hard to detect experimentally. However, given that SEPs can fulfill important functions, identification of translated sORFs is critical for analyzing their roles in bacterial physiology. Ribosome profiling (Ribo-seq) can detect translated sORFs with high sensitivity, but is not yet routinely applied to bacteria because it must be adapted for each species. Here, we established a Ribo-seq procedure for S. meliloti 2011 based on RNase I digestion and detected translation for 60% of the annotated coding sequences during growth in minimal medium. Using ORF prediction tools based on Ribo-seq data, subsequent filtering, and manual curation, the translation of 37 non-annotated sORFs with ≤ 70 amino acids was predicted with confidence. The Ribo-seq data were supplemented by mass spectrometry (MS) analyses from three sample preparation approaches and two integrated proteogenomic search database (iPtgxDB) types. Searches against standard and 20-fold smaller Ribo-seq data-informed custom iPtgxDBs confirmed 47 annotated SEPs and identified 11 additional novel SEPs. Epitope tagging and Western blot analysis confirmed the translation of 15 out of 20 SEPs selected from the translatome map. Overall, by combining MS and Ribo-seq approaches, the small proteome of S. meliloti was substantially expanded by 48 novel SEPs. Several of them are part of predicted operons and/or are conserved from Rhizobiaceae to Bacteria, suggesting important physiological functions.
Collapse
Affiliation(s)
- Lydia Hadjeras
- Institute of Molecular Infection Biology, University of Würzburg, 97080 Würzburg, Germany
| | - Benjamin Heiniger
- Molecular Ecology,
Agroscope and SIB Swiss Institute of Bioinformatics, 8046 Zurich, Switzerland
| | - Sandra Maaß
- Institute of Microbiology, University of Greifswald, 17489 Greifswald, Germany
| | - Robina Scheuer
- Institute of Microbiology and Molecular Biology, University of Giessen, 35392 Giessen, Germany
| | - Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany
| | - Saina Azarderakhsh
- Institute of Microbiology and Molecular Biology, University of Giessen, 35392 Giessen, Germany
| | - Susanne Barth-Weber
- Institute of Microbiology and Molecular Biology, University of Giessen, 35392 Giessen, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany
| | - Dörte Becher
- Institute of Microbiology, University of Greifswald, 17489 Greifswald, Germany
| | - Christian H Ahrens
- Molecular Ecology, Agroscope and SIB Swiss Institute of Bioinformatics, 8046 Zurich, Switzerland
| | - Cynthia M Sharma
- Institute of Molecular Infection Biology, University of Würzburg, 97080 Würzburg, Germany
| | | |
Collapse
|
15
|
Tantoso E, Eisenhaber B, Sinha S, Jensen LJ, Eisenhaber F. About the dark corners in the gene function space of Escherichia coli remaining without illumination by scientific literature. Biol Direct 2023; 18:7. [PMID: 36855185 PMCID: PMC9976479 DOI: 10.1186/s13062-023-00362-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 02/21/2023] [Indexed: 03/02/2023] Open
Abstract
BACKGROUND Although Escherichia coli (E. coli) is the most studied prokaryote organism in the history of life sciences, many molecular mechanisms and gene functions encoded in its genome remain to be discovered. This work aims at quantifying the illumination of the E. coli gene function space by the scientific literature and how close we are towards the goal of a complete list of E. coli gene functions. RESULTS The scientific literature about E. coli protein-coding genes has been mapped onto the genome via the mentioning of names for genomic regions in scientific articles both for the case of the strain K-12 MG1655 as well as for the 95%-threshold softcore genome of 1324 E. coli strains with known complete genome. The article match was quantified with the ratio of a given gene name's occurrence to the mentioning of any gene names in the paper. The various genome regions have an extremely uneven literature coverage. A group of elite genes with ≥ 100 full publication equivalents (FPEs, FPE = 1 is an idealized publication devoted to just a single gene) attracts the lion share of the papers. For K-12, ~ 65% of the literature covers just 342 elite genes; for the softcore genome, ~ 68% of the FPEs is about only 342 elite gene families (GFs). We also find that most genes/GFs have at least one mentioning in a dedicated scientific article (with the exception of at least 137 protein-coding transcripts for K-12 and 26 GFs from the softcore genome). Whereas the literature growth rates were highest for uncharacterized or understudied genes until 2005-2010 compared with other groups of genes, they became negative thereafter. At the same time, literature for anyhow well-studied genes started to grow explosively with threshold T10 (≥ 10 FPEs). Typically, a body of ~ 20 actual articles generated over ~ 15 years of research effort was necessary to reach T10. Lineage-specific co-occurrence analysis of genes belonging to the accessory genome of E. coli together with genomic co-localization and sequence-analytic exploration hints previously completely uncharacterized genes yahV and yddL being associated with osmotic stress response/motility mechanisms. CONCLUSION If the numbers of scientific articles about uncharacterized and understudied genes remain at least at present levels, full gene function lists for the strain K-12 MG1655 and the E. coli softcore genome are in reach within the next 25-30 years. Once the literature body for a gene crosses 10 FPEs, most of the critical fundamental research risk appears overcome and steady incremental research becomes possible.
Collapse
Affiliation(s)
- Erwin Tantoso
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.,Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore
| | - Birgit Eisenhaber
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.,Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore
| | - Swati Sinha
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.,Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Frank Eisenhaber
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore. .,Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore. .,School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Republic of Singapore.
| |
Collapse
|
16
|
Hadjeras L, Bartel J, Maier LK, Maaß S, Vogel V, Svensson SL, Eggenhofer F, Gelhausen R, Müller T, Alkhnbashi OS, Backofen R, Becher D, Sharma CM, Marchfelder A. Revealing the small proteome of Haloferax volcanii by combining ribosome profiling and small-protein optimized mass spectrometry. MICROLIFE 2023; 4:uqad001. [PMID: 37223747 PMCID: PMC10117724 DOI: 10.1093/femsml/uqad001] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 11/29/2022] [Accepted: 01/13/2023] [Indexed: 05/25/2023]
Abstract
In contrast to extensively studied prokaryotic 'small' transcriptomes (encompassing all small noncoding RNAs), small proteomes (here defined as including proteins ≤70 aa) are only now entering the limelight. The absence of a complete small protein catalogue in most prokaryotes precludes our understanding of how these molecules affect physiology. So far, archaeal genomes have not yet been analyzed broadly with a dedicated focus on small proteins. Here, we present a combinatorial approach, integrating experimental data from small protein-optimized mass spectrometry (MS) and ribosome profiling (Ribo-seq), to generate a high confidence inventory of small proteins in the model archaeon Haloferax volcanii. We demonstrate by MS and Ribo-seq that 67% of the 317 annotated small open reading frames (sORFs) are translated under standard growth conditions. Furthermore, annotation-independent analysis of Ribo-seq data showed ribosomal engagement for 47 novel sORFs in intergenic regions. A total of seven of these were also detected by proteomics, in addition to an eighth novel small protein solely identified by MS. We also provide independent experimental evidence in vivo for the translation of 12 sORFs (annotated and novel) using epitope tagging and western blotting, underlining the validity of our identification scheme. Several novel sORFs are conserved in Haloferax species and might have important functions. Based on our findings, we conclude that the small proteome of H. volcanii is larger than previously appreciated, and that combining MS with Ribo-seq is a powerful approach for the discovery of novel small protein coding genes in archaea.
Collapse
Affiliation(s)
- Lydia Hadjeras
- Department of Molecular Infection Biology II, Institute of Molecular Infection Biology (IMIB), University of Würzburg, Josef-Schneider-Straße 2 / D15, 97080 Würzburg, Germany
| | - Jürgen Bartel
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | | | - Sandra Maaß
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Verena Vogel
- Biology II, Ulm University, Albert-Einstein-Allee 11, 89081 Ulm, Germany
| | - Sarah L Svensson
- Department of Molecular Infection Biology II, Institute of Molecular Infection Biology (IMIB), University of Würzburg, Josef-Schneider-Straße 2 / D15, 97080 Würzburg, Germany
| | - Florian Eggenhofer
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Teresa Müller
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Omer S Alkhnbashi
- Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Schaenzlestr. 18, 79104 Freiburg, Germany
| | - Dörte Becher
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Cynthia M Sharma
- Department of Molecular Infection Biology II, Institute of Molecular Infection Biology (IMIB), University of Würzburg, Josef-Schneider-Straße 2 / D15, 97080 Würzburg, Germany
| | - Anita Marchfelder
- Biology II, Ulm University, Albert-Einstein-Allee 11, 89081 Ulm, Germany
| |
Collapse
|
17
|
Ventroux M, Noirot-Gros MF. Prophage-encoded small protein YqaH counteracts the activities of the replication initiator DnaA in Bacillus subtilis. MICROBIOLOGY (READING, ENGLAND) 2022; 168. [PMID: 36748575 DOI: 10.1099/mic.0.001268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Bacterial genomes harbour cryptic prophages that are mostly transcriptionally silent with many unannotated genes. Still, cryptic prophages may contribute to their host fitness and phenotypes. In Bacillus subtilis, the yqaF-yqaN operon belongs to the prophage element skin, and is tightly repressed by the Xre-like repressor SknR. This operon contains several small ORFs (smORFs) potentially encoding small-sized proteins. The smORF-encoded peptide YqaH was previously reported to bind to the replication initiator DnaA. Here, using a yeast two-hybrid assay, we found that YqaH binds to the DNA binding domain IV of DnaA and interacts with Spo0A, a master regulator of sporulation. We isolated single amino acid substitutions in YqaH that abolished the interaction with DnaA but not with Spo0A. Then, using a plasmid-based inducible system to overexpress yqaH WT and mutant derivatives, we studied in B. subtilis the phenotypes associated with the specific loss-of-interaction with DnaA (DnaA_LOI). We found that expression of yqaH carrying DnaA_LOI mutations abolished the deleterious effects of yqaH WT expression on chromosome segregation, replication initiation and DnaA-regulated transcription. When YqaH was induced after vegetative growth, DnaA_LOI mutations abolished the drastic effects of YqaH WT on sporulation and biofilm formation. Thus, YqaH inhibits replication, sporulation and biofilm formation mainly by antagonizing DnaA in a manner that is independent of the cell cycle checkpoint Sda.
Collapse
Affiliation(s)
- Magali Ventroux
- Université Paris-Saclay, INRAE, AgroParisTech, Micalis Institute, 78350, Jouy-en-Josas, France
| | | |
Collapse
|
18
|
Gopalkrishnan S, Ross W, Akbari MS, Li X, Haycocks JRJ, Grainger DC, Court DL, Gourse RL. Homologs of the Escherichia coli F Element Protein TraR, Including Phage Lambda Orf73, Directly Reprogram Host Transcription. mBio 2022; 13:e0095222. [PMID: 35583320 PMCID: PMC9239242 DOI: 10.1128/mbio.00952-22] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 04/20/2022] [Indexed: 11/24/2022] Open
Abstract
Bacterial cells and their associated plasmids and bacteriophages encode numerous small proteins of unknown function. One example, the 73-amino-acid protein TraR, is encoded by the transfer operon of the conjugative F plasmid of Escherichia coli. TraR is a distant homolog of DksA, a protein found in almost all proteobacterial species that is required for ppGpp to regulate transcription during the stringent response. TraR and DksA increase or decrease transcription initiation depending on the kinetic features of the promoter by binding directly to RNA polymerase without binding to DNA. Unlike DksA, whose full activity requires ppGpp as a cofactor, TraR is fully active by itself and unaffected by ppGpp. TraR belongs to a family of divergent proteins encoded by proteobacterial bacteriophages and other mobile elements. Here, we experimentally addressed whether other members of the TraR family function like the F element-encoded TraR. Purified TraR and all 5 homologs that were examined bound to RNA polymerase, functioned at lower concentrations than DksA, and complemented a dksA-null strain for growth on minimal medium. One of the homologs, λ Orf73, encoded by bacteriophage lambda, was examined in greater detail. λ Orf73 slowed host growth and increased phage burst size. Mutational analysis suggested that λ Orf73 and TraR have a similar mechanism for inhibiting rRNA and r-protein promoters. We suggest that TraR and its homologs regulate host transcription to divert cellular resources to phage propagation or conjugation without induction of ppGpp and a stringent response. IMPORTANCE TraR is a distant homolog of the transcription factor DksA and the founding member of a large family of small proteins encoded by proteobacterial phages and conjugative plasmids. Reprogramming transcription during the stringent response requires the interaction of DksA not only with RNA polymerase but also with the stress-induced regulatory nucleotide ppGpp. We show here that five phage TraR homologs by themselves, without ppGpp, regulate transcription of host promoters, mimicking the effects of DksA and ppGpp together. During a stringent response, ppGpp independently binds directly to, and inhibits the activities of, many proteins in addition to RNA polymerase, including translation factors, enzymes needed for ribonucleotide biosynthesis, and other metabolic enzymes. Here, we suggest a physiological role for TraR-like proteins: bacteriophages utilize TraR homologs to reprogram host transcription in the absence of ppGpp induction and thus without inhibiting host enzymes needed for phage development.
Collapse
Affiliation(s)
- Saumya Gopalkrishnan
- University of Wisconsin—Madison, Department of Bacteriology, Madison, Wisconsin, USA
| | - Wilma Ross
- University of Wisconsin—Madison, Department of Bacteriology, Madison, Wisconsin, USA
| | - Madeline S. Akbari
- University of Wisconsin—Madison, Department of Bacteriology, Madison, Wisconsin, USA
| | - Xintian Li
- RNA Biology Laboratory, Center for Cancer Research, The National Cancer Institute at Frederick, Frederick, Maryland, USA
| | - James R. J. Haycocks
- University of Birmingham, Institute of Microbiology and Infection, School of Biosciences, Edgbaston, Birmingham, United Kingdom
| | - David C. Grainger
- University of Birmingham, Institute of Microbiology and Infection, School of Biosciences, Edgbaston, Birmingham, United Kingdom
| | - Donald L. Court
- RNA Biology Laboratory, Center for Cancer Research, The National Cancer Institute at Frederick, Frederick, Maryland, USA
| | - Richard L. Gourse
- University of Wisconsin—Madison, Department of Bacteriology, Madison, Wisconsin, USA
| |
Collapse
|
19
|
Fijalkowski I, Willems P, Jonckheere V, Simoens L, Van Damme P. Hidden in plain sight: challenges in proteomics detection of small ORF-encoded polypeptides. MICROLIFE 2022; 3:uqac005. [PMID: 37223358 PMCID: PMC10117744 DOI: 10.1093/femsml/uqac005] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Revised: 04/18/2022] [Accepted: 04/29/2022] [Indexed: 05/25/2023]
Abstract
Genomic studies of bacteria have long pointed toward widespread prevalence of small open reading frames (sORFs) encoding for short proteins, <100 amino acids in length. Despite the mounting genomic evidence of their robust expression, relatively little progress has been made in their mass spectrometry-based detection and various blanket statements have been used to explain this observed discrepancy. In this study, we provide a large-scale riboproteogenomics investigation of the challenging nature of proteomic detection of such small proteins as informed by conditional translation data. A panel of physiochemical properties alongside recently developed mass spectrometry detectability metrics was interrogated to provide a comprehensive evidence-based assessment of sORF-encoded polypeptide (SEP) detectability. Moreover, a large-scale proteomics and translatomics compendium of proteins produced by Salmonella Typhimurium (S. Typhimurium), a model human pathogen, across a panel of growth conditions is presented and used in support of our in silico SEP detectability analysis. This integrative approach is used to provide a data-driven census of small proteins expressed by S. Typhimurium across growth phases and infection-relevant conditions. Taken together, our study pinpoints current limitations in proteomics-based detection of novel small proteins currently missing from bacterial genome annotations.
Collapse
Affiliation(s)
- Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Patrick Willems
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Veronique Jonckheere
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Laure Simoens
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
20
|
Smith C, Canestrari JG, Wang AJ, Champion MM, Derbyshire KM, Gray TA, Wade JT. Pervasive translation in Mycobacterium tuberculosis. eLife 2022; 11:e73980. [PMID: 35343439 PMCID: PMC9094748 DOI: 10.7554/elife.73980] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 03/25/2022] [Indexed: 11/13/2022] Open
Abstract
Most bacterial ORFs are identified by automated prediction algorithms. However, these algorithms often fail to identify ORFs lacking canonical features such as a length of >50 codons or the presence of an upstream Shine-Dalgarno sequence. Here, we use ribosome profiling approaches to identify actively translated ORFs in Mycobacterium tuberculosis. Most of the ORFs we identify have not been previously described, indicating that the M. tuberculosis transcriptome is pervasively translated. The newly described ORFs are predominantly short, with many encoding proteins of ≤50 amino acids. Codon usage of the newly discovered ORFs suggests that most have not been subject to purifying selection, and hence are unlikely to contribute to cell fitness. Nevertheless, we identify 90 new ORFs (median length of 52 codons) that bear the hallmarks of purifying selection. Thus, our data suggest that pervasive translation of short ORFs in Mycobacterium tuberculosis serves as a rich source for the evolution of new functional proteins.
Collapse
Affiliation(s)
- Carol Smith
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
| | - Jill G Canestrari
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
| | - Archer J Wang
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
| | - Matthew M Champion
- Department of Chemistry and Biochemistry, University of Notre DameNotre DameUnited States
| | - Keith M Derbyshire
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
- Department of Biomedical Sciences, School of Public Health, University at AlbanyNew YorkUnited States
| | - Todd A Gray
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
- Department of Biomedical Sciences, School of Public Health, University at AlbanyNew YorkUnited States
| | - Joseph T Wade
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
- Department of Biomedical Sciences, School of Public Health, University at AlbanyNew YorkUnited States
| |
Collapse
|
21
|
Dimonaco NJ, Aubrey W, Kenobi K, Clare A, Creevey CJ. No one tool to rule them all: prokaryotic gene prediction tool annotations are highly dependent on the organism of study. Bioinformatics 2021; 38:1198-1207. [PMID: 34875010 PMCID: PMC8825762 DOI: 10.1093/bioinformatics/btab827] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 11/13/2021] [Accepted: 12/02/2021] [Indexed: 01/06/2023] Open
Abstract
MOTIVATION The biases in CoDing Sequence (CDS) prediction tools, which have been based on historic genomic annotations from model organisms, impact our understanding of novel genomes and metagenomes. This hinders the discovery of new genomic information as it results in predictions being biased towards existing knowledge. To date, users have lacked a systematic and replicable approach to identify the strengths and weaknesses of any CDS prediction tool and allow them to choose the right tool for their analysis. RESULTS We present an evaluation framework (ORForise) based on a comprehensive set of 12 primary and 60 secondary metrics that facilitate the assessment of the performance of CDS prediction tools. This makes it possible to identify which performs better for specific use-cases. We use this to assess 15 ab initio- and model-based tools representing those most widely used (historically and currently) to generate the knowledge in genomic databases. We find that the performance of any tool is dependent on the genome being analysed, and no individual tool ranked as the most accurate across all genomes or metrics analysed. Even the top-ranked tools produced conflicting gene collections, which could not be resolved by aggregation. The ORForise evaluation framework provides users with a replicable, data-led approach to make informed tool choices for novel genome annotations and for refining historical annotations. AVAILABILITY AND IMPLEMENTATION Code and datasets for reproduction and customisation are available at https://github.com/NickJD/ORForise. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nicholas J Dimonaco
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth SY23 3PD, UK,To whom correspondence should be addressed.
| | - Wayne Aubrey
- Department of Computer Science, Aberystwyth University, Aberystwyth SY23 3DB, UK
| | - Kim Kenobi
- Department of Mathematics, Aberystwyth University, Aberystwyth SY23 3BZ, UK
| | - Amanda Clare
- Department of Computer Science, Aberystwyth University, Aberystwyth SY23 3DB, UK
| | | |
Collapse
|
22
|
Lyapina I, Ivanov V, Fesenko I. Peptidome: Chaos or Inevitability. Int J Mol Sci 2021; 22:13128. [PMID: 34884929 PMCID: PMC8658490 DOI: 10.3390/ijms222313128] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 12/01/2021] [Accepted: 12/02/2021] [Indexed: 12/13/2022] Open
Abstract
Thousands of naturally occurring peptides differing in their origin, abundance and possible functions have been identified in the tissue and biological fluids of vertebrates, insects, fungi, plants and bacteria. These peptide pools are referred to as intracellular or extracellular peptidomes, and besides a small proportion of well-characterized peptide hormones and defense peptides, are poorly characterized. However, a growing body of evidence suggests that unknown bioactive peptides are hidden in the peptidomes of different organisms. In this review, we present a comprehensive overview of the mechanisms of generation and properties of peptidomes across different organisms. Based on their origin, we propose three large peptide groups-functional protein "degradome", small open reading frame (smORF)-encoded peptides (smORFome) and specific precursor-derived peptides. The composition of peptide pools identified by mass-spectrometry analysis in human cells, plants, yeast and bacteria is compared and discussed. The functions of different peptide groups, for example the role of the "degradome" in promoting defense signaling, are also considered.
Collapse
Affiliation(s)
| | | | - Igor Fesenko
- Department of Functional Genomics and Proteomics of Plants, Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry Russian Academy of Sciences, 117997 Moscow, Russia; (I.L.); (V.I.)
| |
Collapse
|
23
|
Abstract
Escherichia coli was one of the first species to have its genome sequenced and remains one of the best-characterized model organisms. Thus, it is perhaps surprising that recent studies have shown that a substantial number of genes have been overlooked. Genes encoding more than 140 small proteins, defined as those containing 50 or fewer amino acids, have been identified in E. coli in the past 10 years, and there is substantial evidence indicating that many more remain to be discovered. This review covers the methods that have been successful in identifying small proteins and the short open reading frames that encode them. The small proteins that have been functionally characterized to date in this model organism are also discussed. It is hoped that the review, along with the associated databases of known as well as predicted but undetected small proteins, will aid in and provide a roadmap for the continued identification and characterization of these proteins in E. coli as well as other bacteria.
Collapse
|
24
|
Tharakan R, Sawa A. Minireview: Novel Micropeptide Discovery by Proteomics and Deep Sequencing Methods. Front Genet 2021; 12:651485. [PMID: 34025718 PMCID: PMC8136307 DOI: 10.3389/fgene.2021.651485] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Accepted: 03/22/2021] [Indexed: 12/12/2022] Open
Abstract
A novel class of small proteins, called micropeptides, has recently been discovered in the genome. These proteins, which have been found to play important roles in many physiological and cellular systems, are shorter than 100 amino acids and were overlooked during previous genome annotations. Discovery and characterization of more micropeptides has been ongoing, often using -omics methods such as proteomics, RNA sequencing, and ribosome profiling. In this review, we survey the recent advances in the micropeptides field and describe the methodological and conceptual challenges facing future micropeptide endeavors.
Collapse
Affiliation(s)
- Ravi Tharakan
- National Institute on Aging, National Institutes of Health, Baltimore, MD, United States
| | - Akira Sawa
- Departments of Psychiatry, Neuroscience, Biomedical Engineering, and Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, United States.,Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| |
Collapse
|
25
|
Steinberg R, Koch HG. The largely unexplored biology of small proteins in pro- and eukaryotes. FEBS J 2021; 288:7002-7024. [PMID: 33780127 DOI: 10.1111/febs.15845] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/11/2021] [Accepted: 03/26/2021] [Indexed: 12/29/2022]
Abstract
The large abundance of small open reading frames (smORFs) in prokaryotic and eukaryotic genomes and the plethora of smORF-encoded small proteins became only apparent with the constant advancements in bioinformatic, genomic, proteomic, and biochemical tools. Small proteins are typically defined as proteins of < 50 amino acids in prokaryotes and of less than 100 amino acids in eukaryotes, and their importance for cell physiology and cellular adaptation is only beginning to emerge. In contrast to antimicrobial peptides, which are secreted by prokaryotic and eukaryotic cells for combatting pathogens and competitors, small proteins act within the producing cell mainly by stabilizing protein assemblies and by modifying the activity of larger proteins. Production of small proteins is frequently linked to stress conditions or environmental changes, and therefore, cells seem to use small proteins as intracellular modifiers for adjusting cell metabolism to different intra- and extracellular cues. However, the size of small proteins imposes a major challenge for the cellular machinery required for protein folding and intracellular trafficking and recent data indicate that small proteins can engage distinct trafficking pathways. In the current review, we describe the diversity of small proteins in prokaryotes and eukaryotes, highlight distinct and common features, and illustrate how they are handled by the protein trafficking machineries in prokaryotic and eukaryotic cells. Finally, we also discuss future topics of research on this fascinating but largely unexplored group of proteins.
Collapse
Affiliation(s)
- Ruth Steinberg
- Institute for Biochemistry and Molecular Biology, Zentrum für Biochemie und Molekulare Medizin (ZMBZ), Faculty of Medicine, Albert-Ludwigs-Universität Freiburg, Germany
| | - Hans-Georg Koch
- Institute for Biochemistry and Molecular Biology, Zentrum für Biochemie und Molekulare Medizin (ZMBZ), Faculty of Medicine, Albert-Ludwigs-Universität Freiburg, Germany
| |
Collapse
|
26
|
Petruschke H, Schori C, Canzler S, Riesbeck S, Poehlein A, Daniel R, Frei D, Segessemann T, Zimmerman J, Marinos G, Kaleta C, Jehmlich N, Ahrens CH, von Bergen M. Discovery of novel community-relevant small proteins in a simplified human intestinal microbiome. MICROBIOME 2021; 9:55. [PMID: 33622394 PMCID: PMC7903761 DOI: 10.1186/s40168-020-00981-z] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 12/16/2020] [Indexed: 05/13/2023]
Abstract
BACKGROUND The intestinal microbiota plays a crucial role in protecting the host from pathogenic microbes, modulating immunity and regulating metabolic processes. We studied the simplified human intestinal microbiota (SIHUMIx) consisting of eight bacterial species with a particular focus on the discovery of novel small proteins with less than 100 amino acids (= sProteins), some of which may contribute to shape the simplified human intestinal microbiota. Although sProteins carry out a wide range of important functions, they are still often missed in genome annotations, and little is known about their structure and function in individual microbes and especially in microbial communities. RESULTS We created a multi-species integrated proteogenomics search database (iPtgxDB) to enable a comprehensive identification of novel sProteins. Six of the eight SIHUMIx species, for which no complete genomes were available, were sequenced and de novo assembled. Several proteomics approaches including two earlier optimized sProtein enrichment strategies were applied to specifically increase the chances for novel sProtein discovery. The search of tandem mass spectrometry (MS/MS) data against the multi-species iPtgxDB enabled the identification of 31 novel sProteins, of which the expression of 30 was supported by metatranscriptomics data. Using synthetic peptides, we were able to validate the expression of 25 novel sProteins. The comparison of sProtein expression in each single strain versus a multi-species community cultivation showed that six of these sProteins were only identified in the SIHUMIx community indicating a potentially important role of sProteins in the organization of microbial communities. Two of these novel sProteins have a potential antimicrobial function. Metabolic modelling revealed that a third sProtein is located in a genomic region encoding several enzymes relevant for the community metabolism within SIHUMIx. CONCLUSIONS We outline an integrated experimental and bioinformatics workflow for the discovery of novel sProteins in a simplified intestinal model system that can be generically applied to other microbial communities. The further analysis of novel sProteins uniquely expressed in the SIHUMIx multi-species community is expected to enable new insights into the role of sProteins on the functionality of bacterial communities such as those of the human intestinal tract. Video abstract.
Collapse
Affiliation(s)
- Hannes Petruschke
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Christian Schori
- Agroscope, Molecular Diagnostics, Genomics & Bioinformatics and SIB Swiss Institute of Bioinformatics, Wädenswil, Switzerland
| | - Sebastian Canzler
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Sarah Riesbeck
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Anja Poehlein
- Institute of Microbiology and Genetics, Department of Genomic and Applied Microbiology, Georg-August University of Göttingen, Göttingen, Germany
| | - Rolf Daniel
- Institute of Microbiology and Genetics, Department of Genomic and Applied Microbiology, Georg-August University of Göttingen, Göttingen, Germany
| | - Daniel Frei
- Agroscope, Molecular Diagnostics, Genomics & Bioinformatics and SIB Swiss Institute of Bioinformatics, Wädenswil, Switzerland
| | - Tina Segessemann
- Agroscope, Molecular Diagnostics, Genomics & Bioinformatics and SIB Swiss Institute of Bioinformatics, Wädenswil, Switzerland
| | - Johannes Zimmerman
- Research Group Medical Systems Biology, Institute for Experimental Medicine, Christian-Albrechts-University Kiel, Kiel, Germany
| | - Georgios Marinos
- Research Group Medical Systems Biology, Institute for Experimental Medicine, Christian-Albrechts-University Kiel, Kiel, Germany
| | - Christoph Kaleta
- Research Group Medical Systems Biology, Institute for Experimental Medicine, Christian-Albrechts-University Kiel, Kiel, Germany
| | - Nico Jehmlich
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Christian H Ahrens
- Agroscope, Molecular Diagnostics, Genomics & Bioinformatics and SIB Swiss Institute of Bioinformatics, Wädenswil, Switzerland.
| | - Martin von Bergen
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany.
- Institute of Biochemistry, Faculty of Biosciences, Pharmacy and Psychology, University of Leipzig, Leipzig, Germany.
| |
Collapse
|
27
|
Adams PP, Baniulyte G, Esnault C, Chegireddy K, Singh N, Monge M, Dale RK, Storz G, Wade JT. Regulatory roles of Escherichia coli 5' UTR and ORF-internal RNAs detected by 3' end mapping. eLife 2021; 10:62438. [PMID: 33460557 PMCID: PMC7815308 DOI: 10.7554/elife.62438] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 11/26/2020] [Indexed: 02/06/2023] Open
Abstract
Many bacterial genes are regulated by RNA elements in their 5´ untranslated regions (UTRs). However, the full complement of these elements is not known even in the model bacterium Escherichia coli. Using complementary RNA-sequencing approaches, we detected large numbers of 3´ ends in 5´ UTRs and open reading frames (ORFs), suggesting extensive regulation by premature transcription termination. We documented regulation for multiple transcripts, including spermidine induction involving Rho and translation of an upstream ORF for an mRNA encoding a spermidine efflux pump. In addition to discovering novel sites of regulation, we detected short, stable RNA fragments derived from 5´ UTRs and sequences internal to ORFs. Characterization of three of these transcripts, including an RNA internal to an essential cell division gene, revealed that they have independent functions as sRNA sponges. Thus, these data uncover an abundance of cis- and trans-acting RNA regulators in bacterial 5´ UTRs and internal to ORFs. In most organisms, specific segments of a cell’s genetic information are copied to form single-stranded molecules of various sizes and purposes. Each of these RNA molecules, as they are known, is constructed as a chain that starts at the 5´ end and terminates at the 3´ end. Certain RNAs carry the information present in a gene, which provides the instructions that a cell needs to build proteins. Some, however, are ‘non-coding’ and instead act to fine-tune the activity of other RNAs. These regulatory RNAs can be separate from the RNAs they control, or they can be embedded in the very sequences they regulate; new evidence also shows that certain regulatory RNAs can act in both ways. Many regulatory RNAs are yet to be catalogued, even in simple, well-studied species such as the bacterium Escherichia coli. Here, Adams et al. aimed to better characterize the regulatory RNAs present in E. coli by mapping out the 3´ ends of every RNA molecule in the bacterium. This revealed many new regulatory RNAs and offered insights into where these sequences are located. For instance, the results show that several of these RNAs were embedded within RNA produced from larger genes. Some were nested in coding RNAs, and were parts of a longer RNA sequence that is adjacent to the protein coding segment. Others, however, were present within the instructions that code for a protein. The work by Adams et al. reveals that regulatory RNAs can be located in unexpected places, and provides a method for identifying them. This can be applied to other types of bacteria, in particular in species with few known RNA regulators.
Collapse
Affiliation(s)
- Philip P Adams
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, United States.,Postdoctoral Research Associate Program, National Institute of General Medical Sciences, National Institutes of Health, Bethesda, United States
| | - Gabriele Baniulyte
- Wadsworth Center, New York State Department of Health, Albany, United States
| | - Caroline Esnault
- Bioinformatics and Scientific Programming Core, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, United States
| | - Kavya Chegireddy
- Department of Biomedical Sciences, School of Public Health, University at Albany, Albany, United States
| | - Navjot Singh
- Wadsworth Center, New York State Department of Health, Albany, United States
| | - Molly Monge
- Wadsworth Center, New York State Department of Health, Albany, United States
| | - Ryan K Dale
- Bioinformatics and Scientific Programming Core, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, United States
| | - Gisela Storz
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, United States
| | - Joseph T Wade
- Wadsworth Center, New York State Department of Health, Albany, United States.,Department of Biomedical Sciences, School of Public Health, University at Albany, Albany, United States
| |
Collapse
|
28
|
Stringer A, Smith C, Mangano K, Wade JT. Identification of novel translated small ORFs in Escherichia coli using complementary ribosome profiling approaches. J Bacteriol 2021; 204:JB0035221. [PMID: 34662240 PMCID: PMC8765432 DOI: 10.1128/jb.00352-21] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 10/12/2021] [Indexed: 11/20/2022] Open
Abstract
Small proteins of <51 amino acids are abundant across all domains of life but are often overlooked because their small size makes them difficult to predict computationally, and they are refractory to standard proteomic approaches. Ribosome profiling has been used to infer the existence of small proteins by detecting the translation of the corresponding open reading frames (ORFs). Detection of translated short ORFs by ribosome profiling can be improved by treating cells with drugs that stall ribosomes at specific codons. Here, we combine the analysis of ribosome profiling data for Escherichia coli cells treated with antibiotics that stall ribosomes at either start or stop codons. Thus, we identify ribosome-occupied start and stop codons with high sensitivity for ∼400 novel putative ORFs. The newly discovered ORFs are mostly short, with 365 encoding proteins of <51 amino acids. We validate translation of several selected short ORFs, and show that many likely encode unstable proteins. Moreover, we present evidence that most of the newly identified short ORFs are not under purifying selection, suggesting they do not impact cell fitness, although a small subset have the hallmarks of functional ORFs. IMPORTANCE Small proteins of <51 amino acids are abundant across all domains of life but are often overlooked because their small size makes them difficult to predict computationally, and they are refractory to standard proteomic approaches. Recent studies have discovered small proteins by mapping the location of translating ribosomes on RNA using a technique known as ribosome profiling. Discovery of translated sORFs using ribosome profiling can be improved by treating cells with drugs that trap initiating ribosomes. Here, we show that combining these data with equivalent data for cells treated with a drug that stalls terminating ribosomes facilitates the discovery of small proteins. We use this approach to discover 365 putative genes that encode small proteins in Escherichia coli.
Collapse
Affiliation(s)
- Anne Stringer
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
| | - Carol Smith
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
| | - Kyle Mangano
- Center for Biomolecular Sciences, University of Illinois, Chicago, Illinois, USA
| | - Joseph T. Wade
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
- Department of Biomedical Sciences, School of Public Health, University at Albany, Albany, New York, USA
| |
Collapse
|
29
|
The Small Toxic Salmonella Protein TimP Targets the Cytoplasmic Membrane and Is Repressed by the Small RNA TimR. mBio 2020; 11:mBio.01659-20. [PMID: 33172998 PMCID: PMC7667032 DOI: 10.1128/mbio.01659-20] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Next-generation sequencing (NGS) has enabled the revelation of a vast number of genomes from organisms spanning all domains of life. To reduce complexity when new genome sequences are annotated, open reading frames (ORFs) shorter than 50 codons in length are generally omitted. However, it has recently become evident that this procedure sorts away ORFs encoding small proteins of high biological significance. For instance, tailored small protein identification approaches have shown that bacteria encode numerous small proteins with important physiological functions. As the number of predicted small ORFs increase, it becomes important to characterize the corresponding proteins. In this study, we discovered a conserved but previously overlooked small enterobacterial protein. We show that this protein, which we dubbed TimP, is a potent toxin that inhibits bacterial growth by targeting the cell membrane. Toxicity is relieved by a small regulatory RNA, which binds the toxin mRNA to inhibit toxin synthesis. Small proteins are gaining increased attention due to their important functions in major biological processes throughout the domains of life. However, their small size and low sequence conservation make them difficult to identify. It is therefore not surprising that enterobacterial ryfA has escaped identification as a small protein coding gene for nearly 2 decades. Since its identification in 2001, ryfA has been thought to encode a noncoding RNA and has been implicated in biofilm formation in Escherichia coli and pathogenesis in Shigella dysenteriae. Although a recent ribosome profiling study suggested ryfA to be translated, the corresponding protein product was not detected. In this study, we provide evidence that ryfA encodes a small toxic inner membrane protein, TimP, overexpression of which causes cytoplasmic membrane leakage. TimP carries an N-terminal signal sequence, indicating that its membrane localization is Sec-dependent. Expression of TimP is repressed by the small RNA (sRNA) TimR, which base pairs with the timP mRNA to inhibit its translation. In contrast to overexpression, endogenous expression of TimP upon timR deletion permits cell growth, possibly indicating a toxicity-independent function in the bacterial membrane.
Collapse
|
30
|
Arginine-Rich Small Proteins with a Domain of Unknown Function, DUF1127, Play a Role in Phosphate and Carbon Metabolism of Agrobacterium tumefaciens. J Bacteriol 2020; 202:JB.00309-20. [PMID: 33093235 DOI: 10.1128/jb.00309-20] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 07/21/2020] [Indexed: 02/06/2023] Open
Abstract
In any given organism, approximately one-third of all proteins have a yet-unknown function. A widely distributed domain of unknown function is DUF1127. Approximately 17,000 proteins with such an arginine-rich domain are found in 4,000 bacteria. Most of them are single-domain proteins, and a large fraction qualifies as small proteins with fewer than 50 amino acids. We systematically identified and characterized the seven DUF1127 members of the plant pathogen Agrobacterium tumefaciens They all give rise to authentic proteins and are differentially expressed as shown at the RNA and protein levels. The seven proteins fall into two subclasses on the basis of their length, sequence, and reciprocal regulation by the LysR-type transcription factor LsrB. The absence of all three short DUF1127 proteins caused a striking phenotype in later growth phases and increased cell aggregation and biofilm formation. Protein profiling and transcriptome sequencing (RNA-seq) analysis of the wild type and triple mutant revealed a large number of differentially regulated genes in late exponential and stationary growth. The most affected genes are involved in phosphate uptake, glycine/serine homeostasis, and nitrate respiration. The results suggest a redundant function of the small DUF1127 paralogs in nutrient acquisition and central carbon metabolism of A. tumefaciens They may be required for diauxic switching between carbon sources when sugar from the medium is depleted. We end by discussing how DUF1127 might confer such a global impact on cell physiology and gene expression.IMPORTANCE Despite being prevalent in numerous ecologically and clinically relevant bacterial species, the biological role of proteins with a domain of unknown function, DUF1127, is unclear. Experimental models are needed to approach their elusive function. We used the phytopathogen Agrobacterium tumefaciens, a natural genetic engineer that causes crown gall disease, and focused on its three small DUF1127 proteins. They have redundant and pervasive roles in nutrient acquisition, cellular metabolism, and biofilm formation. The study shows that small proteins have important previously missed biological functions. How small basic proteins can have such a broad impact is a fascinating prospect of future research.
Collapse
|
31
|
Glaub A, Huptas C, Neuhaus K, Ardern Z. Recommendations for bacterial ribosome profiling experiments based on bioinformatic evaluation of published data. J Biol Chem 2020; 295:8999-9011. [PMID: 32385111 PMCID: PMC7335797 DOI: 10.1074/jbc.ra119.012161] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 05/05/2020] [Indexed: 02/03/2023] Open
Abstract
Ribosome profiling (RIBO-Seq) has improved our understanding of bacterial translation, including finding many unannotated genes. However, protocols for RIBO-Seq and corresponding data analysis are not yet standardized. Here, we analyzed 48 RIBO-Seq samples from nine studies of Escherichia coli K12 grown in lysogeny broth medium and particularly focused on the size-selection step. We show that for conventional expression analysis, a size range between 22 and 30 nucleotides is sufficient to obtain protein-coding fragments, which has the advantage of removing many unwanted rRNA and tRNA reads. More specific analyses may require longer reads and a corresponding improvement in rRNA/tRNA depletion. There is no consensus about the appropriate sequencing depth for RIBO-Seq experiments in prokaryotes, and studies vary significantly in total read number. Our analysis suggests that 20 million reads that are not mapping to rRNA/tRNA are required for global detection of translated annotated genes. We also highlight the influence of drug-induced ribosome stalling, which causes bias at translation start sites. The resulting accumulation of reads at the start site may be especially useful for detecting weakly expressed genes. As different methods suit different questions, it may not be possible to produce a "one-size-fits-all" ribosome profiling data set. Therefore, experiments should be carefully designed in light of the scientific questions of interest. We propose some basic characteristics that should be reported with any new RIBO-Seq data sets. Careful attention to the factors discussed should improve prokaryotic gene detection and the comparability of ribosome profiling data sets.
Collapse
Affiliation(s)
- Alina Glaub
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Christopher Huptas
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Klaus Neuhaus
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany; Core Facility Microbiome, ZIEL Institute for Food and Health, Technical University of Munich, Freising, Germany
| | - Zachary Ardern
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany.
| |
Collapse
|
32
|
The Archaeal Proteome Project advances knowledge about archaeal cell biology through comprehensive proteomics. Nat Commun 2020; 11:3145. [PMID: 32561711 PMCID: PMC7305310 DOI: 10.1038/s41467-020-16784-7] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Accepted: 05/18/2020] [Indexed: 11/08/2022] Open
Abstract
While many aspects of archaeal cell biology remain relatively unexplored, systems biology approaches like mass spectrometry (MS) based proteomics offer an opportunity for rapid advances. Unfortunately, the enormous amount of MS data generated often remains incompletely analyzed due to a lack of sophisticated bioinformatic tools and field-specific biological expertise for data interpretation. Here we present the initiation of the Archaeal Proteome Project (ArcPP), a community-based effort to comprehensively analyze archaeal proteomes. Starting with the model archaeon Haloferax volcanii, we reanalyze MS datasets from various strains and culture conditions. Optimized peptide spectrum matching, with strict control of false discovery rates, facilitates identifying > 72% of the reference proteome, with a median protein sequence coverage of 51%. These analyses, together with expert knowledge in diverse aspects of cell biology, provide meaningful insights into processes such as N-terminal protein maturation, N-glycosylation, and metabolism. Altogether, ArcPP serves as an invaluable blueprint for comprehensive prokaryotic proteomics.
Collapse
|
33
|
Yadavalli SS, Goh T, Carey JN, Malengo G, Vellappan S, Nickels BE, Sourjik V, Goulian M, Yuan J. Functional determinants of a small protein controlling a broadly conserved bacterial sensor kinase. J Bacteriol 2020; 202:JB.00305-20. [PMID: 32482726 PMCID: PMC8404706 DOI: 10.1128/jb.00305-20] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 05/22/2020] [Indexed: 12/14/2022] Open
Abstract
The PhoQ/PhoP two-component system plays a vital role in the regulation of Mg2+ homeostasis, resistance to acid and hyperosmotic stress, cationic antimicrobial peptides, and virulence in Escherichia coli, Salmonella and related bacteria. Previous studies have shown that MgrB, a 47 amino acid membrane protein that is part of the PhoQ/PhoP regulon, inhibits the histidine kinase PhoQ. MgrB is part of a negative feedback loop modulating this two-component system that prevents hyperactivation of PhoQ and may also provide an entry point for additional input signals for the PhoQ/PhoP pathway. To explore the mechanism of action of MgrB, we have analyzed the effects of point mutations, C-terminal truncations and transmembrane region swaps on MgrB activity. In contrast with two other known membrane protein regulators of histidine kinases in E. coli, we find that the MgrB TM region is necessary for PhoQ inhibition. Our results indicate that the TM region mediates interactions with PhoQ and that W20 is a key residue for PhoQ/MgrB complex formation. Additionally, mutations of the MgrB cytosolic region suggest that the two N-terminal lysines play an important role in regulating PhoQ activity. Alanine scanning mutagenesis of the periplasmic region of MgrB further indicates that, with the exception of a few highly conserved residues, most residues are not essential for MgrB's function as a PhoQ inhibitor. Our results indicate that the regulatory function of the small protein MgrB depends on distinct contributions from multiple residues spread across the protein. Interestingly, the TM region also appears to interact with other non-cognate histidine kinases in a bacterial two-hybrid assay, suggesting a potential route for evolving new small protein modulators of histidine kinases.
Collapse
Affiliation(s)
- Srujana S Yadavalli
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Genetics and Waksman Institute of Microbiology, Rutgers University, Piscataway, NJ 08854, USA
| | - Ted Goh
- Department of Biology, Swarthmore College, Swarthmore, Pennsylvania 19081, USA
- Boston University School of Medicine, Boston, Massachusetts 02118, USA
| | - Jeffrey N Carey
- Biochemistry and Molecular Biophysics Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Gabriele Malengo
- Max Planck Institute for Terrestrial Microbiology, 35043 Marburg, Germany
- LOEWE Center for Synthetic Microbiology (SYNMIKRO), 35043 Marburg, Germany
| | - Sangeevan Vellappan
- Molecular Biosciences Graduate Program, Rutgers University, Piscataway NJ 08854
| | - Bryce E Nickels
- Department of Genetics and Waksman Institute of Microbiology, Rutgers University, Piscataway, NJ 08854, USA
| | - Victor Sourjik
- Max Planck Institute for Terrestrial Microbiology, 35043 Marburg, Germany
- LOEWE Center for Synthetic Microbiology (SYNMIKRO), 35043 Marburg, Germany
| | - Mark Goulian
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Physics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Jing Yuan
- Max Planck Institute for Terrestrial Microbiology, 35043 Marburg, Germany
- LOEWE Center for Synthetic Microbiology (SYNMIKRO), 35043 Marburg, Germany
| |
Collapse
|
34
|
Clauwaert J, Menschaert G, Waegeman W. DeepRibo: a neural network for precise gene annotation of prokaryotes by combining ribosome profiling signal and binding site patterns. Nucleic Acids Res 2019; 47:e36. [PMID: 30753697 PMCID: PMC6451124 DOI: 10.1093/nar/gkz061] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 01/02/2019] [Accepted: 01/30/2019] [Indexed: 12/13/2022] Open
Abstract
Annotation of gene expression in prokaryotes often finds itself corrected due to small variations of the annotated gene regions observed between different (sub)-species. It has become apparent that traditional sequence alignment algorithms, used for the curation of genomes, are not able to map the full complexity of the genomic landscape. We present DeepRibo, a novel neural network utilizing features extracted from ribosome profiling information and binding site sequence patterns that shows to be a precise tool for the delineation and annotation of expressed genes in prokaryotes. The neural network combines recurrent memory cells and convolutional layers, adapting the information gained from both the high-throughput ribosome profiling data and ribosome binding translation initiation sequence region into one model. DeepRibo is designed as a single model trained on a variety of ribosome profiling experiments, used for the identification of open reading frames in prokaryotes without a priori knowledge of the translational landscape. Through extensive validation of the model trained on various sets of data, multiple species sequence similarity, mass spectrometry and Edman degradation verified proteins, the effectiveness of DeepRibo is highlighted.
Collapse
Affiliation(s)
- Jim Clauwaert
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| | - Gerben Menschaert
- Biobix, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| | - Willem Waegeman
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| |
Collapse
|
35
|
Weaver J, Mohammad F, Buskirk AR, Storz G. Identifying Small Proteins by Ribosome Profiling with Stalled Initiation Complexes. mBio 2019; 10:e02819-18. [PMID: 30837344 PMCID: PMC6401488 DOI: 10.1128/mbio.02819-18] [Citation(s) in RCA: 126] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2018] [Accepted: 01/24/2019] [Indexed: 11/20/2022] Open
Abstract
Small proteins consisting of 50 or fewer amino acids have been identified as regulators of larger proteins in bacteria and eukaryotes. Despite the importance of these molecules, the total number of small proteins remains unknown because conventional annotation pipelines usually exclude small open reading frames (smORFs). We previously identified several dozen small proteins in the model organism Escherichia coli using theoretical bioinformatic approaches based on sequence conservation and matches to canonical ribosome binding sites. Here, we present an empirical approach for discovering new proteins, taking advantage of recent advances in ribosome profiling in which antibiotics are used to trap newly initiated 70S ribosomes at start codons. This approach led to the identification of many novel initiation sites in intergenic regions in E. coli We tagged 41 smORFs on the chromosome and detected protein synthesis for all but three. Not only are the corresponding genes intergenic but they are also found antisense to other genes, in operons, and overlapping other open reading frames (ORFs), some impacting the translation of larger downstream genes. These results demonstrate the utility of this method for identifying new genes, regardless of their genomic context.IMPORTANCE Proteins comprised of 50 or fewer amino acids have been shown to interact with and modulate the functions of larger proteins in a range of organisms. Despite the possible importance of small proteins, the true prevalence and capabilities of these regulators remain unknown as the small size of the proteins places serious limitations on their identification, purification, and characterization. Here, we present a ribosome profiling approach with stalled initiation complexes that led to the identification of 38 new small proteins.
Collapse
Affiliation(s)
- Jeremy Weaver
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, Maryland, USA
| | - Fuad Mohammad
- Department of Molecular Biology and Genetics, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| | - Allen R Buskirk
- Department of Molecular Biology and Genetics, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| | - Gisela Storz
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, Maryland, USA
| |
Collapse
|
36
|
Khitun A, Ness TJ, Slavoff SA. Small open reading frames and cellular stress responses. Mol Omics 2019; 15:108-116. [PMID: 30810554 DOI: 10.1039/c8mo00283e] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Small open reading frames (smORFs) encoding polypeptides of less than 100 amino acids in eukaryotes (50 amino acids in prokaryotes) were historically excluded from genome annotation. However, recent advances in genomics, ribosome footprinting, and proteomics have revealed thousands of translated smORFs in genomes spanning evolutionary space. These smORFs can encode functional polypeptides, or act as cis-translational regulators. Herein we review evidence that some smORF-encoded polypeptides (SEPs) participate in stress responses in both prokaryotes and eukaryotes, and that some upstream ORFs (uORFs) regulate stress-responsive translation of downstream cistrons in eukaryotic cells. These studies provide insight into a regulated subclass of smORFs and suggest that at least some SEPs may participate in maintenance of cellular homeostasis under stress.
Collapse
Affiliation(s)
- Alexandra Khitun
- Chemical Biology Institute, Yale University, West Haven, CT 06516, USA. and Department of Chemistry, Yale University, New Haven, CT 06520, USA
| | - Travis J Ness
- Chemical Biology Institute, Yale University, West Haven, CT 06516, USA. and Department of Chemistry, Yale University, New Haven, CT 06520, USA
| | - Sarah A Slavoff
- Chemical Biology Institute, Yale University, West Haven, CT 06516, USA. and Department of Chemistry, Yale University, New Haven, CT 06520, USA and Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
37
|
Miravet-Verde S, Ferrar T, Espadas-García G, Mazzolini R, Gharrab A, Sabido E, Serrano L, Lluch-Senar M. Unraveling the hidden universe of small proteins in bacterial genomes. Mol Syst Biol 2019; 15:e8290. [PMID: 30796087 PMCID: PMC6385055 DOI: 10.15252/msb.20188290] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Identification of small open reading frames (smORFs) encoding small proteins (≤ 100 amino acids; SEPs) is a challenge in the fields of genome annotation and protein discovery. Here, by combining a novel bioinformatics tool (RanSEPs) with “‐omics” approaches, we were able to describe 109 bacterial small ORFomes. Predictions were first validated by performing an exhaustive search of SEPs present in Mycoplasma pneumoniae proteome via mass spectrometry, which illustrated the limitations of shotgun approaches. Then, RanSEPs predictions were validated and compared with other tools using proteomic datasets from different bacterial species and SEPs from the literature. We found that up to 16 ± 9% of proteins in an organism could be classified as SEPs. Integration of RanSEPs predictions with transcriptomics data showed that some annotated non‐coding RNAs could in fact encode for SEPs. A functional study of SEPs highlighted an enrichment in the membrane, translation, metabolism, and nucleotide‐binding categories. Additionally, 9.7% of the SEPs included a N‐terminus predicted signal peptide. We envision RanSEPs as a tool to unmask the hidden universe of small bacterial proteins.
Collapse
Affiliation(s)
- Samuel Miravet-Verde
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Tony Ferrar
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Guadalupe Espadas-García
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Rocco Mazzolini
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Anas Gharrab
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Eduard Sabido
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Luis Serrano
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain .,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Maria Lluch-Senar
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain .,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|
38
|
VanOrsdel CE, Kelly JP, Burke BN, Lein CD, Oufiero CE, Sanchez JF, Wimmers LE, Hearn DJ, Abuikhdair FJ, Barnhart KR, Duley ML, Ernst SEG, Kenerson BA, Serafin AJ, Hemm MR. Identifying New Small Proteins in Escherichia coli. Proteomics 2018; 18:e1700064. [PMID: 29645342 PMCID: PMC6001520 DOI: 10.1002/pmic.201700064] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Revised: 03/05/2018] [Indexed: 12/11/2022]
Abstract
The number of small proteins (SPs) encoded in the Escherichia coli genome is unknown, as current bioinformatics and biochemical techniques make short gene and small protein identification challenging. One method of small protein identification involves adding an epitope tag to the 3′ end of a short open reading frame (sORF) on the chromosome, with synthesis confirmed by immunoblot assays. In this study, this strategy was used to identify new E. coli small proteins, tagging 80 sORFs in the E. coli genome, and assayed for protein synthesis. The selected sORFs represent diverse sequence characteristics, including degrees of sORF conservation, predicted transmembrane domains, sORF direction with respect to flanking genes, ribosome binding site (RBS) prediction, and ribosome profiling results. Of 80 sORFs, 36 resulted in encoded synthesized proteins—a 45% success rate. Modeling of detected versus non‐detected small proteins analysis showed predictions based on RBS prediction, transcription data, and ribosome profiling had statistically‐significant correlation with protein synthesis; however, there was no correlation between current sORF annotation and protein synthesis. These results suggest substantial numbers of small proteins remain undiscovered in E. coli, and existing bioinformatics techniques must continue to improve to facilitate identification.
Collapse
Affiliation(s)
- Caitlin E VanOrsdel
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - John P Kelly
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - Brittany N Burke
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - Christina D Lein
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | | | - Joseph F Sanchez
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - Larry E Wimmers
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - David J Hearn
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - Fatimeh J Abuikhdair
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - Kathryn R Barnhart
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - Michelle L Duley
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - Sarah E G Ernst
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - Briana A Kenerson
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - Aubrey J Serafin
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| | - Matthew R Hemm
- Department of Biological Sciences, Smith Hall, Towson University, Towson, MD, USA
| |
Collapse
|