1
|
Ren G, Gu X, Zhang L, Gong S, Song S, Chen S, Chen Z, Wang X, Li Z, Zhou Y, Li L, Yang J, Lai F, Dang Y. Ribosomal frameshifting at normal codon repeats recodes functional chimeric proteins in human. Nucleic Acids Res 2024; 52:2463-2479. [PMID: 38281188 PMCID: PMC10954444 DOI: 10.1093/nar/gkae035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 01/04/2024] [Accepted: 01/10/2024] [Indexed: 01/30/2024] Open
Abstract
Ribosomal frameshifting refers to the process that ribosomes slip into +1 or -1 reading frame, thus produce chimeric trans-frame proteins. In viruses and bacteria, programmed ribosomal frameshifting can produce essential trans-frame proteins for viral replication or regulation of other biological processes. In humans, however, functional trans-frame protein derived from ribosomal frameshifting is scarcely documented. Combining multiple assays, we show that short codon repeats could act as cis-acting elements that stimulate ribosomal frameshifting in humans, abbreviated as CRFS hereafter. Using proteomic analyses, we identified many putative CRFS events from 32 normal human tissues supported by trans-frame peptides positioned at codon repeats. Finally, we show a CRFS-derived trans-frame protein (HDAC1-FS) functions by antagonizing the activities of HDAC1, thus affecting cell migration and apoptosis. These data suggest a novel type of translational recoding associated with codon repeats, which may expand the coding capacity of mRNA and diversify the regulation in human.
Collapse
Affiliation(s)
- Guiping Ren
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Xiaoqian Gu
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Lu Zhang
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Shimin Gong
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Shuang Song
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Shunkai Chen
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Zhenjing Chen
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Xiaoyan Wang
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Zhanbiao Li
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Yingshui Zhou
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Longxi Li
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Jiao Yang
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Fan Lai
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| | - Yunkun Dang
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Key Laboratory for Southwest Microbial Diversity of the Ministry of Education, Yunnan Key Laboratory of Cell Metabolism and Diseases, Center for Life Science, School of Life Sciences, Yunnan University, Kunming 650021, China
- Southwest United Graduate School, Kunming650092, China
| |
Collapse
|
2
|
Coelho LP, Santos-Júnior CD, de la Fuente-Nunez C. Challenges in computational discovery of bioactive peptides in 'omics data. Proteomics 2024:e2300105. [PMID: 38458994 DOI: 10.1002/pmic.202300105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 02/06/2024] [Accepted: 02/06/2024] [Indexed: 03/10/2024]
Abstract
Peptides have a plethora of activities in biological systems that can potentially be exploited biotechnologically. Several peptides are used clinically, as well as in industry and agriculture. The increase in available 'omics data has recently provided a large opportunity for mining novel enzymes, biosynthetic gene clusters, and molecules. While these data primarily consist of DNA sequences, other types of data provide important complementary information. Due to their size, the approaches proven successful at discovering novel proteins of canonical size cannot be naïvely applied to the discovery of peptides. Peptides can be encoded directly in the genome as short open reading frames (smORFs), or they can be derived from larger proteins by proteolysis. Both of these peptide classes pose challenges as simple methods for their prediction result in large numbers of false positives. Similarly, functional annotation of larger proteins, traditionally based on sequence similarity to infer orthology and then transferring functions between characterized proteins and uncharacterized ones, cannot be applied for short sequences. The use of these techniques is much more limited and alternative approaches based on machine learning are used instead. Here, we review the limitations of traditional methods as well as the alternative methods that have recently been developed for discovering novel bioactive peptides with a focus on prokaryotic genomes and metagenomes.
Collapse
Affiliation(s)
- Luis Pedro Coelho
- Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology, Woolloongabba, Queensland, Australia
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
| | - Célio Dias Santos-Júnior
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
- Laboratory of Microbial Processes & Biodiversity - LMPB, Hydrobiology Department, Federal University of São Carlos - UFSCar, São Paulo, Brazil
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
3
|
Miravet-Verde S, Mazzolini R, Segura-Morales C, Broto A, Lluch-Senar M, Serrano L. ProTInSeq: transposon insertion tracking by ultra-deep DNA sequencing to identify translated large and small ORFs. Nat Commun 2024; 15:2091. [PMID: 38453908 PMCID: PMC10920889 DOI: 10.1038/s41467-024-46112-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 02/14/2024] [Indexed: 03/09/2024] Open
Abstract
Identifying open reading frames (ORFs) being translated is not a trivial task. ProTInSeq is a technique designed to characterize proteomes by sequencing transposon insertions engineered to express a selection marker when they occur in-frame within a protein-coding gene. In the bacterium Mycoplasma pneumoniae, ProTInSeq identifies 83% of its annotated proteins, along with 5 proteins and 153 small ORF-encoded proteins (SEPs; ≤100 aa) that were not previously annotated. Moreover, ProTInSeq can be utilized for detecting translational noise, as well as for relative quantification and transmembrane topology estimation of fitness and non-essential proteins. By integrating various identification approaches, the number of initially annotated SEPs in this bacterium increases from 27 to 329, with a quarter of them predicted to possess antimicrobial potential. Herein, we describe a methodology complementary to Ribo-Seq and mass spectroscopy that can identify SEPs while providing other insights in a proteome with a flexible and cost-effective DNA ultra-deep sequencing approach.
Collapse
Affiliation(s)
- Samuel Miravet-Verde
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain.
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zurich, Zurich, Switzerland.
| | | | - Carolina Segura-Morales
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain
| | - Alicia Broto
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain
| | - Maria Lluch-Senar
- Pulmobiotics, Dr Aiguader 88, 08003, Barcelona, Spain.
- Institute of Biotechnology and Biomedicine "Vicent Villar Palasi" (IBB), Universitat Autònoma de Barcelona, Barcelona, Spain.
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
- ICREA, Pg. Lluis Companys 23, 08010, Barcelona, Spain.
| |
Collapse
|
4
|
Valdivia-Francia F, Sendoel A. No country for old methods: New tools for studying microproteins. iScience 2024; 27:108972. [PMID: 38333695 PMCID: PMC10850755 DOI: 10.1016/j.isci.2024.108972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2024] Open
Abstract
Microproteins encoded by small open reading frames (sORFs) have emerged as a fascinating frontier in genomics. Traditionally overlooked due to their small size, recent technological advancements such as ribosome profiling, mass spectrometry-based strategies and advanced computational approaches have led to the annotation of more than 7000 sORFs in the human genome. Despite the vast progress, only a tiny portion of these microproteins have been characterized and an important challenge in the field lies in identifying functionally relevant microproteins and understanding their role in different cellular contexts. In this review, we explore the recent advancements in sORF research, focusing on the new methodologies and computational approaches that have facilitated their identification and functional characterization. Leveraging these new tools hold great promise for dissecting the diverse cellular roles of microproteins and will ultimately pave the way for understanding their role in the pathogenesis of diseases and identifying new therapeutic targets.
Collapse
Affiliation(s)
- Fabiola Valdivia-Francia
- University of Zurich, Institute for Regenerative Medicine (IREM), Wagistrasse 12, 8952 Schlieren-Zurich, Switzerland
- Life Science Zurich Graduate School, Molecular Life Science Program, University of Zurich/ ETH Zurich, Schlieren-Zurich, Switzerland
| | - Ataman Sendoel
- University of Zurich, Institute for Regenerative Medicine (IREM), Wagistrasse 12, 8952 Schlieren-Zurich, Switzerland
| |
Collapse
|
5
|
Cao X, Sun S, Xing J. A Massive Proteogenomic Screen Identifies Thousands of Novel Peptides From the Human "Dark" Proteome. Mol Cell Proteomics 2024; 23:100719. [PMID: 38242438 PMCID: PMC10867589 DOI: 10.1016/j.mcpro.2024.100719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 01/01/2024] [Accepted: 01/16/2024] [Indexed: 01/21/2024] Open
Abstract
Although the human gene annotation has been continuously improved over the past 2 decades, numerous studies demonstrated the existence of a "dark proteome", consisting of proteins that were critical for biological processes but not included in widely used gene catalogs. The Genotype-Tissue Expression project generated more than 15,000 RNA-seq datasets from multiple tissues, which modeled 30 million transcripts in the human genome. To provide a resource of high-confidence novel proteins from the dark proteome, we screened 50,000 mass spectrometry runs from over 900 projects to identify proteins translated from the Genotype-Tissue Expression transcript model with proteomic support. We also integrated 3.8 million common genetic variants from the gnomAD database to improve peptide identification. As a result, we identified 170,529 novel peptides with proteomic evidence, of which 6048 passed the strictest standard we defined and were supported by PepQuery. We provided a user-friendly website (https://ncorf.genes.fun/) for researchers to check the evidence of novel peptides from their studies. The findings will improve our understanding of coding genes and facilitate genomic data interpretation in biomedical research.
Collapse
Affiliation(s)
- Xiaolong Cao
- Department of Anesthesiology, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Siqi Sun
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Jinchuan Xing
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA.
| |
Collapse
|
6
|
Zhou B, Ji B, Shen C, Zhang X, Yu X, Huang P, Yu R, Zhang H, Dou X, Chen Q, Zeng Q, Wang X, Cao Z, Hu G, Xu S, Zhao H, Yang Y, Zhou Y, Wang J. EVLncRNAs 3.0: an updated comprehensive database for manually curated functional long non-coding RNAs validated by low-throughput experiments. Nucleic Acids Res 2024; 52:D98-D106. [PMID: 37953349 PMCID: PMC10767905 DOI: 10.1093/nar/gkad1057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/23/2023] [Accepted: 11/01/2023] [Indexed: 11/14/2023] Open
Abstract
Long noncoding RNAs (lncRNAs) have emerged as crucial regulators across diverse biological processes and diseases. While high-throughput sequencing has enabled lncRNA discovery, functional characterization remains limited. The EVLncRNAs database is the first and exclusive repository for all experimentally validated functional lncRNAs from various species. After previous releases in 2018 and 2021, this update marks a major expansion through exhaustive manual curation of nearly 25 000 publications from 15 May 2020, to 15 May 2023. It incorporates substantial growth across all categories: a 154% increase in functional lncRNAs, 160% in associated diseases, 186% in lncRNA-disease associations, 235% in interactions, 138% in structures, 234% in circular RNAs, 235% in resistant lncRNAs and 4724% in exosomal lncRNAs. More importantly, it incorporated additional information include functional classifications, detailed interaction pathways, homologous lncRNAs, lncRNA locations, COVID-19, phase-separation and organoid-related lncRNAs. The web interface was substantially improved for browsing, visualization, and searching. ChatGPT was tested for information extraction and functional overview with its limitation noted. EVLncRNAs 3.0 represents the most extensive curated resource of experimentally validated functional lncRNAs and will serve as an indispensable platform for unravelling emerging lncRNA functions. The updated database is freely available at https://www.sdklab-biophysics-dzu.net/EVLncRNAs3/.
Collapse
Affiliation(s)
- Bailing Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Baohua Ji
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- College of Physics and Electronic Information, Dezhou University, Dezhou 253023, China
| | - Congcong Shen
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Xia Zhang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Xue Yu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Pingping Huang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Ru Yu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Hongmei Zhang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- College of Life Science, Dezhou University, Dezhou 253023, China
| | - Xianghua Dou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Qingshuai Chen
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Qiangcheng Zeng
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- College of Life Science, Dezhou University, Dezhou 253023, China
| | - Xiaoxin Wang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- College of Physics and Electronic Information, Dezhou University, Dezhou 253023, China
| | - Zanxia Cao
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Guodong Hu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Shicai Xu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Huiying Zhao
- Department of Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou 510120, China
| | - Yuedong Yang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou 510006, China
| | - Yaoqi Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518038, China
| | - Jihua Wang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| |
Collapse
|
7
|
Leblanc S, Yala F, Provencher N, Lucier JF, Levesque M, Lapointe X, Jacques JF, Fournier I, Salzet M, Ouangraoua A, Scott MS, Boisvert FM, Brunet MA, Roucou X. OpenProt 2.0 builds a path to the functional characterization of alternative proteins. Nucleic Acids Res 2024; 52:D522-D528. [PMID: 37956315 PMCID: PMC10767855 DOI: 10.1093/nar/gkad1050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 10/20/2023] [Accepted: 10/23/2023] [Indexed: 11/15/2023] Open
Abstract
The OpenProt proteogenomic resource (https://www.openprot.org/) provides users with a complete and freely accessible set of non-canonical or alternative open reading frames (AltORFs) within the transcriptome of various species, as well as functional annotations of the corresponding protein sequences not found in standard databases. Enhancements in this update are largely the result of user feedback and include the prediction of structure, subcellular localization, and intrinsic disorder, using cutting-edge algorithms based on machine learning techniques. The mass spectrometry pipeline now integrates a machine learning-based peptide rescoring method to improve peptide identification. We continue to help users explore this cryptic proteome by providing OpenCustomDB, a tool that enables users to build their own customized protein databases, and OpenVar, a genomic annotator including genetic variants within AltORFs and protein sequences. A new interface improves the visualization of all functional annotations, including a spectral viewer and the prediction of multicoding genes. All data on OpenProt are freely available and downloadable. Overall, OpenProt continues to establish itself as an important resource for the exploration and study of new proteins.
Collapse
Affiliation(s)
- Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Feriel Yala
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Nicolas Provencher
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Jean-François Lucier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Department of Biology, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Maxime Levesque
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Xavier Lapointe
- Department of Pediatrics, Medical Genetics Service, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Jean-Francois Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Isabelle Fournier
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Michel Salzet
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Aïda Ouangraoua
- Informatics Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Michelle S Scott
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
| | - François-Michel Boisvert
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
- Department of Immunology and Cellular Biology, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Marie A Brunet
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
- Department of Pediatrics, Medical Genetics Service, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
| |
Collapse
|
8
|
Lei T, Chang Y, Yao C, Zhang H. A systematic evaluation of computational methods for predicting translated non-canonical ORFs from ribosome profiling data. J Genet Genomics 2024; 51:105-108. [PMID: 37673174 DOI: 10.1016/j.jgg.2023.08.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 08/15/2023] [Accepted: 08/30/2023] [Indexed: 09/08/2023]
Affiliation(s)
- Tianyu Lei
- College of Ecology, Lanzhou University, Lanzhou, Gansu 730000, China
| | - Yue Chang
- College of Ecology, Lanzhou University, Lanzhou, Gansu 730000, China
| | - Chao Yao
- Cuiying Honors College, Lanzhou University, Lanzhou, Gansu 730000, China
| | - Hong Zhang
- College of Ecology, Lanzhou University, Lanzhou, Gansu 730000, China.
| |
Collapse
|
9
|
Wang J, Wang W, Ma F, Qian H. A hidden translatome in tumors-the coding lncRNAs. SCIENCE CHINA. LIFE SCIENCES 2023; 66:2755-2772. [PMID: 37154857 DOI: 10.1007/s11427-022-2289-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 12/29/2022] [Indexed: 05/10/2023]
Abstract
Long noncoding RNAs (lncRNAs) have been extensively identified in eukaryotic genomes and have been shown to play critical roles in the development of multiple cancers. Through the application and development of ribosome analysis and sequencing technologies, advanced studies have discovered the translation of lncRNAs. Although lncRNAs were originally defined as noncoding RNAs, many lncRNAs actually contain small open reading frames that are translated into peptides. This opens a broad area for the functional investigation of lncRNAs. Here, we introduce prospective methods and databases for screening lncRNAs with functional polypeptides. We also summarize the specific lncRNA-encoded proteins and their molecular mechanisms that promote or inhibit cancerous. Importantly, the role of lncRNA-encoded peptides/proteins holds promise in cancer research, but some potential challenges remain unresolved. This review includes reports on lncRNA-encoded peptides or proteins in cancer, aiming to provide theoretical basis and related references to facilitate the discovery of more functional peptides encoded by lncRNA, and to further develop new anti-cancer therapeutic targets as well as clinical biomarkers of diagnosis and prognosis.
Collapse
Affiliation(s)
- Jinsong Wang
- State Key Laboratory of Molecular Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Wenna Wang
- State Key Laboratory of Molecular Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
- Department of Medical Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Fei Ma
- State Key Laboratory of Molecular Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
- Department of Medical Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
| | - Haili Qian
- State Key Laboratory of Molecular Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
| |
Collapse
|
10
|
Wang L, Cui X, Jiang F, Hu Y, Wan W, Li G, Lin Y, Xiao J. Circular RNA Translation in Cardiovascular Diseases. Curr Genomics 2023; 24:66-71. [PMID: 37994328 PMCID: PMC10662380 DOI: 10.2174/1389202924666230911121358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 06/08/2023] [Accepted: 08/09/2023] [Indexed: 11/24/2023] Open
Abstract
Circular RNAs (circRNAs) are a class of endogenous functional RNA generated by back-splicing. Recently, circRNAs have been found to have certain coding potential. Proteins/peptides translated from circRNAs play essential roles in various diseases. Here, we briefly summarize the basic knowledge and technologies that are usually applied to study circRNA translation. Then, we focus on the research progress of circRNA translation in cardiovascular diseases and discuss the perspective and future direction of translatable circRNA study in cardiovascular diseases.
Collapse
Affiliation(s)
- Lijun Wang
- Institute of Geriatrics (Shanghai University), Affiliated Nantong Hospital of Shanghai University (The Sixth People’s Hospital of Nantong), School of Medicine, Shanghai University, Nantong, 226011, China
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, Shanghai Engineering Research Center of Organ Repair, School of Life Science, Shanghai University, Shanghai, 200444, China
| | - Xinxin Cui
- Institute of Geriatrics (Shanghai University), Affiliated Nantong Hospital of Shanghai University (The Sixth People’s Hospital of Nantong), School of Medicine, Shanghai University, Nantong, 226011, China
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, Shanghai Engineering Research Center of Organ Repair, School of Life Science, Shanghai University, Shanghai, 200444, China
| | - Fei Jiang
- Department of Nursing, Union Hospital, Fujian Medical University Union Hospital, Fuzhou, 350001, China
- Fujian Provincial Special Reserve Talents Laboratory, Fujian Medical University Union Hospital, Fuzhou, 350001, China
| | - Yuxue Hu
- Institute of Geriatrics (Shanghai University), Affiliated Nantong Hospital of Shanghai University (The Sixth People’s Hospital of Nantong), School of Medicine, Shanghai University, Nantong, 226011, China
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, Shanghai Engineering Research Center of Organ Repair, School of Life Science, Shanghai University, Shanghai, 200444, China
| | - Wensi Wan
- Institute of Geriatrics (Shanghai University), Affiliated Nantong Hospital of Shanghai University (The Sixth People’s Hospital of Nantong), School of Medicine, Shanghai University, Nantong, 226011, China
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, Shanghai Engineering Research Center of Organ Repair, School of Life Science, Shanghai University, Shanghai, 200444, China
| | - Guoping Li
- Cardiovascular Division of the Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| | - Yanjuan Lin
- Department of Nursing, Union Hospital, Fujian Medical University Union Hospital, Fuzhou, 350001, China
- Fujian Provincial Special Reserve Talents Laboratory, Fujian Medical University Union Hospital, Fuzhou, 350001, China
| | - Junjie Xiao
- Institute of Geriatrics (Shanghai University), Affiliated Nantong Hospital of Shanghai University (The Sixth People’s Hospital of Nantong), School of Medicine, Shanghai University, Nantong, 226011, China
- Cardiac Regeneration and Ageing Lab, Institute of Cardiovascular Sciences, Shanghai Engineering Research Center of Organ Repair, School of Life Science, Shanghai University, Shanghai, 200444, China
| |
Collapse
|
11
|
Li W, Yu Y, Zhou G, Hu G, Li B, Ma H, Yan W, Pei H. Large-scale ORF screening based on LC-MS to discover novel lncRNA-encoded peptides responding to ionizing radiation and microgravity. Comput Struct Biotechnol J 2023; 21:5201-5211. [PMID: 37928948 PMCID: PMC10624585 DOI: 10.1016/j.csbj.2023.10.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 10/12/2023] [Accepted: 10/18/2023] [Indexed: 11/07/2023] Open
Abstract
In the human genome, 98% of genes can be transcribed into non-coding RNAs (ncRNAs), among which lncRNAs and their encoded peptides play important roles in regulating various aspects of cellular processes and may serve as crucial factors in modulating the biological effects induced by ionizing radiation and microgravity. Unfortunately, there are few reports in space radiation biology on lncRNA-encoded peptides below 10kD due to limitations in detection techniques. To fill this gap, we integrated a variety of methods based on genomics and peptidomics, and discovered 22 lncRNA-encoded small peptides that are sensitive to space radiation and microgravity, which have never been reported before. We concurrently validated the transmembrane helix, subcellular localization, and biological function of these small peptides using bioinformatics and molecular biology techniques. More importantly, we found that these small peptides function independently of the lncRNAs that encode them. Our findings have uncovered a previously unknown human proteome encoded by 'non-coding' genes in response to space conditions and elucidated their involvement in biological processes, providing valuable strategies for individual protection mechanisms for astronauts who carry out deep space exploration missions in space radiation environments.
Collapse
Affiliation(s)
- Wanshi Li
- State Key Laboratory of Radiation Medicine and Protection, School of Radiation Medicine and Protection, Collaborative Innovation Center of Radiological Medicine of Jiangsu Higher Education Institutions, Soochow University, Suzhou 215123, China
| | - Yongduo Yu
- State Key Laboratory of Radiation Medicine and Protection, School of Radiation Medicine and Protection, Collaborative Innovation Center of Radiological Medicine of Jiangsu Higher Education Institutions, Soochow University, Suzhou 215123, China
| | - Guangming Zhou
- State Key Laboratory of Radiation Medicine and Protection, School of Radiation Medicine and Protection, Collaborative Innovation Center of Radiological Medicine of Jiangsu Higher Education Institutions, Soochow University, Suzhou 215123, China
| | - Guang Hu
- Department of Bioinformatics, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou 215123, China
- Center for Systems Biology, Soochow University, Suzhou 215123, China
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Suzhou 215123, China
| | - Bingyan Li
- State Key Laboratory of Radiation Medicine and Protection, School of Radiation Medicine and Protection, Collaborative Innovation Center of Radiological Medicine of Jiangsu Higher Education Institutions, Soochow University, Suzhou 215123, China
| | - Hong Ma
- Beijing Key Laboratory for Separation and Analysis in Biomedicine and Pharmaceuticals, School of Life Science, Beijing Institute of Technology, Beijing 100081, China
| | - Wenying Yan
- Department of Bioinformatics, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou 215123, China
- Center for Systems Biology, Soochow University, Suzhou 215123, China
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Suzhou 215123, China
| | - Hailong Pei
- State Key Laboratory of Radiation Medicine and Protection, School of Radiation Medicine and Protection, Collaborative Innovation Center of Radiological Medicine of Jiangsu Higher Education Institutions, Soochow University, Suzhou 215123, China
| |
Collapse
|
12
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Moritz RL, Deutsch EW, van Heesch S. What Can Ribo-Seq, Immunopeptidomics, and Proteomics Tell Us About the Noncanonical Proteome? Mol Cell Proteomics 2023; 22:100631. [PMID: 37572790 PMCID: PMC10506109 DOI: 10.1016/j.mcpro.2023.100631] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 07/21/2023] [Accepted: 08/08/2023] [Indexed: 08/14/2023] Open
Abstract
Ribosome profiling (Ribo-Seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of noncanonical sites of ribosome translation outside the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7000 noncanonical ORFs are translated, which, at first glance, has the potential to expand the number of human protein CDSs by 30%, from ∼19,500 annotated CDSs to over 26,000 annotated CDSs. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of noncanonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome but searching for guidance on how to proceed. Here, we discuss the current state of noncanonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein coding."
Collapse
Affiliation(s)
- John R Prensner
- Division of Pediatric Hematology/Oncology, Department of Pediatrics, University of Michigan Medical School, Ann Arbor, Michigan, USA; Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, Michigan, USA.
| | | | - Leron W Kok
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Karl R Clauser
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, Agora Center Bugnon 25A, University of Lausanne, Lausanne, Switzerland; Department of Oncology, Centre Hospitalier Universitaire Vaudois (CHUV), Lausanne, Switzerland; Agora Cancer Research Centre, Lausanne, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | | |
Collapse
|
13
|
Dong X, Zhang K, Xun C, Chu T, Liang S, Zeng Y, Liu Z. Small Open Reading Frame-Encoded Micro-Peptides: An Emerging Protein World. Int J Mol Sci 2023; 24:10562. [PMID: 37445739 DOI: 10.3390/ijms241310562] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Revised: 06/20/2023] [Accepted: 06/21/2023] [Indexed: 07/15/2023] Open
Abstract
Small open reading frames (sORFs) are often overlooked features in genomes. In the past, they were labeled as noncoding or "transcriptional noise". However, accumulating evidence from recent years suggests that sORFs may be transcribed and translated to produce sORF-encoded polypeptides (SEPs) with less than 100 amino acids. The vigorous development of computational algorithms, ribosome profiling, and peptidome has facilitated the prediction and identification of many new SEPs. These SEPs were revealed to be involved in a wide range of basic biological processes, such as gene expression regulation, embryonic development, cellular metabolism, inflammation, and even carcinogenesis. To effectively understand the potential biological functions of SEPs, we discuss the history and development of the newly emerging research on sORFs and SEPs. In particular, we review a range of recently discovered bioinformatics tools for identifying, predicting, and validating SEPs as well as a variety of biochemical experiments for characterizing SEP functions. Lastly, this review underlines the challenges and future directions in identifying and validating sORFs and their encoded micropeptides, providing a significant reference for upcoming research on sORF-encoded peptides.
Collapse
Affiliation(s)
- Xiaoping Dong
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Kun Zhang
- The State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Science, Hunan Normal University, Changsha 410081, China
| | - Chengfeng Xun
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Tianqi Chu
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Songping Liang
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Yong Zeng
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
- The State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Science, Hunan Normal University, Changsha 410081, China
| | - Zhonghua Liu
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| |
Collapse
|
14
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Deutsch EW, van Heesch S. What can Ribo-seq and proteomics tell us about the non-canonical proteome? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541049. [PMID: 37292611 PMCID: PMC10245706 DOI: 10.1101/2023.05.16.541049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Ribosome profiling (Ribo-seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of non-canonical sites of ribosome translation outside of the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7,000 non-canonical open reading frames (ORFs) are translated, which, at first glance, has the potential to expand the number of human protein-coding sequences by 30%, from ∼19,500 annotated CDSs to over 26,000. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of non-canonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome, but searching for guidance on how to proceed. Here, we discuss the current state of non-canonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein-coding". In brief The human genome encodes thousands of non-canonical open reading frames (ORFs) in addition to protein-coding genes. As a nascent field, many questions remain regarding non-canonical ORFs. How many exist? Do they encode proteins? What level of evidence is needed for their verification? Central to these debates has been the advent of ribosome profiling (Ribo-seq) as a method to discern genome-wide ribosome occupancy, and immunopeptidomics as a method to detect peptides that are processed and presented by MHC molecules and not observed in traditional proteomics experiments. This article provides a synthesis of the current state of non-canonical ORF research and proposes standards for their future investigation and reporting. Highlights Combined use of Ribo-seq and proteomics-based methods enables optimal confidence in detecting non-canonical ORFs and their protein products.Ribo-seq can provide more sensitive detection of non-canonical ORFs, but data quality and analytical pipelines will impact results.Non-canonical ORF catalogs are diverse and span both high-stringency and low-stringency ORF nominations.A framework for standardized non-canonical ORF evidence will advance the research field.
Collapse
Affiliation(s)
- John R. Prensner
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | | | - Leron W. Kok
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| | - Karl R. Clauser
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Jonathan M. Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Agora Center Bugnon 25A, 1005 Lausanne, Switzerland
- Department of Oncology, Centre hospitalier universitaire vaudois (CHUV), Rue du Bugnon 46, 1005 Lausanne, Switzerland
- Agora Cancer Research Centre, 1011 Lausanne, Switzerland
| | - Eric W. Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| |
Collapse
|
15
|
Yang JE, Zhong WJ, Li JF, Lin YY, Liu FT, Tian H, Chen YJ, Luo XY, Zhuang SM. LINC00998-encoded micropeptide SMIM30 promotes the G1/S transition of cell cycle by regulating cytosolic calcium level. Mol Oncol 2022; 17:901-916. [PMID: 36495128 PMCID: PMC10158777 DOI: 10.1002/1878-0261.13358] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 10/04/2022] [Accepted: 12/09/2022] [Indexed: 12/14/2022] Open
Abstract
The biological functions of short open reading frame (sORF)-encoded micropeptides remain largely unknown. Here, we report that LINC00998, a previously annotated lncRNA, was upregulated in multiple cancer types and the sORF on LINC00998 encoded a micropeptide named SMIM30. SMIM30 was localized in the membranes of the endoplasmic reticulum (ER) and mitochondria. Silencing SMIM30 inhibited the proliferation of hepatoma cells in vitro and suppressed the growth of tumor xenografts and N-nitrosodiethylamine-induced hepatoma. Overexpression of the 5'UTR-sORF sequence of LINC00998, encoding wild-type SMIM30, enhanced tumor cell growth, but this was abolished when a premature stop codon was introduced into the sORF via single-base deletion. Gain- and loss-of-function studies revealed that SMIM30 peptide but not LINC00998 reduced cytosolic calcium level, increased CDK4, cyclin E2, phosphorylated-Rb and E2F1, and promoted the G1/S phase transition and cell proliferation. The effect of SMIM30 silencing was attenuated by a calcium chelator or the agonist of sarco/endoplasmic reticulum calcium ATPase (SERCA) pump. These findings suggest a novel function of micropeptide SMIM30 in promoting G1/S transition and cell proliferation by enhancing SERCA activity and reducing cytosolic calcium level.
Collapse
Affiliation(s)
- Jin-E Yang
- MOE Key Laboratory of Gene Function and Regulation, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Wang-Jing Zhong
- MOE Key Laboratory of Gene Function and Regulation, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Jin-Feng Li
- MOE Key Laboratory of Gene Function and Regulation, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Ying-Ying Lin
- MOE Key Laboratory of Gene Function and Regulation, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Feng-Ting Liu
- MOE Key Laboratory of Gene Function and Regulation, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Hao Tian
- MOE Key Laboratory of Gene Function and Regulation, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Ya-Jing Chen
- MOE Key Laboratory of Gene Function and Regulation, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Xiao-Yu Luo
- MOE Key Laboratory of Gene Function and Regulation, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Shi-Mei Zhuang
- MOE Key Laboratory of Gene Function and Regulation, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
16
|
Jürgens L, Wethmar K. The Emerging Role of uORF-Encoded uPeptides and HLA uLigands in Cellular and Tumor Biology. Cancers (Basel) 2022; 14:cancers14246031. [PMID: 36551517 PMCID: PMC9776223 DOI: 10.3390/cancers14246031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 11/29/2022] [Accepted: 11/30/2022] [Indexed: 12/13/2022] Open
Abstract
Recent technological advances have facilitated the detection of numerous non-canonical human peptides derived from regulatory regions of mRNAs, long non-coding RNAs, and other cryptic transcripts. In this review, we first give an overview of the classification of these novel peptides and summarize recent improvements in their annotation and detection by ribosome profiling, mass spectrometry, and individual experimental analysis. A large fraction of the novel peptides originates from translation at upstream open reading frames (uORFs) that are located within the transcript leader sequence of regular mRNA. In humans, uORF-encoded peptides (uPeptides) have been detected in both healthy and malignantly transformed cells and emerge as important regulators in cellular and immunological pathways. In the second part of the review, we focus on various functional implications of uPeptides. As uPeptides frequently act at the transition of translational regulation and individual peptide function, we describe the mechanistic modes of translational regulation through ribosome stalling, the involvement in cellular programs through protein interaction and complex formation, and their role within the human leukocyte antigen (HLA)-associated immunopeptidome as HLA uLigands. We delineate how malignant transformation may lead to the formation of novel uORFs, uPeptides, or HLA uLigands and explain their potential implication in tumor biology. Ultimately, we speculate on a potential use of uPeptides as peptide drugs and discuss how uPeptides and HLA uLigands may facilitate translational inhibition of oncogenic protein messages and immunotherapeutic approaches in cancer therapy.
Collapse
|
17
|
Liu Q, Peng X, Shen M, Qian Q, Xing J, Li C, Gregory R. Ribo-uORF: a comprehensive data resource of upstream open reading frames (uORFs) based on ribosome profiling. Nucleic Acids Res 2022; 51:D248-D261. [PMID: 36440758 PMCID: PMC9825487 DOI: 10.1093/nar/gkac1094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/27/2022] [Accepted: 11/22/2022] [Indexed: 11/29/2022] Open
Abstract
Upstream open reading frames (uORFs) are typically defined as translation sites located within the 5' untranslated region upstream of the main protein coding sequence (CDS) of messenger RNAs (mRNAs). Although uORFs are prevalent in eukaryotic mRNAs and modulate the translation of downstream CDSs, a comprehensive resource for uORFs is currently lacking. We developed Ribo-uORF (http://rnainformatics.org.cn/RiboUORF) to serve as a comprehensive functional resource for uORF analysis based on ribosome profiling (Ribo-seq) data. Ribo-uORF currently supports six species: human, mouse, rat, zebrafish, fruit fly, and worm. Ribo-uORF includes 501 554 actively translated uORFs and 107 914 upstream translation initiation sites (uTIS), which were identified from 1495 Ribo-seq and 77 quantitative translation initiation sequencing (QTI-seq) datasets, respectively. We also developed mRNAbrowse to visualize items such as uORFs, cis-regulatory elements, genetic variations, eQTLs, GWAS-based associations, RNA modifications, and RNA editing. Ribo-uORF provides a very intuitive web interface for conveniently browsing, searching, and visualizing uORF data. Finally, uORFscan and UTR5var were developed in Ribo-uORF to precisely identify uORFs and analyze the influence of genetic mutations on uORFs using user-uploaded datasets. Ribo-uORF should greatly facilitate studies of uORFs and their roles in mRNA translation and posttranscriptional control of gene expression.
Collapse
Affiliation(s)
- Qi Liu
- To whom correspondence should be addressed. Tel: +86 020 87596559;
| | | | - Mengyuan Shen
- Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China,Guangdong Key Laboratory of New Technology in Rice Breeding, Guangzhou 510640, China,Guangdong Rice Engineering Laboratory, Guangzhou 510640, China,Key Laboratory of Genetics and Breeding of High Quality Rice in Southern China (Co-construction by Ministry and Province), Guangzhou 510640, China
| | - Qian Qian
- Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China,Guangdong Key Laboratory of New Technology in Rice Breeding, Guangzhou 510640, China,Guangdong Rice Engineering Laboratory, Guangzhou 510640, China,Key Laboratory of Genetics and Breeding of High Quality Rice in Southern China (Co-construction by Ministry and Province), Guangzhou 510640, China
| | - Junlian Xing
- Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China,Guangdong Key Laboratory of New Technology in Rice Breeding, Guangzhou 510640, China,Guangdong Rice Engineering Laboratory, Guangzhou 510640, China,Key Laboratory of Genetics and Breeding of High Quality Rice in Southern China (Co-construction by Ministry and Province), Guangzhou 510640, China
| | - Chen Li
- Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China,Guangdong Key Laboratory of New Technology in Rice Breeding, Guangzhou 510640, China,Guangdong Rice Engineering Laboratory, Guangzhou 510640, China,Key Laboratory of Genetics and Breeding of High Quality Rice in Southern China (Co-construction by Ministry and Province), Guangzhou 510640, China
| | - Richard I Gregory
- Correspondence may also be addressed to Richard I. Gregory. Tel: +1 617 919 2273;
| |
Collapse
|
18
|
Zhang M, Zhao J, Li C, Ge F, Wu J, Jiang B, Song J, Song X. csORF-finder: an effective ensemble learning framework for accurate identification of multi-species coding short open reading frames. Brief Bioinform 2022; 23:bbac392. [PMID: 36094083 PMCID: PMC9677467 DOI: 10.1093/bib/bbac392] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 08/03/2022] [Accepted: 08/11/2022] [Indexed: 12/14/2022] Open
Abstract
Short open reading frames (sORFs) refer to the small nucleic fragments no longer than 303 nt in length that probably encode small peptides. To date, translatable sORFs have been found in both untranslated regions of messenger ribonucleic acids (RNAs; mRNAs) and long non-coding RNAs (lncRNAs), playing vital roles in a myriad of biological processes. As not all sORFs are translated or essentially translatable, it is important to develop a highly accurate computational tool for characterizing the coding potential of sORFs, thereby facilitating discovery of novel functional peptides. In light of this, we designed a series of ensemble models by integrating Efficient-CapsNet and LightGBM, collectively termed csORF-finder, to differentiate the coding sORFs (csORFs) from non-coding sORFs in Homo sapiens, Mus musculus and Drosophila melanogaster, respectively. To improve the performance of csORF-finder, we introduced a novel feature encoding scheme named trinucleotide deviation from expected mean (TDE) and computed all types of in-frame sequence-based features, such as i-framed-3mer, i-framed-CKSNAP and i-framed-TDE. Benchmarking results showed that these features could significantly boost the performance compared to the original 3-mer, CKSNAP and TDE features. Our performance comparisons showed that csORF-finder achieved a superior performance than the state-of-the-art methods for csORF prediction on multi-species and non-ATG initiation independent test datasets. Furthermore, we applied csORF-finder to screen the lncRNA datasets for identifying potential csORFs. The resulting data serve as an important computational repository for further experimental validation. We hope that csORF-finder can be exploited as a powerful platform for high-throughput identification of csORFs and functional characterization of these csORFs encoded peptides.
Collapse
Affiliation(s)
- Meng Zhang
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
| | - Jian Zhao
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
| | - Chen Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | - Fang Ge
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| | - Jing Wu
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Bin Jiang
- College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Xiaofeng Song
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
| |
Collapse
|
19
|
Zheng Y, Luo H, Teng X, Hao X, Yan X, Tang Y, Zhang W, Wang Y, Zhang P, Li Y, Zhao Y, Chen R, He S. NPInter v5.0: ncRNA interaction database in a new era. Nucleic Acids Res 2022; 51:D232-D239. [PMID: 36373614 PMCID: PMC9825547 DOI: 10.1093/nar/gkac1002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/16/2022] [Accepted: 10/21/2022] [Indexed: 11/16/2022] Open
Abstract
Noncoding RNAs (ncRNAs) play key regulatory roles in biological processes by interacting with other biomolecules. With the development of high-throughput sequencing and experimental technologies, extensive ncRNA interactions have been accumulated. Therefore, we updated the NPInter database to a fifth version to document these interactions. ncRNA interaction entries were doubled from 1 100 618 to 2 596 695 by manual literature mining and high-throughput data processing. We integrated global RNA-DNA interactions from iMARGI, ChAR-seq and GRID-seq, greatly expanding the number of RNA-DNA interactions (from 888 915 to 8 329 382). In addition, we collected different types of RNA interaction between SARS-CoV-2 virus and its host from recently published studies. Long noncoding RNA (lncRNA) expression specificity in different cell types from tumor single cell RNA-seq (scRNA-seq) data were also integrated to provide a cell-type level view of interactions. A new module named RBP was built to display the interactions of RNA-binding proteins with annotations of localization, binding domains and functions. In conclusion, NPInter v5.0 (http://bigdata.ibp.ac.cn/npinter5/) provides informative and valuable ncRNA interactions for biological researchers.
Collapse
Affiliation(s)
| | | | | | - Xinpei Hao
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaoyu Yan
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiheng Tang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Wanyu Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yuanxin Wang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Peng Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yanyan Li
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yi Zhao
- Bioinformatics Research Group, Key Laboratory of Intelligent Information Processing, Advanced Computing Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Runsheng Chen
- Correspondence may also be addressed to Runsheng Chen. Tel: +86 10 64888543; Fax: +86 10 64871293
| | - Shunmin He
- To whom correspondence should be addressed. Tel: +86 10 64887032; Fax: +86 10 64887032;
| |
Collapse
|
20
|
Turchetti B, Buzzini P, Baeza M. A genomic approach to analyze the cold adaptation of yeasts isolated from Italian Alps. Front Microbiol 2022; 13:1026102. [DOI: 10.3389/fmicb.2022.1026102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 10/07/2022] [Indexed: 11/11/2022] Open
Abstract
Microorganisms including yeasts are responsible for mineralization of organic matter in cold regions, and their characterization is critical to elucidate the ecology of such environments on Earth. Strategies developed by yeasts to survive in cold environments have been increasingly studied in the last years and applied to different biotechnological applications, but their knowledge is still limited. Microbial adaptations to cold include the synthesis of cryoprotective compounds, as well as the presence of a high number of genes encoding the synthesis of proteins/enzymes characterized by a reduced proline content and highly flexible and large catalytic active sites. This study is a comparative genomic study on the adaptations of yeasts isolated from the Italian Alps, considering their growth kinetics. The optimal temperature for growth (OTG), growth rate (Gr), and draft genome sizes considerably varied (OTG, 10°C–20°C; Gr, 0.071–0.0726; genomes, 20.7–21.5 Mpb; %GC, 50.9–61.5). A direct relationship was observed between calculated protein flexibilities and OTG, but not for Gr. Putative genes encoding for cold stress response were found, as well as high numbers of genes encoding for general, oxidative, and osmotic stresses. The cold response genes found in the studied yeasts play roles in cell membrane adaptation, compatible solute accumulation, RNA structure changes, and protein folding, i.e., dihydrolipoamide dehydrogenase, glycogen synthase, omega-6 fatty acid, stearoyl-CoA desaturase, ATP-dependent RNA helicase, and elongation of very-long-chain fatty acids. A redundancy for several putative genes was found, higher for P-loop containing nucleoside triphosphate hydrolase, alpha/beta hydrolase, armadillo repeat-containing proteins, and the major facilitator superfamily protein. Hundreds of thousands of small open reading frames (SmORFs) were found in all studied yeasts, especially in Phenoliferia glacialis. Gene clusters encoding for the synthesis of secondary metabolites such as terpene, non-ribosomal peptide, and type III polyketide were predicted in four, three, and two studied yeasts, respectively.
Collapse
|
21
|
Li Z, Liu L, Feng C, Qin Y, Xiao J, Zhang Z, Ma L. LncBook 2.0: integrating human long non-coding RNAs with multi-omics annotations. Nucleic Acids Res 2022; 51:D186-D191. [PMID: 36330950 PMCID: PMC9825513 DOI: 10.1093/nar/gkac999] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/11/2022] [Accepted: 10/19/2022] [Indexed: 11/06/2022] Open
Abstract
LncBook, a comprehensive resource of human long non-coding RNAs (lncRNAs), has been used in a wide range of lncRNA studies across various biological contexts. Here, we present LncBook 2.0 (https://ngdc.cncb.ac.cn/lncbook), with significant updates and enhancements as follows: (i) incorporation of 119 722 new transcripts, 9632 new genes, and gene structure update of 21 305 lncRNAs; (ii) characterization of conservation features of human lncRNA genes across 40 vertebrates; (iii) integration of lncRNA-encoded small proteins; (iv) enrichment of expression and DNA methylation profiles with more biological contexts and (v) identification of lncRNA-protein interactions and improved prediction of lncRNA-miRNA interactions. Collectively, LncBook 2.0 accommodates a high-quality collection of 95 243 lncRNA genes and 323 950 transcripts and incorporates their abundant annotations at different omics levels, thereby enabling users to decipher functional significance of lncRNAs in different biological contexts.
Collapse
Affiliation(s)
| | | | | | - Yuxin Qin
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jingfa Xiao
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhang Zhang
- Correspondence may also be addressed to Zhang Zhang. Tel: +86 10 8409 7261; Fax: +86 10 8409 7298;
| | - Lina Ma
- To whom correspondence should be addressed. Tel: +86 10 8409 7845; Fax: +86 10 8409 7298;
| |
Collapse
|
22
|
Manske F, Ogoniak L, Jürgens L, Grundmann N, Makałowski W, Wethmar K. The new uORFdb: integrating literature, sequence, and variation data in a central hub for uORF research. Nucleic Acids Res 2022; 51:D328-D336. [PMID: 36305828 PMCID: PMC9825577 DOI: 10.1093/nar/gkac899] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 09/28/2022] [Accepted: 10/03/2022] [Indexed: 02/07/2023] Open
Abstract
Upstream open reading frames (uORFs) are initiated by AUG or near-cognate start codons and have been identified in the transcript leader sequences of the majority of eukaryotic transcripts. Functionally, uORFs are implicated in downstream translational regulation of the main protein coding sequence and may serve as a source of non-canonical peptides. Genetic defects in uORF sequences have been linked to the development of various diseases, including cancer. To simplify uORF-related research, the initial release of uORFdb in 2014 provided a comprehensive and manually curated collection of uORF-related literature. Here, we present an updated sequence-based version of uORFdb, accessible at https://www.bioinformatics.uni-muenster.de/tools/uorfdb. The new uORFdb enables users to directly access sequence information, graphical displays, and genetic variation data for over 2.4 million human uORFs. It also includes sequence data of >4.2 million uORFs in 12 additional species. Multiple uORFs can be displayed in transcript- and reading-frame-specific models to visualize the translational context. A variety of filters, sequence-related information, and links to external resources (UCSC Genome Browser, dbSNP, ClinVar) facilitate immediate in-depth analysis of individual uORFs. The database also contains uORF-related somatic variation data obtained from whole-genome sequencing (WGS) analyses of 677 cancer samples collected by the TCGA consortium.
Collapse
Affiliation(s)
- Felix Manske
- Institute of Bioinformatics, University of Münster, Münster 48149, Germany
| | - Lynn Ogoniak
- Institute of Bioinformatics, University of Münster, Münster 48149, Germany
| | - Lara Jürgens
- Department of Medicine A, Hematology, Oncology, Hemostaseology and Pneumology, University Hospital Münster, Münster 48149, Germany
| | - Norbert Grundmann
- Institute of Bioinformatics, University of Münster, Münster 48149, Germany
| | - Wojciech Makałowski
- Correspondence may also be addressed to Wojciech Makałowski. Tel: +49 2518353006;
| | - Klaus Wethmar
- To whom correspondence should be addressed. Tel: +49 2518347587; Fax: +49 2518347588;
| |
Collapse
|
23
|
Xiang H, Zhang L, Bu F, Guan X, Chen L, Zhang H, Zhao Y, Chen H, Zhang W, Li Y, Lee LJ, Mei Z, Rao Y, Gu Y, Hou Y, Mu F, Dong X. A Novel Proteogenomic Integration Strategy Expands the Breadth of Neo-Epitope Sources. Cancers (Basel) 2022; 14:cancers14123016. [PMID: 35740681 PMCID: PMC9220843 DOI: 10.3390/cancers14123016] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Revised: 06/09/2022] [Accepted: 06/13/2022] [Indexed: 11/16/2022] Open
Abstract
Tumor-specific antigens can activate T cell-based antitumor immune responses and are ideal targets for cancer immunotherapy. However, their identification is still challenging. Although mass spectrometry can directly identify human leukocyte antigen (HLA) binding peptides in tumor cells, it focuses on tumor-specific antigens derived from annotated protein-coding regions constituting only 1.5% of the genome. We developed a novel proteogenomic integration strategy to expand the breadth of tumor-specific epitopes derived from all genomic regions. Using the colorectal cancer cell line HCT116 as a model, we accurately identified 10,737 HLA-presented peptides, 1293 of which were non-canonical peptides that traditional database searches could not identify. Moreover, we found eight tumor neo-epitopes derived from somatic mutations, four of which were not previously reported. Our findings suggest that this new proteogenomic approach holds great promise for increasing the number of tumor-specific antigen candidates, potentially enlarging the tumor target pool and improving cancer immunotherapy.
Collapse
Affiliation(s)
- Haitao Xiang
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; (H.X.); (X.G.); (W.Z.); (Y.L.)
- BGI-Shenzhen, Shenzhen 518103, China; (F.B.); (L.C.); (H.Z.); (Y.Z.); (H.C.); (Y.G.)
| | - Le Zhang
- BGI-GenoImmune, BGI-Shenzhen, Shenzhen 518083, China; (L.Z.); (L.J.L.)
| | - Fanyu Bu
- BGI-Shenzhen, Shenzhen 518103, China; (F.B.); (L.C.); (H.Z.); (Y.Z.); (H.C.); (Y.G.)
| | - Xiangyu Guan
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; (H.X.); (X.G.); (W.Z.); (Y.L.)
- BGI-Shenzhen, Shenzhen 518103, China; (F.B.); (L.C.); (H.Z.); (Y.Z.); (H.C.); (Y.G.)
| | - Lei Chen
- BGI-Shenzhen, Shenzhen 518103, China; (F.B.); (L.C.); (H.Z.); (Y.Z.); (H.C.); (Y.G.)
| | - Haibo Zhang
- BGI-Shenzhen, Shenzhen 518103, China; (F.B.); (L.C.); (H.Z.); (Y.Z.); (H.C.); (Y.G.)
| | - Yuntong Zhao
- BGI-Shenzhen, Shenzhen 518103, China; (F.B.); (L.C.); (H.Z.); (Y.Z.); (H.C.); (Y.G.)
| | - Huanyi Chen
- BGI-Shenzhen, Shenzhen 518103, China; (F.B.); (L.C.); (H.Z.); (Y.Z.); (H.C.); (Y.G.)
| | - Weicong Zhang
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; (H.X.); (X.G.); (W.Z.); (Y.L.)
- BGI-Shenzhen, Shenzhen 518103, China; (F.B.); (L.C.); (H.Z.); (Y.Z.); (H.C.); (Y.G.)
| | - Yijian Li
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; (H.X.); (X.G.); (W.Z.); (Y.L.)
- BGI-Shenzhen, Shenzhen 518103, China; (F.B.); (L.C.); (H.Z.); (Y.Z.); (H.C.); (Y.G.)
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, Shenzhen 518083, China
| | - Leo Jingyu Lee
- BGI-GenoImmune, BGI-Shenzhen, Shenzhen 518083, China; (L.Z.); (L.J.L.)
| | - Zhanlong Mei
- BGI, Shenzhen 518083, China; (Z.M.); (Y.R.); (Y.H.)
| | - Yuan Rao
- BGI, Shenzhen 518083, China; (Z.M.); (Y.R.); (Y.H.)
| | - Ying Gu
- BGI-Shenzhen, Shenzhen 518103, China; (F.B.); (L.C.); (H.Z.); (Y.Z.); (H.C.); (Y.G.)
- Guangdong Provincial Key Laboratory of Genome Read and Write, Shenzhen 518120, China
| | - Yong Hou
- BGI, Shenzhen 518083, China; (Z.M.); (Y.R.); (Y.H.)
| | - Feng Mu
- BGI, Shenzhen 518083, China; (Z.M.); (Y.R.); (Y.H.)
- Correspondence: (F.M.); (X.D.)
| | - Xuan Dong
- BGI-Shenzhen, Shenzhen 518103, China; (F.B.); (L.C.); (H.Z.); (Y.Z.); (H.C.); (Y.G.)
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, Shenzhen 518083, China
- Correspondence: (F.M.); (X.D.)
| |
Collapse
|
24
|
KaKs_Calculator 3.0: Calculating Selective Pressure on Coding and Non-coding Sequences. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:536-540. [PMID: 34990803 PMCID: PMC9801026 DOI: 10.1016/j.gpb.2021.12.002] [Citation(s) in RCA: 87] [Impact Index Per Article: 43.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 12/15/2021] [Accepted: 12/21/2021] [Indexed: 01/26/2023]
Abstract
KaKs_Calculator 3.0 is an updated toolkit that is capable of calculating selective pressure on both coding and non-coding sequences. Similar to the nonsynonymous/synonymous substitution rate ratio for coding sequences, selection on non-coding sequences can be quantified as the ratio of non-coding nucleotide substitution rate to synonymous substitution rate of adjacent coding sequences. As testified on empirical data, KaKs_Calculator 3.0 shows effectiveness to detect the strength and mode of selection operated on molecular sequences, accordingly demonstrating its great potential to achieve genome-wide scan of natural selection on diverse sequences and identification of potentially functional elements at a whole-genome scale. The package of KaKs_Calculator 3.0 is freely available for academic use only at https://ngdc.cncb.ac.cn/biocode/tools/BT000001.
Collapse
|
25
|
Fabre B, Choteau SA, Duboé C, Pichereaux C, Montigny A, Korona D, Deery MJ, Camus M, Brun C, Burlet-Schiltz O, Russell S, Combier JP, Lilley KS, Plaza S. In Depth Exploration of the Alternative Proteome of Drosophila melanogaster. Front Cell Dev Biol 2022; 10:901351. [PMID: 35721519 PMCID: PMC9204603 DOI: 10.3389/fcell.2022.901351] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 04/25/2022] [Indexed: 12/13/2022] Open
Abstract
Recent studies have shown that hundreds of small proteins were occulted when protein-coding genes were annotated. These proteins, called alternative proteins, have failed to be annotated notably due to the short length of their open reading frame (less than 100 codons) or the enforced rule establishing that messenger RNAs (mRNAs) are monocistronic. Several alternative proteins were shown to be biologically active molecules and seem to be involved in a wide range of biological functions. However, genome-wide exploration of the alternative proteome is still limited to a few species. In the present article, we describe a deep peptidomics workflow which enabled the identification of 401 alternative proteins in Drosophila melanogaster. Subcellular localization, protein domains, and short linear motifs were predicted for 235 of the alternative proteins identified and point toward specific functions of these small proteins. Several alternative proteins had approximated abundances higher than their canonical counterparts, suggesting that these alternative proteins are actually the main products of their corresponding genes. Finally, we observed 14 alternative proteins with developmentally regulated expression patterns and 10 induced upon the heat-shock treatment of embryos, demonstrating stage or stress-specific production of alternative proteins.
Collapse
Affiliation(s)
- Bertrand Fabre
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, INP, CNRS, Auzeville-Tolosane, France,Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom,*Correspondence: Bertrand Fabre, ; Serge Plaza,
| | - Sebastien A. Choteau
- Aix-Marseille Université, INSERM, TAGC, Turing Centre for Living Systems, Marseille, France
| | - Carine Duboé
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, INP, CNRS, Auzeville-Tolosane, France
| | - Carole Pichereaux
- Fédération de Recherche (FR3450), Agrobiosciences, Interactions et Biodiversité (AIB), CNRS, Toulouse, France,Institut de Pharmacologie et de Biologie Structurale (IPBS), Université de Toulouse, CNRS, UPS, Toulouse, France,Infrastructure Nationale de Protéomique, ProFI, FR 2048, Toulouse, France
| | - Audrey Montigny
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, INP, CNRS, Auzeville-Tolosane, France
| | - Dagmara Korona
- Cambridge Systems Biology Centre and Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Michael J. Deery
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Mylène Camus
- Institut de Pharmacologie et de Biologie Structurale (IPBS), Université de Toulouse, CNRS, UPS, Toulouse, France,Infrastructure Nationale de Protéomique, ProFI, FR 2048, Toulouse, France
| | - Christine Brun
- Aix-Marseille Université, INSERM, TAGC, Turing Centre for Living Systems, Marseille, France,CNRS, Marseille, France
| | - Odile Burlet-Schiltz
- Institut de Pharmacologie et de Biologie Structurale (IPBS), Université de Toulouse, CNRS, UPS, Toulouse, France,Infrastructure Nationale de Protéomique, ProFI, FR 2048, Toulouse, France
| | - Steven Russell
- Cambridge Systems Biology Centre and Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Jean-Philippe Combier
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, INP, CNRS, Auzeville-Tolosane, France
| | - Kathryn S. Lilley
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Serge Plaza
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, INP, CNRS, Auzeville-Tolosane, France,*Correspondence: Bertrand Fabre, ; Serge Plaza,
| |
Collapse
|