1
|
Zhang T, Li Z, Li J, Peng Y. Small open reading frame-encoded microproteins in cancer: identification, biological functions and clinical significance. Mol Cancer 2025; 24:105. [PMID: 40170020 PMCID: PMC11963466 DOI: 10.1186/s12943-025-02278-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2025] [Accepted: 02/24/2025] [Indexed: 04/03/2025] Open
Abstract
The human genome harbors approximately twenty thousand protein-coding genes, and a significant portion of life science research focuses on elucidating their functions and the underlying mechanisms. Recent studies have revealed that small open reading frame (sORF), originating from non-coding RNAs or the 5' leader sequences of messenger RNAs, can be translated into small peptides called microproteins through cap-dependent or cap-independent mechanisms. These microproteins interact with diverse molecular partners to modulate gene expression at multiple regulatory levels, thereby playing critical roles in various biological processes. Notably, sORF-encoded microproteins exhibit aberrant expression patterns in cancer and are implicated in tumor initiation and progression, expanding our understanding of cancer biology. In this review, we introduce the translational mechanisms and identification methods of microproteins, summarize their dysregulation in cancer and their biological functions in regulating gene expression, and emphasize their roles in driving hallmark events of cancer. Furthermore, we discuss their clinical significance as diagnostic and prognostic biomarkers, as well as therapeutic targets.
Collapse
Affiliation(s)
- Tingting Zhang
- Center for Molecular Oncology, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| | - Zhang Li
- Center for Molecular Oncology, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| | - Jiao Li
- Center for Molecular Oncology, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China.
| | - Yong Peng
- Center for Molecular Oncology, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China.
| |
Collapse
|
2
|
Xu C, Yu F, Xue M, Huang Z, Jiang N, Li Y, Meng Y, Liu W, Zheng Y, Fan Y, Zhou Y. Proteogenomic analysis of Cyprinid herpesvirus 2 using high-resolution mass spectrometry. J Virol 2025:e0196024. [PMID: 40172206 DOI: 10.1128/jvi.01960-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2024] [Accepted: 02/10/2025] [Indexed: 04/04/2025] Open
Abstract
Cyprinid herpesvirus 2 (CyHV-2) is the main pathogen responsible for the development of herpesviral hematopoietic necrosis disease (HVHND) in crucian carp (Carassius auratus). The CyHV-2 genome encodes approximately 150 genes that are expressed in a well-defined manner during productive infection. However, CyHV-2 open reading frames (ORFs) are primarily derived from sequence and homology analyses, and most lack protein-level evidence to support their properties. In this study, we used high-resolution mass spectrometry followed by proteogenomic mapping to achieve genome re-annotation of CyHV-2. Based on our results, a total of 1,683 MS/MS spectra could be mapped to the CyHV-2 genome through six-frame translation, with 1,665 corresponding to 117 currently annotated protein-coding ORFs. Three of the remaining 18 peptides were mapped to the N-terminal extension region of known ORFs. However, 12 novel CyHV-2 ORFs, designated nORF1-12, were identified and characterized for the first time based on the remaining 15 peptides that could be mapped to previously unannotated regions of the viral genome. And the sequence differences of the novel phosphorylated nORF1, also referred to as ORF25E, in different CyHV-2 strains indicated that the nORF1 is a prospective molecular marker that can monitor the evolution from the Japan (J) to the China (C) genotype of CyHV-2. These findings further validate existing annotations, expand the genomic landscape of CyHV-2, and provide a rich resource for aquatic virology research.IMPORTANCECyHV-2 is a viral pathogen that poses a significant threat to crucian carp farming. CyHV-2 has a large genome with complex sequence features and diverse coding mechanisms, which complicates accurate genome annotation in the absence of protein-level evidence. Here, we employed various protein extraction and separation methods to increase viral protein coverage and performed an integrated proteogenomic analysis to refine the CyHV-2 genome annotation. A total of 129 viral genes were confidently identified, including 117 currently annotated genes and 12 novel genes. For the first time, we present large-scale evidence of peptide presence and levels in the genome of aquatic viruses and confirm the majority of the predicted proteins in CyHV-2. Our findings enhance the understanding of the CyHV-2 genome structure and provide valuable insights for future studies on CyHV-2 biology.
Collapse
Affiliation(s)
- Chen Xu
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, China
| | - Fangxing Yu
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, China
- College of Life Sciences, Shanghai Normal University, Shanghai, China
| | - Mingyang Xue
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, China
| | - Zhenyu Huang
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, China
| | - Nan Jiang
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, China
| | - Yiqun Li
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, China
| | - Yan Meng
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, China
| | - Wenzhi Liu
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, China
| | - Ya Zheng
- College of Life Sciences, Shanghai Normal University, Shanghai, China
| | - Yuding Fan
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, China
| | - Yong Zhou
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, China
| |
Collapse
|
3
|
Zhang Y, Yang Y, Li K, Chen L, Yang Y, Yang C, Xie Z, Wang H, Zhao Q. Enhanced Discovery of Alternative Proteins (AltProts) in Mouse Cardiac Development Using Data-Independent Acquisition (DIA) Proteomics. Anal Chem 2025; 97:1517-1527. [PMID: 39813267 PMCID: PMC11781309 DOI: 10.1021/acs.analchem.4c02924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 11/27/2024] [Accepted: 11/27/2024] [Indexed: 01/18/2025]
Abstract
Alternative proteins (AltProts) are a class of proteins encoded by DNA sequences previously classified as noncoding. Despite their historically being overlooked, recent studies have highlighted their widespread presence and distinctive biological roles. So far, direct detection of AltProt has been relying on data-dependent acquisition (DDA) mass spectrometry (MS). However, data-independent acquisition (DIA) MS, a method that is rapidly gaining popularity for the analysis of canonical proteins, has seen limited application in AltProt research, largely due to the complexities involved in constructing DIA libraries. In this study, we present a novel DIA workflow that leverages a fragmentation spectra predictor for the efficient construction of DIA libraries, significantly enhancing the detection of AltProts. Our method achieved a 2-fold increase in the identification of AltProts and a 50% reduction in missing values compared to DDA. We conducted a comprehensive comparison of four AltProt databases, four DIA-library construction strategies, and three analytical software tools to establish an optimal workflow for AltProt analysis. Utilizing this workflow, we investigated the mouse heart development process and identified over 50 AltProts with differential expression between embryonic and adult heart tissues. Over 30 unannotated mouse AltProts were validated, including ASDURF, which played a crucial role in cardiac development. Our findings not only provide a practical workflow for MS-based AltProt analysis but also reveal novel AltProts with potential significance in biological functions.
Collapse
Affiliation(s)
- Yuanliang Zhang
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Ying Yang
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Kecheng Li
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Lei Chen
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Yang Yang
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Chenxi Yang
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Zhi Xie
- State
Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China
| | - Hongwei Wang
- State
Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China
| | - Qian Zhao
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| |
Collapse
|
4
|
Naidu P, Holford M. Microscopic marvels: Decoding the role of micropeptides in innate immunity. Immunology 2024; 173:605-621. [PMID: 39188052 DOI: 10.1111/imm.13850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 07/30/2024] [Indexed: 08/28/2024] Open
Abstract
The innate immune response is under selection pressures from changing environments and pathogens. While sequence evolution can be studied by comparing rates of amino acid mutations within and between species, how a gene's birth and death contribute to the evolution of immunity is less known. Short open reading frames, once regarded as untranslated or transcriptional noise, can often produce micropeptides of <100 amino acids with a wide array of biological functions. Some micropeptide sequences are well conserved, whereas others have no evolutionary conservation, potentially representing new functional compounds that arise from species-specific adaptations. To date, few reports have described the discovery of novel micropeptides of the innate immune system. The diversity of immune-related micropeptides is a blind spot for gene and functional annotation. Immune-related micropeptides represent a potential reservoir of untapped compounds for understanding and treating disease. This review consolidates what is currently known about the evolution and function of innate immune-related micropeptides to facilitate their investigation.
Collapse
Affiliation(s)
- Praveena Naidu
- Graduate Center, Programs in Biology, Biochemistry, Chemistry, City University of New York, New York, New York, USA
- Department of Chemistry and Biochemistry, City University of New York, Hunter College, Belfer Research Building, New York, New York, USA
| | - Mandë Holford
- Graduate Center, Programs in Biology, Biochemistry, Chemistry, City University of New York, New York, New York, USA
- Department of Chemistry and Biochemistry, City University of New York, Hunter College, Belfer Research Building, New York, New York, USA
- American Museum of Natural History, Invertebrate Zoology, Sackler Institute for Comparative Genomics, New York, New York, USA
- Weill Cornell Medicine, Department of Biochemistry, New York, New York, USA
| |
Collapse
|
5
|
Peng M, Wang T, Li Y, Zhang Z, Wan C. Mapping Start Codons of Small Open Reading Frames by N-Terminomics Approach. Mol Cell Proteomics 2024; 23:100860. [PMID: 39419446 PMCID: PMC11602987 DOI: 10.1016/j.mcpro.2024.100860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 10/04/2024] [Accepted: 10/13/2024] [Indexed: 10/19/2024] Open
Abstract
sORF-encoded peptides (SEPs) refer to proteins encoded by small open reading frames (sORFs) with a length of less than 100 amino acids, which play an important role in various life activities. Analysis of known SEPs showed that using non-canonical initiation codons of SEPs was more common. However, the current analysis of SEP sequences mainly relies on bioinformatics prediction, and most of them use AUG as the start site, which may not be completely correct for SEPs. Chemical labeling was used to systematically analyze the N-terminal sequences of SEPs to accurately define the start sites of SEPs. By comparison, we found that dimethylation and guanidinylation are more efficient than acetylation. The ACN precipitation and heating precipitation performed better in SEP enrichment. As an N-terminal peptide enrichment material, Hexadhexaldehyde was superior to CNBr-activated agarose and NHS-activated agarose. Combining these methods, we identified 128 SEPs with 131 N-terminal sequences. Among them, two-thirds are novel N-terminal sequences, and most of them start from the 11-31st amino acids of the original sequence. Partial novel N-termini were produced by proteolysis or signal peptide removal. Some SEPs' transcription start sites were corrected to be non-AUG start codons. One novel start codon was validated using GFP-tag vectors. These results demonstrated that the chemical labeling approaches would be beneficial for identifying the start codons of sORFs and the real N-terminal of their encoded peptides, which helps better understand the characterization of SEPs.
Collapse
Affiliation(s)
- Mingbo Peng
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, People's Republic of China
| | - Tianjing Wang
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, People's Republic of China
| | - Yujie Li
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, People's Republic of China
| | - Zheng Zhang
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, People's Republic of China
| | - Cuihong Wan
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, People's Republic of China.
| |
Collapse
|
6
|
Fan SM, Li ZQ, Zhang SZ, Chen LY, Wei XY, Liang J, Zhao XQ, Su C. Multi-integrated approach for unraveling small open reading frames potentially associated with secondary metabolism in Streptomyces. mSystems 2023; 8:e0024523. [PMID: 37712700 PMCID: PMC10654065 DOI: 10.1128/msystems.00245-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 07/20/2023] [Indexed: 09/16/2023] Open
Abstract
IMPORTANCE Due to their small size and special chemical features, small open reading frame (smORF)-encoding peptides (SEPs) are often neglected. However, they may play critical roles in regulating gene expression, enzyme activity, and metabolite production. Studies on bacterial microproteins have mainly focused on pathogenic bacteria, which are importance to systematically investigate SEPs in streptomycetes and are rich sources of bioactive secondary metabolites. Our study is the first to perform a global identification of smORFs in streptomycetes. We established a peptidogenomic workflow for non-model microbial strains and identified multiple novel smORFs that are potentially linked to secondary metabolism in streptomycetes. Our multi-integrated approach in this study is meaningful to improve the quality and quantity of the detected smORFs. Ultimately, the workflow we established could be extended to other organisms and would benefit the genome mining of microproteins with critical functions for regulation and engineering useful microorganisms.
Collapse
Affiliation(s)
- Si-Min Fan
- National Engineering Laboratory for Resource Developing of Endangered Chinese Crude Drugs in Northwest China, College of Life Sciences, Shaanxi Normal University, Shaanxi, China
| | - Ze-Qi Li
- National Engineering Laboratory for Resource Developing of Endangered Chinese Crude Drugs in Northwest China, College of Life Sciences, Shaanxi Normal University, Shaanxi, China
| | - Shi-Zhe Zhang
- National Engineering Laboratory for Resource Developing of Endangered Chinese Crude Drugs in Northwest China, College of Life Sciences, Shaanxi Normal University, Shaanxi, China
| | - Liang-Yu Chen
- ProteinT (Tianjin) biotechnology Co. Ltd., Tianjin, China
| | - Xi-Ying Wei
- National Engineering Laboratory for Resource Developing of Endangered Chinese Crude Drugs in Northwest China, College of Life Sciences, Shaanxi Normal University, Shaanxi, China
| | - Jian Liang
- National Engineering Laboratory for Resource Developing of Endangered Chinese Crude Drugs in Northwest China, College of Life Sciences, Shaanxi Normal University, Shaanxi, China
- College of Biology and Geography, Yili Normal University, Yining, China
| | - Xin-Qing Zhao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai Jiao, China
| | - Chun Su
- National Engineering Laboratory for Resource Developing of Endangered Chinese Crude Drugs in Northwest China, College of Life Sciences, Shaanxi Normal University, Shaanxi, China
| |
Collapse
|
7
|
Dong X, Zhang K, Xun C, Chu T, Liang S, Zeng Y, Liu Z. Small Open Reading Frame-Encoded Micro-Peptides: An Emerging Protein World. Int J Mol Sci 2023; 24:10562. [PMID: 37445739 DOI: 10.3390/ijms241310562] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Revised: 06/20/2023] [Accepted: 06/21/2023] [Indexed: 07/15/2023] Open
Abstract
Small open reading frames (sORFs) are often overlooked features in genomes. In the past, they were labeled as noncoding or "transcriptional noise". However, accumulating evidence from recent years suggests that sORFs may be transcribed and translated to produce sORF-encoded polypeptides (SEPs) with less than 100 amino acids. The vigorous development of computational algorithms, ribosome profiling, and peptidome has facilitated the prediction and identification of many new SEPs. These SEPs were revealed to be involved in a wide range of basic biological processes, such as gene expression regulation, embryonic development, cellular metabolism, inflammation, and even carcinogenesis. To effectively understand the potential biological functions of SEPs, we discuss the history and development of the newly emerging research on sORFs and SEPs. In particular, we review a range of recently discovered bioinformatics tools for identifying, predicting, and validating SEPs as well as a variety of biochemical experiments for characterizing SEP functions. Lastly, this review underlines the challenges and future directions in identifying and validating sORFs and their encoded micropeptides, providing a significant reference for upcoming research on sORF-encoded peptides.
Collapse
Affiliation(s)
- Xiaoping Dong
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Kun Zhang
- The State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Science, Hunan Normal University, Changsha 410081, China
| | - Chengfeng Xun
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Tianqi Chu
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Songping Liang
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Yong Zeng
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
- The State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Science, Hunan Normal University, Changsha 410081, China
| | - Zhonghua Liu
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| |
Collapse
|
8
|
Hassel KR, Brito-Estrada O, Makarewich CA. Microproteins: Overlooked regulators of physiology and disease. iScience 2023; 26:106781. [PMID: 37213226 PMCID: PMC10199267 DOI: 10.1016/j.isci.2023.106781] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/23/2023] Open
Abstract
Ongoing efforts to generate a complete and accurate annotation of the genome have revealed a significant blind spot for small proteins (<100 amino acids) originating from short open reading frames (sORFs). The recent discovery of numerous sORF-encoded proteins, termed microproteins, that play diverse roles in critical cellular processes has ignited the field of microprotein biology. Large-scale efforts are currently underway to identify sORF-encoded microproteins in diverse cell-types and tissues and specialized methods and tools have been developed to aid in their discovery, validation, and functional characterization. Microproteins that have been identified thus far play important roles in fundamental processes including ion transport, oxidative phosphorylation, and stress signaling. In this review, we discuss the optimized tools available for microprotein discovery and validation, summarize the biological functions of numerous microproteins, outline the promise for developing microproteins as therapeutic targets, and look forward to the future of the field of microprotein biology.
Collapse
Affiliation(s)
- Keira R. Hassel
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| | - Omar Brito-Estrada
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| | - Catherine A. Makarewich
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| |
Collapse
|
9
|
Wang Z, Cui Q, Su C, Zhao S, Wang R, Wang Z, Meng J, Luan Y. Unveiling the secrets of non-coding RNA-encoded peptides in plants: A comprehensive review of mining methods and research progress. Int J Biol Macromol 2023:124952. [PMID: 37257526 DOI: 10.1016/j.ijbiomac.2023.124952] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 05/15/2023] [Accepted: 05/16/2023] [Indexed: 06/02/2023]
Abstract
Non-coding RNAs (ncRNAs) are not conventionally involved in protein encoding. However, recent findings indicate that ncRNAs possess the capacity to code for proteins or peptides. These ncRNA-encoded peptides (ncPEPs) are vital for diverse plant life processes and exhibit significant potential value. Despite their importance, research on plant ncPEPs is limited, with only a few studies conducted and less information on the underlying mechanisms, and the field remains in its nascent stage. This manuscript provides a comprehensive overview of ncPEPs mining methods in plants, focusing on prediction, identification, and functional analysis. We discuss the strengths and weaknesses of various techniques, identify future research directions in the ncPEPs domain, and elucidate the biological functions and agricultural application prospects of plant ncPEPs. By highlighting the immense potential and research value of ncPEPs, we aim to lay a solid foundation for more in-depth studies in plant science.
Collapse
Affiliation(s)
- Zhengjie Wang
- School of Bioengineering, Dalian University of Technology, Dalian 116024, China
| | - Qi Cui
- School of Bioengineering, Dalian University of Technology, Dalian 116024, China
| | - Chenglin Su
- School of Bioengineering, Dalian University of Technology, Dalian 116024, China
| | - Siyuan Zhao
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Ruiming Wang
- School of Bioengineering, Dalian University of Technology, Dalian 116024, China
| | - Zhicheng Wang
- School of Bioengineering, Dalian University of Technology, Dalian 116024, China
| | - Jun Meng
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Yushi Luan
- School of Bioengineering, Dalian University of Technology, Dalian 116024, China.
| |
Collapse
|
10
|
Cassidy L, Kaulich PT, Tholey A. Proteoforms expand the world of microproteins and short open reading frame-encoded peptides. iScience 2023; 26:106069. [PMID: 36818287 PMCID: PMC9929600 DOI: 10.1016/j.isci.2023.106069] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Microproteins and short open reading frame-encoded peptides (SEPs) can, like all proteins, carry numerous posttranslational modifications. Together with posttranscriptional processes, this leads to a high number of possible distinct protein molecules, the proteoforms, out of a limited number of genes. The identification, quantification, and molecular characterization of proteoforms possess special challenges to established, mainly bottom-up proteomics (BUP) based analytical approaches. While BUP methods are powerful, proteins have to be inferred rather than directly identified, which hampers the detection of proteoforms. An alternative approach is top-down proteomics (TDP) which allows to identify intact proteoforms. This perspective article provides a brief overview of modified microproteins and SEPs, introduces the proteoform terminology, and compares present BUP and TDP workflows highlighting their major advantages and caveats. Necessary future developments in TDP to fully accentuate its potential for proteoform-centric analytics of microproteins and SEPs will be discussed.
Collapse
Affiliation(s)
- Liam Cassidy
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany
| | - Philipp T. Kaulich
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany
| | - Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany,Corresponding author
| |
Collapse
|
11
|
Employing non-targeted interactomics approach and subcellular fractionation to increase our understanding of the ghost proteome. iScience 2023; 26:105943. [PMID: 36866041 PMCID: PMC9971881 DOI: 10.1016/j.isci.2023.105943] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 11/07/2022] [Accepted: 01/04/2023] [Indexed: 01/09/2023] Open
Abstract
Eukaryotic mRNA has long been considered monocistronic, but nowadays, alternative proteins (AltProts) challenge this tenet. The alternative or ghost proteome has largely been neglected and the involvement of AltProts in biological processes. Here, we used subcellular fractionation to increase the information about AltProts and facilitate the detection of protein-protein interactions by the identification of crosslinked peptides. In total, 112 unique AltProts were identified, and we were able to identify 220 crosslinks without peptide enrichment. Among these, 16 crosslinks between AltProts and Referenced Proteins (RefProts) were identified. We further focused on specific examples such as the interaction between IP_2292176 (AltFAM227B) and HLA-B, in which this protein could be a potential new immunopeptide, and the interactions between HIST1H4F and several AltProts which can play a role in mRNA transcription. Thanks to the study of the interactome and the localization of AltProts, we can reveal more of the importance of the ghost proteome.
Collapse
|
12
|
Vikram, Mishra V, Rana A, Ahire JJ. Riboswitch-mediated regulation of riboflavin biosynthesis genes in prokaryotes. 3 Biotech 2022; 12:278. [PMID: 36275359 PMCID: PMC9474784 DOI: 10.1007/s13205-022-03348-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 09/02/2022] [Indexed: 11/01/2022] Open
Abstract
Prokaryotic organisms frequently use riboswitches to quantify intracellular metabolite concentration via high-affinity metabolite receptors. Riboswitches possess a metabolite-sensing system that controls gene regulation in a cis-acting fashion at the initiation of transcriptional/translational level by binding with a specific metabolite and controlling various biochemical pathways. Riboswitch binds with flavin mononucleotide (FMN), a phosphorylated form of riboflavin and controls gene expression involved in riboflavin biosynthesis and transport pathway. The first step of the riboflavin biosynthesis pathway is initiated by the conversion of guanine nucleotide triphosphate (GTP), which is an intermediate of the purine biosynthesis pathway. An alternative pentose phosphate pathway of riboflavin biosynthesis includes the enzymatic conversion of ribulose-5-phosphate into 3, 4 dihydroxy-2-butanone-4-phosphates by DHBP synthase. The product of ribAB interferes with both GTP cyclohydrolase II as well as DHBP synthase activities, which catalyze the cleavage of GTP and converts DHBP Ribu5P in the initial steps of both riboflavin biosynthesis branches. Riboswitches are located in the 5' untranslated region (5' UTR) of messenger RNAs and contain an aptamer domain (highly conserved in sequence) where metabolite binding leads to a conformational change in an aptamer domain, which modulate the regulation of gene expression located on bacterial mRNA. In this review, we focus on how riboswitch regulates the riboflavin biosynthesis pathway in Bacillus subtilis and Lactobacillus plantarum.
Collapse
Affiliation(s)
- Vikram
- Department of Basic and Applied Sciences, National Institute of Food Technology, Entrepreneurship and Management (NIFTEM), Sonipat, Haryana India
| | - Vijendra Mishra
- Department of Basic and Applied Sciences, National Institute of Food Technology, Entrepreneurship and Management (NIFTEM), Sonipat, Haryana India
| | - Ananya Rana
- Department of Basic and Applied Sciences, National Institute of Food Technology, Entrepreneurship and Management (NIFTEM), Sonipat, Haryana India
| | - Jayesh J. Ahire
- Centre for Research and Development, Unique Biotech Ltd., Plot No. 2, Phase II, MN Park, Hyderabad, Telangana India
| |
Collapse
|
13
|
Identification and analysis of smORFs in Chlamydomonas reinhardtii. Genomics 2022; 114:110444. [PMID: 35933072 DOI: 10.1016/j.ygeno.2022.110444] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 07/06/2022] [Accepted: 07/31/2022] [Indexed: 11/24/2022]
Abstract
Small open reading frames (smORFs) have been acknowledged as an important partner in organism functions ranging from bacteria to higher eukaryotes. However, lack of investigation of smORFs in green algae, despite their importance in ecology and evolution. We applied bioinformatic analysis, ribosome profiling, and small peptide proteomics to provide a genome-wide and high-confident smORF database in the model green alga Chlamydomonas reinhardtii. The whole genome was screened first to mine potential coding smORFs. Then conservative analysis, ribosome profiling, and proteomics data were processed to identify conserved smORFs and generate translation evidence. The combination of procedures resulted in 2014 smORFs that might exist in the C. reinhardtii genome. The expression of smORFs in Cd treatment suggested that two smORFs might participate in redox reaction, three in inorganic phosphate transport, and one in DNA repair under stress. Our study built a genome-widely database in C. reinhardtii, providing target smORFs for further research.
Collapse
|
14
|
Zhang Z, Li Y, Yuan W, Wang Z, Wan C. Proteomic-driven identification of short open reading frame-encoded peptides. Proteomics 2022; 22:e2100312. [PMID: 35384297 DOI: 10.1002/pmic.202100312] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 03/29/2022] [Accepted: 03/30/2022] [Indexed: 11/10/2022]
Abstract
Accumulating evidence has shown that a large number of short open reading frames (sORFs) also have the ability to encode proteins. The discovery of sORFs opens up a new research area, leading to the identification and functional study of sORF encoded peptides (SEPs) at the omics level. Besides bioinformatics prediction and ribosomal profiling, mass spectrometry (MS) has become a significant tool as it directly detects the sequence of SEPs. Though MS-based proteomics methods have proved to be effective for qualitative and quantitative analysis of SEPs, the detection of SEPs is still a great challenge due to their low abundance and short sequence. To illustrate the progress in method development, we described and discussed the main steps of large-scale proteomics identification of SEPs, including SEP extraction and enrichment, MS detection, data processing and quality control, quantification, and function prediction and validation methods. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Zheng Zhang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Yujie Li
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Wenqian Yuan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Zhiwei Wang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Cuihong Wan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| |
Collapse
|
15
|
Wang Z, Pan N, Yan J, Wan J, Wan C. Systematic Identification of Microproteins during the Development of Drosophila melanogaster. J Proteome Res 2022; 21:1114-1123. [PMID: 35227063 DOI: 10.1021/acs.jproteome.2c00004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Short open reading frame-encoded peptides (SEPs) are microproteins with less than 100 amino acids that play an essential role in the growth and development of organisms. There are plenty of short open reading frames in Drosophila melanogaster that potentially code polypeptides. We chose 11 time points during the life cycle of Drosophila to investigate microproteins, particularly those related to development. Finally, we identified a total of 410 microproteins, of which 27 were noncoding RNA-encoded proteins. Of the 410 microproteins, 74 were expressed in all stages from embryo to adults, whereas 300 microproteins were only found in one or two time points. Approximately, one-third of the microproteins were not reported previously and 44 were obtained from de novo sequencing, validated by synthetic peptides. These microproteins are related to the main bioprocesses of growth and development, such as multicellular organism reproduction, postmating behavior, and oviposition. Over half of the microproteins have predicted functional domains and are conserved across species, suggesting that these microproteins have critical functions in fly development. This work enriches the D. melanogaster proteome and provides a significant data resource for growth and development research.
Collapse
Affiliation(s)
- Zhiwei Wang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| | - Ni Pan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| | - Jiahao Yan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| | - Jian Wan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| | - Cuihong Wan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| |
Collapse
|
16
|
Ahrens CH, Wade JT, Champion MM, Langer JD. A Practical Guide to Small Protein Discovery and Characterization Using Mass Spectrometry. J Bacteriol 2022; 204:e0035321. [PMID: 34748388 PMCID: PMC8765459 DOI: 10.1128/jb.00353-21] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Small proteins of up to ∼50 amino acids are an abundant class of biomolecules across all domains of life. Yet due to the challenges inherent in their size, they are often missed in genome annotations, and are difficult to identify and characterize using standard experimental approaches. Consequently, we still know few small proteins even in well-studied prokaryotic model organisms. Mass spectrometry (MS) has great potential for the discovery, validation, and functional characterization of small proteins. However, standard MS approaches are poorly suited to the identification of both known and novel small proteins due to limitations at each step of a typical proteomics workflow, i.e., sample preparation, protease digestion, liquid chromatography, MS data acquisition, and data analysis. Here, we outline the major MS-based workflows and bioinformatic pipelines used for small protein discovery and validation. Special emphasis is placed on highlighting the adjustments required to improve detection and data quality for small proteins. We discuss both the unbiased detection of small proteins and the targeted analysis of small proteins of interest. Finally, we provide guidelines to prioritize novel small proteins, and an outlook on methods with particular potential to further improve comprehensive discovery and characterization of small proteins.
Collapse
Affiliation(s)
- Christian H. Ahrens
- Agroscope, Method Development and Analytics & SIB Swiss Institute of Bioinformatics, Wädenswil, Switzerland
| | - Joseph T. Wade
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
- Department of Biomedical Sciences, School of Public Health, University at Albany, Albany, New York, USA
| | - Matthew M. Champion
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, USA
| | - Julian D. Langer
- Mass Spectrometry and Proteomics, Max Planck Institute of Biophysics, Frankfurt am Main, Germany
- Proteomics, Max Planck Institute for Brain Research, Frankfurt am Main, Germany
| |
Collapse
|
17
|
Robson B. Computers and preventative diagnosis. A survey with bioinformatics examples of mitochondrial small open reading frame peptides as portents of a new generation of powerful biomarkers. Comput Biol Med 2022; 140:105116. [PMID: 34896883 DOI: 10.1016/j.compbiomed.2021.105116] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Accepted: 12/02/2021] [Indexed: 12/27/2022]
Abstract
The present brief survey is to alert developers in datamining, machine learning, inference methods, and other approaches in relation to diagnostic, predictive, and risk assessment medicine about a relatively new class of bioactive messaging peptides in which there is escalating interest. They provide patterns of communication and cross-chatter about states of health and disease within and, importantly, between cells (they also appear extracellularly in biological fluids). This chatter needs to be analyzed somewhat in the manner of the decryption of the Enigma code in the Second World War. It could lead not only to improved diagnosis but to predictive diagnosis, prediction of organ failure, and preventative medicine. This involves peptide products of short reading frames that have been previously somewhat neglected as unlikely gene products, with probably many examples in nuclear DNA, but certainly several known in the mitochondrial DNA. There is a great deal of knowledge now becoming available about the latter and itis believed thatthat the mRNA can be translated both by standard cytosolic and mitochondrial genetic codes, resulting in different peptides, adding a further level of complexity to the applications of bioinformatics and computational biology but a higher level of detail and sophistication to preventative diagnosis. The code to crack could be sophisticated and combinatorically complex to analyze by computers. Mitochondria may have combined with proto-eucaryotic cells some 2 billion years ago, only about a 7th of the age of the universe. Cells appeared some 2 billion years before that, also with possible signaling based on similar ideas. This makes life small in space but huge in time, refinement of which centrally involves these signaling processes.
Collapse
Affiliation(s)
- Barry Robson
- Ingine Inc. Viginia, USA and the Dirac Foundation OxfordShire UK.
| |
Collapse
|
18
|
Chen L, Yang Y, Zhang Y, Li K, Cai H, Wang H, Zhao Q. The Small Open Reading Frame-Encoded Peptides: Advances in Methodologies and Functional Studies. Chembiochem 2021; 23:e202100534. [PMID: 34862721 DOI: 10.1002/cbic.202100534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 11/15/2021] [Indexed: 11/07/2022]
Abstract
Small open reading frames (sORFs) are an important class of genes with less than 100 codons. They were historically annotated as noncoding or even junk sequences. In recent years, accumulating evidence suggests that sORFs could encode a considerable number of polypeptides, many of which play important roles in both physiology and disease pathology. However, it has been technically challenging to directly detect sORF-encoded peptides (SEPs). Here, we discuss the latest advances in methodologies for identifying SEPs with mass spectrometry, as well as the progress on functional studies of SEPs.
Collapse
Affiliation(s)
- Lei Chen
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China.,Laboratory for Synthetic Chemistry and Chemical Biology Limited, Hong Kong Science and Technology Park, New Territories, Hong Kong SAR, 999077, P. R. China
| | - Ying Yang
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Yuanliang Zhang
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Kecheng Li
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Hongmin Cai
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510623, P. R. China
| | - Hongwei Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, 510623, P. R. China
| | - Qian Zhao
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| |
Collapse
|
19
|
Pan N, Wang Z, Wang B, Wan J, Wan C. Mapping Microproteins and ncRNA-Encoded Polypeptides in Different Mouse Tissues. Front Cell Dev Biol 2021; 9:687748. [PMID: 34381774 PMCID: PMC8350139 DOI: 10.3389/fcell.2021.687748] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 06/30/2021] [Indexed: 12/30/2022] Open
Abstract
Small open reading frame encoded peptides (SEPs), also called microproteins, play a vital role in biological processes. Plenty of their open reading frames are located within the non-coding RNA (ncRNA) range. Recent research has demonstrated that ncRNA-encoded polypeptides have essential functions and exist ubiquitously in various tissues. To better understand the role of microproteins, especially ncRNA-encoded proteins, expressed in different tissues, we profiled the proteomic characterization of five mouse tissues by mass spectrometry, including bottom-up, top-down, and de novo sequencing strategies. Bottom-up and top-down with database-dependent searches identified 811 microproteins in the OpenProt database. De novo sequencing identified 290 microproteins, including 12 ncRNA-encoded microproteins that were not found in current databases. In this study, we discovered 1,074 microproteins in total, including 270 ncRNA-encoded microproteins. From the annotation of these microproteins, we found that the brain contains the largest number of neuropeptides, while the spleen contains the most immunoassociated microproteins. This suggests that microproteins in different tissues have tissue-specific functions. These unannotated ncRNA-coded microproteins have predicted domains, such as the macrophage migration inhibitory factor domain and the Prefoldin domain. These results expand the mouse proteome and provide insight into the molecular biology of mouse tissues.
Collapse
Affiliation(s)
- Ni Pan
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, China
| | - Zhiwei Wang
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, China
| | - Bing Wang
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, China
| | - Jian Wan
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, China
| | - Cuihong Wan
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, China
| |
Collapse
|