1
|
Coughlin TM, Makarewich CA. Emerging roles for microproteins as critical regulators of endoplasmic reticulum function and cellular homeostasis. Semin Cell Dev Biol 2025; 170:103608. [PMID: 40245464 PMCID: PMC12065929 DOI: 10.1016/j.semcdb.2025.103608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Revised: 02/20/2025] [Accepted: 04/04/2025] [Indexed: 04/19/2025]
Abstract
The endoplasmic reticulum (ER) is a multifunctional organelle essential for key cellular processes including protein synthesis, calcium homeostasis, and the cellular stress response. It is composed of distinct domains, such as the rough and smooth ER, as well as membrane regions that facilitate direct communication with other organelles, enabling its diverse functions. While many well-characterized ER proteins contribute to these processes, recent studies have revealed a previously underappreciated class of small proteins that play critical regulatory roles. Microproteins, typically under 100 amino acids in length, were historically overlooked due to size-based biases in genome annotation and often misannotated as noncoding RNAs. Advances in ribosome profiling, mass spectrometry, and computational approaches have now enabled the discovery of numerous previously unrecognized microproteins, significantly expanding our understanding of the proteome. While some ER-associated microproteins, such as phospholamban and sarcolipin, were identified decades ago, newly discovered microproteins share similar fundamental characteristics, underscoring the need to refine our understanding of the coding potential of the genome. Molecular studies have demonstrated that ER microproteins play essential roles in calcium regulation, ER stress response, organelle communication, and protein translocation. Moreover, growing evidence suggests that ER microproteins contribute to cellular homeostasis and are implicated in disease processes, including cardiovascular disease and cancer. This review examines the shared and unique functions of ER microproteins, their implications for health and disease, and their potential as therapeutic targets for conditions associated with ER dysfunction.
Collapse
Affiliation(s)
- Taylor M Coughlin
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Pathobiology and Molecular Medicine Graduate Program, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Catherine A Makarewich
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA.
| |
Collapse
|
2
|
Bae H, Nguyen CM, Ruiz-Orera J, Mills NL, Snyder MP, Jang C, Shah SH, Hübner N, Seldin M. Emerging Technologies and Future Directions in Interorgan Crosstalk Cardiometabolic Research. Circ Res 2025; 136:1494-1506. [PMID: 40403107 PMCID: PMC12101523 DOI: 10.1161/circresaha.125.325515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2025] [Revised: 04/04/2025] [Accepted: 04/15/2025] [Indexed: 05/24/2025]
Abstract
The heart does not work in isolation, with cardiac health and disease occurring through complex interactions between the heart with multiple organs. Furthermore, the integration of organ-specific lipid metabolism, blood pressure, insulin sensitivity, and inflammation involves a complex network of signaling pathways between many organs. Dysregulation in these communications is now recognized as a key contributor to many manifestations of cardiovascular disease. Mechanistic characterization of specific molecules mediating interorgan signaling has been pivotal in advancing our understanding of cardiovascular disease. The discovery of insulin, glucagon, and other hormones in the early 20th century illustrated the importance of communication between organs in maintaining physiological homeostasis. For example, elegant studies evaluating insulin signaling and its role in regulating glucose metabolism have shed light on its broader impact on cardiovascular health, hypertension, atherosclerosis, and other cardiovascular disease risks. Recent technological advances have revolutionized our understanding of interorgan signaling. Global approaches such as proteomics and metabolomics applications to blood have enabled the simultaneous profiling of thousands of circulating factors, revealing previously unknown signaling molecules and pathways. These large-scale studies have identified biomarkers linked to early stages of heart disease and offered new therapeutic targets. By understanding how specific cells in the heart interact with cells in other organs, such as the kidney or liver, researchers can identify key pathways that, when disrupted, lead to cardiovascular pathology. The ability to capture a more holistic view of the cardiovascular system positions interorgan signaling at the forefront of cardiovascular research. As we continue to refine our tools for mapping these complex networks, the insights gained hold the potential to not only improve early diagnosis but also to develop more targeted and effective treatments for cardiovascular disease. In this review, we discuss current approaches used to enhance our understanding of organ crosstalk with a specific emphasis on cardiac and cardiovascular physiology.
Collapse
Affiliation(s)
- Hosung Bae
- Department of Biological Chemistry and Center of Epigenetics and Metabolism, School of Medicine, University of California Irvine School of Medicine (H.B., C.M.N., C.J., M.S.)
| | - Christy M Nguyen
- Department of Biological Chemistry and Center of Epigenetics and Metabolism, School of Medicine, University of California Irvine School of Medicine (H.B., C.M.N., C.J., M.S.)
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany (J.R.-O., N.H.)
| | - Nicholas L Mills
- BHF Centre for Cardiovascular Science (N.L.M.), The University of Edinburgh, United Kingdom
- Usher Institute (N.L.M.), The University of Edinburgh, United Kingdom
| | - Michael P Snyder
- Department of Genetics, Stanford University School of Medicine, CA (M.P.S.)
| | - Cholsoon Jang
- Department of Biological Chemistry and Center of Epigenetics and Metabolism, School of Medicine, University of California Irvine School of Medicine (H.B., C.M.N., C.J., M.S.)
| | - Svati H Shah
- Duke Center for Precision Health (S.H.S.), Duke University School of Medicine, Durham, NC
- Duke Molecular Physiology Institute (S.H.S.), Duke University School of Medicine, Durham, NC
| | - Norbert Hübner
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany (J.R.-O., N.H.)
- German Center for Cardiovascular Research (DZHK), Partner Site Berlin, Germany (N.H.)
- Charité-Universitätsmedizin, Berlin, Germany (N.H.)
- Helmholtz Institute for Translational AngioCardioScience, MDC, Heidelberg University, Germany (N.H.)
| | - Marcus Seldin
- Department of Biological Chemistry and Center of Epigenetics and Metabolism, School of Medicine, University of California Irvine School of Medicine (H.B., C.M.N., C.J., M.S.)
| |
Collapse
|
3
|
Wan F, Torres MDT, Guan C, de la Fuente-Nunez C. Tutorial: guidelines for the use of machine learning methods to mine genomes and proteomes for antibiotic discovery. Nat Protoc 2025:10.1038/s41596-025-01144-w. [PMID: 40369233 DOI: 10.1038/s41596-025-01144-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 01/08/2025] [Indexed: 05/16/2025]
Abstract
Genomes and proteomes constitute a rich reservoir of molecular diversity. However, they have remained underexplored because of a lack of appropriate tools. In recent years, computational approaches have been developed to mine this unexplored biological information, or dark matter, accelerating the discovery of new antibiotic molecules. Such efforts have yielded a wide range of new molecules. These include peptides released via predicted proteolytic cleavage of larger proteins, termed 'encrypted peptides', which have been found to be widespread in nature. Molecules encoded by and translated from small open reading frames within genomic sequences have also been uncovered, further expanding the landscape of bioactive compounds. Here, we discuss computational approaches, including machine learning and artificial intelligence (AI) tools, which have been used to date to identify antimicrobial compounds, with a special emphasis on peptides. We also propose potential avenues for future exploration in this rapidly evolving field. Moreover, we provide an overview of the experimental methods commonly used to validate these computational predictions. We anticipate that efforts combining cutting-edge AI and experimental approaches for biological sequence mining will reveal new insights into host immunity and continue to accelerate discoveries in the fields of antibiotics and infectious diseases.
Collapse
Affiliation(s)
- Fangping Wan
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Marcelo D T Torres
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Changge Guan
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA.
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
4
|
Yang X, Ding A, Wu S, Jiang Z, Li Y, Huang X, Duan L, Cheng S, Zheng S, Gao S. FuHsi regulates rDNA transcription and promotes tumor progression. Sci Bull (Beijing) 2025:S2095-9273(25)00494-3. [PMID: 40374472 DOI: 10.1016/j.scib.2025.04.068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2025] [Revised: 04/16/2025] [Accepted: 04/30/2025] [Indexed: 05/17/2025]
Affiliation(s)
- Xiaohui Yang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230000, China; Chinese Academy of Sciences (CAS) Key Laboratory of Biomedical Diagnostics, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou 215163, China; Zhongda Hospital, School of Life Sciences and Technology, Advanced Institute for Life and Health, Southeast University, Nanjing 210096, China
| | - Ao Ding
- School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Songzhe Wu
- Zhongda Hospital, School of Life Sciences and Technology, Advanced Institute for Life and Health, Southeast University, Nanjing 210096, China
| | - Ziyue Jiang
- Zhongda Hospital, School of Life Sciences and Technology, Advanced Institute for Life and Health, Southeast University, Nanjing 210096, China
| | - Yifei Li
- School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Xianting Huang
- Department of Oncology, Jiangyin People's Hospital, Jiangyin 214400, China
| | - Liqiang Duan
- Shanxi Academy of Advanced Research and Innovation, Taiyuan 030032, China
| | - Shuwen Cheng
- Medical School Of Nanjing University, Nanjing 210046, China
| | - Shizhong Zheng
- School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Shan Gao
- Zhongda Hospital, School of Life Sciences and Technology, Advanced Institute for Life and Health, Southeast University, Nanjing 210096, China.
| |
Collapse
|
5
|
Agrawal A, Saghatelian A. Identification of microproteins with transactivation activity by polyalanine motif selection. RSC Chem Biol 2025; 6:800-808. [PMID: 40083654 PMCID: PMC11898273 DOI: 10.1039/d4cb00277f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2024] [Accepted: 02/26/2025] [Indexed: 03/16/2025] Open
Abstract
Microproteins are an emerging class of proteins that are encoded by small open reading frames (smORFs) less than or equal to 100 amino acids. The functions of several microproteins have been illuminated through phenotypic screening or protein-protein interaction studies, but thousands of microproteins remain uncharacterized. The functional characterization of microproteins is challenging due to a lack of sequence homology. Here, we demonstrate a strategy to enrich microproteins that contain specific motifs as a means to more rapidly characterize microproteins. Specifically, we used the fact that polyalanine motifs are associated with nuclear proteins to select 58 candidate microproteins to screen for transactivation function. We identified three microproteins with transactivation activity when tested as GAL4-fusions in a cell-based luciferase assay. The results support the continued use of the motif selection strategy for the discovery of microprotein function.
Collapse
Affiliation(s)
- Archita Agrawal
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies La Jolla CA USA
| | - Alan Saghatelian
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies La Jolla CA USA
| |
Collapse
|
6
|
Deshpande A, Mahale S, Kanduri C. Beyond the Transcript: Translating Non-Coding RNAs and Their Impact on Cellular Regulation. Cancers (Basel) 2025; 17:1555. [PMID: 40361481 PMCID: PMC12071610 DOI: 10.3390/cancers17091555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2025] [Revised: 04/30/2025] [Accepted: 05/02/2025] [Indexed: 05/15/2025] Open
Abstract
Non-coding RNAs (ncRNAs) constitute the majority of the human transcriptome and play diverse structural, catalytic, and regulatory roles. The ability of ncRNAs to be translated into functional peptides and microproteins expands our understanding of their regulatory potential beyond their established non-coding functions. Our comprehensive search identified 86 translating "non-coding" RNAs. While translating ncRNAs have traditionally been categorized as "peptide-encoding", in this study, we introduce a novel classification based on amino acid length, distinguishing their products as ncRNA encoded peptides (ncRNA-PEPs), which are less than 60 amino acids, or ncRNA encoded microproteins (ncRNA-MPs) ranging from 61 to 200 amino acids. These peptides and microproteins act as co-regulators in cell signaling, transcriptional regulation, and protein complex assembly, playing a role in both health and disease. We outline the molecular pathways by which ncRNA-PEPs and ncRNA-MPs could govern cell cycle progression, highlighting their influence on cell cycle transitions, oncogenic and tumor suppressor pathways, metabolic homeostasis, autophagy, and on key cell cycle regulators like PCNA, Rad18, and CDK-cyclin complexes. Furthermore, we highlight recent advancements in their detection and characterization, exploring their evolutionary origins, species-specific conservation, and potential therapeutic applications. Our findings underscore the emerging significance of ncRNA-PEPs and ncRNA-MPs as integral regulators of cellular processes, highlighting their functional versatility and opening promising avenues for further research and potential therapeutic applications.
Collapse
Affiliation(s)
| | | | - Chandrasekhar Kanduri
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, University of Gothenburg, SE-40530 Gothenburg, Sweden; (A.D.); (S.M.)
| |
Collapse
|
7
|
Rajinikanth N, Chauhan R, Prabakaran S. Harnessing Noncanonical Proteins for Next-Generation Drug Discovery and Diagnosis. WIREs Mech Dis 2025; 17:e70001. [PMID: 40423871 DOI: 10.1002/wsbm.70001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Revised: 05/06/2025] [Accepted: 05/07/2025] [Indexed: 05/28/2025]
Abstract
Noncanonical proteins, encoded by previously overlooked genomic regions (part of the "dark genome"), are emerging as crucial players in human health and disease, expanding our understanding of the "dark proteome." This review explores their landscape, including proteins derived from long non-coding RNAs, circular RNAs, and alternative open reading frames. Recent advances in ribosome profiling, mass spectrometry, and proteogenomics have unveiled their involvement in critical cellular processes. We examine their roles in cancer, neurological disorders, cardiovascular diseases, and infectious diseases, highlighting their potential as novel biomarkers and therapeutic targets. The review addresses challenges in identifying and characterizing these proteins, particularly recently evolved ones, and discusses implications for drug discovery, including cancer immunotherapy and neoantigen sources. By synthesizing recent findings, we underscore the significance of noncanonical proteins in expanding our understanding of the human genome and proteome, and their promise in developing innovative diagnostic tools and targeted therapies. This overview aims to stimulate further research into this unexplored biological space, potentially revolutionizing approaches to disease treatment and personalized medicine.
Collapse
Affiliation(s)
- Nachiket Rajinikanth
- University of Missouri Kansas City School of Medicine, Kansas City, Missouri, USA
| | | | - Sudhakaran Prabakaran
- NonExomics, Inc., Acton, Massachusetts, USA
- Northeastern University, Boston, Massachusetts, USA
| |
Collapse
|
8
|
Gadad SS, Camacho CV, Gong X, Thornton M, Malladi VS, Nagari A, Sundaresan A, Nandu T, Koul S, Peng Y, Kraus WL. X-Linked Cancer-Associated Polypeptide (XCP) from lncRNA1456 Cooperates with PHF8 to Regulate Gene Expression and Cellular Pathways in Breast Cancer. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.21.644649. [PMID: 40196671 PMCID: PMC11974697 DOI: 10.1101/2025.03.21.644649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/09/2025]
Abstract
Recent studies have demonstrated that a subset of long "noncoding" RNAs (lncRNAs) produce functional polypeptides and proteins. In this study, we discovered a 132 amino acid protein in human breast cancer cells named XCP (X-linked Cancer-associated Polypeptide), which is encoded by lncRNA1456 (a.k.a. RHOXF1P3), a transcript previously thought to be noncoding. lncRNA1456 is a pancreas- and testis-specific RNA whose gene is located on chromosome X. We found that the expression of lncRNA1456 and XCP are highly upregulated in the luminal A, luminal B, and HER2 molecular subtypes of breast cancer. XCP modulates both estrogen-dependent and estrogen-independent growth of breast cancer cells by regulating cancer pathways, as shown in cell and xenograft models. XCP shares some homology with homeodomain-containing proteins and interacts with the histone demethylase plant homeodomain finger protein 8 (PHF8), which is also encoded by an X-linked gene. Mechanistically, XCP stimulates the histone demethylase activity of PHF8 to regulate gene expression in breast cancer cells. These findings identify XCP as a coregulator of transcription and emphasize the need to interrogate the potential functional roles of open reading frames originating from noncoding RNAs.
Collapse
Affiliation(s)
- Shrikanth S. Gadad
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Division of Basic Research, Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Current address: Center of Emphasis in Cancer, Paul L. Foster School of Medicine, Department of Biomedical Sciences, Texas Tech University Health Sciences Center, El Paso, TX 79905, USA
- These authors contributed equally to this work
| | - Cristel V. Camacho
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Division of Basic Research, Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- These authors contributed equally to this work
| | - Xuan Gong
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Current address: Department of Bone Marrow Transplantation and Cellular Therapy, St. Jude Children’s Research Hospital, Memphis, TN 38105
| | - Micah Thornton
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Computational Core Facility, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Venkat S. Malladi
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Computational Core Facility, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Anusha Nagari
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Computational Core Facility, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Aishwarya Sundaresan
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Computational Core Facility, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Tulip Nandu
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Computational Core Facility, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Sneh Koul
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Computational Core Facility, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Yan Peng
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - W. Lee Kraus
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Division of Basic Research, Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
9
|
Danner M, Begemann M, Kraft F, Elbracht M, Kurth I, Krause J. Mutational constraint analysis workflow for overlapping short open reading frames and genomic neighbors. BMC Genomics 2025; 26:254. [PMID: 40087590 PMCID: PMC11909976 DOI: 10.1186/s12864-025-11444-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 03/04/2025] [Indexed: 03/17/2025] Open
Abstract
Understanding the dark genome is a priority task following the complete sequencing of the human genome. Short open reading frames (sORFs) are a group of largely unexplored elements of the dark genome with the potential for being translated into microproteins. The definitive number of coding and regulatory sORFs is not known, however they could account for up to 1-2% of the human genome. This corresponds to an order of magnitude in the range of canonical coding genes. For a few sORFs a clinical relevance has already been demonstrated, but for the majority of potential sORFs the biological function remains unclear. A major limitation in predicting their disease relevance using large-scale genomic data is the fact that no population-level constraint metrics for genetic variants in sORFs are yet available. To overcome this, we used the recently released gnomAD 4.0 dataset and analyzed the constraint of a consensus set of sORFs and their genomic neighbors. We demonstrate that sORFs are mostly embedded into a moderately constrained genomic context, but within the gencode dataset we identified a subset of highly constrained sORFs comparable to highly constrained canonical genes.
Collapse
Affiliation(s)
- Martin Danner
- Institute for Human Genetics and Genomic Medicine Medical Faculty, RWTH Aachen University Hospital, Pauwelsstrasse 30, D-52074, Aachen, North-Rhine-Westphalia, Germany
- Scieneers GmbH, Kantstraße 1a, 76137, Karlsruhe, Baden-Wuerttemberg, Germany
| | - Matthias Begemann
- Institute for Human Genetics and Genomic Medicine Medical Faculty, RWTH Aachen University Hospital, Pauwelsstrasse 30, D-52074, Aachen, North-Rhine-Westphalia, Germany
| | - Florian Kraft
- Institute for Human Genetics and Genomic Medicine Medical Faculty, RWTH Aachen University Hospital, Pauwelsstrasse 30, D-52074, Aachen, North-Rhine-Westphalia, Germany
| | - Miriam Elbracht
- Institute for Human Genetics and Genomic Medicine Medical Faculty, RWTH Aachen University Hospital, Pauwelsstrasse 30, D-52074, Aachen, North-Rhine-Westphalia, Germany
| | - Ingo Kurth
- Institute for Human Genetics and Genomic Medicine Medical Faculty, RWTH Aachen University Hospital, Pauwelsstrasse 30, D-52074, Aachen, North-Rhine-Westphalia, Germany
| | - Jeremias Krause
- Institute for Human Genetics and Genomic Medicine Medical Faculty, RWTH Aachen University Hospital, Pauwelsstrasse 30, D-52074, Aachen, North-Rhine-Westphalia, Germany.
| |
Collapse
|
10
|
Ruiz-Orera J, Hübner N. The non-canonical proteome: a novel contributor to cancer proliferation. Cell Res 2025; 35:155-156. [PMID: 39806169 PMCID: PMC11909265 DOI: 10.1038/s41422-024-01069-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2025] Open
Affiliation(s)
- Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Norbert Hübner
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany.
- DZHK (German Center for Cardiovascular Research), Partner Site Berlin, Berlin, Germany.
- Charité-Universitätsmedizin, Berlin, Germany.
- Helmholtz Institute for Translational AngioCardioScience (HI-TAC) of the Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC) at Heidelberg University, Heidelberg, Germany.
| |
Collapse
|
11
|
Khanduja A, Mohanty D. SProtFP: a machine learning-based method for functional classification of small ORFs in prokaryotes. NAR Genom Bioinform 2025; 7:lqae186. [PMID: 39781515 PMCID: PMC11704790 DOI: 10.1093/nargab/lqae186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2024] [Revised: 11/07/2024] [Accepted: 12/17/2024] [Indexed: 01/12/2025] Open
Abstract
Small proteins (≤100 amino acids) play important roles across all life forms, ranging from unicellular bacteria to higher organisms. In this study, we have developed SProtFP which is a machine learning-based method for functional annotation of prokaryotic small proteins into selected functional categories. SProtFP uses independent artificial neural networks (ANNs) trained using a combination of physicochemical descriptors for classifying small proteins into antitoxin type 2, bacteriocin, DNA-binding, metal-binding, ribosomal protein, RNA-binding, type 1 toxin and type 2 toxin proteins. We have also trained a model for identification of small open reading frame (smORF)-encoded antimicrobial peptides (AMPs). Comprehensive benchmarking of SProtFP revealed an average area under the receiver operator curve (ROC-AUC) of 0.92 during 10-fold cross-validation and an ROC-AUC of 0.94 and 0.93 on held-out balanced and imbalanced test sets. Utilizing our method to annotate bacterial isolates from the human gut microbiome, we could identify thousands of remote homologs of known small protein families and assign putative functions to uncharacterized proteins. This highlights the utility of SProtFP for large-scale functional annotation of microbiome datasets, especially in cases where sequence homology is low. SProtFP is freely available at http://www.nii.ac.in/sprotfp.html and can be combined with genome annotation tools such as ProsmORF-pred to uncover the functional repertoire of novel small proteins in bacteria.
Collapse
Affiliation(s)
- Akshay Khanduja
- National Institute of Immunology, Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Debasisa Mohanty
- National Institute of Immunology, Aruna Asaf Ali Marg, New Delhi 110067, India
| |
Collapse
|
12
|
Shi C, Liu F, Su X, Yang Z, Wang Y, Xie S, Xie S, Sun Q, Chen Y, Sang L, Tan M, Zhu L, Lei K, Li J, Yang J, Gao Z, Yu M, Wang X, Wang J, Chen J, Zhuo W, Fang Z, Liu J, Yan Q, Neculai D, Sun Q, Shao J, Lin W, Liu W, Chen J, Wang L, Liu Y, Li X, Zhou T, Lin A. Comprehensive discovery and functional characterization of the noncanonical proteome. Cell Res 2025; 35:186-204. [PMID: 39794466 PMCID: PMC11909191 DOI: 10.1038/s41422-024-01059-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Accepted: 11/14/2024] [Indexed: 01/13/2025] Open
Abstract
The systematic identification and functional characterization of noncanonical translation products, such as novel peptides, will facilitate the understanding of the human genome and provide new insights into cell biology. Here, we constructed a high-coverage peptide sequencing reference library with 11,668,944 open reading frames and employed an ultrafiltration tandem mass spectrometry assay to identify novel peptides. Through these methods, we discovered 8945 previously unannotated peptides from normal gastric tissues, gastric cancer tissues and cell lines, nearly half of which were derived from noncoding RNAs. Moreover, our CRISPR screening revealed that 1161 peptides are involved in tumor cell proliferation. The presence and physiological function of a subset of these peptides, selected based on screening scores, amino acid length, and various indicators, were verified through Flag-knockin and multiple other methods. To further characterize the potential regulatory mechanisms involved, we constructed a framework based on artificial intelligence structure prediction and peptide‒protein interaction network analysis for the top 100 candidates and revealed that these cancer-related peptides have diverse subcellular locations and participate in organelle-specific processes. Further investigation verified the interacting partners of pep1-nc-OLMALINC, pep5-nc-TRHDE-AS1, pep-nc-ZNF436-AS1 and pep2-nc-AC027045.3, and the functions of these peptides in mitochondrial complex assembly, energy metabolism, and cholesterol metabolism, respectively. We showed that pep5-nc-TRHDE-AS1 and pep2-nc-AC027045.3 had substantial impacts on tumor growth in xenograft models. Furthermore, the dysregulation of these four peptides is closely correlated with clinical prognosis. Taken together, our study provides a comprehensive characterization of the noncanonical proteome, and highlights critical roles of these previously unannotated peptides in cancer biology.
Collapse
Affiliation(s)
- Chengyu Shi
- The Center for RNA Medicine, International Institutes of Medicine, International School of Medicine, The 4th Affiliated Hospital of Zhejiang University School of Medicine, Yiwu, Zhejiang, China
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Fangzhou Liu
- The Center for RNA Medicine, International Institutes of Medicine, International School of Medicine, The 4th Affiliated Hospital of Zhejiang University School of Medicine, Yiwu, Zhejiang, China
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Xinwan Su
- The Center for RNA Medicine, International Institutes of Medicine, International School of Medicine, The 4th Affiliated Hospital of Zhejiang University School of Medicine, Yiwu, Zhejiang, China
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Zuozhen Yang
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Ying Wang
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Shanshan Xie
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Department of Cell Biology and Program in Molecular Cell Biology, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Department of Gastroenterology, the Second Affiliated Hospital, School of Medicine and Institute of Gastroenterology, Zhejiang University, Hangzhou, Zhejiang, China
| | - Shaofang Xie
- Key Laboratory of Structural Biology of Zhejiang Province, Westlake Laboratory of Life Sciences and Biomedicine, Westlake University, Hangzhou, Zhejiang, China
| | - Qiang Sun
- The Center for RNA Medicine, International Institutes of Medicine, International School of Medicine, The 4th Affiliated Hospital of Zhejiang University School of Medicine, Yiwu, Zhejiang, China
| | - Yu Chen
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Lingjie Sang
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Manman Tan
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Linyu Zhu
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Kai Lei
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Junhong Li
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Jiecheng Yang
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Zerui Gao
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Meng Yu
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Xinyi Wang
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Junfeng Wang
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Jing Chen
- Department of Gastrointestinal Surgery, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Wei Zhuo
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Department of Cell Biology and Program in Molecular Cell Biology, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Department of Gastroenterology, the Second Affiliated Hospital, School of Medicine and Institute of Gastroenterology, Zhejiang University, Hangzhou, Zhejiang, China
| | - Zhaoyuan Fang
- Zhejiang University-University of Edinburgh Institute, Zhejiang University School of Medicine, Haining, Zhejiang, China
- The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Jian Liu
- Zhejiang University-University of Edinburgh Institute, Zhejiang University School of Medicine, Haining, Zhejiang, China
- Hangzhou Cancer Hospital, Hangzhou, Zhejiang, China
| | - Qingfeng Yan
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Dante Neculai
- The Center for RNA Medicine, International Institutes of Medicine, International School of Medicine, The 4th Affiliated Hospital of Zhejiang University School of Medicine, Yiwu, Zhejiang, China
| | - Qiming Sun
- The Center for RNA Medicine, International Institutes of Medicine, International School of Medicine, The 4th Affiliated Hospital of Zhejiang University School of Medicine, Yiwu, Zhejiang, China
| | - Jianzhong Shao
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Weiqiang Lin
- Department of Nephrology, Center for Regeneration and Aging Medicine, The Fourth Affiliated Hospital of School of Medicine and International School of Medicine, International Institutes of Medicine, Zhejiang University, Yiwu, Zhejiang, China
| | - Wei Liu
- The Center for RNA Medicine, International Institutes of Medicine, International School of Medicine, The 4th Affiliated Hospital of Zhejiang University School of Medicine, Yiwu, Zhejiang, China
| | - Jian Chen
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
- Department of Gastrointestinal Surgery, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Liangjing Wang
- Department of Gastroenterology, the Second Affiliated Hospital, School of Medicine and Institute of Gastroenterology, Zhejiang University, Hangzhou, Zhejiang, China
| | - Yang Liu
- Institute of Immunology, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Xu Li
- Key Laboratory of Structural Biology of Zhejiang Province, Westlake Laboratory of Life Sciences and Biomedicine, Westlake University, Hangzhou, Zhejiang, China
| | - Tianhua Zhou
- The Center for RNA Medicine, International Institutes of Medicine, International School of Medicine, The 4th Affiliated Hospital of Zhejiang University School of Medicine, Yiwu, Zhejiang, China.
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China.
- Department of Cell Biology and Program in Molecular Cell Biology, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
| | - Aifu Lin
- The Center for RNA Medicine, International Institutes of Medicine, International School of Medicine, The 4th Affiliated Hospital of Zhejiang University School of Medicine, Yiwu, Zhejiang, China.
- MOE Laboratory of Biosystem Homeostasis and Protection, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China.
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China.
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Hangzhou, Zhejiang, China.
- Future Health Laboratory, Innovation Center of Yangtze River Delta, Zhejiang University, Jiashan, Zhejiang, China.
- Key Laboratory for Cell and Gene Engineering of Zhejiang Province, Hangzhou, Zhejiang, China.
| |
Collapse
|
13
|
Razumova E, Makariuk A, Dontsova O, Shepelev N, Rubtsova M. Structural Features of 5' Untranslated Region in Translational Control of Eukaryotes. Int J Mol Sci 2025; 26:1979. [PMID: 40076602 PMCID: PMC11900008 DOI: 10.3390/ijms26051979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2025] [Revised: 02/19/2025] [Accepted: 02/21/2025] [Indexed: 03/14/2025] Open
Abstract
Gene expression is a complex process regulated at multiple levels in eukaryotic cells. Translation frequently represents a pivotal step in the control of gene expression. Among the stages of translation, initiation is particularly important, as it governs ribosome recruitment and the efficiency of protein synthesis. The 5' untranslated region (5' UTR) of mRNA plays a key role in this process, often exhibiting a complicated and structured landscape. Numerous eukaryotic mRNAs possess long 5' UTRs that contain diverse regulatory elements, including RNA secondary structures, specific nucleotide motifs, and chemical modifications. These structural features can independently modulate translation through their intrinsic properties or by serving as platforms for trans-acting factors such as RNA-binding proteins. The dynamic nature of 5' UTR elements allows cells to fine-tune translation in response to environmental and cellular signals. Understanding these mechanisms is not only fundamental to molecular biology but also holds significant biomedical potential. Insights into 5' UTR-mediated regulation could drive advancements in synthetic biology and mRNA-based targeted therapies. This review outlines the current knowledge of the structural elements of the 5' UTR, the interplay between them, and their combined functional impact on translation.
Collapse
Affiliation(s)
- Elizaveta Razumova
- Chemistry Department, Lomonosov Moscow State University, Moscow 119234, Russia; (E.R.); (O.D.); (N.S.)
| | - Aleksandr Makariuk
- Department of Biology, Lomonosov Moscow State University, Moscow 119234, Russia;
| | - Olga Dontsova
- Chemistry Department, Lomonosov Moscow State University, Moscow 119234, Russia; (E.R.); (O.D.); (N.S.)
- A.N.Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119234, Russia
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow 117437, Russia
- Skolkovo Institute of Science and Technology, Center for Molecular and Cellular Biology, Moscow 121205, Russia
| | - Nikita Shepelev
- Chemistry Department, Lomonosov Moscow State University, Moscow 119234, Russia; (E.R.); (O.D.); (N.S.)
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow 117437, Russia
| | - Maria Rubtsova
- Chemistry Department, Lomonosov Moscow State University, Moscow 119234, Russia; (E.R.); (O.D.); (N.S.)
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow 117437, Russia
| |
Collapse
|
14
|
Comtois F, Jacques JF, Métayer L, Ouedraogo WYD, Ouangraoua A, Denault JB, Roucou X. Noncanonical altPIDD1 protein: unveiling the true major translational output of the PIDD1 gene. Life Sci Alliance 2025; 8:e202402910. [PMID: 39532532 PMCID: PMC11557682 DOI: 10.26508/lsa.202402910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Revised: 11/04/2024] [Accepted: 11/05/2024] [Indexed: 11/16/2024] Open
Abstract
Proteogenomics has enabled the detection of novel proteins encoded in noncanonical or alternative open reading frames (altORFs) in genes already coding a reference protein. Reanalysis of proteomic and ribo-seq data revealed that the p53-induced death domain-containing protein (or PIDD1) gene encodes a second 171 amino acid protein, altPIDD1, in addition to the known 910-amino acid-long PIDD1 protein. The two ORFs overlap almost completely, and the translation initiation site of altPIDD1 is located upstream of PIDD1. AltPIDD1 has more translational and protein level evidence than PIDD1 across various cell lines and tissues. In HEK293 cells, the altPIDD1 to PIDD1 ratio is 40 to 1, as measured with isotope-labeled (heavy) peptides and targeted proteomics. AltPIDD1 localizes to cytoskeletal structures labeled with phalloidin and interacts with cytoskeletal proteins. Unlike most noncanonical proteins, altPIDD1 is not evolutionarily young but emerged in placental mammals. Overall, we identify PIDD1 as a dual-coding gene, with altPIDD1, not the annotated protein, being the primary product of translation.
Collapse
Affiliation(s)
- Frédérick Comtois
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Canada
| | - Jean-François Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Canada
| | - Lenna Métayer
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Canada
| | | | - Aïda Ouangraoua
- Department of Informatics, Université de Sherbrooke, Sherbrooke, Canada
| | - Jean-Bernard Denault
- Department of Pharmacology and Physiology, Université de Sherbrooke, Sherbrooke, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, Canada
| |
Collapse
|
15
|
Tong G, Martinez TF. Ribosome profiling reveals hidden world of small proteins. Trends Genet 2025; 41:101-103. [PMID: 39814675 DOI: 10.1016/j.tig.2024.12.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2024] [Accepted: 12/28/2024] [Indexed: 01/18/2025]
Abstract
The development of ribosome profiling (Ribo-seq) by Ingolia et al. introduced a powerful new method for monitoring translation genome-wide. Application of Ribo-seq across multiple organisms has since revealed thousands of unannotated translated small open reading frames (ORFs) and enhanced efforts to study their encoded proteins, called microproteins.
Collapse
Affiliation(s)
- Gregory Tong
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, CA 92617, USA
| | - Thomas F Martinez
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, CA 92617, USA; Department of Biological Chemistry, University of California, Irvine, Irvine, CA 92617, USA; Chao Family Comprehensive Cancer Center, University of California, Irvine, Irvine, CA 92617, USA.
| |
Collapse
|
16
|
Jemth P. Protein binding and folding through an evolutionary lens. Curr Opin Struct Biol 2025; 90:102980. [PMID: 39817990 DOI: 10.1016/j.sbi.2024.102980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Revised: 12/18/2024] [Accepted: 12/19/2024] [Indexed: 01/18/2025]
Abstract
Protein-protein associations are often mediated by an intrinsically disordered protein region interacting with a folded domain in a coupled binding and folding reaction. Classic physical organic chemistry approaches together with structural biology have shed light on mechanistic aspects of such reactions. Further insight into general principles may be obtained by interpreting the results through an evolutionary lens. This review attempts to provide an overview on how the analysis of binding and folding reactions can benefit from an evolutionary approach, and is aimed at protein scientists without a background in evolution. Evolution constantly reshapes existing proteins by sampling more or less fit variants. Most new variants are weeded out as generations and new species come and go over hundreds to hundreds of millions of years. The huge ongoing genome sequencing efforts have provided us with a snapshot of existing adapted fit-for-purpose protein homologs in thousands of different organisms. Comparison of present-day orthologs and paralogs highlights general principles of the evolution of coupled binding and folding reactions and demonstrate a great potential for evolution to operate on disordered regions and modulate affinity and specificity of the interactions.
Collapse
Affiliation(s)
- Per Jemth
- Department of Medical Biochemistry and Microbiology, Uppsala University, BMC, Box 582, SE-75123 Uppsala, Sweden.
| |
Collapse
|
17
|
Tornesello AL, Cerasuolo A, Starita N, Amiranda S, Cimmino TP, Bonelli P, Tuccillo FM, Buonaguro FM, Buonaguro L, Tornesello ML. Emerging role of endogenous peptides encoded by non-coding RNAs in cancer biology. Noncoding RNA Res 2025; 10:231-241. [PMID: 39554691 PMCID: PMC11567935 DOI: 10.1016/j.ncrna.2024.10.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Revised: 09/30/2024] [Accepted: 10/27/2024] [Indexed: 11/19/2024] Open
Abstract
Non-coding RNAs have long been recognized for their regulatory roles in various cellular processes, including cancer development and progression. Recent advancements have shed light on a novel aspect of non-coding RNA biology, revealing their ability to encode endogenous peptides also named micropeptides or microprotein through short open reading frames (sORFs). These small proteins play crucial roles in oncogenic processes, acting as either tumour suppressors or tumour promoters, and hold enormous potential as biomarkers for early diagnosis of cancer and as therapeutic targets. This comprehensive review highlights the state of the art on peptides encoded by long non-coding RNAs (lncRNAs), microRNAs (miRNAs), and circular RNAs (circRNAs), elucidating their regulatory functions and implications in different cancer types, including breast cancer, hepatocellular carcinoma and colorectal cancer. The review also discusses challenges and future directions in the exploration of these emerging players in cancer biology, emphasizing the importance of further investigation for their clinical translation in diagnosis and therapy.
Collapse
Affiliation(s)
- Anna Lucia Tornesello
- Innovative Immunological Models Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Napoli, Italy
| | - Andrea Cerasuolo
- Molecular Biology and Viral Oncology Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Napoli, Italy
| | - Noemy Starita
- Molecular Biology and Viral Oncology Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Napoli, Italy
| | - Sara Amiranda
- Molecular Biology and Viral Oncology Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Napoli, Italy
| | - Tiziana Pecchillo Cimmino
- Molecular Biology and Viral Oncology Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Napoli, Italy
| | - Patrizia Bonelli
- Molecular Biology and Viral Oncology Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Napoli, Italy
| | - Franca Maria Tuccillo
- Molecular Biology and Viral Oncology Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Napoli, Italy
| | - Franco Maria Buonaguro
- Molecular Biology and Viral Oncology Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Napoli, Italy
| | - Luigi Buonaguro
- Innovative Immunological Models Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Napoli, Italy
| | - Maria Lina Tornesello
- Molecular Biology and Viral Oncology Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Napoli, Italy
| |
Collapse
|
18
|
Hannon Bozorgmehr J. The De Novo Emergence of Two Brain Genes in the Human Lineage Appears to be Unsupported. J Mol Evol 2025; 93:3-10. [PMID: 39725692 DOI: 10.1007/s00239-024-10227-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Accepted: 12/10/2024] [Indexed: 12/28/2024]
Abstract
Recently, certain studies have claimed that cognitive features and pathologies unique to humans can be traced to certain changes in the nervous system. These are caused by genes that have likely evolved "from scratch," not having any coding precursors. The translated proteins would not appear outside of the human lineage and any orthologs in other species should be non-coding. This contrasts with research that has identified a decisive role for duplication, and modifications to regulatory sequences, for such phenotypic traits. Closer examination, however, reveals that the inferred lineage-specific emergence of at least two of these genes is likely a misinterpretation owing to a lack of peptide verification, experimental oversights, and insufficient species comparisons. A possible pseudogenic origin is proposed for one of them. The implications of these claims for the study of molecular evolution are discussed.
Collapse
|
19
|
Hofman DA, Prensner JR, van Heesch S. Microproteins in cancer: identification, biological functions, and clinical implications. Trends Genet 2025; 41:146-161. [PMID: 39379206 PMCID: PMC11794034 DOI: 10.1016/j.tig.2024.09.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 08/19/2024] [Accepted: 09/17/2024] [Indexed: 10/10/2024]
Abstract
Cancer continues to be a major global health challenge, accounting for 10 million deaths annually worldwide. Since the inception of genome-wide cancer sequencing studies 20 years ago, a core set of ~700 oncogenes and tumor suppressor genes has become the basis for cancer research. However, this research has been based largely on an understanding that the human genome encodes ~19 500 protein-coding genes. Complementing this genomic landscape, recent advances have described numerous microproteins which are now poised to redefine our understanding of oncogenic processes and open new avenues for therapeutic intervention. This review explores the emerging evidence for microprotein involvement in cancer mechanisms and discusses potential therapeutic applications, with an emphasis on highlighting recent advances in the field.
Collapse
Affiliation(s)
- Damon A Hofman
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584, CS, Utrecht, The Netherlands; Oncode Institute, Utrecht, The Netherlands
| | - John R Prensner
- Department of Pediatrics, Division of Pediatric Hematology/Oncology and Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584, CS, Utrecht, The Netherlands; Oncode Institute, Utrecht, The Netherlands.
| |
Collapse
|
20
|
Azam S, Yang F, Wu X. Finding functional microproteins. Trends Genet 2025; 41:107-118. [PMID: 39753408 PMCID: PMC11794006 DOI: 10.1016/j.tig.2024.12.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 12/05/2024] [Accepted: 12/06/2024] [Indexed: 02/06/2025]
Abstract
Genome-wide translational profiling has uncovered the synthesis in human cells of thousands of microproteins, a class of proteins traditionally overlooked in functional studies. Although an increasing number of these microproteins have been found to play critical roles in cellular processes, the functional relevance of the majority remains poorly understood. Studying these low-abundance, often unstable proteins is further complicated by the challenge of disentangling their functions from the noncoding roles of the associated DNA, RNA, and the act of translation. This review highlights recent advances in functional genomics that have led to the discovery of >1000 human microproteins required for optimal cell proliferation. Ongoing technological innovations will continue to clarify the roles and mechanisms of microproteins in both normal physiology and disease, potentially opening new avenues for therapeutic exploration.
Collapse
Affiliation(s)
- Sikandar Azam
- Department of Medicine and Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Feiyue Yang
- Department of Medicine and Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Xuebing Wu
- Department of Medicine and Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10032, USA.
| |
Collapse
|
21
|
Zhang Y, Yang Y, Li K, Chen L, Yang Y, Yang C, Xie Z, Wang H, Zhao Q. Enhanced Discovery of Alternative Proteins (AltProts) in Mouse Cardiac Development Using Data-Independent Acquisition (DIA) Proteomics. Anal Chem 2025; 97:1517-1527. [PMID: 39813267 PMCID: PMC11781309 DOI: 10.1021/acs.analchem.4c02924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 11/27/2024] [Accepted: 11/27/2024] [Indexed: 01/18/2025]
Abstract
Alternative proteins (AltProts) are a class of proteins encoded by DNA sequences previously classified as noncoding. Despite their historically being overlooked, recent studies have highlighted their widespread presence and distinctive biological roles. So far, direct detection of AltProt has been relying on data-dependent acquisition (DDA) mass spectrometry (MS). However, data-independent acquisition (DIA) MS, a method that is rapidly gaining popularity for the analysis of canonical proteins, has seen limited application in AltProt research, largely due to the complexities involved in constructing DIA libraries. In this study, we present a novel DIA workflow that leverages a fragmentation spectra predictor for the efficient construction of DIA libraries, significantly enhancing the detection of AltProts. Our method achieved a 2-fold increase in the identification of AltProts and a 50% reduction in missing values compared to DDA. We conducted a comprehensive comparison of four AltProt databases, four DIA-library construction strategies, and three analytical software tools to establish an optimal workflow for AltProt analysis. Utilizing this workflow, we investigated the mouse heart development process and identified over 50 AltProts with differential expression between embryonic and adult heart tissues. Over 30 unannotated mouse AltProts were validated, including ASDURF, which played a crucial role in cardiac development. Our findings not only provide a practical workflow for MS-based AltProt analysis but also reveal novel AltProts with potential significance in biological functions.
Collapse
Affiliation(s)
- Yuanliang Zhang
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Ying Yang
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Kecheng Li
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Lei Chen
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Yang Yang
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Chenxi Yang
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| | - Zhi Xie
- State
Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China
| | - Hongwei Wang
- State
Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China
| | - Qian Zhao
- Department
of Applied Biology and Chemical Technology, State Key Laboratory of
Chemical Biology and Drug Discovery, Hong
Kong Polytechnic University, Hong Kong 999077, China
| |
Collapse
|
22
|
Meng K, Li Y, Yuan X, Shen HM, Hu LL, Liu D, Shi F, Zheng D, Shi X, Wen N, Cao Y, Pan YL, He QY, Zhang CZ. The cryptic lncRNA-encoded microprotein TPM3P9 drives oncogenic RNA splicing and tumorigenesis. Signal Transduct Target Ther 2025; 10:43. [PMID: 39865075 PMCID: PMC11770092 DOI: 10.1038/s41392-025-02128-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 12/21/2024] [Accepted: 01/07/2025] [Indexed: 01/28/2025] Open
Abstract
Emerging evidence demonstrates that cryptic translation from RNAs previously annotated as noncoding might generate microproteins with oncogenic functions. However, the importance and underlying mechanisms of these microproteins in alternative splicing-driven tumor progression have rarely been studied. Here, we show that the novel protein TPM3P9, encoded by the lncRNA tropomyosin 3 pseudogene 9, exhibits oncogenic activity in clear cell renal cell carcinoma (ccRCC) by enhancing oncogenic RNA splicing. Overexpression of TPM3P9 promotes cell proliferation and tumor growth. Mechanistically, TPM3P9 binds to the RRM1 domain of the splicing factor RBM4 to inhibit RBM4-mediated exon skipping in the transcription factor TCF7L2. This results in increased expression of the oncogenic splice variant TCF7L2-L, which activates NF-κB signaling via its interaction with SAM68 to transcriptionally induce RELB expression. From a clinical perspective, TPM3P9 expression is upregulated in cancer tissues and is significantly correlated with the expression of TCF7L2-L and RELB. High TPM3P9 expression or low RBM4 expression is associated with poor survival in patients with ccRCC. Collectively, our findings functionally and clinically characterize the "noncoding RNA"-derived microprotein TPM3P9 and thus identify potential prognostic and therapeutic factors in renal cancer.
Collapse
Affiliation(s)
- Kun Meng
- MOE Key Laboratory of Tumor Molecular Biology and State Key Laboratory of Bioactive Molecules and Druggability Assessment, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, 510632, China
- Xiangyang Central Hospital, Affiliated Hospital of Hubei University of Arts and Science, Hubei Province, 441100, Xiangyang, China
| | - Yuying Li
- MOE Key Laboratory of Tumor Molecular Biology and State Key Laboratory of Bioactive Molecules and Druggability Assessment, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, 510632, China
| | - Xiaoyi Yuan
- MOE Key Laboratory of Tumor Molecular Biology and State Key Laboratory of Bioactive Molecules and Druggability Assessment, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, 510632, China
| | - Hui-Min Shen
- Department of Obstetrics and Gynecology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510080, China
| | - Li-Ling Hu
- MOE Key Laboratory of Tumor Molecular Biology and State Key Laboratory of Bioactive Molecules and Druggability Assessment, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, 510632, China
| | - Danya Liu
- MOE Key Laboratory of Tumor Molecular Biology and State Key Laboratory of Bioactive Molecules and Druggability Assessment, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, 510632, China
| | - Fujin Shi
- MOE Key Laboratory of Tumor Molecular Biology and State Key Laboratory of Bioactive Molecules and Druggability Assessment, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, 510632, China
| | - Dandan Zheng
- MOE Key Laboratory of Tumor Molecular Biology and State Key Laboratory of Bioactive Molecules and Druggability Assessment, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, 510632, China
| | - Xinyu Shi
- MOE Key Laboratory of Tumor Molecular Biology and State Key Laboratory of Bioactive Molecules and Druggability Assessment, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, 510632, China
| | - Nengqiao Wen
- Department of Pathology, State Key Laboratory of Oncology in South China, Sun Yat-sen University Cancer Center, 510060, Guangzhou, China
| | - Yun Cao
- Department of Pathology, State Key Laboratory of Oncology in South China, Sun Yat-sen University Cancer Center, 510060, Guangzhou, China
| | - Yun-Long Pan
- The First Affiliated Hospital of Jinan University, Jinan University, Guangzhou, 510632, China
| | - Qing-Yu He
- MOE Key Laboratory of Tumor Molecular Biology and State Key Laboratory of Bioactive Molecules and Druggability Assessment, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, 510632, China.
| | - Chris Zhiyi Zhang
- MOE Key Laboratory of Tumor Molecular Biology and State Key Laboratory of Bioactive Molecules and Druggability Assessment, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, 510632, China.
| |
Collapse
|
23
|
Mudge J, Carbonell-Sala S, Diekhans M, Martinez J, Hunt T, Jungreis I, Loveland J, Arnan C, Barnes I, Bennett R, Berry A, Bignell A, Cerdán-Vélez D, Cochran K, Cortés L, Davidson C, Donaldson S, Dursun C, Fatima R, Hardy M, Hebbar P, Hollis Z, James B, Jiang Y, Johnson R, Kaur G, Kay M, Mangan R, Maquedano M, Gómez L, Mathlouthi N, Merritt R, Ni P, Palumbo E, Perteghella T, Pozo F, Raj S, Sisu C, Steed E, Sumathipala D, Suner MM, Uszczynska-Ratajczak B, Wass E, Yang Y, Zhang D, Finn R, Gerstein M, Guigó R, Hubbard TP, Kellis M, Kundaje A, Paten B, Tress M, Birney E, Martin F, Frankish A. GENCODE 2025: reference gene annotation for human and mouse. Nucleic Acids Res 2025; 53:D966-D975. [PMID: 39565199 PMCID: PMC11701607 DOI: 10.1093/nar/gkae1078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 10/12/2024] [Accepted: 10/23/2024] [Indexed: 11/21/2024] Open
Abstract
GENCODE produces comprehensive reference gene annotation for human and mouse. Entering its twentieth year, the project remains highly active as new technologies and methodologies allow us to catalog the genome at ever-increasing granularity. In particular, long-read transcriptome sequencing enables us to identify large numbers of missing transcripts and to substantially improve existing models, and our long non-coding RNA catalogs have undergone a dramatic expansion and reconfiguration as a result. Meanwhile, we are incorporating data from state-of-the-art proteomics and Ribo-seq experiments to fine-tune our annotation of translated sequences, while further insights into function can be gained from multi-genome alignments that grow richer as more species' genomes are sequenced. Such methodologies are combined into a fully integrated annotation workflow. However, the increasing complexity of our resources can present usability challenges, and we are resolving these with the creation of filtered genesets such as MANE Select and GENCODE Primary. The next challenge is to propagate annotations throughout multiple human and mouse genomes, as we enter the pangenome era. Our resources are freely available at our web portal www.gencodegenes.org, and via the Ensembl and UCSC genome browsers.
Collapse
Affiliation(s)
- Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sílvia Carbonell-Sala
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003 Catalonia, Spain
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, 2300 Delaware Avenue, University of California, Santa Cruz, CA 95060, USA
| | - Jose Gonzalez Martinez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Irwin Jungreis
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA 02139, USA
- The Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA
| | - Jane E Loveland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carme Arnan
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003 Catalonia, Spain
| | - If Barnes
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ruth Bennett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrew Berry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexandra Bignell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel Cerdán-Vélez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez Almagro, 3, 28029 Madrid, Spain
| | - Kelly Cochran
- Department of Computer Science, Stanford University, 353 Jane Stanford Way, Stanford, CA, USA
| | - Lucas T Cortés
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Claire Davidson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sarah Donaldson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Cagatay Dursun
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Reham Fatima
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthew Hardy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Prajna Hebbar
- UC Santa Cruz Genomics Institute, 2300 Delaware Avenue, University of California, Santa Cruz, CA 95060, USA
| | - Zoe Hollis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Benjamin T James
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA 02139, USA
- The Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA
| | - Yunzhe Jiang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Rory Johnson
- Department of Medical Oncology, Bern University Hospital, Murtenstrasse 35, 3008 Bern, Switzerland
- School of Biology and Environmental Science, University College Dublin,, Belfield, Dublin 4 D04 V1W8, Ireland
| | - Gazaldeep Kaur
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003 Catalonia, Spain
| | - Mike Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Riley J Mangan
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA 02139, USA
- The Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA
- Genetics Training Program, Harvard Medical School, Boston, MA 02115, USA
| | - Miguel Maquedano
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez Almagro, 3, 28029 Madrid, Spain
| | - Laura Martínez Gómez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez Almagro, 3, 28029 Madrid, Spain
| | - Nourhen Mathlouthi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ryan Merritt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Pengyu Ni
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Emilio Palumbo
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003 Catalonia, Spain
| | - Tamara Perteghella
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003 Catalonia, Spain
- Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra (UPF), Carrer de la Mercè, 12, Ciutat Vella 08002 Barcelona, Spain
| | - Fernando Pozo
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez Almagro, 3, 28029 Madrid, Spain
| | - Shriya Raj
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Cristina Sisu
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
- Department of Life Sciences, Brunel University London, Kingston Lane, Uxbridge, London UB8 3PH, UK
| | - Emily Steed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dulika Sumathipala
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marie-Marthe Suner
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Barbara Uszczynska-Ratajczak
- Department of Computational Biology of Noncoding RNA, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego12/14, 61-704 Poznan, Poland
| | - Elizabeth Wass
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Yucheng T Yang
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, 220 Handan Road, Shanghai 200433, China
| | - Dingyao Zhang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003 Catalonia, Spain
- Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra (UPF), Carrer de la Mercè, 12, Ciutat Vella 08002 Barcelona, Spain
| | - Tim J P Hubbard
- Department of Medical and Molecular Genetics, King’s College London, Guys Hospital, Great Maze Pond, London SE1 9RT, UK
- ELIXIR Hub, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA 02139, USA
- The Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, 353 Jane Stanford Way, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, 2300 Delaware Avenue, University of California, Santa Cruz, CA 95060, USA
| | - Michael L Tress
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez Almagro, 3, 28029 Madrid, Spain
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
24
|
Kochetov AV. Evaluation of Eukaryotic mRNA Coding Potential. Methods Mol Biol 2025; 2859:319-331. [PMID: 39436610 DOI: 10.1007/978-1-0716-4152-1_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2024]
Abstract
It is widely discussed that eukaryotic mRNAs can encode several functional polypeptides. Recent progress in NGS and proteomics techniques has resulted in a huge volume of information on potential alternative translation initiation sites and open reading frames (altORFs). However, these data are still incomprehensive, and the vast majority of eukaryotic mRNAs annotated in conventional databases (e.g., GenBank) contain a single ORF (CDS) encoding a protein larger than some arbitrary threshold (commonly 100 amino acid residues). Indeed, some gene functions may relate to the polypeptides encoded by unannotated altORFs, and insufficient information in nucleotide sequence databanks may limit the interpretation of genomics and transcriptomics data. However, despite the need for special experiments to predict altORFs accurately, there are some simple methods for their preliminary mapping.
Collapse
Affiliation(s)
- Alex V Kochetov
- Institute of Cytology and Genetics, SB RAS, Novosibirsk, Russia.
- Novosibirsk State Agrarian University, Novosibirsk, Russia.
- Novosibirsk State University, Novosibirsk, Russia.
| |
Collapse
|
25
|
Aparicio B, Theunissen P, Hervas-Stubbs S, Fortes P, Sarobe P. Relevance of mutation-derived neoantigens and non-classical antigens for anticancer therapies. Hum Vaccin Immunother 2024; 20:2303799. [PMID: 38346926 PMCID: PMC10863374 DOI: 10.1080/21645515.2024.2303799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 01/06/2024] [Indexed: 02/15/2024] Open
Abstract
Efficacy of cancer immunotherapies relies on correct recognition of tumor antigens by lymphocytes, eliciting thus functional responses capable of eliminating tumor cells. Therefore, important efforts have been carried out in antigen identification, with the aim of understanding mechanisms of response to immunotherapy and to design safer and more efficient strategies. In addition to classical tumor-associated antigens identified during the last decades, implementation of next-generation sequencing methodologies is enabling the identification of neoantigens (neoAgs) arising from mutations, leading to the development of new neoAg-directed therapies. Moreover, there are numerous non-classical tumor antigens originated from other sources and identified by new methodologies. Here, we review the relevance of neoAgs in different immunotherapies and the results obtained by applying neoAg-based strategies. In addition, the different types of non-classical tumor antigens and the best approaches for their identification are described. This will help to increase the spectrum of targetable molecules useful in cancer immunotherapies.
Collapse
Affiliation(s)
- Belen Aparicio
- Program of Immunology and Immunotherapy, Center for Applied Medical Research (CIMA) University of Navarra, Pamplona, Spain
- Cancer Center Clinica Universidad de Navarra (CCUN), Pamplona, Spain
- Navarra Institute for Health Research (IDISNA), Pamplona, Spain
- CIBERehd, Pamplona, Spain
| | - Patrick Theunissen
- Cancer Center Clinica Universidad de Navarra (CCUN), Pamplona, Spain
- Navarra Institute for Health Research (IDISNA), Pamplona, Spain
- CIBERehd, Pamplona, Spain
- DNA and RNA Medicine Division, Center for Applied Medical Research (CIMA), University of Navarra, Pamplona, Spain
| | - Sandra Hervas-Stubbs
- Program of Immunology and Immunotherapy, Center for Applied Medical Research (CIMA) University of Navarra, Pamplona, Spain
- Cancer Center Clinica Universidad de Navarra (CCUN), Pamplona, Spain
- Navarra Institute for Health Research (IDISNA), Pamplona, Spain
- CIBERehd, Pamplona, Spain
| | - Puri Fortes
- Cancer Center Clinica Universidad de Navarra (CCUN), Pamplona, Spain
- Navarra Institute for Health Research (IDISNA), Pamplona, Spain
- CIBERehd, Pamplona, Spain
- DNA and RNA Medicine Division, Center for Applied Medical Research (CIMA), University of Navarra, Pamplona, Spain
- Spanish Network for Advanced Therapies (TERAV ISCIII), Spain
| | - Pablo Sarobe
- Program of Immunology and Immunotherapy, Center for Applied Medical Research (CIMA) University of Navarra, Pamplona, Spain
- Cancer Center Clinica Universidad de Navarra (CCUN), Pamplona, Spain
- Navarra Institute for Health Research (IDISNA), Pamplona, Spain
- CIBERehd, Pamplona, Spain
| |
Collapse
|
26
|
Libé-Philippot B, Polleux F, Vanderhaeghen P. If you please, draw me a neuron - linking evolutionary tinkering with human neuron evolution. Curr Opin Genet Dev 2024; 89:102260. [PMID: 39357501 PMCID: PMC11625661 DOI: 10.1016/j.gde.2024.102260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Revised: 08/23/2024] [Accepted: 09/02/2024] [Indexed: 10/04/2024]
Abstract
Animal speciation often involves novel behavioral features that rely on nervous system evolution. Human-specific brain features have been proposed to underlie specialized cognitive functions and to be linked, at least in part, to the evolution of synapses, neurons, and circuits of the cerebral cortex. Here, we review recent results showing that, while the human cortex is composed of a repertoire of cells that appears to be largely similar to the one found in other mammals, human cortical neurons do display specialized features at many levels, from gene expression to intrinsic physiological properties. The molecular mechanisms underlying human species-specific neuronal features remain largely unknown but implicate hominid-specific gene duplicates that encode novel molecular modifiers of neuronal function. The identification of human-specific genetic modifiers of neuronal function brings novel insights on brain evolution and function and, could also provide new insights on human species-specific vulnerabilities to brain disorders.
Collapse
Affiliation(s)
- Baptiste Libé-Philippot
- VIB-KU Leuven Center for Brain & Disease Research, 3000 Leuven, Belgium; Department of Neurosciences, Leuven Brain Institute, KUL, 3000 Leuven, Belgium; Aix-Marseille Université, CNRS UMR 7288, Developmental Biology Institute of Marseille (IBDM), NeuroMarseille, Marseille, France.
| | - Franck Polleux
- Department of Neuroscience, Columbia University, New York, NY, USA; Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA. https://twitter.com/@fpolleux
| | - Pierre Vanderhaeghen
- VIB-KU Leuven Center for Brain & Disease Research, 3000 Leuven, Belgium; Department of Neurosciences, Leuven Brain Institute, KUL, 3000 Leuven, Belgium.
| |
Collapse
|
27
|
Haseltine WA, Patarca R. The RNA Revolution in the Central Molecular Biology Dogma Evolution. Int J Mol Sci 2024; 25:12695. [PMID: 39684407 DOI: 10.3390/ijms252312695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 11/24/2024] [Accepted: 11/25/2024] [Indexed: 12/18/2024] Open
Abstract
Human genome projects in the 1990s identified about 20,000 protein-coding sequences. We are now in the RNA revolution, propelled by the realization that genes determine phenotype beyond the foundational central molecular biology dogma, stating that inherited linear pieces of DNA are transcribed to RNAs and translated into proteins. Crucially, over 95% of the genome, initially considered junk DNA between protein-coding genes, encodes essential, functionally diverse non-protein-coding RNAs, raising the gene count by at least one order of magnitude. Most inherited phenotype-determining changes in DNA are in regulatory areas that control RNA and regulatory sequences. RNAs can directly or indirectly determine phenotypes by regulating protein and RNA function, transferring information within and between organisms, and generating DNA. RNAs also exhibit high structural, functional, and biomolecular interaction plasticity and are modified via editing, methylation, glycosylation, and other mechanisms, which bestow them with diverse intra- and extracellular functions without altering the underlying DNA. RNA is, therefore, currently considered the primary determinant of cellular to populational functional diversity, disease-linked and biomolecular structural variations, and cell function regulation. As demonstrated by RNA-based coronavirus vaccines' success, RNA technology is transforming medicine, agriculture, and industry, as did the advent of recombinant DNA technology in the 1980s.
Collapse
Affiliation(s)
- William A Haseltine
- Access Health International, 384 West Lane, Ridgefield, CT 06877, USA
- Feinstein Institutes for Medical Research, 350 Community Dr, Manhasset, NY 11030, USA
| | - Roberto Patarca
- Access Health International, 384 West Lane, Ridgefield, CT 06877, USA
- Feinstein Institutes for Medical Research, 350 Community Dr, Manhasset, NY 11030, USA
| |
Collapse
|
28
|
Li R, Qin T, Guo Y, Zhang S, Guo X. CEAM is a mitochondrial-localized, amyloid-like motif-containing microprotein expressed in human cardiomyocytes. Biochem Biophys Res Commun 2024; 734:150737. [PMID: 39388734 DOI: 10.1016/j.bbrc.2024.150737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Revised: 09/22/2024] [Accepted: 09/22/2024] [Indexed: 10/12/2024]
Abstract
Microproteins synthesized through non-canonical translation pathways are frequently found within mitochondria. However, the functional significance of these mitochondria-localized microproteins in energy-intensive organs such as the heart remains largely unexplored. In this study, we demonstrate that the long non-coding RNA CD63-AS1 encodes a mitochondrial microprotein. Notably, in ribosome profiling data of human hearts, there is a positive correlation between the expression of CD63-AS1 and genes associated with cardiomyopathy. We have termed this microprotein CEAM (CD63-AS1 encoded amyloid-like motif containing microprotein), reflecting its sequence characteristics. Our biochemical assays show that CEAM forms protease-resistant aggregates within mitochondria, whereas deletion of the amyloid-like motif transforms CEAM into a soluble cytosolic protein. Overexpression of CEAM triggers mitochondrial stress responses and adversely affect mitochondrial bioenergetics in cultured cardiomyocytes. In turn, the expression of CEAM is reciprocally inhibited by the activation of mitochondrial stresses induced by oligomycin. When expressed in mouse hearts via adeno-associated virus, CEAM impairs cardiac function. However, under conditions of pressure overload-induced cardiac hypertrophy, CEAM expression appears to offer a protective benefit and mitigates the expression of genes associated with cardiac remodeling, presumably through a mechanism that suppresses stress-induced translation reprogramming. Collectively, our study uncovers a hitherto unexplored amyloid-like microprotein expressed in the human cardiomyocytes, offering novel insights into myocardial hypertrophy pathophysiology.
Collapse
Affiliation(s)
- Ruobing Li
- Department of Cardiology of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China
| | - Ti Qin
- Department of Biochemistry, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Yabo Guo
- Department of Biochemistry, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Shan Zhang
- Department of Cardiology of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China; Department of Biochemistry, Zhejiang University School of Medicine, Hangzhou, 310058, China.
| | - Xiaogang Guo
- Department of Cardiology of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.
| |
Collapse
|
29
|
Cao M, Qiu Q, Zhang X, Zhang W, Shen Z, Ma C, Zhu M, Pan J, Tong X, Cao G, Gong C, Hu X. Identification and characterization of a novel small viral peptide (VSP59) encoded by Bombyx mori cypovirus (BmCPV) that negatively regulates viral replication. Microbiol Spectr 2024; 12:e0082624. [PMID: 39382281 PMCID: PMC11537000 DOI: 10.1128/spectrum.00826-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 08/16/2024] [Indexed: 10/10/2024] Open
Abstract
Bombyx mori cypovirus (BmCPV), a member of the Reoviridae family, is a well-established research model for double-stranded RNA (dsRNA) viruses with segmented genomes. Despite its small genome size, the coding potential of BmCPV remains largely unexplored. In this study, we identified a novel small open reading frame within the S10 dsRNA genome, encoding a small viral peptide (VSP59) with 59 amino acid residues. Functional characterization revealed that VSP59 acts as a negative regulator of viral replication. VSP59 predominantly localizes to the cytoplasm, where it interacts with prohibitin 2 (PHB2), an inner membrane mitophagy receptor. This interaction targets mitochondria and triggers caspase 3-dependent apoptosis. Transient expression of vsp59 in BmN cells suppressed viral replication, an effect that was reversed by silencing PHB2 expression. Moreover, recombinant BmCPV with a mutated vsp59 exhibited reduced replication. Our findings demonstrate that VSP59 interacts with PHB2 on mitochondria, inducing apoptosis and thereby diminishing viral replication. This study expands our understanding of the genetic information encoded by the BmCPV genome and highlights the role of novel small peptides in host-virus interactions. IMPORTANCE A novel small open reading frame (sORF) from the viral genome was identified and characterized. The sORF could encode a small viral peptide (VSP59) that targeted mitochondria and induced prohibitin 2-related apoptosis, further attenuating Bombyx mori cypovirus replication.
Collapse
Affiliation(s)
- Manman Cao
- School of Life Science, Soochow University, Suzhou, China
| | - Qunnan Qiu
- School of Life Science, Soochow University, Suzhou, China
| | - Xing Zhang
- School of Chemistry and Life Science, Suzhou University of Science and Technology, Suzhou, China
| | - Wenxue Zhang
- School of Life Science, Soochow University, Suzhou, China
| | - Zeen Shen
- School of Life Science, Soochow University, Suzhou, China
| | - Chang Ma
- School of Life Science, Soochow University, Suzhou, China
| | - Min Zhu
- School of Life Science, Soochow University, Suzhou, China
| | - Jun Pan
- School of Life Science, Soochow University, Suzhou, China
| | - Xingyu Tong
- School of Life Science, Soochow University, Suzhou, China
| | - Guangli Cao
- School of Life Science, Soochow University, Suzhou, China
| | - Chengliang Gong
- School of Life Science, Soochow University, Suzhou, China
- Institute of Agricultural Biotechnology and Ecological Research, Soochow University, Suzhou, China
| | - Xiaolong Hu
- School of Life Science, Soochow University, Suzhou, China
- Institute of Agricultural Biotechnology and Ecological Research, Soochow University, Suzhou, China
| |
Collapse
|
30
|
Pai VJ, Lau CJ, Garcia-Ruiz A, Donaldson C, Vaughan JM, Miller B, De Souza EV, Pinto AM, Diedrich J, Gavva NR, Yu S, DeBoever C, Horman SR, Saghatelian A. Microprotein-encoding RNA regulation in cells treated with pro-inflammatory and pro-fibrotic stimuli. BMC Genomics 2024; 25:1034. [PMID: 39497054 PMCID: PMC11536906 DOI: 10.1186/s12864-024-10948-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Accepted: 10/24/2024] [Indexed: 11/06/2024] Open
Abstract
BACKGROUND Recent analysis of the human proteome via proteogenomics and ribosome profiling of the transcriptome revealed the existence of thousands of previously unannotated microprotein-coding small open reading frames (smORFs). Most functional microproteins were chosen for characterization because of their evolutionary conservation. However, one example of a non-conserved immunomodulatory microprotein in mice suggests that strict sequence conservation misses some intriguing microproteins. RESULTS We examine the ability of gene regulation to identify human microproteins with potential roles in inflammation or fibrosis of the intestine. To do this, we collected ribosome profiling data of intestinal cell lines and peripheral blood mononuclear cells and used gene expression of microprotein-encoding transcripts to identify strongly regulated microproteins, including several examples of microproteins that are only conserved with primates. CONCLUSION This approach reveals a number of new microproteins worthy of additional functional characterization and provides a dataset that can be queried in different ways to find additional gut microproteins of interest.
Collapse
Affiliation(s)
- Victor J Pai
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA, 92037, USA.
| | - Calvin J Lau
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Almudena Garcia-Ruiz
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Cynthia Donaldson
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Joan M Vaughan
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Brendan Miller
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Eduardo V De Souza
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Antonio M Pinto
- Mass Spectrometry Core, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Jolene Diedrich
- Mass Spectrometry Core, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Narender R Gavva
- Takeda Development Center Americas, Inc, San Diego, CA, 92121, USA
| | - Shan Yu
- Takeda Development Center Americas, Inc, San Diego, CA, 92121, USA
| | | | - Shane R Horman
- Takeda Development Center Americas, Inc, San Diego, CA, 92121, USA.
| | - Alan Saghatelian
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA, 92037, USA.
| |
Collapse
|
31
|
Papadopoulos C, Arbes H, Cornu D, Chevrollier N, Blanchet S, Roginski P, Rabier C, Atia S, Lespinet O, Namy O, Lopes A. The ribosome profiling landscape of yeast reveals a high diversity in pervasive translation. Genome Biol 2024; 25:268. [PMID: 39402662 PMCID: PMC11472626 DOI: 10.1186/s13059-024-03403-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 09/26/2024] [Indexed: 10/19/2024] Open
Abstract
BACKGROUND Pervasive translation is a widespread phenomenon that plays a critical role in the emergence of novel microproteins, but the diversity of translation patterns contributing to their generation remains unclear. Based on 54 ribosome profiling (Ribo-Seq) datasets, we investigated the yeast Ribo-Seq landscape using a representation framework that allows the comprehensive inventory and classification of the entire diversity of Ribo-Seq signals, including non-canonical ones. RESULTS We show that if coding regions occupy specific areas of the Ribo-Seq landscape, noncoding regions encompass a wide diversity of Ribo-Seq signals and, conversely, populate the entire landscape. Our results show that pervasive translation can, nevertheless, be associated with high specificity, with 1055 noncoding ORFs exhibiting canonical Ribo-Seq signals. Using mass spectrometry under standard conditions or proteasome inhibition with an in-house analysis protocol, we report 239 microproteins originating from noncoding ORFs that display canonical but also non-canonical Ribo-Seq signals. Each condition yields dozens of additional microprotein candidates with comparable translation properties, suggesting a larger population of volatile microproteins that are challenging to detect. Our findings suggest that non-canonical translation signals may harbor valuable information and underscore the significance of considering them in proteogenomic studies. Finally, we show that the translation outcome of a noncoding ORF is primarily determined by the initiating codon and the codon distribution in its two alternative frames, rather than features indicative of functionality. CONCLUSION Our results enable us to propose a topology of a species' Ribo-Seq landscape, opening the way to comparative analyses of this translation landscape under different conditions.
Collapse
Affiliation(s)
- Chris Papadopoulos
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, Cedex, 91198, France
- Hospital del Mar Research Institute, Barcelona, Spain
| | - Hugo Arbes
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, Cedex, 91198, France
| | - David Cornu
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, Cedex, 91198, France
| | | | - Sandra Blanchet
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, Cedex, 91198, France
| | - Paul Roginski
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, Cedex, 91198, France
| | - Camille Rabier
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, Cedex, 91198, France
| | - Safiya Atia
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, Cedex, 91198, France
| | - Olivier Lespinet
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, Cedex, 91198, France
| | - Olivier Namy
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, Cedex, 91198, France
| | - Anne Lopes
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, Cedex, 91198, France.
| |
Collapse
|
32
|
Tzani I, Castro-Rivadeneyra M, Kelly P, Strasser L, Zhang L, Clynes M, Karger BL, Barron N, Bones J, Clarke C. Detection of host cell microprotein impurities in antibody drug products. Nat Commun 2024; 15:8605. [PMID: 39366928 PMCID: PMC11452709 DOI: 10.1038/s41467-024-51870-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 08/21/2024] [Indexed: 10/06/2024] Open
Abstract
Chinese hamster ovary (CHO) cells are used to produce almost 90% of therapeutic monoclonal antibodies (mAbs) and antibody fusion proteins (Fc-fusion). The annotation of non-canonical translation events in these cellular factories remains incomplete, limiting our ability to study CHO cell biology and detect host cell protein (HCP) impurities in the final antibody drug product. We utilised ribosome footprint profiling (Ribo-seq) to identify novel open reading frames (ORFs) including N-terminal extensions and thousands of short ORFs (sORFs) predicted to encode microproteins. Mass spectrometry-based HCP analysis of eight commercial antibody drug products (7 mAbs and 1 Fc-fusion protein) using the extended protein sequence database revealed the presence of microprotein impurities. We present evidence that microprotein abundance varies with growth phase and can be affected by the cell culture environment. In addition, our work provides a vital resource to facilitate future studies of non-canonical translation and the regulation of protein synthesis in CHO cell lines.
Collapse
Affiliation(s)
- Ioanna Tzani
- National Institute for Bioprocessing Research and Training, Fosters Avenue, Blackrock, Co, Dublin, Ireland
| | - Marina Castro-Rivadeneyra
- National Institute for Bioprocessing Research and Training, Fosters Avenue, Blackrock, Co, Dublin, Ireland
- School of Chemical and Bioprocess Engineering, University College Dublin, Belfield, Dublin, Ireland
| | - Paul Kelly
- National Institute for Bioprocessing Research and Training, Fosters Avenue, Blackrock, Co, Dublin, Ireland
| | - Lisa Strasser
- National Institute for Bioprocessing Research and Training, Fosters Avenue, Blackrock, Co, Dublin, Ireland
| | - Lin Zhang
- Bioprocess R&D, Pfizer Inc. Andover, Massachusetts, USA
| | - Martin Clynes
- National Institute for Cellular Biotechnology, Dublin City University, Dublin, Ireland
| | - Barry L Karger
- Barnett Institute, Northeastern University, 360 Huntington Ave, Boston, MA, USA
| | - Niall Barron
- National Institute for Bioprocessing Research and Training, Fosters Avenue, Blackrock, Co, Dublin, Ireland
- School of Chemical and Bioprocess Engineering, University College Dublin, Belfield, Dublin, Ireland
| | - Jonathan Bones
- National Institute for Bioprocessing Research and Training, Fosters Avenue, Blackrock, Co, Dublin, Ireland
- School of Chemical and Bioprocess Engineering, University College Dublin, Belfield, Dublin, Ireland
| | - Colin Clarke
- National Institute for Bioprocessing Research and Training, Fosters Avenue, Blackrock, Co, Dublin, Ireland.
- School of Chemical and Bioprocess Engineering, University College Dublin, Belfield, Dublin, Ireland.
| |
Collapse
|
33
|
Chanut-Delalande H, Zanet J. Small ORFs, Big Insights: Drosophila as a Model to Unraveling Microprotein Functions. Cells 2024; 13:1645. [PMID: 39404408 PMCID: PMC11475943 DOI: 10.3390/cells13191645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 09/27/2024] [Accepted: 10/02/2024] [Indexed: 10/19/2024] Open
Abstract
Recently developed experimental and computational approaches to identify putative coding small ORFs (smORFs) in genomes have revealed thousands of smORFs localized within coding and non-coding RNAs. They can be translated into smORF peptides or microproteins, which are defined as less than 100 amino acids in length. The identification of such a large number of potential biological regulators represents a major challenge, notably for elucidating the in vivo functions of these microproteins. Since the emergence of this field, Drosophila has proved to be a valuable model for studying the biological functions of microproteins in vivo. In this review, we outline how the smORF field emerged and the nomenclature used in this domain. We summarize the technical challenges associated with identifying putative coding smORFs in the genome and the relevant translated microproteins. Finally, recent findings on one of the best studied smORF peptides, Pri, and other microproteins studied so far in Drosophila are described. These studies highlight the diverse roles that microproteins can fulfil in the regulation of various molecular targets involved in distinct cellular processes during animal development and physiology. Given the recent emergence of the microprotein field and the associated discoveries, the microproteome represents an exquisite source of potentially bioactive molecules, whose in vivo biological functions can be explored in the Drosophila model.
Collapse
Affiliation(s)
| | - Jennifer Zanet
- Unité de Biologie Moléculaire, Cellulaire et du Développement (MCD), UMR 5077, Centre de Biologie Intégrative (CBI), CNRS, UPS, Université de Toulouse, 31062 Toulouse, France;
| |
Collapse
|
34
|
Ruiz-Orera J, Miller DC, Greiner J, Genehr C, Grammatikaki A, Blachut S, Mbebi J, Patone G, Myronova A, Adami E, Dewani N, Liang N, Hummel O, Muecke MB, Hildebrandt TB, Fritsch G, Schrade L, Zimmermann WH, Kondova I, Diecke S, van Heesch S, Hübner N. Evolution of translational control and the emergence of genes and open reading frames in human and non-human primate hearts. NATURE CARDIOVASCULAR RESEARCH 2024; 3:1217-1235. [PMID: 39317836 PMCID: PMC11473369 DOI: 10.1038/s44161-024-00544-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 08/28/2024] [Indexed: 09/26/2024]
Abstract
Evolutionary innovations can be driven by changes in the rates of RNA translation and the emergence of new genes and small open reading frames (sORFs). In this study, we characterized the transcriptional and translational landscape of the hearts of four primate and two rodent species through integrative ribosome and transcriptomic profiling, including adult left ventricle tissues and induced pluripotent stem cell-derived cardiomyocyte cell cultures. We show here that the translational efficiencies of subunits of the mitochondrial oxidative phosphorylation chain complexes IV and V evolved rapidly across mammalian evolution. Moreover, we discovered hundreds of species-specific and lineage-specific genomic innovations that emerged during primate evolution in the heart, including 551 genes, 504 sORFs and 76 evolutionarily conserved genes displaying human-specific cardiac-enriched expression. Overall, our work describes the evolutionary processes and mechanisms that have shaped cardiac transcription and translation in recent primate evolution and sheds light on how these can contribute to cardiac development and disease.
Collapse
Affiliation(s)
- Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany.
| | - Duncan C Miller
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association (MDC), Technology Platform Pluripotent Stem Cells, Berlin, Germany
| | - Johannes Greiner
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Carolin Genehr
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association (MDC), Technology Platform Pluripotent Stem Cells, Berlin, Germany
| | - Aliki Grammatikaki
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Susanne Blachut
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Jeanne Mbebi
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Giannino Patone
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Anna Myronova
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Eleonora Adami
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Nikita Dewani
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Ning Liang
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Oliver Hummel
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Michael B Muecke
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Thomas B Hildebrandt
- Leibniz Institute for Zoo and Wildlife Research, Berlin, Germany
- Freie Universitaet Berlin, Berlin, Germany
| | - Guido Fritsch
- Leibniz Institute for Zoo and Wildlife Research, Berlin, Germany
| | - Lisa Schrade
- Leibniz Institute for Zoo and Wildlife Research, Berlin, Germany
| | - Wolfram H Zimmermann
- Institute of Pharmacology and Toxicology, University Medical Center Göttingen, Göttingen, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Lower Saxony, Göttingen, Germany
- DZNE (German Center for Neurodegenerative Diseases), Göttingen, Germany
- Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), Göttingen, Germany
| | - Ivanela Kondova
- Biomedical Primate Research Centre (BPRC), Rijswijk, The Netherlands
| | - Sebastian Diecke
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association (MDC), Technology Platform Pluripotent Stem Cells, Berlin, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Berlin, Berlin, Germany
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Norbert Hübner
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany.
- DZHK (German Center for Cardiovascular Research), Partner Site Berlin, Berlin, Germany.
- Charité-Universitätsmedizin, Berlin, Germany.
- Helmholtz Institute for Translational AngioCardioScience (HI-TAC) of the Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC) at Heidelberg University, Heidelberg, Germany.
| |
Collapse
|
35
|
Xiao C, Mo F, Lu Y, Xiao Q, Yao C, Li T, Qi J, Liu X, Chen JY, Zhang L, Guo T, Hu B, An NA, Li CY. Reply to: Identification of old coding regions disproves the hominoid de novo status of genes. Nat Ecol Evol 2024; 8:1831-1834. [PMID: 39187608 DOI: 10.1038/s41559-024-02515-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 07/23/2024] [Indexed: 08/28/2024]
Affiliation(s)
- Chunfu Xiao
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Fan Mo
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute for Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yingfei Lu
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute for Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Qi Xiao
- Westlake Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China
- School of Medicine, School of Life Sciences, Westlake University, Hangzhou, China
| | - Chao Yao
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Ting Li
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Jianhuan Qi
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute for Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xiaoge Liu
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Jia-Yu Chen
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Chemistry and Biomedicine Innovation Center, Nanjing University, Nanjing, China
| | - Li Zhang
- Chinese Institute for Brain Research, Beijing, China
| | - Tiannan Guo
- Westlake Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China
- School of Medicine, School of Life Sciences, Westlake University, Hangzhou, China
| | - Baoyang Hu
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute for Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
| | - Ni A An
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China.
| | - Chuan-Yun Li
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China.
- Chinese Institute for Brain Research, Beijing, China.
- Southwest United Graduate School, Kunming, China.
| |
Collapse
|
36
|
Evolution of cardiac genomic elements in humans and non-human primates. NATURE CARDIOVASCULAR RESEARCH 2024; 3:1187-1188. [PMID: 39354158 DOI: 10.1038/s44161-024-00552-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/03/2024]
|
37
|
Das D, Podder S. Microscale marvels: unveiling the macroscopic significance of micropeptides in human health. Brief Funct Genomics 2024; 23:624-638. [PMID: 38706311 DOI: 10.1093/bfgp/elae018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 04/07/2024] [Accepted: 04/15/2024] [Indexed: 05/07/2024] Open
Abstract
Non-coding RNA encodes micropeptides from small open reading frames located within the RNA. Interestingly, these micropeptides are involved in a variety of functions within the body. They are emerging as the resolving piece of the puzzle for complex biomolecular signaling pathways within the body. Recent studies highlight the pivotal role of small peptides in regulating important biological processes like DNA repair, gene expression, muscle regeneration, immune responses, etc. On the contrary, altered expression of micropeptides also plays a pivotal role in the progression of various diseases like cardiovascular diseases, neurological disorders and several types of cancer, including colorectal cancer, hepatocellular cancer, lung cancer, etc. This review delves into the dual impact of micropeptides on health and pathology, exploring their pivotal role in preserving normal physiological homeostasis and probing their involvement in the triggering and progression of diseases.
Collapse
Affiliation(s)
- Deepyaman Das
- Computational and Systems Biology Laboratory, Department of Microbiology, Raiganj University, Raiganj, Uttar Dinajpur, West Bengal-733134, India
| | - Soumita Podder
- Computational and Systems Biology Laboratory, Department of Microbiology, Raiganj University, Raiganj, Uttar Dinajpur, West Bengal-733134, India
| |
Collapse
|
38
|
Whited AM, Jungreis I, Allen J, Cleveland CL, Mudge JM, Kellis M, Rinn JL, Hough LE. Biophysical characterization of high-confidence, small human proteins. BIOPHYSICAL REPORTS 2024; 4:100167. [PMID: 38909903 PMCID: PMC11305224 DOI: 10.1016/j.bpr.2024.100167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 04/09/2024] [Accepted: 06/20/2024] [Indexed: 06/25/2024]
Abstract
Significant efforts have been made to characterize the biophysical properties of proteins. Small proteins have received less attention because their annotation has historically been less reliable. However, recent improvements in sequencing, proteomics, and bioinformatics techniques have led to the high-confidence annotation of small open reading frames (smORFs) that encode for functional proteins, producing smORF-encoded proteins (SEPs). SEPs have been found to perform critical functions in several species, including humans. While significant efforts have been made to annotate SEPs, less attention has been given to the biophysical properties of these proteins. We characterized the distributions of predicted and curated biophysical properties, including sequence composition, structure, localization, function, and disease association of a conservative list of previously identified human SEPs. We found significant differences between SEPs and both larger proteins and control sets. In addition, we provide an example of how our characterization of biophysical properties can contribute to distinguishing protein-coding smORFs from noncoding ones in otherwise ambiguous cases.
Collapse
Affiliation(s)
- A M Whited
- BioFrontiers Institute, University of Colorado, Boulder, Colorado
| | - Irwin Jungreis
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts; MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts
| | - Jeffre Allen
- BioFrontiers Institute, University of Colorado, Boulder, Colorado; Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado
| | | | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Manolis Kellis
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts; MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts
| | - John L Rinn
- BioFrontiers Institute, University of Colorado, Boulder, Colorado; Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado
| | - Loren E Hough
- BioFrontiers Institute, University of Colorado, Boulder, Colorado; Department of Physics, University of Colorado Boulder, Boulder, Colorado.
| |
Collapse
|
39
|
Deutsch EW, Kok LW, Mudge JM, Ruiz-Orera J, Fierro-Monti I, Sun Z, Abelin JG, Alba MM, Aspden JL, Bazzini AA, Bruford EA, Brunet MA, Calviello L, Carr SA, Carvunis AR, Chothani S, Clauwaert J, Dean K, Faridi P, Frankish A, Hubner N, Ingolia NT, Magrane M, Martin MJ, Martinez TF, Menschaert G, Ohler U, Orchard S, Rackham O, Roucou X, Slavoff SA, Valen E, Wacholder A, Weissman JS, Wu W, Xie Z, Choudhary J, Bassani-Sternberg M, Vizcaíno JA, Ternette N, Moritz RL, Prensner JR, van Heesch S. High-quality peptide evidence for annotating non-canonical open reading frames as human proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.09.612016. [PMID: 39314370 PMCID: PMC11419116 DOI: 10.1101/2024.09.09.612016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
A major scientific drive is to characterize the protein-coding genome as it provides the primary basis for the study of human health. But the fundamental question remains: what has been missed in prior genomic analyses? Over the past decade, the translation of non-canonical open reading frames (ncORFs) has been observed across human cell types and disease states, with major implications for proteomics, genomics, and clinical science. However, the impact of ncORFs has been limited by the absence of a large-scale understanding of their contribution to the human proteome. Here, we report the collaborative efforts of stakeholders in proteomics, immunopeptidomics, Ribo-seq ORF discovery, and gene annotation, to produce a consensus landscape of protein-level evidence for ncORFs. We show that at least 25% of a set of 7,264 ncORFs give rise to translated gene products, yielding over 3,000 peptides in a pan-proteome analysis encompassing 3.8 billion mass spectra from 95,520 experiments. With these data, we developed an annotation framework for ncORFs and created public tools for researchers through GENCODE and PeptideAtlas. This work will provide a platform to advance ncORF-derived proteins in biomedical discovery and, beyond humans, diverse animals and plants where ncORFs are similarly observed.
Collapse
Affiliation(s)
- Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, WA, 98109, USA
| | - Leron W Kok
- Princess Máxima Center for Pediatric Oncology, Utrecht, 3584 CS, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, 13125, Germany
| | - Ivo Fierro-Monti
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Zhi Sun
- Institute for Systems Biology (ISB), Seattle, WA, 98109, USA
| | | | - M Mar Alba
- Hospital del Mar Research Institute, Barcelona, Spain
- Catalan Institute for Research and Advanced Studies (ICREA), Barcelona, Spain
| | - Julie L Aspden
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, LS2 9JT, UK
| | - Ariel A Bazzini
- Stowers Institute for Medical Research, Kansas City, MO, 64110, USA
- Department of Molecular and Integrative Physiology, University of Kansas Medical Center, Kansas City, KS, 66160, USA
| | - Elspeth A Bruford
- HUGO Gene Nomenclature Committee (HGNC), Department of Haematology, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Marie A Brunet
- Pediatrics Department, University of Sherbrooke, Sherbrooke, Québec, Canada
- Centre de Recherche du Centre hospitalier universitaire de Sherbrooke (CRCHUS), Sherbrooke, Québec, Canada
| | | | - Steven A Carr
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, USA
- Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Sonia Chothani
- Centre for Computational Biology and Program in Cardiovascular and Metabolic Disorders, Duke-NUS (National University of Singapore) Medical School, Singapore
| | - Jim Clauwaert
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Kellie Dean
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Pouya Faridi
- Centre for Cancer Research, Hudson Institute of Medical Research, Clayton, VIC, Australia
- Monash Proteomics & Metabolomics Platform, Department of Medicine, School of Clinical Sciences, Monash University, Clayton, VIC, Australia
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Norbert Hubner
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, 13125, Germany
- Charité-Universitätsmedizin Berlin, Berlin, 10117, Germany
- Helmholtz-Institute for Translational AngioCardioScience (HI-TAC) of the Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC) at Heidelberg University, Heidelberg, 69117, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Berlin, Berlin, 13347, Germany
| | - Nicholas T Ingolia
- Department of Molecular and Cell Biology, Center for Computational Biology, University of California, Berkeley, Berkeley, CA, 94720-3202, USA
| | - Michele Magrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Maria Jesus Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Thomas F Martinez
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, CA, 92617, USA
- Department of Biological Chemistry, University of California, Irvine, Irvine, CA, 92617, USA
- Chao Family Comprehensive Cancer Center, University of California, Irvine, Irvine, CA, 92617, USA
| | - Gerben Menschaert
- Biobix, Lab of Bioinformatics and Computational Genomics, Department of Mathematical Modelling, Statistics and Bioinformatics, Ghent University, Ghent, Belgium
| | - Uwe Ohler
- Department of Biology, Humboldt University Berlin, Berlin, 10117, Germany
- Berlin Institute of Medical Systems Biology (BIMSB), Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, 10115, Germany
| | - Sandra Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | | | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, CT, 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06520, USA
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, 06516, USA
| | - Eivind Valen
- Department of Biosciences, University of Oslo, Oslo, Norway
| | - Aaron Wacholder
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, USA
- Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Jonathan S Weissman
- Whitehead Institute for Biomedical Research, Cambridge, MA, 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02142, USA
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA, 02138, USA
- David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Wei Wu
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), Singapore
- Department of Pharmacy & Pharmaceutical sciences, National University of Singapore (NUS), Singapore
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Jyoti Choudhary
- Functional Proteomics Group, Institute of Cancer Research, Chester Betty Labs, London, SW3 6JB, UK
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, 1005, Switzerland
- Department of Oncology, Centre hospitalier universitaire vaudois (CHUV), Lausanne, 1005, Switzerland
- Agora Cancer Research Centre, Lausanne, 1011, Switzerland
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Nicola Ternette
- School of Life Sciences, Division Cell Signalling and Immunology, University of Dundee, Dundee, DD1 5EH, UK
- Centre for Immuno-Oncology, University of Oxford, Oxford, OX37DQ, UK
| | - Robert L Moritz
- Institute for Systems Biology (ISB), Seattle, WA, 98109, USA
| | - John R Prensner
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Utrecht, 3584 CS, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| |
Collapse
|
40
|
Nichols C, Do-Thi VA, Peltier DC. Noncanonical microprotein regulation of immunity. Mol Ther 2024; 32:2905-2929. [PMID: 38734902 PMCID: PMC11403233 DOI: 10.1016/j.ymthe.2024.05.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/19/2024] [Accepted: 05/09/2024] [Indexed: 05/13/2024] Open
Abstract
The immune system is highly regulated but, when dysregulated, suboptimal protective or overly robust immune responses can lead to immune-mediated disorders. The genetic and molecular mechanisms of immune regulation are incompletely understood, impeding the development of more precise diagnostics and therapeutics for immune-mediated disorders. Recently, thousands of previously unrecognized noncanonical microprotein genes encoded by small open reading frames have been identified. Many of these microproteins perform critical functions, often in a cell- and context-specific manner. Several microproteins are now known to regulate immunity; however, the vast majority are uncharacterized. Therefore, illuminating what is often referred to as the "dark proteome," may present opportunities to tune immune responses more precisely. Here, we review noncanonical microprotein biology, highlight recently discovered examples regulating immunity, and discuss the potential and challenges of modulating dysregulated immune responses by targeting microproteins.
Collapse
Affiliation(s)
- Cydney Nichols
- Morris Green Scholars Program, Department of Pediatrics, Riley Hospital for Children, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Van Anh Do-Thi
- Division of Pediatric Hematology and Oncology, Department of Pediatrics, Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Daniel C Peltier
- Division of Pediatric Hematology and Oncology, Department of Pediatrics, Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA; Simon Cancer Center, Indiana University School of Medicine, Indianapolis, IN 46202, USA.
| |
Collapse
|
41
|
Houghton CJ, Coelho NC, Chiang A, Hedayati S, Parikh SB, Ozbaki-Yagan N, Wacholder A, Iannotta J, Berger A, Carvunis AR, O’Donnell AF. Cellular processing of beneficial de novo emerging proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.28.610198. [PMID: 39257767 PMCID: PMC11384008 DOI: 10.1101/2024.08.28.610198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
Novel proteins can originate de novo from non-coding DNA and contribute to species-specific adaptations. It is challenging to conceive how de novo emerging proteins may integrate pre-existing cellular systems to bring about beneficial traits, given that their sequences are previously unseen by the cell. To address this apparent paradox, we investigated 26 de novo emerging proteins previously associated with growth benefits in yeast. Microscopy revealed that these beneficial emerging proteins preferentially localize to the endoplasmic reticulum (ER). Sequence and structure analyses uncovered a common protein organization among all ER-localizing beneficial emerging proteins, characterized by a short hydrophobic C-terminus immediately preceded by a transmembrane domain. Using genetic and biochemical approaches, we showed that ER localization of beneficial emerging proteins requires the GET and SND pathways, both of which are evolutionarily conserved and known to recognize transmembrane domains to promote post-translational ER insertion. The abundance of ER-localizing beneficial emerging proteins was regulated by conserved proteasome- and vacuole-dependent processes, through mechanisms that appear to be facilitated by the emerging proteins' C-termini. Consequently, we propose that evolutionarily conserved pathways can convergently govern the cellular processing of de novo emerging proteins with unique sequences, likely owing to common underlying protein organization patterns.
Collapse
Affiliation(s)
- Carly J. Houghton
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, United States
| | - Nelson Castilho Coelho
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, United States
| | - Annette Chiang
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United States
| | - Stefanie Hedayati
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United States
| | - Saurin B. Parikh
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, United States
| | - Nejla Ozbaki-Yagan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United States
| | - Aaron Wacholder
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, United States
| | - John Iannotta
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, United States
| | - Alexis Berger
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, United States
| | - Anne-Ruxandra Carvunis
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, United States
| | - Allyson F. O’Donnell
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15213, United States
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United States
| |
Collapse
|
42
|
Abstract
Long non-coding RNAs (lncRNAs), once considered transcriptional noise, have emerged as critical regulators of gene expression and key players in cancer biology. Recent breakthroughs have revealed that certain lncRNAs can encode small open reading frame (sORF)-derived peptides, which are now understood to contribute to the pathogenesis of various cancers. This review synthesizes current knowledge on the detection, functional roles, and clinical implications of lncRNA-encoded peptides in cancer. We discuss technological advancements in the detection and validation of sORFs, including ribosome profiling and mass spectrometry, which have facilitated the discovery of these peptides. The functional roles of lncRNA-encoded peptides in cancer processes such as gene transcription, translation regulation, signal transduction, and metabolic reprogramming are explored in various types of cancer. The clinical potential of these peptides is highlighted, with a focus on their utility as diagnostic biomarkers, prognostic indicators, and therapeutic targets. The challenges and future directions in translating these findings into clinical practice are also discussed, including the need for large-scale validation, development of sensitive detection methods, and optimization of peptide stability and delivery.
Collapse
Affiliation(s)
- Yaguang Zhang
- Laboratory of Gastrointestinal Tumor Epigenetics and Genomics, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, People's Republic of China.
| |
Collapse
|
43
|
Vakirlis N, Acar O, Cherupally V, Carvunis AR. Ancestral Sequence Reconstruction as a Tool to Detect and Study De Novo Gene Emergence. Genome Biol Evol 2024; 16:evae151. [PMID: 39004885 PMCID: PMC11299112 DOI: 10.1093/gbe/evae151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 06/17/2024] [Accepted: 07/09/2024] [Indexed: 07/16/2024] Open
Abstract
New protein-coding genes can evolve from previously noncoding genomic regions through a process known as de novo gene emergence. Evidence suggests that this process has likely occurred throughout evolution and across the tree of life. Yet, confidently identifying de novo emerged genes remains challenging. Ancestral sequence reconstruction is a promising approach for inferring whether a gene has emerged de novo or not, as it allows us to inspect whether a given genomic locus ancestrally harbored protein-coding capacity. However, the use of ancestral sequence reconstruction in the context of de novo emergence is still in its infancy and its capabilities, limitations, and overall potential are largely unknown. Notably, it is difficult to formally evaluate the protein-coding capacity of ancestral sequences, particularly when new gene candidates are short. How well-suited is ancestral sequence reconstruction as a tool for the detection and study of de novo genes? Here, we address this question by designing an ancestral sequence reconstruction workflow incorporating different tools and sets of parameters and by introducing a formal criterion that allows to estimate, within a desired level of confidence, when protein-coding capacity originated at a particular locus. Applying this workflow on ∼2,600 short, annotated budding yeast genes (<1,000 nucleotides), we found that ancestral sequence reconstruction robustly predicts an ancient origin for the most widely conserved genes, which constitute "easy" cases. For less robust cases, we calculated a randomization-based empirical P-value estimating whether the observed conservation between the extant and ancestral reading frame could be attributed to chance. This formal criterion allowed us to pinpoint a branch of origin for most of the less robust cases, identifying 49 genes that can unequivocally be considered de novo originated since the split of the Saccharomyces genus, including 37 Saccharomyces cerevisiae-specific genes. We find that for the remaining equivocal cases we cannot rule out different evolutionary scenarios including rapid evolution, multiple gene losses, or a recent de novo origin. Overall, our findings suggest that ancestral sequence reconstruction is a valuable tool to study de novo gene emergence but should be applied with caution and awareness of its limitations.
Collapse
Affiliation(s)
- Nikolaos Vakirlis
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Omer Acar
- Pittsburgh Center for Evolutionary Biology and Medicine, Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Vijay Cherupally
- Pittsburgh Center for Evolutionary Biology and Medicine, Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Anne-Ruxandra Carvunis
- Pittsburgh Center for Evolutionary Biology and Medicine, Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| |
Collapse
|
44
|
Rich A, Acar O, Carvunis AR. Massively integrated coexpression analysis reveals transcriptional regulation, evolution and cellular implications of the yeast noncanonical translatome. Genome Biol 2024; 25:183. [PMID: 38978079 PMCID: PMC11232214 DOI: 10.1186/s13059-024-03287-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 05/20/2024] [Indexed: 07/10/2024] Open
Abstract
BACKGROUND Recent studies uncovered pervasive transcription and translation of thousands of noncanonical open reading frames (nORFs) outside of annotated genes. The contribution of nORFs to cellular phenotypes is difficult to infer using conventional approaches because nORFs tend to be short, of recent de novo origins, and lowly expressed. Here we develop a dedicated coexpression analysis framework that accounts for low expression to investigate the transcriptional regulation, evolution, and potential cellular roles of nORFs in Saccharomyces cerevisiae. RESULTS Our results reveal that nORFs tend to be preferentially coexpressed with genes involved in cellular transport or homeostasis but rarely with genes involved in RNA processing. Mechanistically, we discover that young de novo nORFs located downstream of conserved genes tend to leverage their neighbors' promoters through transcription readthrough, resulting in high coexpression and high expression levels. Transcriptional piggybacking also influences the coexpression profiles of young de novo nORFs located upstream of genes, but to a lesser extent and without detectable impact on expression levels. Transcriptional piggybacking influences, but does not determine, the transcription profiles of de novo nORFs emerging nearby genes. About 40% of nORFs are not strongly coexpressed with any gene but are transcriptionally regulated nonetheless and tend to form entirely new transcription modules. We offer a web browser interface ( https://carvunislab.csb.pitt.edu/shiny/coexpression/ ) to efficiently query, visualize, and download our coexpression inferences. CONCLUSIONS Our results suggest that nORF transcription is highly regulated. Our coexpression dataset serves as an unprecedented resource for unraveling how nORFs integrate into cellular networks, contribute to cellular phenotypes, and evolve.
Collapse
Affiliation(s)
- April Rich
- Joint Carnegie Mellon University-University of Pittsburgh, University of Pittsburgh Computational Biology PhD Program, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), University of Pittsburgh, Pittsburgh, PA, USA
| | - Omer Acar
- Joint Carnegie Mellon University-University of Pittsburgh, University of Pittsburgh Computational Biology PhD Program, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), University of Pittsburgh, Pittsburgh, PA, USA
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.
- Pittsburgh Center for Evolutionary Biology and Medicine (CEBaM), University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
45
|
Vara C, Montañés JC, Albà MM. High Polymorphism Levels of De Novo ORFs in a Yoruba Human Population. Genome Biol Evol 2024; 16:evae126. [PMID: 38934859 PMCID: PMC11221430 DOI: 10.1093/gbe/evae126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 05/08/2024] [Accepted: 06/01/2024] [Indexed: 06/28/2024] Open
Abstract
During evolution, new open reading frames (ORFs) with the potential to give rise to novel proteins continuously emerge. A recent compilation of noncanonical ORFs with translation signatures in humans has identified thousands of cases with a putative de novo origin. However, it is not known which is their distribution in the population. Are they universally translated? Here, we use ribosome profiling data from 65 lymphoblastoid cell lines from individuals of Yoruba origin to investigate this question. We identify 2,587 de novo ORFs translated in at least one of the cell lines. In line with their de novo origin, the encoded proteins tend to be smaller than 100 amino acids and encode positively charged proteins. We observe that the de novo ORFs are more polymorphic in the population than the set of canonical proteins, with a substantial fraction of them being translated in only some of the cell lines. Remarkably, this difference remains significant after controlling for differences in the translation levels. These results suggest that variations in the level translation of de novo ORFs could be a relevant source of intraspecies phenotypic diversity in humans.
Collapse
Affiliation(s)
- Covadonga Vara
- Research Programme on Biomedical Informatics (GRIB),Hospital del Mar Research Institute, Barcelona, Spain
| | - José Carlos Montañés
- Research Programme on Biomedical Informatics (GRIB),Hospital del Mar Research Institute, Barcelona, Spain
| | - M Mar Albà
- Research Programme on Biomedical Informatics (GRIB),Hospital del Mar Research Institute, Barcelona, Spain
- Catalan Institute for Research and Advanced Studies (ICREA), Barcelona, Spain
| |
Collapse
|
46
|
Bonnet C, Dian AL, Espie-Caullet T, Fabbri L, Lagadec L, Pivron T, Dutertre M, Luco R, Navickas A, Vagner S, Verga D, Uguen P. Post-transcriptional gene regulation: From mechanisms to RNA chemistry and therapeutics. Bull Cancer 2024; 111:782-790. [PMID: 38824069 DOI: 10.1016/j.bulcan.2024.04.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 03/22/2024] [Accepted: 04/03/2024] [Indexed: 06/03/2024]
Abstract
A better understanding of the RNA biology and chemistry is necessary to then develop new RNA therapeutic strategies. This review is the synthesis of a series of conferences that took place during the 6th international course on post-transcriptional gene regulation at Institut Curie. This year, the course made a special focus on RNA chemistry.
Collapse
Affiliation(s)
- Clara Bonnet
- CNRS UMR3348 Genome integrity, RNA and Cancer, Institut Curie, University Paris-Saclay, 91401 Orsay, France
| | - Ana Luisa Dian
- CNRS UMR3348 Genome integrity, RNA and Cancer, Institut Curie, University Paris-Saclay, 91401 Orsay, France
| | - Tristan Espie-Caullet
- CNRS UMR3348 Genome integrity, RNA and Cancer, Institut Curie, University Paris-Saclay, 91401 Orsay, France
| | - Lucilla Fabbri
- CNRS UMR3348 Genome integrity, RNA and Cancer, Institut Curie, University Paris-Saclay, 91401 Orsay, France
| | - Lucie Lagadec
- CNRS UMR3348 Genome integrity, RNA and Cancer, Institut Curie, University Paris-Saclay, 91401 Orsay, France
| | - Thibaud Pivron
- CNRS UMR3348 Genome integrity, RNA and Cancer, Institut Curie, University Paris-Saclay, 91401 Orsay, France
| | - Martin Dutertre
- CNRS UMR3348 Genome integrity, RNA and Cancer, Institut Curie, University Paris-Saclay, 91401 Orsay, France
| | - Reini Luco
- CNRS UMR3348 Genome integrity, RNA and Cancer, Institut Curie, University Paris-Saclay, 91401 Orsay, France
| | - Albertas Navickas
- CNRS UMR3348 Genome integrity, RNA and Cancer, Institut Curie, University Paris-Saclay, 91401 Orsay, France
| | - Stephan Vagner
- CNRS UMR3348 Genome integrity, RNA and Cancer, Institut Curie, University Paris-Saclay, 91401 Orsay, France
| | - Daniela Verga
- CNRS UMR9187, Inserm U1196, Chemistry and Modelling for the Biology of Cancer, Institut Curie, université Paris-Saclay, 91405 Orsay, France
| | - Patricia Uguen
- CNRS UMR3348 Genome integrity, RNA and Cancer, Institut Curie, University Paris-Saclay, 91401 Orsay, France.
| |
Collapse
|
47
|
Liang Y, Lv D, Liu K, Yang L, Shu H, Wen L, Lv C, Sun Q, Yin J, Liu H, Xu J, Liu Z, Ding N. MicroProteinDB: A database to provide knowledge on sequences, structures and function of ncRNA-derived microproteins. Comput Biol Med 2024; 177:108660. [PMID: 38820774 DOI: 10.1016/j.compbiomed.2024.108660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 05/08/2024] [Accepted: 05/26/2024] [Indexed: 06/02/2024]
Abstract
Omics-based technologies have revolutionized our comprehension of microproteins encoded by ncRNAs, revealing their abundant presence and pivotal roles within complex functional landscapes. Here, we developed MicroProteinDB (http://bio-bigdata.hrbmu.edu.cn/MicroProteinDB), which offers and visualizes the extensive knowledge to aid retrieval and analysis of computationally predicted and experimentally validated microproteins originating from various ncRNA types. Employing prediction algorithms grounded in diverse deep learning approaches, MicroProteinDB comprehensively documents the fundamental physicochemical properties, secondary and tertiary structures, interactions with functional proteins, family domains, and inter-species conservation of microproteins. With five major analytical modules, it will serve as a valuable knowledge for investigating ncRNA-derived microproteins.
Collapse
Affiliation(s)
- Yinan Liang
- The First Affiliated Hospital, Harbin Medical University, Harbin, 150001, China
| | - Dezhong Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Kefan Liu
- School of Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, 150081, China
| | - Liting Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Huan Shu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Luan Wen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Chongwen Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Qisen Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Jiaqi Yin
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Hui Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Juan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
| | - Zhigang Liu
- Affiliated Foshan Maternity&Child Healthcare Hospital, Southern Medical University, Guangzhou, 510000, China.
| | - Na Ding
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
| |
Collapse
|
48
|
Mohsen JJ, Mohsen MG, Jiang K, Landajuela A, Quinto L, Isaacs FJ, Karatekin E, Slavoff SA. Cellular function of the GndA small open reading frame-encoded polypeptide during heat shock. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.29.601336. [PMID: 38979229 PMCID: PMC11230408 DOI: 10.1101/2024.06.29.601336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Over the past 15 years, hundreds of previously undiscovered bacterial small open reading frame (sORF)-encoded polypeptides (SEPs) of fewer than fifty amino acids have been identified, and biological functions have been ascribed to an increasing number of SEPs from intergenic regions and small RNAs. However, despite numbering in the dozens in Escherichia coli, and hundreds to thousands in humans, same-strand nested sORFs that overlap protein coding genes in alternative reading frames remain understudied. In order to provide insight into this enigmatic class of unannotated genes, we characterized GndA, a 36-amino acid, heat shock-regulated SEP encoded within the +2 reading frame of the gnd gene in E. coli K-12 MG1655. We show that GndA pulls down components of respiratory complex I (RCI) and is required for proper localization of a RCI subunit during heat shock. At high temperature GndA deletion (ΔGndA) cells exhibit perturbations in cell growth, NADH+/NAD ratio, and expression of a number of genes including several associated with oxidative stress. These findings suggest that GndA may function in maintenance of homeostasis during heat shock. Characterization of GndA therefore supports the nascent but growing consensus that functional, overlapping genes occur in genomes from viruses to humans.
Collapse
Affiliation(s)
- Jessica J. Mohsen
- Department of Chemistry, Yale University, New Haven, CT 06511
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516
| | - Michael G. Mohsen
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06511
- Howard Hughes Medical Institute, Yale University, New Haven, CT 06511
| | - Kevin Jiang
- Department of Chemistry, Yale University, New Haven, CT 06511
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516
| | - Ane Landajuela
- Department of Cellular and Molecular Physiology, Yale School of Medicine, New Haven, CT 06510
- Nanobiology Institute, Yale University, West Haven, CT 06516
| | - Laura Quinto
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06511
- Systems Biology Institute, Yale University, West Haven, CT 06516
| | - Farren J. Isaacs
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06511
- Systems Biology Institute, Yale University, West Haven, CT 06516
| | - Erdem Karatekin
- Department of Cellular and Molecular Physiology, Yale School of Medicine, New Haven, CT 06510
- Nanobiology Institute, Yale University, West Haven, CT 06516
- Wu Tsai Institute, Yale University, New Haven, CT 06511
- Université de Paris, Saints-Pères Paris Institute for the Neurosciences (SPPIN), Centre National de la Recherche Scientifique (CNRS), 75006 Paris, France
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06511
| | - Sarah A. Slavoff
- Department of Chemistry, Yale University, New Haven, CT 06511
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06511
| |
Collapse
|
49
|
Sanejouand YH. Are Most Human-Specific Proteins Encoded by Long Noncoding RNAs? J Mol Evol 2024:10.1007/s00239-024-10174-z. [PMID: 38916610 DOI: 10.1007/s00239-024-10174-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 05/03/2024] [Indexed: 06/26/2024]
Abstract
By looking for a lack of homologs in a reference database of 27 well-annotated proteomes of primates and 52 well-annotated proteomes of other mammals, 170 putative human-specific proteins were identified. While most of them are deemed uncertain, 2 are known at the protein level and 23 at the transcript level, according to UniProt. Interestingly, 23 of these 25 proteins are found to be encoded or to have close homologs in an open reading frame of a long noncoding human RNA. However, half of them are predicted to be at least 80% globular, with a single structural domain, according to IUPred, and with at least 80% of ordered residues, according to flDPnn. Strikingly, there is a near-complete lack of structural knowledge about these proteins, with no tertiary structure presently available in the Protein Data Bank and a fair prediction for one of them in the AlphaFold Protein Structure Database. Moreover, knowledge about the function of these possibly key proteins remains scarce.
Collapse
Affiliation(s)
- Yves-Henri Sanejouand
- US2B, UMR 6286 of CNRS, Nantes University, 2 rue de la Houssinière, Nantes, 44322, Pays de la Loire, France.
| |
Collapse
|
50
|
Dasgupta A, Prensner JR. Upstream open reading frames: new players in the landscape of cancer gene regulation. NAR Cancer 2024; 6:zcae023. [PMID: 38774471 PMCID: PMC11106035 DOI: 10.1093/narcan/zcae023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 04/29/2024] [Accepted: 05/07/2024] [Indexed: 05/24/2024] Open
Abstract
The translation of RNA by ribosomes represents a central biological process and one of the most dysregulated processes in cancer. While translation is traditionally thought to occur exclusively in the protein-coding regions of messenger RNAs (mRNAs), recent transcriptome-wide approaches have shown abundant ribosome activity across diverse stretches of RNA transcripts. The most common type of this kind of ribosome activity occurs in gene leader sequences, also known as 5' untranslated regions (UTRs) of the mRNA, that precede the main coding sequence. Translation of these upstream open reading frames (uORFs) is now known to occur in upwards of 25% of all protein-coding genes. With diverse functions from RNA regulation to microprotein generation, uORFs are rapidly igniting a new arena of cancer biology, where they are linked to cancer genetics, cancer signaling, and tumor-immune interactions. This review focuses on the contributions of uORFs and their associated 5'UTR sequences to cancer biology.
Collapse
Affiliation(s)
- Anwesha Dasgupta
- Chad Carr Pediatric Brain Tumor Center, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - John R Prensner
- Chad Carr Pediatric Brain Tumor Center, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| |
Collapse
|