1
|
Duan XY, Song L, Jin Q, Yang XN, Liu HH, Wang C, Lu X, Ji XJ, Wang Z, Tian Y. Enhancing Cordycepin Biosynthesis in Yarrowia lipolytica via Lipid Droplets Compartmentalization Engineering and Optimized Fermentation Strategies. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2025. [PMID: 40367369 DOI: 10.1021/acs.jafc.5c03654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2025]
Abstract
Cordycepin, a physiologically active nucleoside compound with broad applications in healthcare, is biosynthesized in Cordyceps militaris through a protein complex formed by CmCns1 and CmCns2. To enhance cordycepin heterologous production in Yarrowia lipolytica, this study confirmed the colocalization of CmCns1 and CmCns2 on lipid droplets, with CmCns1 dominating this process by recruiting CmCns2 from the cytoplasm to lipid droplets via strong interactions. Critical lipid-droplet-targeting motifs within CmCns1 were identified. On this basis, an engineered strain YL-CD3 was developed by expanding the lipid droplets and CmCns3-NK compartmentalization. Then, the fermentation parameters were optimized to increase the yield of cordycepin to 2008.23 mg/L in shake flasks. Finally, fed-batch fermentation in a 2.4 L bioreactor for 144 h achieved 4780.75 mg/L (150.1 mg/OD600 and 66.57 mg/g glucose), marking the highest reported titer in Y. lipolytica. This work establishes Y. lipolytica as a high-potential platform for efficient cordycepin biosynthesis.
Collapse
Affiliation(s)
- Xi-Yu Duan
- College of Life Science, Hunan Normal University, No. 36 Lushan Road, Changsha 410081, P. R. China
| | - Liping Song
- College of Bioscience and Biotechnology, Hunan Agricultural University, No. 1 Nongda Road, Changsha 410128, P. R. China
| | - Qing Jin
- College of Bioscience and Biotechnology, Hunan Agricultural University, No. 1 Nongda Road, Changsha 410128, P. R. China
| | - Xiao-Na Yang
- College of Bioscience and Biotechnology, Hunan Agricultural University, No. 1 Nongda Road, Changsha 410128, P. R. China
| | - Hu-Hu Liu
- College of Bioscience and Biotechnology, Hunan Agricultural University, No. 1 Nongda Road, Changsha 410128, P. R. China
| | - Chong Wang
- College of Bioscience and Biotechnology, Hunan Agricultural University, No. 1 Nongda Road, Changsha 410128, P. R. China
| | - Xiangyang Lu
- College of Bioscience and Biotechnology, Hunan Agricultural University, No. 1 Nongda Road, Changsha 410128, P. R. China
| | - Xiao-Jun Ji
- College of Biotechnology and Pharmaceutical Engineering, Nanjing Tech University, No. 30 South Puzhu Road, Nanjing 211816, P. R. China
| | - Zhi Wang
- College of Life Science, Hunan Normal University, No. 36 Lushan Road, Changsha 410081, P. R. China
| | - Yun Tian
- College of Bioscience and Biotechnology, Hunan Agricultural University, No. 1 Nongda Road, Changsha 410128, P. R. China
- Institute of Agricultural Quality Standard and Testing, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa 850032, P. R. China
| |
Collapse
|
2
|
Floyd BM, Schmidt EL, Till NA, Yang JL, Liao P, George BM, Flynn RA, Bertozzi CR. Mapping the nanoscale organization of the human cell surface proteome reveals new functional associations and surface antigen clusters. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.12.637979. [PMID: 40027624 PMCID: PMC11870420 DOI: 10.1101/2025.02.12.637979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
The cell surface is a dynamic interface that controls cell-cell communication and signal transduction relevant to organ development, homeostasis and repair, immune reactivity, and pathologies driven by aberrant cell surface phenotypes. The spatial organization of cell surface proteins is central to these processes. High-resolution fluorescence microscopy and proximity labeling have advanced studies of surface protein associations, but the spatial organization of the complete surface proteome remains uncharted. In this study, we systematically mapped the surface proteome of human T-lymphocytes and B-lymphoblasts using proximity labeling of 85 antigens, identified from over 100 antibodies tested for binding to surface-exposed proteins. These experiments were coupled with an optimized data-independent acquisition mass spectrometry workflow to generate a robust dataset. Unsupervised clustering of the resulting interactome revealed functional modules, including well-characterized complexes such as the T-cell receptor and HLA class I/II, alongside novel clusters. Notably, we identified mitochondrial proteins localized to the surface, including the transcription factor TFAM, suggesting previously unappreciated roles for mitochondrial proteins at the plasma membrane. A high-accuracy machine learning classifier predicted over 6,000 surface protein associations, highlighting functional associations such as IL10RB's role as a negative regulator of type I interferon signaling. Spatial modeling of the surface proteome provided insights into protein dispersion patterns, distinguishing widely distributed proteins, such as CD45, from localized antigens, such as CD226 pointing to active mechanisms of regulating surface organization. This work provides a comprehensive map of the human surfaceome and a resource for exploring the spatial and functional dynamics of the cell membrane proteome.
Collapse
Affiliation(s)
- Brendan M Floyd
- Sarafan ChEM-H and Department of Chemistry, Stanford University, Stanford, CA, USA
- Lead contact
| | - Elizabeth L Schmidt
- Sarafan ChEM-H and Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Nicholas A Till
- Sarafan ChEM-H and Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Jonathan L Yang
- Sarafan ChEM-H and Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Pinyu Liao
- Sarafan ChEM-H and Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Benson M George
- Stem Cell Program and Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA, USA
| | - Ryan A Flynn
- Stem Cell Program and Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA, USA
- Harvard Stem Cell Institute, Harvard University, Cambridge, MA, USA
| | - Carolyn R Bertozzi
- Sarafan ChEM-H and Department of Chemistry, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford, CA, USA
- Lead contact
| |
Collapse
|
3
|
Afonin DA, Gerasimov ES, Škodová-Sveráková I, Záhonová K, Gahura O, Albanaz ATS, Myšková E, Bykova A, Paris Z, Lukeš J, Opperdoes FR, Horváth A, Zimmer SL, Yurchenko V. Blastocrithidia nonstop mitochondrial genome and its expression are remarkably insulated from nuclear codon reassignment. Nucleic Acids Res 2024; 52:3870-3885. [PMID: 38452217 DOI: 10.1093/nar/gkae168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 02/20/2024] [Accepted: 02/27/2024] [Indexed: 03/09/2024] Open
Abstract
The canonical stop codons of the nuclear genome of the trypanosomatid Blastocrithidia nonstop are recoded. Here, we investigated the effect of this recoding on the mitochondrial genome and gene expression. Trypanosomatids possess a single mitochondrion and protein-coding transcripts of this genome require RNA editing in order to generate open reading frames of many transcripts encoded as 'cryptogenes'. Small RNAs that can number in the hundreds direct editing and produce a mitochondrial transcriptome of unusual complexity. We find B. nonstop to have a typical trypanosomatid mitochondrial genetic code, which presumably requires the mitochondrion to disable utilization of the two nucleus-encoded suppressor tRNAs, which appear to be imported into the organelle. Alterations of the protein factors responsible for mRNA editing were also documented, but they have likely originated from sources other than B. nonstop nuclear genome recoding. The population of guide RNAs directing editing is minimal, yet virtually all genes for the plethora of known editing factors are still present. Most intriguingly, despite lacking complex I cryptogene guide RNAs, these cryptogene transcripts are stochastically edited to high levels.
Collapse
MESH Headings
- Genome, Mitochondrial
- RNA Editing
- Cell Nucleus/genetics
- Cell Nucleus/metabolism
- RNA, Transfer/genetics
- RNA, Transfer/metabolism
- Open Reading Frames/genetics
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- Trypanosomatina/genetics
- Trypanosomatina/metabolism
- Codon/genetics
- Mitochondria/genetics
- Mitochondria/metabolism
- Codon, Terminator/genetics
- RNA, Guide, Kinetoplastida/genetics
- RNA, Guide, Kinetoplastida/metabolism
- Genetic Code
- Protozoan Proteins/genetics
- Protozoan Proteins/metabolism
Collapse
Affiliation(s)
- Dmitry A Afonin
- Faculty of Biology, Lomonosov Moscow State University, Moscow 119991, Russia
| | - Evgeny S Gerasimov
- Faculty of Biology, Lomonosov Moscow State University, Moscow 119991, Russia
- Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127051, Russia
| | - Ingrid Škodová-Sveráková
- Life Science Research Centre, Faculty of Science, University of Ostrava, 710 00 Ostrava, Czechia
- Department of Biochemistry, Faculty of Natural Sciences, Comenius University, 842 15 Bratislava, Slovakia
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, 370 05 České Budějovice, Czechia
| | - Kristína Záhonová
- Life Science Research Centre, Faculty of Science, University of Ostrava, 710 00 Ostrava, Czechia
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, 370 05 České Budějovice, Czechia
- Department of Parasitology, Faculty of Science, Charles University, BIOCEV 252 50 Vestec, Czechia
- Division of Infectious Diseases, Department of Medicine, University of Alberta, T6G 2R3 Edmonton, Alberta, Canada
| | - Ondřej Gahura
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, 370 05 České Budějovice, Czechia
| | - Amanda T S Albanaz
- Life Science Research Centre, Faculty of Science, University of Ostrava, 710 00 Ostrava, Czechia
| | - Eva Myšková
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, 370 05 České Budějovice, Czechia
| | - Anastassia Bykova
- Life Science Research Centre, Faculty of Science, University of Ostrava, 710 00 Ostrava, Czechia
| | - Zdeněk Paris
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, 370 05 České Budějovice, Czechia
- Faculty of Science, University of South Bohemia, 370 05 České Budějovice, Czechia
| | - Julius Lukeš
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, 370 05 České Budějovice, Czechia
- Faculty of Science, University of South Bohemia, 370 05 České Budějovice, Czechia
| | - Fred R Opperdoes
- De Duve Institute, Université Catholique de Louvain, 1200 Brussels, Belgium
| | - Anton Horváth
- Department of Biochemistry, Faculty of Natural Sciences, Comenius University, 842 15 Bratislava, Slovakia
| | - Sara L Zimmer
- University of Minnesota Medical School, Duluth Campus, Duluth, MN 55812, USA
| | - Vyacheslav Yurchenko
- Life Science Research Centre, Faculty of Science, University of Ostrava, 710 00 Ostrava, Czechia
| |
Collapse
|
4
|
Butenko A, Lukeš J, Speijer D, Wideman JG. Mitochondrial genomes revisited: why do different lineages retain different genes? BMC Biol 2024; 22:15. [PMID: 38273274 PMCID: PMC10809612 DOI: 10.1186/s12915-024-01824-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 01/11/2024] [Indexed: 01/27/2024] Open
Abstract
The mitochondria contain their own genome derived from an alphaproteobacterial endosymbiont. From thousands of protein-coding genes originally encoded by their ancestor, only between 1 and about 70 are encoded on extant mitochondrial genomes (mitogenomes). Thanks to a dramatically increasing number of sequenced and annotated mitogenomes a coherent picture of why some genes were lost, or relocated to the nucleus, is emerging. In this review, we describe the characteristics of mitochondria-to-nucleus gene transfer and the resulting varied content of mitogenomes across eukaryotes. We introduce a 'burst-upon-drift' model to best explain nuclear-mitochondrial population genetics with flares of transfer due to genetic drift.
Collapse
Affiliation(s)
- Anzhelika Butenko
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice (Budweis), Czech Republic
- Faculty of Science, University of Ostrava, Ostrava, Czech Republic
- Faculty of Sciences, University of South Bohemia, České Budějovice (Budweis), Czech Republic
| | - Julius Lukeš
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice (Budweis), Czech Republic
- Faculty of Sciences, University of South Bohemia, České Budějovice (Budweis), Czech Republic
| | - Dave Speijer
- Medical Biochemistry, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Jeremy G Wideman
- Center for Mechanisms of Evolution, Biodesign Institute, School of Life Sciences, Arizona State University, Tempe, USA.
| |
Collapse
|
5
|
Madera D, Alonso-Gómez A, Delgado MJ, Valenciano AI, Alonso-Gómez ÁL. Gene Characterization of Nocturnin Paralogues in Goldfish: Full Coding Sequences, Structure, Phylogeny and Tissue Expression. Int J Mol Sci 2023; 25:54. [PMID: 38203224 PMCID: PMC10779419 DOI: 10.3390/ijms25010054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 12/05/2023] [Accepted: 12/14/2023] [Indexed: 01/12/2024] Open
Abstract
The aim of this work is the full characterization of all the nocturnin (noc) paralogues expressed in a teleost, the goldfish. An in silico analysis of the evolutive origin of noc in Osteichthyes is performed, including the splicing variants and new paralogues appearing after teleostean 3R genomic duplication and the cyprinine 4Rc. After sequencing the full-length mRNA of goldfish, we obtained two isoforms for noc-a (noc-aa and noc-ab) with two splice variants (I and II), and only one for noc-b (noc-bb) with two transcripts (II and III). Using the splicing variant II, the prediction of the secondary and tertiary structures renders a well-conserved 3D distribution of four α-helices and nine β-sheets in the three noc isoforms. A synteny analysis based on the localization of noc genes in the patrilineal or matrilineal subgenomes and a phylogenetic tree of protein sequences were accomplished to stablish a classification and a long-lasting nomenclature of noc in goldfish, and valid to be extrapolated to allotetraploid Cyprininae. Finally, both goldfish and zebrafish showed a broad tissue expression of all the noc paralogues. Moreover, the enriched expression of specific paralogues in some tissues argues in favour of neo- or subfunctionalization.
Collapse
Affiliation(s)
| | | | | | | | - Ángel Luis Alonso-Gómez
- Departamento de Genética, Fisiología y Microbiología, Universidad Complutense de Madrid, 28040 Madrid, Spain; (D.M.); (A.A.-G.); (M.J.D.); (A.I.V.)
| |
Collapse
|
6
|
Zou K, Wang S, Wang Z, Zou H, Yang F. Dual-Signal Feature Spaces Map Protein Subcellular Locations Based on Immunohistochemistry Image and Protein Sequence. SENSORS (BASEL, SWITZERLAND) 2023; 23:9014. [PMID: 38005402 PMCID: PMC10675401 DOI: 10.3390/s23229014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 10/29/2023] [Accepted: 11/01/2023] [Indexed: 11/26/2023]
Abstract
Protein is one of the primary biochemical macromolecular regulators in the compartmental cellular structure, and the subcellular locations of proteins can therefore provide information on the function of subcellular structures and physiological environments. Recently, data-driven systems have been developed to predict the subcellular location of proteins based on protein sequence, immunohistochemistry (IHC) images, or immunofluorescence (IF) images. However, the research on the fusion of multiple protein signals has received little attention. In this study, we developed a dual-signal computational protocol by incorporating IHC images into protein sequences to learn protein subcellular localization. Three major steps can be summarized as follows in this protocol: first, a benchmark database that includes 281 proteins sorted out from 4722 proteins of the Human Protein Atlas (HPA) and Swiss-Prot database, which is involved in the endoplasmic reticulum (ER), Golgi apparatus, cytosol, and nucleoplasm; second, discriminative feature operators were first employed to quantitate protein image-sequence samples that include IHC images and protein sequence; finally, the feature subspace of different protein signals is absorbed to construct multiple sub-classifiers via dimensionality reduction and binary relevance (BR), and multiple confidence derived from multiple sub-classifiers is adopted to decide subcellular location by the centralized voting mechanism at the decision layer. The experimental results indicated that the dual-signal model embedded IHC images and protein sequences outperformed the single-signal models with accuracy, precision, and recall of 75.41%, 80.38%, and 74.38%, respectively. It is enlightening for further research on protein subcellular location prediction under multi-signal fusion of protein.
Collapse
Affiliation(s)
- Kai Zou
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330038, China
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Simeng Wang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330038, China
| | - Ziqian Wang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330038, China
| | - Hongliang Zou
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330038, China
| | - Fan Yang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330038, China
- Artificial Intelligence and Bioinformation Cognition Laboratory, Jiangxi Science and Technology Normal University, Nanchang 330038, China
| |
Collapse
|
7
|
Gómez-Pérez D, Schmid M, Chaudhry V, Hu Y, Velic A, Maček B, Ruhe J, Kemen A, Kemen E. Proteins released into the plant apoplast by the obligate parasitic protist Albugo selectively repress phyllosphere-associated bacteria. THE NEW PHYTOLOGIST 2023; 239:2320-2334. [PMID: 37222268 DOI: 10.1111/nph.18995] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 04/11/2023] [Indexed: 05/25/2023]
Abstract
Biotic and abiotic interactions shape natural microbial communities. The mechanisms behind microbe-microbe interactions, particularly those protein based, are not well understood. We hypothesize that released proteins with antimicrobial activity are a powerful and highly specific toolset to shape and defend plant niches. We have studied Albugo candida, an obligate plant parasite from the protist Oomycota phylum, for its potential to modulate the growth of bacteria through release of antimicrobial proteins into the apoplast. Amplicon sequencing and network analysis of Albugo-infected and uninfected wild Arabidopsis thaliana samples revealed an abundance of negative correlations between Albugo and other phyllosphere microbes. Analysis of the apoplastic proteome of Albugo-colonized leaves combined with machine learning predictors enabled the selection of antimicrobial candidates for heterologous expression and study of their inhibitory function. We found for three candidate proteins selective antimicrobial activity against Gram-positive bacteria isolated from A. thaliana and demonstrate that these inhibited bacteria are precisely important for the stability of the community structure. We could ascribe the antibacterial activity of the candidates to intrinsically disordered regions and positively correlate it with their net charge. This is the first report of protist proteins with antimicrobial activity under apoplastic conditions that therefore are potential biocontrol tools for targeted manipulation of the microbiome.
Collapse
Affiliation(s)
- Daniel Gómez-Pérez
- Microbial Interactions in Plant Ecosystems, Center for Plant Molecular Biology, University of Tübingen, 72076, Tübingen, Germany
| | - Monja Schmid
- Microbial Interactions in Plant Ecosystems, Center for Plant Molecular Biology, University of Tübingen, 72076, Tübingen, Germany
| | - Vasvi Chaudhry
- Microbial Interactions in Plant Ecosystems, Center for Plant Molecular Biology, University of Tübingen, 72076, Tübingen, Germany
| | - Yiheng Hu
- Microbial Interactions in Plant Ecosystems, Center for Plant Molecular Biology, University of Tübingen, 72076, Tübingen, Germany
| | - Ana Velic
- Department of Biology, Quantitative Proteomics Group, Interfaculty Institute of Cell Biology, University of Tübingen, 72076, Tübingen, Germany
| | - Boris Maček
- Department of Biology, Quantitative Proteomics Group, Interfaculty Institute of Cell Biology, University of Tübingen, 72076, Tübingen, Germany
| | - Jonas Ruhe
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
| | - Ariane Kemen
- Microbial Interactions in Plant Ecosystems, Center for Plant Molecular Biology, University of Tübingen, 72076, Tübingen, Germany
| | - Eric Kemen
- Microbial Interactions in Plant Ecosystems, Center for Plant Molecular Biology, University of Tübingen, 72076, Tübingen, Germany
| |
Collapse
|
8
|
Lear SK, Nunez JA, Shipman SL. High-throughput colocalization pipeline quantifies efficacy of mitochondrial targeting signals across different protein types. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.03.535288. [PMID: 37066162 PMCID: PMC10103990 DOI: 10.1101/2023.04.03.535288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Efficient metabolic engineering and the development of mitochondrial therapeutics often rely upon the specific and strong import of foreign proteins into mitochondria. Fusing a protein to a mitochondria-bound signal peptide is a common method to localize proteins to mitochondria, but this strategy is not universally effective with particular proteins empirically failing to localize. To help overcome this barrier, this work develops a generalizable and open-source framework to design proteins for mitochondrial import and quantify their specific localization. By using a Python-based pipeline to quantitatively assess the colocalization of different proteins previously used for precise genome editing in a high-throughput manner, we reveal signal peptide-protein combinations that localize well in mitochondria and, more broadly, general trends about the overall reliability of commonly used mitochondrial targeting signals.
Collapse
Affiliation(s)
- Sierra K Lear
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
- Graduate Program in Bioengineering, University of California, San Francisco and Berkeley, CA, USA
| | - Jose A Nunez
- Department of Mechanical Engineering, University of California, Santa Barbara, CA, USA
| | - Seth L Shipman
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA
- Chan Zuckerberg Biohub - San Francisco, San Francisco, CA, USA
| |
Collapse
|
9
|
Nguyen TTT, Katt WP, Cerione RA. Alone and together: current approaches to targeting glutaminase enzymes as part of anti-cancer therapies. FUTURE DRUG DISCOVERY 2023; 4:FDD79. [PMID: 37009252 PMCID: PMC10051075 DOI: 10.4155/fdd-2022-0011] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 02/10/2023] [Indexed: 03/29/2023] Open
Abstract
Metabolic reprogramming is a major hallmark of malignant transformation in cancer, and part of the so-called Warburg effect, in which the upregulation of glutamine catabolism plays a major role. The glutaminase enzymes convert glutamine to glutamate, which initiates this pathway. Inhibition of different forms of glutaminase (KGA, GAC, or LGA) demonstrated potential as an emerging anti-cancer therapeutic strategy. The regulation of these enzymes, and the molecular basis for their inhibition, have been the focus of much recent research. This review will explore the recent progress in understanding the molecular basis for activation and inhibition of different forms of glutaminase, as well as the recent focus on combination therapies of glutaminase inhibitors with other anti-cancer drugs.
Collapse
Affiliation(s)
- Thuy-Tien T Nguyen
- Department of Chemistry & Chemical Biology, Cornell University, Ithaca, NY 14853, USA
| | - William P Katt
- Department of Molecular Medicine, Cornell University, Ithaca, NY 14853, USA
| | - Richard A Cerione
- Department of Chemistry & Chemical Biology, Cornell University, Ithaca, NY 14853, USA
- Department of Molecular Medicine, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
10
|
Xue J, Zhou J, Li J, Du G, Chen J, Wang M, Zhao X. Systematic engineering of Saccharomyces cerevisiae for efficient synthesis of hemoglobins and myoglobins. BIORESOURCE TECHNOLOGY 2023; 370:128556. [PMID: 36586429 DOI: 10.1016/j.biortech.2022.128556] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 12/26/2022] [Accepted: 12/27/2022] [Indexed: 05/26/2023]
Abstract
Hemoglobin (Hb) and myoglobin (Mb) are kinds of heme-binding proteins that play crucial physiological roles in different organisms. With rapid application development in food processing and biocatalysis, the requirement of biosynthetic Hb and Mb is increasing. However, the production of Hb and Mb is limited by the lower expressional level of globins and insufficient or improper heme supply. After selecting an inducible strategy for the expression of globins, removing the spatial barrier during heme synthesis, increasing the synthesis of 5-aminolevulinate and moderately enhancing heme synthetic rate-limiting steps, the microbial synthesis of bovine and porcine Hb was firstly achieved. Furthermore, an engineered Saccharomyces cerevisiae obtained a higher titer of soybean (108.2 ± 3.5 mg/L) and clover (13.7 ± 0.5 mg/L) Hb and bovine (68.9 ± 1.6 mg/L) and porcine (85.9 ± 5.0 mg/L) Mb. Therefore, this systematic engineering strategy will be useful to produce other hemoproteins or hemoenzymes with high activities.
Collapse
Affiliation(s)
- Jike Xue
- School of Food Science and Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Jingwen Zhou
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu, China; Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Jianghua Li
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu, China; Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Guocheng Du
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu, China; Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Jian Chen
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu, China; Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Miao Wang
- School of Food Science and Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Xinrui Zhao
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu, China; Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China; Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China.
| |
Collapse
|
11
|
Bayne AN, Dong J, Amiri S, Farhan SMK, Trempe JF. MTSviewer: A database to visualize mitochondrial targeting sequences, cleavage sites, and mutations on protein structures. PLoS One 2023; 18:e0284541. [PMID: 37093842 PMCID: PMC10124841 DOI: 10.1371/journal.pone.0284541] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 04/02/2023] [Indexed: 04/25/2023] Open
Abstract
Mitochondrial dysfunction is implicated in a wide array of human diseases ranging from neurodegenerative disorders to cardiovascular defects. The coordinated localization and import of proteins into mitochondria are essential processes that ensure mitochondrial homeostasis. The localization and import of most mitochondrial proteins are driven by N-terminal mitochondrial targeting sequences (MTS's), which interact with import machinery and are removed by the mitochondrial processing peptidase (MPP). The recent discovery of internal MTS's-those which are distributed throughout a protein and act as import regulators or secondary MPP cleavage sites-has expanded the role of both MTS's and MPP beyond conventional N-terminal regulatory pathways. Still, the global mutational landscape of MTS's remains poorly characterized, both from genetic and structural perspectives. To this end, we have integrated a variety of tools into one harmonized R/Shiny database called MTSviewer (https://neurobioinfo.github.io/MTSvieweR/), which combines MTS predictions, cleavage sites, genetic variants, pathogenicity predictions, and N-terminomics data with structural visualization using AlphaFold models of human and yeast mitochondrial proteomes. Using MTSviewer, we profiled all MTS-containing proteins across human and yeast mitochondrial proteomes and provide multiple case studies to highlight the utility of this database.
Collapse
Affiliation(s)
- Andrew N Bayne
- Department of Pharmacology & Therapeutics and Centre de Recherche en Biologie Structurale, McGill University, Montréal, Quebec, Canada
| | - Jing Dong
- Department of Pharmacology & Therapeutics and Centre de Recherche en Biologie Structurale, McGill University, Montréal, Quebec, Canada
| | - Saeid Amiri
- Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill University, Montréal, Quebec, Canada
| | - Sali M K Farhan
- Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill University, Montréal, Quebec, Canada
- Department of Human Genetics, McGill University, Montréal, Quebec, Canada
| | - Jean-François Trempe
- Department of Pharmacology & Therapeutics and Centre de Recherche en Biologie Structurale, McGill University, Montréal, Quebec, Canada
| |
Collapse
|
12
|
Cuypers B, Rappuoli R, Brozzi A. A Lean Reverse Vaccinology Pipeline with Publicly Available Bioinformatic Tools. Methods Mol Biol 2023; 2673:341-356. [PMID: 37258926 DOI: 10.1007/978-1-0716-3239-0_24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Reverse vaccinology (RV) marked an outstanding improvement in vaccinology employing bioinformatics tools to extract effective features from protein sequences to drive the selection of potential vaccine candidates (Rappuoli, Curr Opin Microbiol 3(5):445-450, 2000). Pioneered by Rino Rappuoli and first used against serogroup B meningococcus, since then, it has been used on several other bacterial vaccines, varying during time the adopted bioinformatics tools. Based on our experience in the field of RV and following an extensive literature review, we consolidate a lean RV pipeline of publicly available bioinformatic tools whose usage is described in this contribution. The protein features, whose extraction is reported in this contribution, can be also the input in a matrix format for machine learning-based approaches.
Collapse
Affiliation(s)
- Bart Cuypers
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, Belgium
- Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing (AUDACIS), Antwerp, Belgium
| | | | | |
Collapse
|
13
|
Anteghini M, Haja A, Martins dos Santos VA, Schomaker L, Saccenti E. OrganelX web server for sub-peroxisomal and sub-mitochondrial protein localization and peroxisomal target signal detection. Comput Struct Biotechnol J 2022; 21:128-133. [PMID: 36544474 PMCID: PMC9747352 DOI: 10.1016/j.csbj.2022.11.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 11/28/2022] [Accepted: 11/28/2022] [Indexed: 12/12/2022] Open
Abstract
We present the OrganelX e-Science Web Server that provides a user-friendly implementation of the In-Pero and In-Mito classifiers for sub-peroxisomal and sub-mitochondrial localization of peroxisomal and mitochondrial proteins and the Is-PTS1 algorithm for detecting and validating potential peroxisomal proteins carrying a PTS1 signal sequence. The OrganelX e-Science Web Server is available at https://organelx.hpc.rug.nl/fasta/.
Collapse
Affiliation(s)
- Marco Anteghini
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, The Netherlands
- LifeGlimmer GmbH, Berlin, Germany
| | - Asmaa Haja
- Bernoulli Institute, University of Groningen, Groningen, The Netherlands
| | - Vitor A.P. Martins dos Santos
- LifeGlimmer GmbH, Berlin, Germany
- Bioprocess Engineering, Wageningen University & Research, Wageningen, The Netherlands
| | - Lambert Schomaker
- Bernoulli Institute, University of Groningen, Groningen, The Netherlands
| | - Edoardo Saccenti
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, The Netherlands
| |
Collapse
|
14
|
Zhang T, Gu J, Wang Z, Wu C, Liang Y, Shi X. Protein Subcellular Localization Prediction Model Based on Graph Convolutional Network. Interdiscip Sci 2022; 14:937-946. [PMID: 35713780 DOI: 10.1007/s12539-022-00529-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 05/12/2022] [Accepted: 05/17/2022] [Indexed: 06/15/2023]
Abstract
Protein subcellular localization prediction is an important research area in bioinformatics, which plays an essential role in understanding protein function and mechanism. Many machine learning and deep learning algorithms have been employed for this task, but most of them do not use structural information of proteins. With the advances in protein structure research in recent years, protein contact map prediction has been dramatically enhanced. In this paper, we present GraphLoc, a deep learning model that predicts the localization of proteins at the subcellular level. The cores of the model are a graph convolutional neural network module and a multi-head attention module. The protein topology graph is constructed based on a contact map predicted from protein sequences, which is used as the input of the GCN module to take full advantage of the structural information of proteins. Multi-head attention module learns the weighted contribution of different amino acids to subcellular localization in different feature representation subspaces. Experiments on the benchmark dataset show that the performance of our model is better than others. The code can be accessed at https://github.com/GoodGuy398/GraphLoc . The proposed GraphLoc model consists of three parts. The first part is a graph convolutional network (GCN) module, which utilizes the predicted contact maps to construct protein graph, taking benefit of protein information accordingly. The second part is the multi-head attention module, which learns the weighted contribution of different amino acids in different feature representation subspace, and weighted average the feature map across all amino acid nodes. The last part is a fully connected layer that maps the flatten graph representation vector to another vector with a category number dimension, followed by a softmax layer to predict the protein subcellular localization.
Collapse
Affiliation(s)
- Tianhao Zhang
- College of Computer Science and Technology, University of Jilin, Changchun, 130012, China
| | - Jiawei Gu
- College of Computer Science and Technology, University of Jilin, Changchun, 130012, China
| | - Zeyu Wang
- College of Computer Science and Technology, University of Jilin, Changchun, 130012, China
| | - Chunguo Wu
- College of Computer Science and Technology, University of Jilin, Changchun, 130012, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Changchun, 130012, China
| | - Yanchun Liang
- College of Computer Science and Technology, University of Jilin, Changchun, 130012, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Changchun, 130012, China
- School of Computer Science, Zhuhai College of Science and Technology, Zhuhai, 519041, China
| | - Xiaohu Shi
- College of Computer Science and Technology, University of Jilin, Changchun, 130012, China.
- Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Changchun, 130012, China.
- School of Computer Science, Zhuhai College of Science and Technology, Zhuhai, 519041, China.
| |
Collapse
|
15
|
Rius R, Bennett NK, Bhattacharya K, Riley LG, Yüksel Z, Formosa LE, Compton AG, Dale RC, Cowley MJ, Gayevskiy V, Al Tala SM, Almehery AA, Ryan MT, Thorburn DR, Nakamura K, Christodoulou J. Biallelic pathogenic variants in COX11 are associated with an infantile-onset mitochondrial encephalopathy. Hum Mutat 2022; 43:1970-1978. [PMID: 36030551 PMCID: PMC9771894 DOI: 10.1002/humu.24453] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 07/28/2022] [Accepted: 08/22/2022] [Indexed: 01/25/2023]
Abstract
Primary mitochondrial diseases are a group of genetically and clinically heterogeneous disorders resulting from oxidative phosphorylation (OXPHOS) defects. COX11 encodes a copper chaperone that participates in the assembly of complex IV and has not been previously linked to human disease. In a previous study, we identified that COX11 knockdown decreased cellular adenosine triphosphate (ATP) derived from respiration, and that ATP levels could be restored with coenzyme Q10 (CoQ10 ) supplementation. This finding is surprising since COX11 has no known role in CoQ10 biosynthesis. Here, we report a novel gene-disease association by identifying biallelic pathogenic variants in COX11 associated with infantile-onset mitochondrial encephalopathies in two unrelated families using trio genome and exome sequencing. Functional studies showed that mutant COX11 fibroblasts had decreased ATP levels which could be rescued by CoQ10 . These results not only suggest that COX11 variants cause defects in energy production but reveal a potential metabolic therapeutic strategy for patients with COX11 variants.
Collapse
Affiliation(s)
- Rocio Rius
- Brain and Mitochondrial Research Group, Murdoch Children's Research InstituteRoyal Children's HospitalMelbourneAustralia
- Department of PaediatricsUniversity of MelbourneMelbourneAustralia
| | - Neal K. Bennett
- Gladstone Institute of Neurological DiseaseGladstone InstitutesSan FranciscoCaliforniaUSA
| | - Kaustuv Bhattacharya
- Genetic Metabolic Disorders ServiceThe Children's Hospital at WestmeadSydneyNew South WalesAustralia
- Discipline of Genetic Medicine, Sydney Medical SchoolUniversity of SydneySydneyNew South WalesAustralia
| | - Lisa G. Riley
- Specialty of Child & Adolescent HealthUniversity of SydneySydneyAustralia
- Rare Diseases Functional GenomicsThe Children's Hospital at WestmeadSydneyNew South WalesAustralia
| | - Zafer Yüksel
- Department of Human GeneticsBioscientia Healthcare GmbHIngelheimGermany
| | - Luke E. Formosa
- Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery InstituteMonash UniversityMelbourneVictoriaAustralia
| | - Alison G. Compton
- Brain and Mitochondrial Research Group, Murdoch Children's Research InstituteRoyal Children's HospitalMelbourneAustralia
- Department of PaediatricsUniversity of MelbourneMelbourneAustralia
| | - Russell C. Dale
- Department of Paediatric Neurology and Clinical school, The Children's Hospital at Westmead, Faculty of Medicine and HealthUniversity of SydneySydneyNew South WalesAustralia
| | - Mark J. Cowley
- Children's Cancer Institute & School of Women's and Children's HealthUniversity of New South WalesSydneyNew South WalesAustralia
| | - Velimir Gayevskiy
- Kinghorn Centre for Clinical GenomicsGarvan Institute of Medical ResearchSydneyNew South WalesAustralia
| | - Saeed M. Al Tala
- Pediatric DirectorateNeonatal NICU, Armed Forces Hospital SRKhamis MushaytSaudi Arabia
| | | | - Michael T. Ryan
- Department of Human GeneticsBioscientia Healthcare GmbHIngelheimGermany
| | - David R. Thorburn
- Brain and Mitochondrial Research Group, Murdoch Children's Research InstituteRoyal Children's HospitalMelbourneAustralia
- Department of PaediatricsUniversity of MelbourneMelbourneAustralia
- Victorian Clinical Genetics ServicesRoyal Children's HospitalMelbourneVictoriaAustralia
| | - Ken Nakamura
- Gladstone Institute of Neurological DiseaseGladstone InstitutesSan FranciscoCaliforniaUSA
- Department of NeurologyUniversity of CaliforniaSan FranciscoCaliforniaUSA
- Graduate Programs in Biomedical Sciences and NeuroscienceUniversity of CaliforniaSan FranciscoCaliforniaUSA
| | - John Christodoulou
- Brain and Mitochondrial Research Group, Murdoch Children's Research InstituteRoyal Children's HospitalMelbourneAustralia
- Department of PaediatricsUniversity of MelbourneMelbourneAustralia
- Discipline of Genetic Medicine, Sydney Medical SchoolUniversity of SydneySydneyNew South WalesAustralia
| |
Collapse
|
16
|
Ebeed HT. Genome-wide analysis of polyamine biosynthesis genes in wheat reveals gene expression specificity and involvement of STRE and MYB-elements in regulating polyamines under drought. BMC Genomics 2022; 23:734. [PMID: 36309637 PMCID: PMC9618216 DOI: 10.1186/s12864-022-08946-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 10/10/2022] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND Polyamines (PAs) are considered promising biostimulants that have diverse key roles during growth and stress responses in plants. Nevertheless, the molecular basis of these roles by PAs has not been completely realized even now, and unfortunately, the transcriptional analyses of the biosynthesis pathway in various wheat tissues have not been investigated under normal or stress conditions. In this research, the findings of genome-wide analyses of genes implicated in the PAs biosynthesis in wheat (ADC, Arginine decarboxylase; ODC, ornithine decarboxylase; AIH, agmatine iminohydrolase; NPL1, Nitrlase like protein 1; SAMDC, S-adenosylmethionine decarboxylase; SPDS, spermidine synthase; SPMS, spermine synthase and ACL5, thermospermine synthase) are shown. RESULTS In total, thirty PAs biosynthesis genes were identified. Analysis of gene structure, subcellular compartmentation and promoters were discussed. Furthermore, experimental gene expression analyses in roots, shoot axis, leaves, and spike tissues were investigated in adult wheat plants under control and drought conditions. Results revealed structural similarity within each gene family and revealed the identity of two new motifs that were conserved in SPDS, SPMS and ACL5. Analysis of the promoter elements revealed the incidence of conserved elements (STRE, CAAT-box, TATA-box, and MYB TF) in all promoters and highly conserved CREs in >80% of promoters (G-Box, ABRE, TGACG-motif, CGTCA-motif, as1, and MYC). The results of the quantification of PAs revealed higher levels of putrescine (Put) in the leaves and higher spermidine (Spd) in the other tissues. However, no spermine (Spm) was detected in the roots. Drought stress elevated Put level in the roots and the Spm in the leaves, shoots and roots, while decreased Put in spikes and elevated the total PAs levels in all tissues. Interestingly, PA biosynthesis genes showed tissue-specificity and some homoeologs of the same gene family showed differential gene expression during wheat development. Additionally, gene expression analysis showed that ODC is the Put biosynthesis path under drought stress in roots. CONCLUSION The information gained by this research offers important insights into the transcriptional regulation of PA biosynthesis in wheat that would result in more successful and consistent plant production.
Collapse
Affiliation(s)
- Heba Talat Ebeed
- Botany and Microbiology Department, Faculty of Science, Damietta University, Damietta, 34517, Egypt.
| |
Collapse
|
17
|
Rozov SM, Deineko EV. Increasing the Efficiency of the Accumulation of Recombinant Proteins in Plant Cells: The Role of Transport Signal Peptides. PLANTS (BASEL, SWITZERLAND) 2022; 11:2561. [PMID: 36235427 PMCID: PMC9572730 DOI: 10.3390/plants11192561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 09/23/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
The problem with increasing the yield of recombinant proteins is resolvable using different approaches, including the transport of a target protein to cell compartments with a low protease activity. In the cell, protein targeting involves short-signal peptide sequences recognized by intracellular protein transport systems. The main systems of the protein transport across membranes of the endoplasmic reticulum and endosymbiotic organelles are reviewed here, as are the major types and structure of the signal sequences targeting proteins to the endoplasmic reticulum and its derivatives, to plastids, and to mitochondria. The role of protein targeting to certain cell organelles depending on specific features of recombinant proteins and the effect of this targeting on the protein yield are discussed, in addition to the main directions of the search for signal sequences based on their primary structure. This knowledge makes it possible not only to predict a protein localization in the cell but also to reveal the most efficient sequences with potential biotechnological utility.
Collapse
|
18
|
Jiang Y, Wang D, Wang W, Xu D. Computational methods for protein localization prediction. Comput Struct Biotechnol J 2021; 19:5834-5844. [PMID: 34765098 PMCID: PMC8564054 DOI: 10.1016/j.csbj.2021.10.023] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 10/12/2021] [Accepted: 10/13/2021] [Indexed: 12/16/2022] Open
Abstract
The accurate annotation of protein localization is crucial in understanding protein function in tandem with a broad range of applications such as pathological analysis and drug design. Since most proteins do not have experimentally-determined localization information, the computational prediction of protein localization has been an active research area for more than two decades. In particular, recent machine-learning advancements have fueled the development of new methods in protein localization prediction. In this review paper, we first categorize the main features and algorithms used for protein localization prediction. Then, we summarize a list of protein localization prediction tools in terms of their coverage, characteristics, and accessibility to help users find suitable tools based on their needs. Next, we evaluate some of these tools on a benchmark dataset. Finally, we provide an outlook on the future exploration of protein localization methods.
Collapse
Affiliation(s)
- Yuexu Jiang
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| | - Duolin Wang
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| | - Weiwei Wang
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| |
Collapse
|
19
|
Jiang Y, Wang D, Yao Y, Eubel H, Künzler P, Møller IM, Xu D. MULocDeep: A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation. Comput Struct Biotechnol J 2021; 19:4825-4839. [PMID: 34522290 PMCID: PMC8426535 DOI: 10.1016/j.csbj.2021.08.027] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 08/16/2021] [Accepted: 08/16/2021] [Indexed: 12/18/2022] Open
Abstract
Prediction of protein localization plays an important role in understanding protein function and mechanisms. In this paper, we propose a general deep learning-based localization prediction framework, MULocDeep, which can predict multiple localizations of a protein at both subcellular and suborganellar levels. We collected a dataset with 44 suborganellar localization annotations in 10 major subcellular compartments—the most comprehensive suborganelle localization dataset to date. We also experimentally generated an independent dataset of mitochondrial proteins in Arabidopsis thaliana cell cultures, Solanum tuberosum tubers, and Vicia faba roots and made this dataset publicly available. Evaluations using the above datasets show that overall, MULocDeep outperforms other major methods at both subcellular and suborganellar levels. Furthermore, MULocDeep assesses each amino acid’s contribution to localization, which provides insights into the mechanism of protein sorting and localization motifs. A web server can be accessed at http://mu-loc.org.
Collapse
Affiliation(s)
- Yuexu Jiang
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, Columbia, MO, USA
| | - Duolin Wang
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, Columbia, MO, USA
| | - Yifu Yao
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, Columbia, MO, USA
| | - Holger Eubel
- Institute of Plant Genetics, Leibniz University Hannover, Hannover, Germany
| | - Patrick Künzler
- Institute of Plant Genetics, Leibniz University Hannover, Hannover, Germany
| | - Ian Max Møller
- Department of Molecular Biology and Genetics, Aarhus University, Forsøgsvej 1, DK-4200 Slagelse, Denmark
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, Columbia, MO, USA
| |
Collapse
|
20
|
Computer-Aided Prediction of Protein Mitochondrial Localization. Methods Mol Biol 2021; 2275:433-452. [PMID: 34118055 DOI: 10.1007/978-1-0716-1262-0_28] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2023]
Abstract
Protein sequences, directly translated from genomic data, need functional and structural annotation. Together with molecular function and biological process, subcellular localization is an important feature necessary for understanding the protein role and the compartment where the mature protein is active. In the case of mitochondrial proteins, their precursor sequences translated by the ribosome machinery include specific patterns from which it is possible not only to recognize their final destination within the organelle but also which of the mitochondrial subcompartments the protein is intended for. Four compartments are routinely discriminated, including the inner and the outer membranes, the intermembrane space, and the matrix. Here we discuss to which extent it is feasible to develop computational methods for detecting mitochondrial targeting peptides in the precursor sequence and to discriminate their final destination in the organelle. We benchmark two of our methods on the general task of recognizing human mitochondrial proteins endowed with an experimentally characterized targeting peptide (TPpred3) and predicting which submitochondrial compartment is the final destination (DeepMito). We describe how to adopt our web servers in order to discriminate which human proteins are endowed with mitochondrial targeting peptides, the position of cleavage sites, and which submitochondrial compartment are intended for. By this, we add some other 1788 human proteins to the 450 ones already manually annotated in UniProt with a mitochondrial targeting peptide, providing for each of them also the characterization of the suborganellar localization.
Collapse
|
21
|
Anteghini M, Martins dos Santos V, Saccenti E. In-Pero: Exploiting Deep Learning Embeddings of Protein Sequences to Predict the Localisation of Peroxisomal Proteins. Int J Mol Sci 2021; 22:6409. [PMID: 34203866 PMCID: PMC8232616 DOI: 10.3390/ijms22126409] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 05/31/2021] [Accepted: 06/09/2021] [Indexed: 01/28/2023] Open
Abstract
Peroxisomes are ubiquitous membrane-bound organelles, and aberrant localisation of peroxisomal proteins contributes to the pathogenesis of several disorders. Many computational methods focus on assigning protein sequences to subcellular compartments, but there are no specific tools tailored for the sub-localisation (matrix vs. membrane) of peroxisome proteins. We present here In-Pero, a new method for predicting protein sub-peroxisomal cellular localisation. In-Pero combines standard machine learning approaches with recently proposed multi-dimensional deep-learning representations of the protein amino-acid sequence. It showed a classification accuracy above 0.9 in predicting peroxisomal matrix and membrane proteins. The method is trained and tested using a double cross-validation approach on a curated data set comprising 160 peroxisomal proteins with experimental evidence for sub-peroxisomal localisation. We further show that the proposed approach can be easily adapted (In-Mito) to the prediction of mitochondrial protein localisation obtaining performances for certain classes of proteins (matrix and inner-membrane) superior to existing tools.
Collapse
Affiliation(s)
- Marco Anteghini
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Stippeneng 4, 6708 WE Wageningen, The Netherlands;
- LifeGlimmer GmbH, 12163 Berlin, Germany
| | - Vitor Martins dos Santos
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Stippeneng 4, 6708 WE Wageningen, The Netherlands;
- LifeGlimmer GmbH, 12163 Berlin, Germany
| | - Edoardo Saccenti
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Stippeneng 4, 6708 WE Wageningen, The Netherlands;
| |
Collapse
|
22
|
Towards a systems-level understanding of mitochondrial biology. Cell Calcium 2021; 95:102364. [PMID: 33601101 DOI: 10.1016/j.ceca.2021.102364] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Revised: 01/22/2021] [Accepted: 01/23/2021] [Indexed: 11/21/2022]
Abstract
Human mitochondria are complex and highly dynamic biological systems, comprised of over a thousand parts and evolved to fully integrate into the specialized intracellular signaling networks and metabolic requirements of each cell and organ. Over the last two decades, several complementary, top-down computational and experimental approaches have been developed to identify, characterize and modulate the human mitochondrial system, demonstrating the power of integrating classical reductionist and discovery-driven analyses in order to de-orphanize hitherto unknown molecular components of mitochondrial machineries and pathways. To this goal, systematic, multiomics-based surveys of proteome composition, protein networks, and phenotype-to-pathway associations at the tissue, cell and organellar level have been largely exploited to predict the full complement of mitochondrial proteins and their functional interactions, therefore catalyzing data-driven hypotheses. Collectively, these multidisciplinary and integrative research approaches hold the potential to propel our understanding of mitochondrial biology and provide a systems-level framework to unraveling mitochondria-mediated and disease-spanning pathomechanisms.
Collapse
|
23
|
Abstract
The elucidation of the subcellular localization of proteins is very important in order to deeply understand their functions. In fact, proteins activities are strictly correlated to the cellular compartment and microenvironment in which they are present.In recent years, several effective and reliable proteomics techniques and computational methods have been developed and implemented in order to identify the proteins subcellular localization. This process is often time-consuming and expensive, but the recent technological and bioinformatics progress allowed the development of more accurate and simple workflows to determine the localization, interactions, and functions of proteins.In the following chapter, a brief introduction on the importance of knowing subcellular localization of proteins will be presented. Then, sample preparation protocols, proteomic methods, data analysis strategies, and software for the prediction of proteins localization will be presented and discussed. Finally, the more recent and advanced spatial proteomics techniques will be shown.
Collapse
Affiliation(s)
- Elettra Barberis
- Department of Translational Medicine, University of Piemonte Orientale, Novara, Italy
- Center for Translational Research on Autoimmune and Allergic Diseases, CAAD, University of Piemonte Orientale, Novara, Italy
| | - Emilio Marengo
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Alessandria, Italy
- Center for Translational Research on Autoimmune and Allergic Diseases, CAAD, University of Piemonte Orientale, Novara, Italy
| | - Marcello Manfredi
- Department of Translational Medicine, University of Piemonte Orientale, Novara, Italy.
- Center for Translational Research on Autoimmune and Allergic Diseases, CAAD, University of Piemonte Orientale, Novara, Italy.
| |
Collapse
|
24
|
Imai K, Nakai K. Tools for the Recognition of Sorting Signals and the Prediction of Subcellular Localization of Proteins From Their Amino Acid Sequences. Front Genet 2020; 11:607812. [PMID: 33324450 PMCID: PMC7723863 DOI: 10.3389/fgene.2020.607812] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 11/03/2020] [Indexed: 12/13/2022] Open
Abstract
At the time of translation, nascent proteins are thought to be sorted into their final subcellular localization sites, based on the part of their amino acid sequences (i.e., sorting or targeting signals). Thus, it is interesting to computationally recognize these signals from the amino acid sequences of any given proteins and to predict their final subcellular localization with such information, supplemented with additional information (e.g., k-mer frequency). This field has a long history and many prediction tools have been released. Even in this era of proteomic atlas at the single-cell level, researchers continue to develop new algorithms, aiming at accessing the impact of disease-causing mutations/cell type-specific alternative splicing, for example. In this article, we overview the entire field and discuss its future direction.
Collapse
Affiliation(s)
- Kenichiro Imai
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
| | - Kenta Nakai
- The Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
25
|
Kaleel M, Zheng Y, Chen J, Feng X, Simpson JC, Pollastri G, Mooney C. SCLpred-EMS: subcellular localization prediction of endomembrane system and secretory pathway proteins by Deep N-to-1 Convolutional Neural Networks. Bioinformatics 2020; 36:3343-3349. [PMID: 32142105 DOI: 10.1093/bioinformatics/btaa156] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 02/25/2020] [Accepted: 03/02/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The subcellular location of a protein can provide useful information for protein function prediction and drug design. Experimentally determining the subcellular location of a protein is an expensive and time-consuming task. Therefore, various computer-based tools have been developed, mostly using machine learning algorithms, to predict the subcellular location of proteins. RESULTS Here, we present a neural network-based algorithm for protein subcellular location prediction. We introduce SCLpred-EMS a subcellular localization predictor powered by an ensemble of Deep N-to-1 Convolutional Neural Networks. SCLpred-EMS predicts the subcellular location of a protein into two classes, the endomembrane system and secretory pathway versus all others, with a Matthews correlation coefficient of 0.75-0.86 outperforming the other state-of-the-art web servers we tested. AVAILABILITY AND IMPLEMENTATION SCLpred-EMS is freely available for academic users at http://distilldeep.ucd.ie/SCLpred2/. CONTACT catherine.mooney@ucd.ie.
Collapse
Affiliation(s)
- Manaz Kaleel
- School of Computer Science.,UCD Institute for Discovery, University College Dublin, Dublin, Ireland
| | - Yandan Zheng
- Beijing-Dublin International College, Beijing University of Technology, Chaoyang, China
| | - Jialiang Chen
- Beijing-Dublin International College, Beijing University of Technology, Chaoyang, China
| | - Xuanming Feng
- Beijing-Dublin International College, Beijing University of Technology, Chaoyang, China
| | - Jeremy C Simpson
- Conway Institute of Biomolecular and Biomedical Research.,School of Biology and Environmental Science, University College Dublin, Dublin, Ireland
| | - Gianluca Pollastri
- School of Computer Science.,UCD Institute for Discovery, University College Dublin, Dublin, Ireland
| | - Catherine Mooney
- School of Computer Science.,Beijing-Dublin International College, Beijing University of Technology, Chaoyang, China
| |
Collapse
|
26
|
Large-scale prediction and analysis of protein sub-mitochondrial localization with DeepMito. BMC Bioinformatics 2020; 21:266. [PMID: 32938368 PMCID: PMC7493403 DOI: 10.1186/s12859-020-03617-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Accepted: 06/18/2020] [Indexed: 12/31/2022] Open
Abstract
Background The prediction of protein subcellular localization is a key step of the big effort towards protein functional annotation. Many computational methods exist to identify high-level protein subcellular compartments such as nucleus, cytoplasm or organelles. However, many organelles, like mitochondria, have their own internal compartmentalization. Knowing the precise location of a protein inside mitochondria is crucial for its accurate functional characterization. We recently developed DeepMito, a new method based on a 1-Dimensional Convolutional Neural Network (1D-CNN) architecture outperforming other similar approaches available in literature. Results Here, we explore the adoption of DeepMito for the large-scale annotation of four sub-mitochondrial localizations on mitochondrial proteomes of five different species, including human, mouse, fly, yeast and Arabidopsis thaliana. A significant fraction of the proteins from these organisms lacked experimental information about sub-mitochondrial localization. We adopted DeepMito to fill the gap, providing complete characterization of protein localization at sub-mitochondrial level for each protein of the five proteomes. Moreover, we identified novel mitochondrial proteins fishing on the set of proteins lacking any subcellular localization annotation using available state-of-the-art subcellular localization predictors. We finally performed additional functional characterization of proteins predicted by DeepMito as localized into the four different sub-mitochondrial compartments using both available experimental and predicted GO terms. All data generated in this study were collected into a database called DeepMitoDB (available at http://busca.biocomp.unibo.it/deepmitodb), providing complete functional characterization of 4307 mitochondrial proteins from the five species. Conclusions DeepMitoDB offers a comprehensive view of mitochondrial proteins, including experimental and predicted fine-grain sub-cellular localization and annotated and predicted functional annotations. The database complements other similar resources providing characterization of new proteins. Furthermore, it is also unique in including localization information at the sub-mitochondrial level. For this reason, we believe that DeepMitoDB can be a valuable resource for mitochondrial research.
Collapse
|
27
|
Zhao X, Zhou J, Du G, Chen J. Recent Advances in the Microbial Synthesis of Hemoglobin. Trends Biotechnol 2020; 39:286-297. [PMID: 32912649 DOI: 10.1016/j.tibtech.2020.08.004] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Revised: 07/27/2020] [Accepted: 08/11/2020] [Indexed: 01/08/2023]
Abstract
Hemoglobin is a cofactor-containing protein with heme that plays important roles in transporting and storing oxygen. Hemoglobins have been widely applied as acellular oxygen carriers, bioavailable iron-supplying agents, and food-grade coloring and flavoring agents. To meet increasing demands and overcome the drawbacks of chemical extraction, the biosynthesis of hemoglobin has become an attractive alternative. Several hemoglobins have recently been synthesized by various microorganisms through metabolic engineering and synthetic biology. In this review, we summarize the novel strategies that have been used to biosynthesize hemoglobin. These strategies can also serve as references for producing other heme-binding proteins.
Collapse
Affiliation(s)
- Xinrui Zhao
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi, Jiangsu 214122, China
| | - Jingwen Zhou
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi, Jiangsu 214122, China; National Engineering Laboratory of Cereal Fermentation Technology, Jiangnan University, Wuxi, Jiangsu 214122, China.
| | - Guocheng Du
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi, Jiangsu 214122, China; Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi, Jiangsu 214122, China
| | - Jian Chen
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi, Jiangsu 214122, China; National Engineering Laboratory of Cereal Fermentation Technology, Jiangnan University, Wuxi, Jiangsu 214122, China
| |
Collapse
|
28
|
Savojardo C, Bruciaferri N, Tartari G, Martelli PL, Casadio R. DeepMito: accurate prediction of protein sub-mitochondrial localization using convolutional neural networks. Bioinformatics 2020; 36:56-64. [PMID: 31218353 PMCID: PMC6956790 DOI: 10.1093/bioinformatics/btz512] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 05/31/2019] [Accepted: 06/17/2019] [Indexed: 11/18/2022] Open
Abstract
Motivation The correct localization of proteins in cell compartments is a key issue for their function. Particularly, mitochondrial proteins are physiologically active in different compartments and their aberrant localization contributes to the pathogenesis of human mitochondrial pathologies. Many computational methods exist to assign protein sequences to subcellular compartments such as nucleus, cytoplasm and organelles. However, a substantial lack of experimental evidence in public sequence databases hampered so far a finer grain discrimination, including also intra-organelle compartments. Results We describe DeepMito, a novel method for predicting protein sub-mitochondrial cellular localization. Taking advantage of powerful deep-learning approaches, such as convolutional neural networks, our method is able to achieve very high prediction performances when discriminating among four different mitochondrial compartments (matrix, outer, inner and intermembrane regions). The method is trained and tested in cross-validation on a newly generated, high-quality dataset comprising 424 mitochondrial proteins with experimental evidence for sub-organelle localizations. We benchmark DeepMito towards the only one recent approach developed for the same task. Results indicate that DeepMito performances are superior. Finally, genomic-scale prediction on a highly-curated dataset of human mitochondrial proteins further confirms the effectiveness of our approach and suggests that DeepMito is a good candidate for genome-scale annotation of mitochondrial protein subcellular localization. Availability and implementation The DeepMito web server as well as all datasets used in this study are available at http://busca.biocomp.unibo.it/deepmito. A standalone version of DeepMito is available on DockerHub at https://hub.docker.com/r/bolognabiocomp/deepmito. DeepMito source code is available on GitHub at https://github.com/BolognaBiocomp/deepmito Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Bologna, Italy
| | - Niccolò Bruciaferri
- Biocomputing Group, Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Bologna, Italy
| | - Giacomo Tartari
- Biocomputing Group, Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Bologna, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM), Italian National Research Council (CNR), Bari, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Bologna, Italy
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Bologna, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM), Italian National Research Council (CNR), Bari, Italy
| |
Collapse
|
29
|
Baysal C, Pérez-González A, Eseverri Á, Jiang X, Medina V, Caro E, Rubio L, Christou P, Zhu C. Recognition motifs rather than phylogenetic origin influence the ability of targeting peptides to import nuclear-encoded recombinant proteins into rice mitochondria. Transgenic Res 2020; 29:37-52. [PMID: 31598902 PMCID: PMC7000509 DOI: 10.1007/s11248-019-00176-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 10/01/2019] [Indexed: 10/30/2022]
Abstract
Mitochondria fulfil essential functions in respiration and metabolism as well as regulating stress responses and apoptosis. Most native mitochondrial proteins are encoded by nuclear genes and are imported into mitochondria via one of several receptors that recognize N-terminal signal peptides. The targeting of recombinant proteins to mitochondria therefore requires the presence of an appropriate N-terminal peptide, but little is known about mitochondrial import in monocotyledonous plants such as rice (Oryza sativa). To gain insight into this phenomenon, we targeted nuclear-encoded enhanced green fluorescent protein (eGFP) to rice mitochondria using six mitochondrial pre-sequences with diverse phylogenetic origins, and investigated their effectiveness by immunoblot analysis as well as confocal and electron microscopy. We found that the ATPA and COX4 (Saccharomyces cerevisiae), SU9 (Neurospora crassa), pFA (Arabidopsis thaliana) and OsSCSb (Oryza sativa) peptides successfully directed most of the eGFP to the mitochondria, whereas the MTS2 peptide (Nicotiana plumbaginifolia) showed little or no evidence of targeting ability even though it is a native plant sequence. Our data therefore indicate that the presence of particular recognition motifs may be required for mitochondrial targeting, whereas the phylogenetic origin of the pre-sequences probably does not play a key role in the success of mitochondrial targeting in dedifferentiated rice callus and plants.
Collapse
Affiliation(s)
- Can Baysal
- Department of Plant Production and Forestry Science, University of Lleida-Agrotecnio Center, Av. Alcalde Rovira Roure, 191, 25198, Lleida, Spain
| | - Ana Pérez-González
- Centre for Plant Biotechnology and Genomics, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus Montegancedo UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Álvaro Eseverri
- Centre for Plant Biotechnology and Genomics, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus Montegancedo UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Xi Jiang
- Centre for Plant Biotechnology and Genomics, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus Montegancedo UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Vicente Medina
- Department of Plant Production and Forestry Science, University of Lleida-Agrotecnio Center, Av. Alcalde Rovira Roure, 191, 25198, Lleida, Spain
| | - Elena Caro
- Centre for Plant Biotechnology and Genomics, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus Montegancedo UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Luis Rubio
- Centre for Plant Biotechnology and Genomics, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus Montegancedo UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Paul Christou
- Department of Plant Production and Forestry Science, University of Lleida-Agrotecnio Center, Av. Alcalde Rovira Roure, 191, 25198, Lleida, Spain
- ICREA, Catalan Institute for Research and Advanced Studies, Passeig Lluís Companys 23, 08010, Barcelona, Spain
| | - Changfu Zhu
- Department of Plant Production and Forestry Science, University of Lleida-Agrotecnio Center, Av. Alcalde Rovira Roure, 191, 25198, Lleida, Spain.
| |
Collapse
|
30
|
Nithya V. SubmitoLoc: Identification of mitochondrial sub cellular locations of proteins using support vector machine. Bioinformation 2019; 15:863-868. [PMID: 32256006 PMCID: PMC7088428 DOI: 10.6026/97320630015863] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2019] [Revised: 12/31/2019] [Accepted: 12/31/2019] [Indexed: 11/23/2022] Open
Abstract
Mitochondria are important sub-cellular organelles in eukaryotes. Defects in mitochondrial system lead to a variety of disease. Therefore, detailed knowledge of mitochondrial proteome is vital to understand mitochondrial system and their function. Sequence databases contain large number of mitochondrial proteins but they are mostly not annotated. In this study, we developed a support vector machine approach, SubmitoLoc, to predict mitochondrial sub cellular locations of proteins based on various sequence derived properties. We evaluated the predictor using 10-fold cross validation. Our method achieved 88.56 % accuracy using all features. Average sensitivity and specificity for four-subclass prediction is 85.37% and 87.25% respectively. High prediction accuracy suggests that SubmitoLoc will be useful for researchers studying mitochondrial biology and drug discovery.
Collapse
Affiliation(s)
- Varadharaju Nithya
- Department of Animal Health Management, Alagappa University, Karaikudi-630003, India
| |
Collapse
|
31
|
Almagro Armenteros JJ, Salvatore M, Emanuelsson O, Winther O, von Heijne G, Elofsson A, Nielsen H. Detecting sequence signals in targeting peptides using deep learning. Life Sci Alliance 2019; 2:2/5/e201900429. [PMID: 31570514 DOI: 10.1101/639203] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 09/18/2019] [Accepted: 09/18/2019] [Indexed: 05/25/2023] Open
Abstract
In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.
Collapse
Affiliation(s)
- Jose Juan Almagro Armenteros
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kongen Lyngby, Denmark
| | - Marco Salvatore
- Science for Life Laboratory, Solna, Sweden
- Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Olof Emanuelsson
- Science for Life Laboratory, Solna, Sweden
- Department of Gene Technology, School of Engineering Sciences in Biotechnology, Chemistry and Health, KTH-Royal Institute of Technology, Stockholm, Sweden
| | - Ole Winther
- DTU Compute, Technical University of Denmark, Kongen Lyngby, Denmark
- Computational and RNA Biology, University of Copenhagen, Copenhagen, Denmark
- Centre for Genomic Medicine, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Gunnar von Heijne
- Science for Life Laboratory, Solna, Sweden
- Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Arne Elofsson
- Science for Life Laboratory, Solna, Sweden
- Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Henrik Nielsen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kongen Lyngby, Denmark
| |
Collapse
|
32
|
Almagro Armenteros JJ, Salvatore M, Emanuelsson O, Winther O, von Heijne G, Elofsson A, Nielsen H. Detecting sequence signals in targeting peptides using deep learning. Life Sci Alliance 2019; 2:2/5/e201900429. [PMID: 31570514 PMCID: PMC6769257 DOI: 10.26508/lsa.201900429] [Citation(s) in RCA: 525] [Impact Index Per Article: 87.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 09/18/2019] [Accepted: 09/18/2019] [Indexed: 11/24/2022] Open
Abstract
In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.
Collapse
Affiliation(s)
- Jose Juan Almagro Armenteros
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kongen Lyngby, Denmark
| | - Marco Salvatore
- Science for Life Laboratory, Solna, Sweden.,Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Olof Emanuelsson
- Science for Life Laboratory, Solna, Sweden.,Department of Gene Technology, School of Engineering Sciences in Biotechnology, Chemistry and Health, KTH-Royal Institute of Technology, Stockholm, Sweden
| | - Ole Winther
- DTU Compute, Technical University of Denmark, Kongen Lyngby, Denmark.,Computational and RNA Biology, University of Copenhagen, Copenhagen, Denmark.,Centre for Genomic Medicine, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Gunnar von Heijne
- Science for Life Laboratory, Solna, Sweden.,Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Arne Elofsson
- Science for Life Laboratory, Solna, Sweden .,Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Henrik Nielsen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kongen Lyngby, Denmark
| |
Collapse
|
33
|
Cao Z, Pan X, Yang Y, Huang Y, Shen HB. The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier. Bioinformatics 2019; 34:2185-2194. [PMID: 29462250 DOI: 10.1093/bioinformatics/bty085] [Citation(s) in RCA: 287] [Impact Index Per Article: 47.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 02/14/2018] [Indexed: 01/01/2023] Open
Abstract
Motivation The long non-coding RNA (lncRNA) studies have been hot topics in the field of RNA biology. Recent studies have shown that their subcellular localizations carry important information for understanding their complex biological functions. Considering the costly and time-consuming experiments for identifying subcellular localization of lncRNAs, computational methods are urgently desired. However, to the best of our knowledge, there are no computational tools for predicting the lncRNA subcellular locations to date. Results In this study, we report an ensemble classifier-based predictor, lncLocator, for predicting the lncRNA subcellular localizations. To fully exploit lncRNA sequence information, we adopt both k-mer features and high-level abstraction features generated by unsupervised deep models, and construct four classifiers by feeding these two types of features to support vector machine (SVM) and random forest (RF), respectively. Then we use a stacked ensemble strategy to combine the four classifiers and get the final prediction results. The current lncLocator can predict five subcellular localizations of lncRNAs, including cytoplasm, nucleus, cytosol, ribosome and exosome, and yield an overall accuracy of 0.59 on the constructed benchmark dataset. Availability and implementation The lncLocator is available at www.csbio.sjtu.edu.cn/bioinf/lncLocator. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhen Cao
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
| | - Xiaoyong Pan
- Department of Medical Informatics, Erasmus MC, Rotterdam, The Netherlands
| | - Yang Yang
- Department of Computer Science, Shanghai Jiao Tong University, and Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai, China
| | - Yan Huang
- State Key Laboratory of Infrared Physics, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
| |
Collapse
|
34
|
Orioli T, Vihinen M. Benchmarking subcellular localization and variant tolerance predictors on membrane proteins. BMC Genomics 2019; 20:547. [PMID: 31307390 PMCID: PMC6631444 DOI: 10.1186/s12864-019-5865-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Background Membrane proteins constitute up to 30% of the human proteome. These proteins have special properties because the transmembrane segments are embedded into lipid bilayer while extramembranous parts are in different environments. Membrane proteins have several functions and are involved in numerous diseases. A large number of prediction methods have been introduced to predict protein subcellular localization as well as the tolerance or pathogenicity of amino acid substitutions. Results We tested the performance of 22 tolerance predictors by collecting information on membrane proteins and variants in them. The analysis indicated that the best tools had similar prediction performance on transmembrane, inside and outside regions of transmembrane proteins and comparable to overall prediction performances for all types of proteins. PON-P2 had the highest performance followed by REVEL, MetaSVM and VEST3. Further, we tested with the high quality dataset also the performance of seven subcellular localization predictors on membrane proteins. We assessed separately the performance for single pass and multi pass membrane proteins. Predictions for multi pass proteins were more reliable than those for single pass proteins. Conclusions The predictors for variant effects had better performance than subcellular localization tools. The best tolerance predictors are highly reliable. As there are large differences in the performances of tools, end-users have to be cautious in method selection. Electronic supplementary material The online version of this article (10.1186/s12864-019-5865-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tommaso Orioli
- International Master in Bioinformatics, School of Science, University of Bologna, Bologna, Italy.,Department of Experimental Medical Science, BMC B13, Lund University, SE-22184, Lund, Sweden
| | - Mauno Vihinen
- Department of Experimental Medical Science, BMC B13, Lund University, SE-22184, Lund, Sweden.
| |
Collapse
|
35
|
Abstract
Ever since the signal hypothesis was proposed in 1971, the exact nature of signal peptides has been a focus point of research. The prediction of signal peptides and protein subcellular location from amino acid sequences has been an important problem in bioinformatics since the dawn of this research field, involving many statistical and machine learning technologies. In this review, we provide a historical account of how position-weight matrices, artificial neural networks, hidden Markov models, support vector machines and, lately, deep learning techniques have been used in the attempts to predict where proteins go. Because the secretory pathway was the first one to be studied both experimentally and through bioinformatics, our main focus is on the historical development of prediction methods for signal peptides that target proteins for secretion; prediction methods to identify targeting signals for other cellular compartments are treated in less detail.
Collapse
Affiliation(s)
- Henrik Nielsen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kgs. Lyngby, Denmark.
| | - Konstantinos D Tsirigos
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Søren Brunak
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kgs. Lyngby, Denmark
- Faculty of Health and Medical Sciences, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Gunnar von Heijne
- Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
- Science for Life Laboratory, Stockholm University, Solna, Sweden
| |
Collapse
|
36
|
Kume K, Amagasa T, Hashimoto T, Kitagawa H. NommPred: Prediction of Mitochondrial and Mitochondrion-Related Organelle Proteins of Nonmodel Organisms. Evol Bioinform Online 2018; 14:1176934318819835. [PMID: 30626996 PMCID: PMC6305954 DOI: 10.1177/1176934318819835] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 11/07/2018] [Indexed: 01/11/2023] Open
Abstract
To estimate the functions of mitochondria of diverse eukaryotic nonmodel organisms in which the mitochondrial proteomes are not available, it is necessary to predict the protein sequence features of the mitochondrial proteins computationally. Various prediction methods that are trained using the proteins of model organisms belonging particularly to animals, plants, and fungi exist. However, such methods may not be suitable for predicting the proteins derived from nonmodel organisms because the sequence features of the mitochondrial proteins of diversified nonmodel organisms can differ from those of model organisms that are present only in restricted parts of the tree of eukaryotes. Here, we proposed NommPred, which predicts the mitochondrial proteins of nonmodel organisms that are widely distributed over eukaryotes. We used a gradient boosting machine to develop 2 predictors-one for predicting the proteins of mitochondria and the other for predicting the proteins of mitochondrion-related organelles that are highly reduced mitochondria. The performance of both predictors was found to be better than that of the best method available.
Collapse
Affiliation(s)
- Keitaro Kume
- Graduate School of Systems and Information Engineering, University of Tsukuba, Tsukuba, Japan
- Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Japan
| | - Toshiyuki Amagasa
- Graduate School of Systems and Information Engineering, University of Tsukuba, Tsukuba, Japan
- Center for Computational Sciences, University of Tsukuba, Tsukuba, Japan
| | - Tetsuo Hashimoto
- Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Japan
- Center for Computational Sciences, University of Tsukuba, Tsukuba, Japan
| | - Hiroyuki Kitagawa
- Graduate School of Systems and Information Engineering, University of Tsukuba, Tsukuba, Japan
- Center for Computational Sciences, University of Tsukuba, Tsukuba, Japan
| |
Collapse
|
37
|
Savojardo C, Martelli PL, Fariselli P, Casadio R. SChloro: directing Viridiplantae proteins to six chloroplastic sub-compartments. Bioinformatics 2018; 33:347-353. [PMID: 28172591 PMCID: PMC5408801 DOI: 10.1093/bioinformatics/btw656] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Revised: 06/21/2016] [Accepted: 10/12/2016] [Indexed: 12/12/2022] Open
Abstract
Motivation Chloroplasts are organelles found in plants and involved in several important cell processes. Similarly to other compartments in the cell, chloroplasts have an internal structure comprising several sub-compartments, where different proteins are targeted to perform their functions. Given the relation between protein function and localization, the availability of effective computational tools to predict protein sub-organelle localizations is crucial for large-scale functional studies. Results In this paper we present SChloro, a novel machine-learning approach to predict protein sub-chloroplastic localization, based on targeting signal detection and membrane protein information. The proposed approach performs multi-label predictions discriminating six chloroplastic sub-compartments that include inner membrane, outer membrane, stroma, thylakoid lumen, plastoglobule and thylakoid membrane. In comparative benchmarks, the proposed method outperforms current state-of-the-art methods in both single- and multi-compartment predictions, with an overall multi-label accuracy of 74%. The results demonstrate the relevance of the approach that is eligible as a good candidate for integration into more general large-scale annotation pipelines of protein subcellular localization. Availability and Implementation The method is available as web server at http://schloro.biocomp.unibo.it Contact gigi@biocomp.unibo.it.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, BiGeA - CIG, Interdepartmental Center «Luigi Galvani» for Integrated Studies of Bioinformatics, Biophysics and Biocomplexity, University of Bologna, Bologna, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, BiGeA - CIG, Interdepartmental Center «Luigi Galvani» for Integrated Studies of Bioinformatics, Biophysics and Biocomplexity, University of Bologna, Bologna, Italy
| | - Piero Fariselli
- Department of Comparative Biomedicine and Food Science (BCA), University of Padova, Padova, Italy
| | - Rita Casadio
- Biocomputing Group, BiGeA - CIG, Interdepartmental Center «Luigi Galvani» for Integrated Studies of Bioinformatics, Biophysics and Biocomplexity, University of Bologna, Bologna, Italy.,Interdepartmental Center «Giorgio Prodi» for Cancer Research, University of Bologna, Bologna, Italy
| |
Collapse
|
38
|
Savojardo C, Martelli P, Fariselli P, Profiti G, Casadio R. BUSCA: an integrative web server to predict subcellular localization of proteins. Nucleic Acids Res 2018; 46:W459-W466. [PMID: 29718411 PMCID: PMC6031068 DOI: 10.1093/nar/gky320] [Citation(s) in RCA: 280] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 04/12/2018] [Accepted: 04/17/2018] [Indexed: 12/28/2022] Open
Abstract
Here, we present BUSCA (http://busca.biocomp.unibo.it), a novel web server that integrates different computational tools for predicting protein subcellular localization. BUSCA combines methods for identifying signal and transit peptides (DeepSig and TPpred3), GPI-anchors (PredGPI) and transmembrane domains (ENSEMBLE3.0 and BetAware) with tools for discriminating subcellular localization of both globular and membrane proteins (BaCelLo, MemLoci and SChloro). Outcomes from the different tools are processed and integrated for annotating subcellular localization of both eukaryotic and bacterial protein sequences. We benchmark BUSCA against protein targets derived from recent CAFA experiments and other specific data sets, reporting performance at the state-of-the-art. BUSCA scores better than all other evaluated methods on 2732 targets from CAFA2, with a F1 value equal to 0.49 and among the best methods when predicting targets from CAFA3. We propose BUSCA as an integrated and accurate resource for the annotation of protein subcellular localization.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna 40100, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna 40100, Italy
| | - Piero Fariselli
- Department of Comparative Biomedicine and Food Science, University of Padova, Padova 35020, Italy
| | - Giuseppe Profiti
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna 40100, Italy
- Institute of Biomembrane, Bioenergetics and Molecular Biotechnologies, Italian National Research Council (CNR), Bari 70126, Italy
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna 40100, Italy
- Institute of Biomembrane, Bioenergetics and Molecular Biotechnologies, Italian National Research Council (CNR), Bari 70126, Italy
| |
Collapse
|
39
|
Zhou H, Yang Y, Shen HB. Hum-mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features. Bioinformatics 2017; 33:843-853. [PMID: 27993784 DOI: 10.1093/bioinformatics/btw723] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Accepted: 11/17/2016] [Indexed: 11/13/2022] Open
Abstract
Motivation Protein subcellular localization prediction has been an important research topic in computational biology over the last decade. Various automatic methods have been proposed to predict locations for large scale protein datasets, where statistical machine learning algorithms are widely used for model construction. A key step in these predictors is encoding the amino acid sequences into feature vectors. Many studies have shown that features extracted from biological domains, such as gene ontology and functional domains, can be very useful for improving the prediction accuracy. However, domain knowledge usually results in redundant features and high-dimensional feature spaces, which may degenerate the performance of machine learning models. Results In this paper, we propose a new amino acid sequence-based human protein subcellular location prediction approach Hum-mPLoc 3.0, which covers 12 human subcellular localizations. The sequences are represented by multi-view complementary features, i.e. context vocabulary annotation-based gene ontology (GO) terms, peptide-based functional domains, and residue-based statistical features. To systematically reflect the structural hierarchy of the domain knowledge bases, we propose a novel feature representation protocol denoted as HCM (Hidden Correlation Modeling), which will create more compact and discriminative feature vectors by modeling the hidden correlations between annotation terms. Experimental results on four benchmark datasets show that HCM improves prediction accuracy by 5-11% and F 1 by 8-19% compared with conventional GO-based methods. A large-scale application of Hum-mPLoc 3.0 on the whole human proteome reveals proteins co-localization preferences in the cell. Availability and Implementation www.csbio.sjtu.edu.cn/bioinf/Hum-mPLoc3/. Contacts hbshen@sjtu.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hang Zhou
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Ministry of Education of China, Shanghai, China.,Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
| | - Yang Yang
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China.,Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Ministry of Education of China, Shanghai, China.,Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
| |
Collapse
|
40
|
Calvo SE, Julien O, Clauser KR, Shen H, Kamer KJ, Wells JA, Mootha VK. Comparative Analysis of Mitochondrial N-Termini from Mouse, Human, and Yeast. Mol Cell Proteomics 2017; 16:512-523. [PMID: 28122942 DOI: 10.1074/mcp.m116.063818] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Revised: 01/06/2017] [Indexed: 01/08/2023] Open
Abstract
The majority of mitochondrial proteins are encoded in the nuclear genome, translated in the cytoplasm, and directed to the mitochondria by an N-terminal presequence that is cleaved upon import. Recently, N-proteome catalogs have been generated for mitochondria from yeast and from human U937 cells. Here, we applied the subtiligase method to determine N-termini for 327 proteins in mitochondria isolated from mouse liver and kidney. Comparative analysis between mitochondrial N-termini from mouse, human, and yeast proteins shows that whereas presequences are poorly conserved at the sequence level, other presequence properties are extremely conserved, including a length of ∼20-60 amino acids, a net charge between +3 to +6, and the presence of stabilizing amino acids at the N-terminus of mature proteins that follow the N-end rule from bacteria. As in yeast, ∼80% of mouse presequence cleavage sites match canonical motifs for three mitochondrial peptidases (MPP, Icp55, and Oct1), whereas the remainder do not match any known peptidase motifs. We show that mature mitochondrial proteins often exist with a spectrum of N-termini, consistent with a model of multiple cleavage events by MPP and Icp55. In addition to analysis of canonical targeting presequences, our N-terminal dataset allows the exploration of other cleavage events and provides support for polypeptide cleavage into two distinct enzymes (Hsd17b4), protein cleavages key for signaling (Oma1, Opa1, Htra2, Mavs, and Bcs2l13), and in several cases suggests novel protein isoforms (Scp2, Acadm, Adck3, Hsdl2, Dlst, and Ogdh). We present an integrated catalog of mammalian mitochondrial N-termini that can be used as a community resource to investigate individual proteins, to elucidate mechanisms of mammalian mitochondrial processing, and to allow researchers to engineer tags distally to the presequence cleavage.
Collapse
Affiliation(s)
- Sarah E Calvo
- From the ‡Howard Hughes Medical Institute, Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts 02114; .,§Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02115.,¶Broad Institute, Cambridge, Massachusetts 02141
| | | | | | - Hongying Shen
- From the ‡Howard Hughes Medical Institute, Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts 02114.,§Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02115
| | - Kimberli J Kamer
- From the ‡Howard Hughes Medical Institute, Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts 02114.,§Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02115
| | - James A Wells
- **Departments of Pharmaceutical Chemistry and.,§§Cellular and Molecular Pharmacology, University of California, San Francisco, California 94143
| | - Vamsi K Mootha
- From the ‡Howard Hughes Medical Institute, Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts 02114.,§Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02115
| |
Collapse
|