401
|
Cope KR, Prates ET, Miller JI, Demerdash ON, Shah M, Kainer D, Cliff A, Sullivan KA, Cashman M, Lane M, Matthiadis A, Labbé J, Tschaplinski TJ, Jacobson DA, Kalluri UC. Exploring the role of plant lysin motif receptor-like kinases in regulating plant-microbe interactions in the bioenergy crop Populus. Comput Struct Biotechnol J 2022; 21:1122-1139. [PMID: 36789259 PMCID: PMC9900275 DOI: 10.1016/j.csbj.2022.12.052] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 12/18/2022] [Accepted: 12/30/2022] [Indexed: 01/02/2023] Open
Abstract
For plants, distinguishing between mutualistic and pathogenic microbes is a matter of survival. All microbes contain microbe-associated molecular patterns (MAMPs) that are perceived by plant pattern recognition receptors (PRRs). Lysin motif receptor-like kinases (LysM-RLKs) are PRRs attuned for binding and triggering a response to specific MAMPs, including chitin oligomers (COs) in fungi, lipo-chitooligosaccharides (LCOs), which are produced by mycorrhizal fungi and nitrogen-fixing rhizobial bacteria, and peptidoglycan in bacteria. The identification and characterization of LysM-RLKs in candidate bioenergy crops including Populus are limited compared to other model plant species, thus inhibiting our ability to both understand and engineer microbe-mediated gains in plant productivity. As such, we performed a sequence analysis of LysM-RLKs in the Populus genome and predicted their function based on phylogenetic analysis with known LysM-RLKs. Then, using predictive models, molecular dynamics simulations, and comparative structural analysis with previously characterized CO and LCO plant receptors, we identified probable ligand-binding sites in Populus LysM-RLKs. Using several machine learning models, we predicted remarkably consistent binding affinity rankings of Populus proteins to CO. In addition, we used a modified Random Walk with Restart network-topology based approach to identify a subset of Populus LysM-RLKs that are functionally related and propose a corresponding signal transduction cascade. Our findings provide the first look into the role of LysM-RLKs in Populus-microbe interactions and establish a crucial jumping-off point for future research efforts to understand specificity and redundancy in microbial perception mechanisms.
Collapse
Affiliation(s)
- Kevin R. Cope
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Erica T. Prates
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - John I. Miller
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Omar N.A. Demerdash
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Manesh Shah
- Genome Science and Technology, The University of Tennessee–Knoxville, Knoxville, TN 37996, USA
| | - David Kainer
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Ashley Cliff
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville 37996, USA
| | - Kyle A. Sullivan
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Mikaela Cashman
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Matthew Lane
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville 37996, USA
| | - Anna Matthiadis
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Jesse Labbé
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | | | - Daniel A. Jacobson
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville 37996, USA
| | - Udaya C. Kalluri
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| |
Collapse
|
402
|
Calloni RD, Muchut RJ, Garay AS, Arias DG, Iglesias AA, Guerrero SA. Functional and structural characterization of an endo-β-1,3-glucanase from Euglena gracilis. Biochimie 2022; 208:117-128. [PMID: 36586565 DOI: 10.1016/j.biochi.2022.12.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 12/20/2022] [Accepted: 12/23/2022] [Indexed: 12/29/2022]
Abstract
Endo-β-1,3-glucanases from several organisms have attracted much attention in recent years because of their capability for in vitro degrading β-1,3-glucan as a critical step for both biofuels production and short-chain oligosaccharides synthesis. In this study, we biochemically characterized a putative endo-β-1,3-glucanase (EgrGH64) belonging to the family GH64 from the single-cell protist Euglena gracilis. The gene coding for the enzyme was heterologously expressed in a prokaryotic expression system supplemented with 3% (v/v) ethanol to optimize the recombinant protein right folding. Thus, the produced enzyme was highly purified by immobilized-metal affinity and gel filtration chromatography. The enzymatic study demonstrated that EgrGH64 could hydrolyze laminarin (KM 23.5 mg ml-1,kcat 1.20 s-1) and also, but with less enzymatic efficiency, paramylon (KM 20.2 mg ml-1,kcat 0.23 ml mg-1 s-1). The major product of the hydrolysis of both substrates was laminaripentaose. The enzyme could also use ramified β-glucan from the baker's yeast cell wall as a substrate (KM 2.10 mg ml-1, kcat 0.88 ml mg-1 s-1). This latter result, combined with interfacial kinetic analysis evidenced a protein's greater efficiency for the yeast polysaccharide, and a higher number of hydrolysis sites in the β-1,3/β-1,6-glucan. Concurrently, the enzyme efficiently inhibited the fungal growth when used at 1.0 mg/mL (15.4 μM). This study contributes to assigning a correct function and determining the enzymatic specificity of EgrGH64, which emerges as a relevant biotechnological tool for processing β-glucans.
Collapse
Affiliation(s)
- Rodrigo D Calloni
- Laboratorio de Enzimología Molecular, Instituto de Agrobiotecnología del Litoral (CONICET-UNL), Santa Fe, Argentina; Facultad de Bioquímica y Ciencias Biológicas, Universidad Nacional del Litoral, Santa Fe, Argentina
| | - Robertino J Muchut
- Laboratorio de Enzimología Molecular, Instituto de Agrobiotecnología del Litoral (CONICET-UNL), Santa Fe, Argentina
| | - Alberto S Garay
- Facultad de Bioquímica y Ciencias Biológicas, Universidad Nacional del Litoral, Santa Fe, Argentina
| | - Diego G Arias
- Laboratorio de Enzimología Molecular, Instituto de Agrobiotecnología del Litoral (CONICET-UNL), Santa Fe, Argentina; Facultad de Bioquímica y Ciencias Biológicas, Universidad Nacional del Litoral, Santa Fe, Argentina
| | - Alberto A Iglesias
- Laboratorio de Enzimología Molecular, Instituto de Agrobiotecnología del Litoral (CONICET-UNL), Santa Fe, Argentina; Facultad de Bioquímica y Ciencias Biológicas, Universidad Nacional del Litoral, Santa Fe, Argentina
| | - Sergio A Guerrero
- Laboratorio de Enzimología Molecular, Instituto de Agrobiotecnología del Litoral (CONICET-UNL), Santa Fe, Argentina; Facultad de Bioquímica y Ciencias Biológicas, Universidad Nacional del Litoral, Santa Fe, Argentina.
| |
Collapse
|
403
|
Zhao W, Zhong B, Zheng L, Tan P, Wang Y, Leng H, de Souza N, Liu Z, Hong L, Xiao X. Proteome-wide 3D structure prediction provides insights into the ancestral metabolism of ancient archaea and bacteria. Nat Commun 2022; 13:7861. [PMID: 36543797 PMCID: PMC9772386 DOI: 10.1038/s41467-022-35523-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 12/07/2022] [Indexed: 12/24/2022] Open
Abstract
Ancestral metabolism has remained controversial due to a lack of evidence beyond sequence-based reconstructions. Although prebiotic chemists have provided hints that metabolism might originate from non-enzymatic protometabolic pathways, gaps between ancestral reconstruction and prebiotic processes mean there is much that is still unknown. Here, we apply proteome-wide 3D structure predictions and comparisons to investigate ancestorial metabolism of ancient bacteria and archaea, to provide information beyond sequence as a bridge to the prebiotic processes. We compare representative bacterial and archaeal strains, which reveal surprisingly similar physiological and metabolic characteristics via microbiological and biophysical experiments. Pairwise comparison of protein structures identify the conserved metabolic modules in bacteria and archaea, despite interference from overly variable sequences. The conserved modules (for example, middle of glycolysis, partial TCA, proton/sulfur respiration, building block biosynthesis) constitute the basic functions that possibly existed in the archaeal-bacterial common ancestor, which are remarkably consistent with the experimentally confirmed protometabolic pathways. These structure-based findings provide a new perspective to reconstructing the ancestral metabolism and understanding its origin, which suggests high-throughput protein 3D structure prediction is a promising approach, deserving broader application in future ancestral exploration.
Collapse
Affiliation(s)
- Weishu Zhao
- State Key Laboratory of Microbial Metabolism, International Center for Deep Life Investigation (IC-DLI), School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Bozitao Zhong
- State Key Laboratory of Microbial Metabolism, International Center for Deep Life Investigation (IC-DLI), School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200240, Shanghai, China
- Institute of Natural Sciences, Shanghai National Center for Applied Mathematics (SJTU Center) and MOE-LSC, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Lirong Zheng
- Institute of Natural Sciences, Shanghai National Center for Applied Mathematics (SJTU Center) and MOE-LSC, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Pan Tan
- Institute of Natural Sciences, Shanghai National Center for Applied Mathematics (SJTU Center) and MOE-LSC, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Yinzhao Wang
- State Key Laboratory of Microbial Metabolism, International Center for Deep Life Investigation (IC-DLI), School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Hao Leng
- State Key Laboratory of Microbial Metabolism, International Center for Deep Life Investigation (IC-DLI), School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Nicolas de Souza
- Australian Nuclear Science and Technology (ANSTO), Locked Bag 2001, Kirrawee DC, Sydney, NSW, 2232, Australia
| | - Zhuo Liu
- Institute of Natural Sciences, Shanghai National Center for Applied Mathematics (SJTU Center) and MOE-LSC, Shanghai Jiao Tong University, 200240, Shanghai, China
- Shanghai Artificial Intelligence Laboratory, 200232, Shanghai, China
- School of Physics and Astronomy, Zhangjiang Institute for Advanced Study, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Liang Hong
- Institute of Natural Sciences, Shanghai National Center for Applied Mathematics (SJTU Center) and MOE-LSC, Shanghai Jiao Tong University, 200240, Shanghai, China.
- Shanghai Artificial Intelligence Laboratory, 200232, Shanghai, China.
- School of Physics and Astronomy, Zhangjiang Institute for Advanced Study, Shanghai Jiao Tong University, 200240, Shanghai, China.
| | - Xiang Xiao
- State Key Laboratory of Microbial Metabolism, International Center for Deep Life Investigation (IC-DLI), School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200240, Shanghai, China.
- Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, Guangdong, China.
| |
Collapse
|
404
|
Mufassirin MMM, Newton MAH, Sattar A. Artificial intelligence for template-free protein structure prediction: a comprehensive review. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10350-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
405
|
Schäfer T, Kramer K, Werten S, Rupp B, Hoffmeister D. Characterization of the Gateway Decarboxylase for Psilocybin Biosynthesis. Chembiochem 2022; 23:e202200551. [PMID: 36327140 DOI: 10.1002/cbic.202200551] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Revised: 11/01/2022] [Indexed: 11/06/2022]
Abstract
The l-tryptophan decarboxylase PsiD catalyzes the initial step of the metabolic cascade to psilocybin, the major indoleethylamine natural product of the "magic" mushrooms and a candidate drug against major depressive disorder. Unlike numerous pyridoxal phosphate (PLP)-dependent decarboxylases for natural product biosyntheses, PsiD is PLP-independent and resembles type II phosphatidylserine decarboxylases. Here, we report on the in vitro biochemical characterization of Psilocybe cubensis PsiD along with in silico modeling of the PsiD structure. A non-canonical serine protease triad for autocatalytic cleavage of the pro-protein was predicted and experimentally verified by site-directed mutagenesis.
Collapse
Affiliation(s)
- Tim Schäfer
- Department Pharmaceutical Microbiology at the Hans-Knöll-Institute, Friedrich-Schiller-Universität, Beutenbergstrasse 11a, 07745, Jena, Germany
| | - Kristina Kramer
- Department Pharmaceutical Microbiology at the Hans-Knöll-Institute, Friedrich-Schiller-Universität, Beutenbergstrasse 11a, 07745, Jena, Germany
| | - Sebastiaan Werten
- Institute of Genetic Epidemiology, Medizinische Universität Innsbruck, Schöpfstrasse 41, 6020, Innsbruck, Austria
| | - Bernhard Rupp
- Institute of Genetic Epidemiology, Medizinische Universität Innsbruck, Schöpfstrasse 41, 6020, Innsbruck, Austria.,k.-k. Hofkristallamt, 991 Audrey Place, Vista, CA, 92084, USA
| | - Dirk Hoffmeister
- Department Pharmaceutical Microbiology at the Hans-Knöll-Institute, Friedrich-Schiller-Universität, Beutenbergstrasse 11a, 07745, Jena, Germany
| |
Collapse
|
406
|
Mérida-Quesada F, Vergara-Valladares F, Rubio-Meléndez ME, Hernández-Rojas N, González-González A, Michard E, Navarro-Retamal C, Dreyer I. TPC1-Type Channels in Physcomitrium patens: Interaction between EF-Hands and Ca 2. PLANTS (BASEL, SWITZERLAND) 2022; 11:3527. [PMID: 36559639 PMCID: PMC9783492 DOI: 10.3390/plants11243527] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 12/12/2022] [Accepted: 12/14/2022] [Indexed: 05/26/2023]
Abstract
Two-pore channels (TPCs) are members of the superfamily of ligand-gated and voltage-sensitive ion channels in the membranes of intracellular organelles of eukaryotic cells. The evolution of ordinary plant TPC1 essentially followed a very conservative pattern, with no changes in the characteristic structural footprints of these channels, such as the cytosolic and luminal regions involved in Ca2+ sensing. In contrast, the genomes of mosses and liverworts encode also TPC1-like channels with larger variations at these sites (TPC1b channels). In the genome of the model plant Physcomitrium patens we identified nine non-redundant sequences belonging to the TPC1 channel family, two ordinary TPC1-type, and seven TPC1b-type channels. The latter show variations in critical amino acids in their EF-hands essential for Ca2+ sensing. To investigate the impact of these differences between TPC1 and TPC1b channels, we generated structural models of the EF-hands of PpTPC1 and PpTPC1b channels. These models were used in molecular dynamics simulations to determine the frequency with which calcium ions were present in a coordination site and also to estimate the average distance of the ions from the center of this site. Our analyses indicate that the EF-hand domains of PpTPC1b-type channels have a lower capacity to coordinate calcium ions compared with those of common TPC1-like channels.
Collapse
Affiliation(s)
- Franko Mérida-Quesada
- Programa de Doctorado en Ciencias mención Modelado de Sistemas Químicos y Biológicos, Universidad de Talca, 2 Norte 685, Talca CL-3460000, Chile
| | - Fernando Vergara-Valladares
- Programa de Doctorado en Ciencias mención Modelado de Sistemas Químicos y Biológicos, Universidad de Talca, 2 Norte 685, Talca CL-3460000, Chile
| | - María Eugenia Rubio-Meléndez
- Electrical Signaling in Plants (ESP) Laboratory–Centro de Bioinformática y Simulación Molecular (CBSM), Facultad de Ingeniería, Universidad de Talca, 2 Norte 685, Talca CL-3460000, Chile
| | - Naomí Hernández-Rojas
- Electrical Signaling in Plants (ESP) Laboratory–Centro de Bioinformática y Simulación Molecular (CBSM), Facultad de Ingeniería, Universidad de Talca, 2 Norte 685, Talca CL-3460000, Chile
| | - Angélica González-González
- Programa de Doctorado en Ciencias mención Biología Vegetal y Biotecnología, Universidad de Talca, 2 Norte 685, Talca CL-3460000, Chile
- Instituto de Ciencias Biológicas, Universidad de Talca, Campus Talca, Avenida Lircay, Talca CL-3460000, Chile
| | - Erwan Michard
- Instituto de Ciencias Biológicas, Universidad de Talca, Campus Talca, Avenida Lircay, Talca CL-3460000, Chile
| | - Carlos Navarro-Retamal
- Instituto de Ciencias Biológicas, Universidad de Talca, Campus Talca, Avenida Lircay, Talca CL-3460000, Chile
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742-5815, USA
| | - Ingo Dreyer
- Electrical Signaling in Plants (ESP) Laboratory–Centro de Bioinformática y Simulación Molecular (CBSM), Facultad de Ingeniería, Universidad de Talca, 2 Norte 685, Talca CL-3460000, Chile
| |
Collapse
|
407
|
Weyer R, Hellmann MJ, Hamer-Timmermann SN, Singh R, Moerschbacher BM. Customized chitooligosaccharide production-controlling their length via engineering of rhizobial chitin synthases and the choice of expression system. Front Bioeng Biotechnol 2022; 10:1073447. [PMID: 36588959 PMCID: PMC9795070 DOI: 10.3389/fbioe.2022.1073447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 11/28/2022] [Indexed: 12/15/2022] Open
Abstract
Chitooligosaccharides (COS) have attracted attention from industry and academia in various fields due to their diverse bioactivities. However, their conventional chemical production is environmentally unfriendly and in addition, defined and pure molecules are both scarce and expensive. A promising alternative is the in vivo synthesis of desired COS in microbial platforms with specific chitin synthases enabling a more sustainable production. Hence, we examined the whole cell factory approach with two well-established microorganisms-Escherichia coli and Corynebacterium glutamicum-to produce defined COS with the chitin synthase NodC from Rhizobium sp. GRH2. Moreover, based on an in silico model of the synthase, two amino acids potentially relevant for COS length were identified and mutated to direct the production. Experimental validation showed the influence of the expression system, the mutations, and their combination on COS length, steering the production from originally pentamers towards tetramers or hexamers, the latter virtually pure. Possible explanations are given by molecular dynamics simulations. These findings pave the way for a better understanding of chitin synthases, thus allowing a more targeted production of defined COS. This will, in turn, at first allow better research of COS' bioactivities, and subsequently enable sustainable large-scale production of oligomers.
Collapse
|
408
|
Palukaitis P, Akbarimotlagh M, Baek E, Yoon JY. The Secret Life of the Inhibitor of Virus Replication. Viruses 2022; 14:2782. [PMID: 36560786 PMCID: PMC9787567 DOI: 10.3390/v14122782] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/11/2022] [Accepted: 12/12/2022] [Indexed: 12/15/2022] Open
Abstract
The inhibitor of virus replication (IVR) is an inducible protein that is not virus-target-specific and can be induced by several viruses. The GenBank was interrogated for sequences closely related to the tobacco IVR. Various RNA fragments from tobacco, tomato, and potato and their genomic DNA contained IVR-like sequences. However, IVRs were part of larger proteins encoded by these genomic DNA sequences, which were identified in Arabidopsis as being related to the cyclosome protein designated anaphase-promoting complex 7 (APC7). Sequence analysis of the putative APC7s of nine plant species showed proteins of 558-561 amino acids highly conserved in sequence containing at least six protein-binding elements of 34 amino acids called tetratricopeptide repeats (TPRs), which form helix-turn-helix structures. The structures of Arabidopsis APC7 and the tobacco IVR proteins were modeled using the AlphaFold program and superimposed, showing that IVR had the same structure as the C-terminal 34% of APC7, indicating that IVR was a product of the APC7 gene. Based on the presence of various transcription factor binding sites in the APC7 sequences upstream of the IVR coding sequences, we propose that IVR could be expressed by these APC7 gene sequences involving the transcription factor SHE1.
Collapse
Affiliation(s)
- Peter Palukaitis
- Department of Horticulture Sciences, Seoul Women’s University, Seoul 01797, Republic of Korea
| | - Masoud Akbarimotlagh
- Plant Pathology Department, Faculty of Agriculture, Tarbiat Modares University, Tehran 14115-111, Iran
| | - Eseul Baek
- Department of Horticulture Sciences, Seoul Women’s University, Seoul 01797, Republic of Korea
| | - Ju-Yeon Yoon
- Department of Plant Protection and Quarantine, Jeonbuk National University, Jeonju 54896, Republic of Korea
- Department of Agricultural Convergence Technology, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
409
|
Delzell S, Nelson SW, Frost MP, Klingbeil MM. Trypanosoma brucei Mitochondrial DNA Polymerase POLIB Contains a Novel Polymerase Domain Insertion That Confers Dominant Exonuclease Activity. Biochemistry 2022; 61:2751-2765. [PMID: 36399653 PMCID: PMC9731263 DOI: 10.1021/acs.biochem.2c00392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 10/31/2022] [Indexed: 11/19/2022]
Abstract
Trypanosoma brucei and related parasites contain an unusual catenated mitochondrial genome known as kinetoplast DNA (kDNA) composed of maxicircles and minicircles. The kDNA structure and replication mechanism are divergent and essential for parasite survival. POLIB is one of three Family A DNA polymerases independently essential to maintain the kDNA network. However, the division of labor among the paralogs, particularly which might be a replicative, proofreading enzyme, remains enigmatic. De novo modeling of POLIB suggested a structure that is divergent from all other Family A polymerases, in which the thumb subdomain contains a 369 amino acid insertion with homology to DEDDh DnaQ family 3'-5' exonucleases. Here we demonstrate recombinant POLIB 3'-5' exonuclease prefers DNA vs RNA substrates and degrades single- and double-stranded DNA nonprocessively. Exonuclease activity prevails over polymerase activity on DNA substrates at pH 8.0, while DNA primer extension is favored at pH 6.0. Mutations that ablate POLIB polymerase activity slow the exonuclease rate suggesting crosstalk between the domains. We show that POLIB extends an RNA primer more efficiently than a DNA primer in the presence of dNTPs but does not incorporate rNTPs efficiently using either primer. Immunoprecipitation of Pol I-like paralogs from T. brucei corroborates the pH selectivity and RNA primer preferences of POLIB and revealed that the other paralogs efficiently extend a DNA primer. The enzymatic properties of POLIB suggest this paralog is not a replicative kDNA polymerase, and the noncanonical polymerase domain provides another example of exquisite diversity among DNA polymerases for specialized function.
Collapse
Affiliation(s)
- Stephanie
B. Delzell
- Department
of Microbiology, University of Massachusetts, Amherst, Massachusetts01003, United States
| | - Scott W. Nelson
- Roy
J. Carver Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State University, Ames, Iowa50011, United States
| | - Matthew P. Frost
- Department
of Microbiology, University of Massachusetts, Amherst, Massachusetts01003, United States
| | - Michele M. Klingbeil
- Department
of Microbiology, University of Massachusetts, Amherst, Massachusetts01003, United States
- The
Institute for Applied Life Sciences, University
of Massachusetts, Amherst, Massachusetts01003, United States
| |
Collapse
|
410
|
Roney JP, Ovchinnikov S. State-of-the-Art Estimation of Protein Model Accuracy Using AlphaFold. PHYSICAL REVIEW LETTERS 2022; 129:238101. [PMID: 36563190 DOI: 10.1103/physrevlett.129.238101] [Citation(s) in RCA: 100] [Impact Index Per Article: 33.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 10/18/2022] [Indexed: 06/17/2023]
Abstract
The problem of predicting a protein's 3D structure from its primary amino acid sequence is a longstanding challenge in structural biology. Recently, approaches like alphafold have achieved remarkable performance on this task by combining deep learning techniques with coevolutionary data from multiple sequence alignments of related protein sequences. The use of coevolutionary information is critical to these models' accuracy, and without it their predictive performance drops considerably. In living cells, however, the 3D structure of a protein is fully determined by its primary sequence and the biophysical laws that cause it to fold into a low-energy configuration. Thus, it should be possible to predict a protein's structure from only its primary sequence by learning an approximate biophysical energy function. We provide evidence that alphafold has learned such an energy function, and uses coevolution data to solve the global search problem of finding a low-energy conformation. We demonstrate that alphafold'slearned energy function can be used to rank the quality of candidate protein structures with state-of-the-art accuracy, without using any coevolution data. Finally, we explore several applications of this energy function, including the prediction of protein structures without multiple sequence alignments.
Collapse
Affiliation(s)
- James P Roney
- Harvard University, Cambridge, Massachusetts 02138, USA
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, Massachusetts 02138, USA
| |
Collapse
|
411
|
Sánchez Rodríguez F, Chojnowski G, Keegan RM, Rigden DJ. Using deep-learning predictions of inter-residue distances for model validation. Acta Crystallogr D Struct Biol 2022; 78:1412-1427. [PMID: 36458613 PMCID: PMC9716559 DOI: 10.1107/s2059798322010415] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 10/28/2022] [Indexed: 11/27/2022] Open
Abstract
Determination of protein structures typically entails building a model that satisfies the collected experimental observations and its deposition in the Protein Data Bank. Experimental limitations can lead to unavoidable uncertainties during the process of model building, which result in the introduction of errors into the deposited model. Many metrics are available for model validation, but most are limited to consideration of the physico-chemical aspects of the model or its match to the experimental data. The latest advances in the field of deep learning have enabled the increasingly accurate prediction of inter-residue distances, an advance which has played a pivotal role in the recent improvements observed in the field of protein ab initio modelling. Here, new validation methods are presented based on the use of these precise inter-residue distance predictions, which are compared with the distances observed in the protein model. Sequence-register errors are particularly clearly detected and the register shifts required for their correction can be reliably determined. The method is available in the ConKit package (https://www.conkit.org).
Collapse
Affiliation(s)
- Filomeno Sánchez Rodríguez
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
- Life Science, Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Grzegorz Chojnowski
- European Molecular Biology Laboratory, Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany
| | - Ronan M. Keegan
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| |
Collapse
|
412
|
Zhao C, Liu T, Wang Z. Predicting residue-specific qualities of individual protein models using residual neural networks and graph neural networks. Proteins 2022; 90:2091-2102. [PMID: 35842895 PMCID: PMC9796650 DOI: 10.1002/prot.26400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 06/24/2022] [Accepted: 07/08/2022] [Indexed: 01/02/2023]
Abstract
The estimation of protein model accuracy (EMA) or model quality assessment (QA) is important for protein structure prediction. An accurate EMA algorithm can guide the refinement of models or pick the best model or best parts of models from a pool of predicted tertiary structures. We developed two novel methods: MASS2 and LAW, for predicting residue-specific or local qualities of individual models, which incorporate residual neural networks and graph neural networks, respectively. These two methods use similar features extracted from protein models but different architectures of neural networks to predict the local accuracies of single models. MASS2 and LAW participated in the QA category of CASP14, and according to our evaluations based on CASP14 official criteria, MASS2 and LAW are the best and second-best methods based on the Z-scores of ASE/100, AUC, and ULR-1.F1. We also evaluated MASS2, LAW, and the residue-specific predicted deviations (between model and native structure) generated by AlphaFold2 on CASP14 AlphaFold2 tertiary structure (TS) models. LAW achieved comparable or better performances compared to the predicted deviations generated by AlphaFold2 on AlphaFold2 TS models, even though LAW was not trained on any AlphaFold2 TS models. Specifically, LAW performed better on AUC and ULR scores, and AlphaFold2 performed better on ASE scores. This means that AlphaFold2 is better at predicting deviations, but LAW is better at classifying accurate and inaccurate residues and detecting unreliable local regions. MASS2 and LAW can be freely accessed from http://dna.cs.miami.edu/MASS2-CASP14/ and http://dna.cs.miami.edu/LAW-CASP14/, respectively.
Collapse
Affiliation(s)
- Chenguang Zhao
- Department of Computer ScienceUniversity of MiamiCoral GablesFloridaUSA
| | - Tong Liu
- Department of Computer ScienceUniversity of MiamiCoral GablesFloridaUSA
| | - Zheng Wang
- Department of Computer ScienceUniversity of MiamiCoral GablesFloridaUSA
| |
Collapse
|
413
|
Leipart V, Enger Ø, Turcu DC, Dobrovolska O, Drabløs F, Halskau Ø, Amdam GV. Resolving the zinc binding capacity of honey bee vitellogenin and locating its putative binding sites. INSECT MOLECULAR BIOLOGY 2022; 31:810-820. [PMID: 36054587 PMCID: PMC9804912 DOI: 10.1111/imb.12807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 08/08/2022] [Indexed: 06/15/2023]
Abstract
The protein vitellogenin (Vg) plays a central role in lipid transportation in most egg-laying animals. High Vg levels correlate with stress resistance and lifespan potential in honey bees (Apis mellifera). Vg is the primary circulating zinc-carrying protein in honey bees. Zinc is an essential metal ion in numerous biological processes, including the function and structure of many proteins. Measurements of Zn2+ suggest a variable number of ions per Vg molecule in different animal species, but the molecular implications of zinc-binding by this protein are not well-understood. We used inductively coupled plasma mass spectrometry to determine that, on average, each honey bee Vg molecule binds 3 Zn2+ -ions. Our full-length protein structure and sequence analysis revealed seven potential zinc-binding sites. These are located in the β-barrel and α-helical subdomains of the N-terminal domain, the lipid binding site, and the cysteine-rich C-terminal region of unknown function. Interestingly, two potential zinc-binding sites in the β-barrel can support a proposed role for this structure in DNA-binding. Overall, our findings suggest that honey bee Vg bind zinc at several functional regions, indicating that Zn2+ -ions are important for many of the activities of this protein. In addition to being potentially relevant for other egg-laying species, these insights provide a platform for studies of metal ions in bee health, which is of global interest due to recent declines in pollinator numbers.
Collapse
Affiliation(s)
- Vilde Leipart
- Faculty of Environmental Sciences and Natural Resource ManagementNorwegian University of Life SciencesAasNorway
| | - Øyvind Enger
- Faculty of Environmental Sciences and Natural Resource ManagementNorwegian University of Life SciencesAasNorway
| | | | | | - Finn Drabløs
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health SciencesNTNU – Norwegian University of Science and TechnologyTrondheimNorway
| | - Øyvind Halskau
- Department of Biological SciencesUniversity of BergenBergenNorway
| | - Gro V. Amdam
- Faculty of Environmental Sciences and Natural Resource ManagementNorwegian University of Life SciencesAasNorway
- School of Life SciencesArizona State UniversityTempeArizonaUSA
| |
Collapse
|
414
|
Delgado-Cunningham K, López T, Khatib F, Arias CF, DuBois RM. Structure of the divergent human astrovirus MLB capsid spike. Structure 2022; 30:1573-1581.e3. [PMID: 36417907 PMCID: PMC9722636 DOI: 10.1016/j.str.2022.10.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 08/30/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022]
Abstract
Despite their worldwide prevalence and association with human disease, the molecular bases of human astrovirus (HAstV) infection and evolution remain poorly characterized. Here, we report the structure of the capsid protein spike of the divergent HAstV MLB clade (HAstV MLB). While the structure shares a similar folding topology with that of classical-clade HAstV spikes, it is otherwise strikingly different. We find no evidence of a conserved receptor-binding site between the MLB and classical HAstV spikes, suggesting that MLB and classical HAstVs utilize different receptors for host-cell attachment. We provide evidence for this hypothesis using a novel HAstV infection competition assay. Comparisons of the HAstV MLB spike structure with structures predicted from its sequence reveal poor matches, but template-based predictions were surprisingly accurate relative to machine-learning-based predictions. Our data provide a foundation for understanding the mechanisms of infection by diverse HAstVs and can support structure determination in similarly unstudied systems.
Collapse
Affiliation(s)
- Kevin Delgado-Cunningham
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Tomás López
- Departamento de Genética del Desarrollo y Fisiología Molecular, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, 62210, Mexico
| | - Firas Khatib
- Department of Computer and Information Science, University of Massachusetts Dartmouth, Dartmouth, MA 02747, USA
| | - Carlos F Arias
- Departamento de Genética del Desarrollo y Fisiología Molecular, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, 62210, Mexico
| | - Rebecca M DuBois
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA.
| |
Collapse
|
415
|
Varadi M, Nair S, Sillitoe I, Tauriello G, Anyango S, Bienert S, Borges C, Deshpande M, Green T, Hassabis D, Hatos A, Hegedus T, Hekkelman ML, Joosten R, Jumper J, Laydon A, Molodenskiy D, Piovesan D, Salladini E, Salzberg SL, Sommer MJ, Steinegger M, Suhajda E, Svergun D, Tenorio-Ku L, Tosatto S, Tunyasuvunakool K, Waterhouse AM, Žídek A, Schwede T, Orengo C, Velankar S. 3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources. Gigascience 2022; 11:giac118. [PMID: 36448847 PMCID: PMC9709962 DOI: 10.1093/gigascience/giac118] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/20/2022] [Accepted: 11/11/2022] [Indexed: 12/02/2022] Open
Abstract
While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, the gap between the number of known protein sequences and their experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational protein modeling approaches. While often powerful on their own, most methods have strengths and weaknesses. Therefore, it benefits researchers to examine models from various model providers and perform comparative analysis to identify what models can best address their specific use cases. To make data from a large array of model providers more easily accessible to the broader scientific community, we established 3D-Beacons, a collaborative initiative to create a federated network with unified data access mechanisms. The 3D-Beacons Network allows researchers to collate coordinate files and metadata for experimentally determined and theoretical protein models from state-of-the-art and specialist model providers and also from the Protein Data Bank.
Collapse
Affiliation(s)
- Mihaly Varadi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Sreenath Nair
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Ian Sillitoe
- Department of Structural and Molecular Biology, UCL, London WC1E 6BT, UK
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Stephen Anyango
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Clemente Borges
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
- European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Mandar Deshpande
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | | | | | - Andras Hatos
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
- Department of Oncology, Lausanne University Hospital, Lausanne 1015, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne 1015, Switzerland
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Swiss Cancer Center Leman, Lausanne 1005, Switzerland
| | - Tamas Hegedus
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | | | - Robbie Joosten
- Netherlands Cancer Institute, Amsterdam 1066 CX, The Netherlands
| | | | | | - Dmitry Molodenskiy
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
- European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Steven L Salzberg
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Markus J Sommer
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Martin Steinegger
- School of Biology, Seoul National University, Seoul 82-2-880-6971, 6977, South Korea
| | - Erzsebet Suhajda
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | - Dmitri Svergun
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
- European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Luiggi Tenorio-Ku
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Silvio Tosatto
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | | | - Andrew Mark Waterhouse
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | | | - Torsten Schwede
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Christine Orengo
- Department of Structural and Molecular Biology, UCL, London WC1E 6BT, UK
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| |
Collapse
|
416
|
Structural and functional insights into iron acquisition from lactoferrin and transferrin in Gram-negative bacterial pathogens. Biometals 2022; 36:683-702. [PMID: 36418809 PMCID: PMC10182148 DOI: 10.1007/s10534-022-00466-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Accepted: 11/05/2022] [Indexed: 11/25/2022]
Abstract
AbstractIron is an essential element for various lifeforms but is largely insoluble due to the oxygenation of Earth’s atmosphere and oceans during the Proterozoic era. Metazoans evolved iron transport glycoproteins, like transferrin (Tf) and lactoferrin (Lf), to keep iron in a non-toxic, usable form, while maintaining a low free iron concentration in the body that is unable to sustain bacterial growth. To survive on the mucosal surfaces of the human respiratory tract where it exclusively resides, the Gram-negative bacterial pathogen Moraxella catarrhalis utilizes surface receptors for acquiring iron directly from human Tf and Lf. The receptors are comprised of a surface lipoprotein to capture iron-loaded Tf or Lf and deliver it to a TonB-dependent transporter (TBDT) for removal of iron and transport across the outer membrane. The subsequent transport of iron into the cell is normally mediated by a periplasmic iron-binding protein and inner membrane transport complex, which has yet to be determined for Moraxella catarrhalis. We identified two potential periplasm to cytoplasm transport systems and performed structural and functional studies with the periplasmic binding proteins (FbpA and AfeA) to evaluate their role. Growth studies with strains deleted in the fbpA or afeA gene demonstrated that FbpA, but not AfeA, was required for growth on human Tf or Lf. The crystal structure of FbpA with bound iron in the open conformation was obtained, identifying three tyrosine ligands that were required for growth on Tf or Lf. Computational modeling of the YfeA homologue, AfeA, revealed conserved residues involved in metal binding.
Collapse
|
417
|
Suchland RJ, Carrell SJ, Ramsey SA, Hybiske K, Debrine AM, Sanchez J, Celum C, Rockey DD. Genomic Analysis of MSM Rectal Chlamydia trachomatis Isolates Identifies Predicted Tissue-Tropic Lineages Generated by Intraspecies Lateral Gene Transfer-Mediated Evolution. Infect Immun 2022; 90:e0026522. [PMID: 36214558 PMCID: PMC9670952 DOI: 10.1128/iai.00265-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 09/12/2022] [Indexed: 11/18/2022] Open
Abstract
Chlamydia trachomatis is an obligate intracellular bacterium that causes serious diseases in humans. Rectal infection and disease caused by this pathogen are important yet understudied aspects of C. trachomatis natural history. The University of Washington Chlamydia Repository has a large collection of male-rectal-sourced strains (MSM rectal strains) isolated in Seattle, USA and Lima, Peru. Initial characterization of strains collected over 30 years in both Seattle and Lima led to an association of serovars G and J with male rectal infections. Serovar D, E, and F strains were also collected from MSM patients. Genome sequence analysis of a subset of MSM rectal strains identified a clade of serovar G and J strains that had high overall genomic identity. A genome-wide association study was then used to identify genomic loci that were correlated with tissue tropism in a collection of serovar-matched male rectal and female cervical strains. The polymorphic membrane protein PmpE had the strongest correlation, and amino acid sequence alignments identified a set of PmpE variable regions (VRs) that were correlated with host or tissue tropism. Examination of the positions of VRs by the protein structure-predicting Alphafold2 algorithm demonstrated that the VRs were often present in predicted surface-exposed loops in both PmpE and PmpH protein structure. Collectively, these studies identify possible tropism-predictive loci for MSM rectal C. trachomatis infections and identify predicted surface-exposed variable regions of Pmp proteins that may function in MSM rectal versus cervical tropism differences.
Collapse
Affiliation(s)
- Robert J. Suchland
- Division of Allergy and Infectious Diseases, Department of Medicine, University of Washington, Seattle, Washington, USA
| | - Steven J. Carrell
- Department of Biomedical Sciences, Oregon State University, Corvallis, Oregon, USA
| | - Stephen A. Ramsey
- Department of Biomedical Sciences, Oregon State University, Corvallis, Oregon, USA
| | - Kevin Hybiske
- Division of Allergy and Infectious Diseases, Department of Medicine, University of Washington, Seattle, Washington, USA
| | - Abigail M. Debrine
- Department of Biomedical Sciences, Oregon State University, Corvallis, Oregon, USA
| | - Jorge Sanchez
- Centro de Investigaciones Tecnológicas, Universidad Nacional Mayor San Marcos, Lima, Peru
| | - Connie Celum
- Departments of Global Health and Medicine, University of Washington, Seattle, Washington, USA
| | - Daniel D. Rockey
- Department of Biomedical Sciences, Oregon State University, Corvallis, Oregon, USA
| |
Collapse
|
418
|
Oeffner RD, Croll TI, Millán C, Poon BK, Schlicksup CJ, Read RJ, Terwilliger TC. Putting AlphaFold models to work with phenix.process_predicted_model and ISOLDE. Acta Crystallogr D Struct Biol 2022; 78:1303-1314. [PMID: 36322415 PMCID: PMC9629492 DOI: 10.1107/s2059798322010026] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 10/13/2022] [Indexed: 11/23/2022] Open
Abstract
AlphaFold has recently become an important tool in providing models for experimental structure determination by X-ray crystallography and cryo-EM. Large parts of the predicted models typically approach the accuracy of experimentally determined structures, although there are frequently local errors and errors in the relative orientations of domains. Importantly, residues in the model of a protein predicted by AlphaFold are tagged with a predicted local distance difference test score, informing users about which regions of the structure are predicted with less confidence. AlphaFold also produces a predicted aligned error matrix indicating its confidence in the relative positions of each pair of residues in the predicted model. The phenix.process_predicted_model tool downweights or removes low-confidence residues and can break a model into confidently predicted domains in preparation for molecular replacement or cryo-EM docking. These confidence metrics are further used in ISOLDE to weight torsion and atom-atom distance restraints, allowing the complete AlphaFold model to be interactively rearranged to match the docked fragments and reducing the need for the rebuilding of connecting regions.
Collapse
Affiliation(s)
- Robert D. Oeffner
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge Biomedical Campus, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Tristan I. Croll
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge Biomedical Campus, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Claudia Millán
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge Biomedical Campus, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Billy K. Poon
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory (LBNL), Building 33R0349, Berkeley, CA 94720-8235, USA
| | - Christopher J. Schlicksup
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory (LBNL), Building 33R0349, Berkeley, CA 94720-8235, USA
| | - Randy J. Read
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge Biomedical Campus, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Tom C. Terwilliger
- New Mexico Consortium, Los Alamos National Laboratory, 100 Entrada Drive, Los Alamos, NM 87544, USA
| |
Collapse
|
419
|
Terwilliger TC, Poon BK, Afonine PV, Schlicksup CJ, Croll TI, Millán C, Richardson JS, Read RJ, Adams PD. Improved AlphaFold modeling with implicit experimental information. Nat Methods 2022; 19:1376-1382. [PMID: 36266465 PMCID: PMC9636017 DOI: 10.1038/s41592-022-01645-6] [Citation(s) in RCA: 73] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 09/09/2022] [Indexed: 12/02/2022]
Abstract
Machine-learning prediction algorithms such as AlphaFold and RoseTTAFold can create remarkably accurate protein models, but these models usually have some regions that are predicted with low confidence or poor accuracy. We hypothesized that by implicitly including new experimental information such as a density map, a greater portion of a model could be predicted accurately, and that this might synergistically improve parts of the model that were not fully addressed by either machine learning or experiment alone. An iterative procedure was developed in which AlphaFold models are automatically rebuilt on the basis of experimental density maps and the rebuilt models are used as templates in new AlphaFold predictions. We show that including experimental information improves prediction beyond the improvement obtained with simple rebuilding guided by the experimental data. This procedure for AlphaFold modeling with density has been incorporated into an automated procedure for interpretation of crystallographic and electron cryo-microscopy maps.
Collapse
Affiliation(s)
- Thomas C Terwilliger
- New Mexico Consortium, Los Alamos, NM, USA.
- Los Alamos National Laboratory, Los Alamos, NM, USA.
| | - Billy K Poon
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Pavel V Afonine
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Christopher J Schlicksup
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Tristan I Croll
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | - Claudia Millán
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | | | - Randy J Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | - Paul D Adams
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Bioengineering, University of California, Berkeley, CA, USA
| |
Collapse
|
420
|
Barger J, Adhikari B. New Labeling Methods for Deep Learning Real-Valued Inter-Residue Distance Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3586-3594. [PMID: 34559660 DOI: 10.1109/tcbb.2021.3115053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
BACKGROUND Much of the recent success in protein structure prediction has been a result of accurate protein contact prediction-a binary classification problem. Dozens of methods, built from various types of machine learning and deep learning algorithms, have been published over the last two decades for predicting contacts. Recently, many groups, including Google DeepMind, have demonstrated that reformulating the problem as a multi-class classification problem is a more promising direction to pursue. As an alternative approach, we recently proposed real-valued distance predictions, formulating the problem as a regression problem. The nuances of protein 3D structures make this formulation appropriate, allowing predictions to reflect inter-residue distances in nature. Despite these promises, the accurate prediction of real-valued distances remains relatively unexplored; possibly due to classification being better suited to machine and deep learning algorithms. METHODS Can regression methods be designed to predict real-valued distances as precise as binary contacts? To investigate this, we propose multiple novel methods of input label engineering, which is different from feature engineering, with the goal of optimizing the distribution of distances to cater to the loss function of the deep-learning model. Since an important utility of predicted contacts or distances is to build three-dimensional models, we also tested if predicted distances can reconstruct more accurate models than contacts. RESULTS Our results demonstrate, for the first time, that deep learning methods for real-valued protein distance prediction can deliver distances as precise as binary classification methods. When using an optimal distance transformation function on the standard PSICOV dataset consisting of 150 representative proteins, the precision of 'top-all' long-range contacts improves from 60.9% to 61.4% when predicting real-valued distances instead of contacts. When building three-dimensional models we observed an average TM-score increase from 0.61 to 0.72, highlighting the advantage of predicting real-valued distances.
Collapse
|
421
|
Ahlqvist J, Linares-Pastén JA, Jasilionis A, Welin M, Håkansson M, Svensson LA, Wang L, Watzlawick H, Ævarsson A, Friðjónsson ÓH, Hreggviðsson GÓ, Ketelsen Striberny B, Glomsaker E, Lanes O, Al-Karadaghi S, Nordberg Karlsson E. Crystal structure of DNA polymerase I from Thermus phage G20c. Acta Crystallogr D Struct Biol 2022; 78:1384-1398. [PMID: 36322421 PMCID: PMC9629493 DOI: 10.1107/s2059798322009895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 10/10/2022] [Indexed: 11/06/2022] Open
Abstract
This study describes the structure of DNA polymerase I from Thermus phage G20c, termed PolI_G20c. This is the first structure of a DNA polymerase originating from a group of related thermophilic bacteriophages infecting Thermus thermophilus, including phages G20c, TSP4, P74-26, P23-45 and phiFA and the novel phage Tth15-6. Sequence and structural analysis of PolI_G20c revealed a 3'-5' exonuclease domain and a DNA polymerase domain, and activity screening confirmed that both domains were functional. No functional 5'-3' exonuclease domain was present. Structural analysis also revealed a novel specific structure motif, here termed SβαR, that was not previously identified in any polymerase belonging to the DNA polymerases I (or the DNA polymerase A family). The SβαR motif did not show any homology to the sequences or structures of known DNA polymerases. The exception was the sequence conservation of the residues in this motif in putative DNA polymerases encoded in the genomes of a group of thermophilic phages related to Thermus phage G20c. The structure of PolI_G20c was determined with the aid of another structure that was determined in parallel and was used as a model for molecular replacement. This other structure was of a 3'-5' exonuclease termed ExnV1. The cloned and expressed gene encoding ExnV1 was isolated from a thermophilic virus metagenome that was collected from several hot springs in Iceland. The structure of ExnV1, which contains the novel SβαR motif, was first determined to 2.19 Å resolution. With these data at hand, the structure of PolI_G20c was determined to 2.97 Å resolution. The structures of PolI_G20c and ExnV1 are most similar to those of the Klenow fragment of DNA polymerase I (PDB entry 2kzz) from Escherichia coli, DNA polymerase I from Geobacillus stearothermophilus (PDB entry 1knc) and Taq polymerase (PDB entry 1bgx) from Thermus aquaticus.
Collapse
Affiliation(s)
- Josefin Ahlqvist
- Division of Biotechnology, Department of Chemistry, Lund University, PO Box 124, 221 00 Lund, Sweden
| | - Javier A. Linares-Pastén
- Division of Biotechnology, Department of Chemistry, Lund University, PO Box 124, 221 00 Lund, Sweden
| | - Andrius Jasilionis
- Division of Biotechnology, Department of Chemistry, Lund University, PO Box 124, 221 00 Lund, Sweden
| | - Martin Welin
- SARomics Biostructures (Sweden), Medicon Village, 223 81 Lund, Sweden
| | - Maria Håkansson
- SARomics Biostructures (Sweden), Medicon Village, 223 81 Lund, Sweden
| | | | - Lei Wang
- Institute of Biomedical Genetics, University of Stuttgart, Allmandring 31, 70569 Stuttgart, Germany
| | - Hildegard Watzlawick
- Institute of Biomedical Genetics, University of Stuttgart, Allmandring 31, 70569 Stuttgart, Germany
| | | | | | - Guðmundur Ó. Hreggviðsson
- Matís, Vínlandsleið 12, 113 Reykjavík, Iceland
- Department of Biology, School of Engineering and Natural Sciences, University of Iceland, Sturlugata 7, 102 Reykjavík, Iceland
| | | | | | - Olav Lanes
- ArcticZymes Technologies, PO Box 6463, 9294 Tromsø, Norway
| | | | - Eva Nordberg Karlsson
- Division of Biotechnology, Department of Chemistry, Lund University, PO Box 124, 221 00 Lund, Sweden
| |
Collapse
|
422
|
Heo L, Feig M. Multi-state modeling of G-protein coupled receptors at experimental accuracy. Proteins 2022; 90:1873-1885. [PMID: 35510704 PMCID: PMC9561049 DOI: 10.1002/prot.26382] [Citation(s) in RCA: 114] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Revised: 04/07/2022] [Accepted: 04/26/2022] [Indexed: 12/30/2022]
Abstract
The family of G-protein coupled receptors (GPCRs) is one of the largest protein families in the human genome. GPCRs transduct chemical signals from extracellular to intracellular regions via a conformational switch between active and inactive states upon ligand binding. While experimental structures of GPCRs remain limited, high-accuracy computational predictions are now possible with AlphaFold2. However, AlphaFold2 only predicts one state and is biased toward either the active or inactive conformation depending on the GPCR class. Here, a multi-state prediction protocol is introduced that extends AlphaFold2 to predict either active or inactive states at very high accuracy using state-annotated templated GPCR databases. The predicted models accurately capture the main structural changes upon activation of the GPCR at the atomic level. For most of the benchmarked GPCRs (10 out of 15), models in the active and inactive states were closer to their corresponding activation state structures. Median RMSDs of the transmembrane regions were 1.12 Å and 1.41 Å for the active and inactive state models, respectively. The models were more suitable for protein-ligand docking than the original AlphaFold2 models and template-based models. Finally, our prediction protocol predicted accurate GPCR structures and GPCR-peptide complex structures in GPCR Dock 2021, a blind GPCR-ligand complex modeling competition. We expect that high accuracy GPCR models in both activation states will promote understanding in GPCR activation mechanisms and drug discovery for GPCRs. At the time, the new protocol paves the way towards capturing the dynamics of proteins at high-accuracy via machine-learning methods.
Collapse
Affiliation(s)
- Lim Heo
- Department of Biochemistry and Molecular BiologyMichigan State UniversityEast LansingMichiganUSA
| | - Michael Feig
- Department of Biochemistry and Molecular BiologyMichigan State UniversityEast LansingMichiganUSA
| |
Collapse
|
423
|
Medina A, Jiménez E, Caballero I, Castellví A, Triviño Valls J, Alcorlo M, Molina R, Hermoso JA, Sammito MD, Borges R, Usón I. Verification: model-free phasing with enhanced predicted models in ARCIMBOLDO_SHREDDER. Acta Crystallogr D Struct Biol 2022; 78:1283-1293. [PMID: 36322413 PMCID: PMC9629495 DOI: 10.1107/s2059798322009706] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 10/03/2022] [Indexed: 11/23/2022] Open
Abstract
Structure predictions have matched the accuracy of experimental structures from close homologues, providing suitable models for molecular replacement phasing. Even in predictions that present large differences due to the relative movement of domains or poorly predicted areas, very accurate regions tend to be present. These are suitable for successful fragment-based phasing as implemented in ARCIMBOLDO. The particularities of predicted models are inherently addressed in the new predicted_model mode, rendering preliminary treatment superfluous but also harmless. B-value conversion from predicted LDDT or error estimates, the removal of unstructured polypeptide, hierarchical decomposition of structural units from domains to local folds and systematically probing the model against the experimental data will ensure the optimal use of the model in phasing. Concomitantly, the exhaustive use of models and stereochemistry in phasing, refinement and validation raises the concern of crystallographic model bias and the need to critically establish the information contributed by the experiment. Therefore, in its predicted_model mode ARCIMBOLDO_SHREDDER will first determine whether the input model already constitutes a solution or provides a straightforward solution with Phaser. If not, extracted fragments will be located. If the landscape of solutions reveals numerous, clearly discriminated and consistent probes or if the input model already constitutes a solution, model-free verification will be activated. Expansions with SHELXE will omit the partial solution seeding phases and all traces outside their respective masks will be combined in ALIXE, as far as consistent. This procedure completely eliminates the molecular replacement search model in favour of the inferences derived from this model. In the case of fragments, an incorrect starting hypothesis impedes expansion. The predicted_model mode has been tested in different scenarios.
Collapse
Affiliation(s)
- Ana Medina
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Elisabet Jiménez
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Iracema Caballero
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Albert Castellví
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Josep Triviño Valls
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Martin Alcorlo
- Department of Crystallography and Structural Biology, Institute of Physical Chemistry ‘Rocasolano’, Spanish National Research Council (CSIC), Madrid, Spain
| | - Rafael Molina
- Department of Crystallography and Structural Biology, Institute of Physical Chemistry ‘Rocasolano’, Spanish National Research Council (CSIC), Madrid, Spain
| | - Juan A. Hermoso
- Department of Crystallography and Structural Biology, Institute of Physical Chemistry ‘Rocasolano’, Spanish National Research Council (CSIC), Madrid, Spain
| | - Massimo D. Sammito
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Rafael Borges
- Department of Biophysics and Pharmacology, Biosciences Institute, São Paulo State University (UNESP), Botucatu, Sao Paulo 18618-689, Brazil
| | - Isabel Usón
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
- ICREA, Institució Catalana de Recerca i Estudis Avançats, Passeig Lluís Companys 23, 08003 Barcelona, Spain
| |
Collapse
|
424
|
Farhana R, Lei R, Pham K, Derrien V, Cedeño J, Rodriquez V, Bernad S, Lima FF, Miksovska J. Globin X: A highly stable intrinsically hexacoordinate globin. J Inorg Biochem 2022; 236:111976. [PMID: 36058051 DOI: 10.1016/j.jinorgbio.2022.111976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 08/10/2022] [Accepted: 08/18/2022] [Indexed: 12/15/2022]
Abstract
Several novel members of the vertebrate globin family were recently discovered with unique structural features that are not found in traditional penta-coordinate globins. Here we combine structural tools to better understand and recognize molecular determinants that contribute to the stability of hexacoordinate globin X (GbX) from Danio rerio (zebrafish). pH-induced unfolding data indicates increased stability of GbX with pHmid of 1.9 ± 0.1 for met GbXWT, 2.4 ± 0.1 for met GbXC65A, and 3.4 ± 0.1 for GbXH90V. These results are in good agreement with GbX unfolding experiments using GuHCl, where a ΔGunf 13.8 ± 2.5 kcal mol-1 and 16.3 ± 2.6 kcal mol-1 are observed for metGbXWT, and metGbXC65A constructs, respectively, and diminished stability is measured for GbXH90V, ΔGunf = 9.5 ± 3.6 kcal mol-1. The metGbXWT and metGbXC65A also exhibit high thermal stability (melting points of 118 °C and 107 °C, respectively). Native ion mobility - mass spectrometry (IM-MS) experiments showed a narrow charge state distribution (9-12+) characteristics of a native, structured protein; a single mobility band was observed for the native states. Collision induced unfolding IM-MS experiments showed a two-state transition, in good agreement with the solution studies. GbXWT retains the heme over a wide range of charge states, suggesting strong interactions between the prosthetic group and the apoprotein. The above results indicate that in addition to the disulfide bond and the heme iron hexa-coordination, other structural determinants enhance stability of this protein.
Collapse
Affiliation(s)
- Rifat Farhana
- Department of Chemistry and Biochemistry, Florida International University, Miami, FL, United States of America
| | - Ruipeng Lei
- Department of Chemistry and Biochemistry, Florida International University, Miami, FL, United States of America
| | - Khoa Pham
- Department of Chemistry and Biochemistry, Florida International University, Miami, FL, United States of America
| | - Valerie Derrien
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, UMR8000, 91405 Orsay, France
| | - Jonathan Cedeño
- Department of Chemistry and Biochemistry, Florida International University, Miami, FL, United States of America
| | - Veronica Rodriquez
- Department of Chemistry and Biochemistry, Florida International University, Miami, FL, United States of America
| | - Sophie Bernad
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, UMR8000, 91405 Orsay, France
| | - Francisco Fernandez Lima
- Department of Chemistry and Biochemistry, Florida International University, Miami, FL, United States of America; Biomedical Science Institute, Florida International University, Miami, FL, United States of America
| | - Jaroslava Miksovska
- Department of Chemistry and Biochemistry, Florida International University, Miami, FL, United States of America; Biomedical Science Institute, Florida International University, Miami, FL, United States of America.
| |
Collapse
|
425
|
He J, Turzo SBA, Seffernick JT, Kim SS, Lindert S. Prediction of Intrinsic Disorder Using Rosetta ResidueDisorder and AlphaFold2. J Phys Chem B 2022; 126:8439-8446. [PMID: 36251522 DOI: 10.1021/acs.jpcb.2c05508] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
The combination of deep learning and sequence data has transformed protein structure prediction and modeling, evidenced in the success of AlphaFold (AF). For this reason, many methods have been developed to take advantage of this success in areas where inaccurate structural modeling may limit computational predictiveness. For example, many methods have been developed to predict protein intrinsic disorder from sequence, including our Rosetta ResidueDisorder (RRD) approach. Intrinsically disordered regions in proteins are parts of the sequence that do not form ordered, folded structures under typical physiological conditions. In the original implementation of RRD, Rosetta ab initio models were generated, and disordered regions were predicted based on residue scores (disordered residues typically exist in regions of unfavorable scores). In this work, we show that by (i) replacing the ab initio modeling with AF (using the same scoring and disorder assignment approach) and (ii) updating the score function, the predictiveness improved significantly. Residues were better ranked by the order/disorder, evidenced by an improvement in receiver operating characteristic area-under-the-curve from 0.69 to 0.78 on a large (229 protein) and balanced data set (relatively even ordered versus disordered residues). Finally, the binary prediction accuracy also improved from 62% to 74% on the same data set. Our results show that the combined AF-RRD approach was as good as or better than all existing methods by these metrics (AF-RRD had the highest prediction accuracy).
Collapse
Affiliation(s)
- Jiadi He
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, United States
| | - Sm Bargeen Alam Turzo
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, United States
| | - Justin T Seffernick
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, United States
| | - Stephanie S Kim
- School of Biological Sciences, Seoul National University, Seoul 08826, South Korea
| | - Steffen Lindert
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, United States
| |
Collapse
|
426
|
A structure and evolutionary-based classification of solute carriers. iScience 2022; 25:105096. [PMID: 36164651 PMCID: PMC9508557 DOI: 10.1016/j.isci.2022.105096] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 07/22/2022] [Accepted: 09/04/2022] [Indexed: 11/22/2022] Open
Abstract
Solute carriers are an operationally defined diverse family of membrane proteins involved in the transport of nutrients, metabolites, xenobiotics, and drugs. Here, we provide an integrative classification of solute carriers by combining evolutionary information with proteome-wide structure models recently made available through the AlphaFold resource. Analyses of orthologous relations among 455 protein-coding genes currently classified as human solute carriers, over the fully sequenced genomes of 2,100 species, suggest no more than approximately 180 independent evolutionary origins. Structural comparative analyses provided further insight revealing a total of 24 structurally distinct transmembrane folds, increasing by approximately 40% the number of previously described SLC structural folds. In addition, a structural comparative analysis identified a new human solute carrier member and revealed details of noncanonical ones. Our analyses uncover new ancestral relations between solute carrier genes, provide insights into the evolution of remote homologs and a platform to test hypotheses of functional deorphanization.
Collapse
|
427
|
Sim J, Kwon S, Seok C. HProteome-BSite: predicted binding sites and ligands in human 3D proteome. Nucleic Acids Res 2022; 51:D403-D408. [PMID: 36243970 PMCID: PMC9825455 DOI: 10.1093/nar/gkac873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/20/2022] [Accepted: 09/29/2022] [Indexed: 01/29/2023] Open
Abstract
Atomic-level knowledge of protein-ligand interactions allows a detailed understanding of protein functions and provides critical clues to discovering molecules regulating the functions. While recent innovative deep learning methods for protein structure prediction dramatically increased the structural coverage of the human proteome, molecular interactions remain largely unknown. A new database, HProteome-BSite, provides predictions of binding sites and ligands in the enlarged 3D human proteome. The model structures for human proteins from the AlphaFold Protein Structure Database were processed to structural domains of high confidence to maximize the coverage and reliability of interaction prediction. For ligand binding site prediction, an updated version of a template-based method GalaxySite was used. A high-level performance of the updated GalaxySite was confirmed. HProteome-BSite covers 80.74% of the UniProt entries in the AlphaFold human 3D proteome. Predicted binding sites and binding poses of potential ligands are provided for effective applications to further functional studies and drug discovery. The HProteome-BSite database is available at https://galaxy.seoklab.org/hproteome-bsite/database and is free and open to all users.
Collapse
Affiliation(s)
- Jiho Sim
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| | - Sohee Kwon
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea,Galux Inc, Gwanak-gu, Seoul 08738, Republic of Korea
| | - Chaok Seok
- To whom correspondence should be addressed. Tel: +82 2 880 9197; Fax: +82 2 889 1568;
| |
Collapse
|
428
|
Wicky BIM, Milles LF, Courbet A, Ragotte RJ, Dauparas J, Kinfu E, Tipps S, Kibler RD, Baek M, DiMaio F, Li X, Carter L, Kang A, Nguyen H, Bera AK, Baker D. Hallucinating symmetric protein assemblies. Science 2022; 378:56-61. [PMID: 36108048 PMCID: PMC9724707 DOI: 10.1126/science.add1964] [Citation(s) in RCA: 100] [Impact Index Per Article: 33.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Deep learning generative approaches provide an opportunity to broadly explore protein structure space beyond the sequences and structures of natural proteins. Here, we use deep network hallucination to generate a wide range of symmetric protein homo-oligomers given only a specification of the number of protomers and the protomer length. Crystal structures of seven designs are very similar to the computational models (median root mean square deviation: 0.6 angstroms), as are three cryo-electron microscopy structures of giant 10-nanometer rings with up to 1550 residues and C33 symmetry; all differ considerably from previously solved structures. Our results highlight the rich diversity of new protein structures that can be generated using deep learning and pave the way for the design of increasingly complex components for nanomachines and biomaterials.
Collapse
Affiliation(s)
- B. I. M. Wicky
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - L. F. Milles
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - A. Courbet
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - R. J. Ragotte
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - J. Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - E. Kinfu
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - S. Tipps
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - R. D. Kibler
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - M. Baek
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - F. DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - X. Li
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - L. Carter
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - A. Kang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - H. Nguyen
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - A. K. Bera
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - D. Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
429
|
Dauparas J, Anishchenko I, Bennett N, Bai H, Ragotte RJ, Milles LF, Wicky BIM, Courbet A, de Haas RJ, Bethel N, Leung PJY, Huddy TF, Pellock S, Tischer D, Chan F, Koepnick B, Nguyen H, Kang A, Sankaran B, Bera AK, King NP, Baker D. Robust deep learning-based protein sequence design using ProteinMPNN. Science 2022; 378:49-56. [PMID: 36108050 PMCID: PMC9997061 DOI: 10.1126/science.add2187] [Citation(s) in RCA: 524] [Impact Index Per Article: 174.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Although deep learning has revolutionized protein structure prediction, almost all experimentally characterized de novo protein designs have been generated using physically based approaches such as Rosetta. Here, we describe a deep learning-based protein sequence design method, ProteinMPNN, that has outstanding performance in both in silico and experimental tests. On native protein backbones, ProteinMPNN has a sequence recovery of 52.4% compared with 32.9% for Rosetta. The amino acid sequence at different positions can be coupled between single or multiple chains, enabling application to a wide range of current protein design challenges. We demonstrate the broad utility and high accuracy of ProteinMPNN using x-ray crystallography, cryo-electron microscopy, and functional studies by rescuing previously failed designs, which were made using Rosetta or AlphaFold, of protein monomers, cyclic homo-oligomers, tetrahedral nanoparticles, and target-binding proteins.
Collapse
Affiliation(s)
- J. Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - I. Anishchenko
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - N. Bennett
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Molecular Engineering Graduate Program, University of Washington, Seattle, WA, USA
| | - H. Bai
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - R. J. Ragotte
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - L. F. Milles
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - B. I. M. Wicky
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - A. Courbet
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - R. J. de Haas
- Department of Physical Chemistry and Soft Matter, Wageningen University and Research, Wageningen, The Netherlands
| | - N. Bethel
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - P. J. Y. Leung
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Molecular Engineering Graduate Program, University of Washington, Seattle, WA, USA
| | - T. F. Huddy
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - S. Pellock
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - D. Tischer
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - F. Chan
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - B. Koepnick
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - H. Nguyen
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - A. Kang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - B. Sankaran
- Berkeley Center for Structural Biology, Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - A. K. Bera
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - N. P. King
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - D. Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
430
|
Shao C, Bittrich S, Wang S, Burley SK. Assessing PDB macromolecular crystal structure confidence at the individual amino acid residue level. Structure 2022; 30:1385-1394.e3. [PMID: 36049478 PMCID: PMC9547844 DOI: 10.1016/j.str.2022.08.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/24/2022] [Accepted: 08/05/2022] [Indexed: 11/22/2022]
Abstract
Approximately 87% of the more than 190,000 atomic-level three-dimensional (3D) biostructures in the PDB were determined using macromolecular crystallography (MX). Agreement between 3D atomic coordinates and experimental data for >100 million individual amino acid residues occurring within ∼150,000 PDB MX structures was analyzed in detail. The real-space correlation coefficient (RSCC) calculated using the 3D atomic coordinates for each residue and experimental-data-derived electron density enables outlier detection of unreliable atomic coordinates (particularly important for poorly resolved side-chain atoms) and ready evaluation of local structure quality by PDB users. For human protein MX structures in PDB, comparisons of the per-residue RSCC metric with AlphaFold2-computed structure model confidence (pLDDT-predicted local distance difference test) document (1) that RSCC values and pLDDT scores are correlated (median correlation coefficient ∼0.41), and (2) that experimentally determined MX structures (3.5 Å resolution or better) are more reliable than AlphaFold2-computed structure models and should be used preferentially whenever possible.
Collapse
Affiliation(s)
- Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Sijian Wang
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Statistics, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA; Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ 08903, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
431
|
Tajwar R, Bradley DP, Ponzar NL, Tavis JE. Predicted structure of the hepatitis B virus polymerase reveals an ancient conserved protein fold. Protein Sci 2022; 31:e4421. [PMID: 36173165 PMCID: PMC9601786 DOI: 10.1002/pro.4421] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 07/31/2022] [Accepted: 08/02/2022] [Indexed: 11/12/2022]
Abstract
Hepatitis B virus (HBV) chronically infects >250 million people. It replicates by a unique protein-primed reverse transcription mechanism, and the primary anti-HBV drugs are nucleos(t)ide analogs targeting the viral polymerase (P). P has four domains compared to only two in most reverse transcriptases: the terminal protein (TP) that primes DNA synthesis, a spacer, the reverse transcriptase (RT), and the ribonuclease H (RNase H). Despite being a major drug target and catalyzing a reverse transcription pathway very different from the retroviruses, HBV P has resisted structural analysis for decades. Here, we exploited computational advances to model P. The TP wrapped around the RT domain rather than forming the anticipated globular domain, with the priming tyrosine poised over the RT active site. The orientation of the RT and RNase H domains resembled that of the retroviral enzymes despite the lack of sequences analogous to the retroviral linker region. The model was validated by mapping residues with known surface exposures, docking nucleic acids, mechanistically interpreting mutations with strong phenotypes, and docking inhibitors into the RT and RNase H active sites. The HBV P fold, including the orientation of the TP domain, was conserved among hepadnaviruses infecting rodent to fish hosts and a nackednavirus, but not in other non-retroviral RTs. Therefore, this protein fold has persisted since the hepadnaviruses diverged from nackednaviruses >400 million years ago. This model will advance mechanistic analyses into the poorly understood enzymology of HBV reverse transcription and will enable drug development against non-active site targets for the first time.
Collapse
Affiliation(s)
- Razia Tajwar
- Department of Molecular Microbiology and Immunology, School of Medicine and Institute for Drug and Biotherapeutic InnovationSaint Louis UniversitySaint LouisMissouriUSA
| | - Daniel P. Bradley
- Department of Molecular Microbiology and Immunology, School of Medicine and Institute for Drug and Biotherapeutic InnovationSaint Louis UniversitySaint LouisMissouriUSA
| | - Nathan L. Ponzar
- Department of Molecular Microbiology and Immunology, School of Medicine and Institute for Drug and Biotherapeutic InnovationSaint Louis UniversitySaint LouisMissouriUSA
| | - John E. Tavis
- Department of Molecular Microbiology and Immunology, School of Medicine and Institute for Drug and Biotherapeutic InnovationSaint Louis UniversitySaint LouisMissouriUSA
| |
Collapse
|
432
|
Tan ZW, Tee WV, Guarnera E, Berezovsky IN. AlloMAPS 2: allosteric fingerprints of the AlphaFold and Pfam-trRosetta predicted structures for engineering and design. Nucleic Acids Res 2022; 51:D345-D351. [PMID: 36169226 PMCID: PMC9825619 DOI: 10.1093/nar/gkac828] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 08/26/2022] [Accepted: 09/15/2022] [Indexed: 01/29/2023] Open
Abstract
AlloMAPS 2 is an update of the Allosteric Mutation Analysis and Polymorphism of Signalling database, which contains data on allosteric communication obtained for predicted structures in the AlphaFold database (AFDB) and trRosetta-predicted Pfam domains. The data update contains Allosteric Signalling Maps (ASMs) and Allosteric Probing Maps (APMs) quantifying allosteric effects of mutations and of small probe binding, respectively. To ensure quality of the ASMs and APMs, we performed careful and accurate selection of protein sets containing high-quality predicted structures in both databases for each organism/structure, and the data is available for browsing and download. The data for remaining structures are available for download and should be used at user's discretion and responsibility. We believe these massive data can facilitate both diagnostics and drug design within the precision medicine paradigm. Specifically, it can be instrumental in the analysis of allosteric effects of pathological and rescue mutations, providing starting points for fragment-based design of allosteric effectors. The exhaustive character of allosteric signalling and probing fingerprints will be also useful in future developments of corresponding machine learning applications. The database is freely available at: http://allomaps.bii.a-star.edu.sg.
Collapse
Affiliation(s)
- Zhen Wah Tan
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore
| | - Wei-Ven Tee
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore
| | - Enrico Guarnera
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore
| | - Igor N Berezovsky
- To whom correspondence should be addressed. Tel: +65 6478 8269; Fax: +65 6478 9047;
| |
Collapse
|
433
|
Teixeira LR, Fernandes TM, Silva MA, Morgado L, Salgueiro CA. Characterization of a Novel Cytochrome Involved in
Geobacter sulfurreducens’
Electron Harvesting Pathways. Chemistry 2022; 28:e202202333. [DOI: 10.1002/chem.202202333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Indexed: 11/12/2022]
Affiliation(s)
- Liliana R. Teixeira
- Associate Laboratory i4HB-Institute for Health and Bioeconomy NOVA School of Science and Technology NOVA University Lisbon 2819-516 Caparica Portugal
- UCIBIO – Applied Molecular Biosciences Unit, Chemistry Department NOVA School of Science and Technology NOVA University Lisbon 2829-516 Caparica Portugal
| | - Tomás M. Fernandes
- Associate Laboratory i4HB-Institute for Health and Bioeconomy NOVA School of Science and Technology NOVA University Lisbon 2819-516 Caparica Portugal
- UCIBIO – Applied Molecular Biosciences Unit, Chemistry Department NOVA School of Science and Technology NOVA University Lisbon 2829-516 Caparica Portugal
| | - Marta A. Silva
- Associate Laboratory i4HB-Institute for Health and Bioeconomy NOVA School of Science and Technology NOVA University Lisbon 2819-516 Caparica Portugal
- UCIBIO – Applied Molecular Biosciences Unit, Chemistry Department NOVA School of Science and Technology NOVA University Lisbon 2829-516 Caparica Portugal
| | - Leonor Morgado
- Associate Laboratory i4HB-Institute for Health and Bioeconomy NOVA School of Science and Technology NOVA University Lisbon 2819-516 Caparica Portugal
- UCIBIO – Applied Molecular Biosciences Unit, Chemistry Department NOVA School of Science and Technology NOVA University Lisbon 2829-516 Caparica Portugal
| | - Carlos A. Salgueiro
- Associate Laboratory i4HB-Institute for Health and Bioeconomy NOVA School of Science and Technology NOVA University Lisbon 2819-516 Caparica Portugal
- UCIBIO – Applied Molecular Biosciences Unit, Chemistry Department NOVA School of Science and Technology NOVA University Lisbon 2829-516 Caparica Portugal
| |
Collapse
|
434
|
Pasquadibisceglie A, Leccese A, Polticelli F. A computational study of the structure and function of human Zrt and Irt-like proteins metal transporters: An elevator-type transport mechanism predicted by AlphaFold2. Front Chem 2022; 10:1004815. [PMID: 36204150 PMCID: PMC9530640 DOI: 10.3389/fchem.2022.1004815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 09/05/2022] [Indexed: 11/22/2022] Open
Abstract
The ZIP (Zrt and Irt-like proteins) protein family includes transporters responsible for the translocation of zinc and other transition metals, such as iron and cadmium, between the extracellular space (or the lumen of organelles) and the cytoplasm. This protein family is present at all the phylogenetic levels, including bacteria, fungi, plants, insects, and mammals. ZIP proteins are responsible for the homeostasis of metals essential for the cell physiology. The human ZIP family consists of fourteen members (hZIP1-hZIP14), divided into four subfamilies: LIV-1, containing nine hZIPs, the subfamily I, with only one member, the subfamily II, which includes three members and the subfamily gufA, which has only one member. Apart from the extracellular domain, typical of the LIV-1 subfamily, the highly conserved transmembrane domain, containing the binuclear metal center (BMC), and the histidine-rich intracellular loop are the common features characterizing the ZIP family. Here is presented a computational study of the structure and function of human ZIP family members. Multiple sequence alignment and structural models were obtained for the 14 hZIP members. Moreover, a full-length three-dimensional model of the hZIP4-homodimer complex was also produced. Different conformations of the representative hZIP transporters were obtained through a modified version of the AlphaFold2 algorithm. The inward and outward-facing conformations obtained suggest that the hZIP proteins function with an “elevator-type” mechanism.
Collapse
Affiliation(s)
| | | | - Fabio Polticelli
- Department of Sciences, Roma Tre University, Rome, Italy
- National Institute of Nuclear Physics, Roma Tre Section, Rome, Italy
- *Correspondence: Fabio Polticelli,
| |
Collapse
|
435
|
Liang H, Lopez IJ, Sánchez-Hidalgo M, Genilloud O, van der Donk WA. Mechanistic Studies on Dehydration in Class V Lanthipeptides. ACS Chem Biol 2022; 17:2519-2527. [PMID: 36044589 PMCID: PMC9486802 DOI: 10.1021/acschembio.2c00458] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Accepted: 08/18/2022] [Indexed: 01/19/2023]
Abstract
Lanthipeptides are ribosomally synthesized and post-translationally modified peptides characterized by lanthionine (Lan) and/or methyllanthionine (MeLan) residues. Four classes of enzymes have been identified to install these structures in a substrate peptide. Recently, a novel class of lanthipeptides was discovered that lack genes for known class I-IV lanthionine synthases in their biosynthetic gene cluster (BGC). In this study, the dehydration of Ser/Thr during the biosynthesis of the class V lanthipeptide cacaoidin was reconstituted in vitro. The aminoglycoside phosphotransferase-like enzyme CaoK iteratively phosphorylates Ser/Thr residues on the precursor peptide CaoA, followed by phosphate elimination catalyzed by the HopA1 effector-like protein CaoY to achieve eight successive dehydrations. CaoY shows sequence similarity to the OspF family proteins and the lyase domains of class III/IV lanthionine synthetases, and mutagenesis studies identified residues that are critical for catalysis. An AlphaFold prediction of the structure of the dehydration enzyme complex engaged with its substrate suggests the importance of hydrophobic interactions between the CaoA leader peptide and CaoK in enzyme-substrate recognition. This model is supported by site-directed mutagenesis studies.
Collapse
Affiliation(s)
- Haoqian Liang
- Department
of Biochemistry, University of Illinois
at Urbana—Champaign, 600 S. Mathews Avenue, Urbana, Illinois 61801, United States
| | - Isaiah J. Lopez
- Department
of Biochemistry, University of Illinois
at Urbana—Champaign, 600 S. Mathews Avenue, Urbana, Illinois 61801, United States
| | - Marina Sánchez-Hidalgo
- Fundación
MEDINA Centro de Excelencia en Investigación de Medicamentos
Innovadores en Andalucía, Avenida del Conocimiento, 34 Parque Tecnológico
de Ciencias de la Salud, Armilla, 18016 Granada, Spain
| | - Olga Genilloud
- Fundación
MEDINA Centro de Excelencia en Investigación de Medicamentos
Innovadores en Andalucía, Avenida del Conocimiento, 34 Parque Tecnológico
de Ciencias de la Salud, Armilla, 18016 Granada, Spain
| | - Wilfred A. van der Donk
- Department
of Biochemistry, University of Illinois
at Urbana—Champaign, 600 S. Mathews Avenue, Urbana, Illinois 61801, United States
- Department
of Chemistry and Howard Hughes Medical Institute, University of Illinois at Urbana—Champaign, 600 S. Mathews Avenue, Urbana, Illinois 61801, United States
| |
Collapse
|
436
|
Chen JY, Mumtaz A, Gonzales-Vigil E. Evolution and molecular basis of substrate specificity in a 3-ketoacyl-CoA synthase gene cluster from Populus trichocarpa. J Biol Chem 2022; 298:102496. [PMID: 36115459 PMCID: PMC9574513 DOI: 10.1016/j.jbc.2022.102496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 09/08/2022] [Accepted: 09/10/2022] [Indexed: 10/26/2022] Open
Abstract
Very-long-chain fatty acids (VLCFAs) are precursors to sphingolipids, glycerophospholipids, and plant cuticular waxes. In plants, members of a large 3-ketoacyl-CoA synthase (KCS) gene family catalyze the substrate-specific elongation of VLCFAs. Although it is well understood that KCSs have evolved to use diverse substrates, the underlying molecular determinants of their specificity are still unclear. In this study, we exploited the sequence similarity of a KCS gene cluster from Populus trichocarpa to examine the evolution and molecular determinants of KCS substrate specificity. Functional characterization of five members (PtKCS1, 2, 4, 8, 9) in yeast showed divergent product profiles based on VLCFA length, saturation, and position of the double bond. In addition, homology models, rationally designed chimeras, and site-directed mutants were used to identify two key regions (helix-4 and position 277) as being major determinants of substrate specificity. These results were corroborated with chimeras involving a more distantly related KCS, PtCER6 (the poplar ortholog of the Arabidopsis CER6), and used to show that helix-4 is necessary for the modulatory effect of PtCER2-like 5 on KCS substrate specificity. The role of position 277 in limiting product length was further tested by substitution with smaller amino acids, which shifted specificity towards longer products. Finally, treatment with KCS inhibitors (K3 herbicides) showed varying inhibitor sensitivities between the duplicated paralogs despite their sequence similarity. Together, this work sheds light on the molecular mechanisms driving substrate diversification in the KCS family and lays the groundwork for tailoring the production of specific VLCFAs.
Collapse
Affiliation(s)
- Jeff Y Chen
- Department of Biological Sciences, University of Toronto Scarborough, Toronto, M1C 1A4 Canada; Department of Cell and Systems Biology, University of Toronto, Toronto, M5S 3G5, Canada
| | - Arishba Mumtaz
- Department of Biological Sciences, University of Toronto Scarborough, Toronto, M1C 1A4 Canada
| | - Eliana Gonzales-Vigil
- Department of Biological Sciences, University of Toronto Scarborough, Toronto, M1C 1A4 Canada; Department of Cell and Systems Biology, University of Toronto, Toronto, M5S 3G5, Canada.
| |
Collapse
|
437
|
Van Cauwenberghe J, Santamaría RI, Bustos P, González V. Novel lineages of single-stranded DNA phages that coevolved with the symbiotic bacteria Rhizobium. Front Microbiol 2022; 13:990394. [PMID: 36177468 PMCID: PMC9512667 DOI: 10.3389/fmicb.2022.990394] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Accepted: 08/19/2022] [Indexed: 11/27/2022] Open
Abstract
This study describes novel single-stranded DNA phages isolated from common bean agriculture soils by infection of the nitrogen-fixing symbiotic bacteria Rhizobium etli and R. phaseoli. A total of 29 phages analyzed have 4.3-6 kb genomes in size and GC 59-60%. They belong to different clades unrelated to other Microviridae subfamilies. Three-dimensional models of the major capsid protein (MCP) showed a conserved β-barrel structural "jelly-roll" fold. A variable-length loop in the MCPs distinguished three Rhizobium microvirus groups. Microviridae subfamilies were consistent with viral clusters determined by the protein-sharing network. All viral clusters, except for Bullavirinae, included mostly microviruses identified in metagenomes from distinct ecosystems. Two Rhizobium microvirus clusters, chaparroviruses, and chicoviruses, were included within large viral unknown clusters with microvirus genomes identified in diverse metagenomes. A third Rhizobium microvirus cluster belonged to the subfamily Amoyvirinae. Phylogenetic analysis of the MCP confirms the divergence of the Rhizobium microviruses into separate clades. The phylogeny of the bacterial hosts matches the microvirus MCP phylogeny, suggesting a coevolutionary history between the phages and their bacterial host. This study provided essential biological information on cultivated microvirus for understanding the evolution and ecological diversification of the Microviridae family in diverse microbial ecosystems.
Collapse
Affiliation(s)
- Jannick Van Cauwenberghe
- Centro de Ciencias Genómicas, Universidad Nacional Autonóma de México, Cuernavaca, Mexico
- Department of Integrative Biology, University of California, Berkeley, CA, United States
| | - Rosa I. Santamaría
- Centro de Ciencias Genómicas, Universidad Nacional Autonóma de México, Cuernavaca, Mexico
| | - Patricia Bustos
- Centro de Ciencias Genómicas, Universidad Nacional Autonóma de México, Cuernavaca, Mexico
| | - Víctor González
- Centro de Ciencias Genómicas, Universidad Nacional Autonóma de México, Cuernavaca, Mexico
| |
Collapse
|
438
|
Liang YY, Yan LQ, Tan MH, Li GH, Fang JH, Peng JY, Li KT. Isolation, characterization, and genome sequencing of a novel chitin deacetylase producing Bacillus aryabhattai TCI-16. Front Microbiol 2022; 13:999639. [PMID: 36171752 PMCID: PMC9511218 DOI: 10.3389/fmicb.2022.999639] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 08/23/2022] [Indexed: 11/23/2022] Open
Abstract
Chitin deacetylase (CDA) is a chitin degradation enzyme that catalyzes the conversion of chitin to chitosan by the deacetylation of N-acetyl-D-glucosamine residues, playing an important role in the high-value utilization of waste chitin. The shells of shrimp and crab are rich in chitin, and mangroves are usually recognized as an active habitat to shrimp and crab. In the present study, a CDA-producing bacterium, strain TCI-16, was isolated and screened from the mangrove soil. Strain TCI-16 was identified and named as Bacillus aryabhattai TCI-16, and the maximum CDA activity in fermentation broth reached 120.35 ± 2.40 U/mL at 36 h of cultivation. Furthermore, the complete genome analysis of B. aryabhattai TCI-16 revealed the chitin-degrading enzyme system at genetic level, in which a total of 13 putative genes were associated with carbohydrate esterase 4 (CE4) family enzymes, including one gene coding CDA, seven genes encoding polysaccharide deacetylases, and five genes encoding peptidoglycan-N-acetyl glucosamine deacetylases. Amino acid sequence analysis showed that the predicted CDA of B. aryabhattai TCI-16 was composed of 236 amino acid residues with a molecular weight of 27.3 kDa, which possessed a conserved CDA active like the known CDAs. However, the CDA of B. aryabhattai TCI-16 showed low homology (approximately 30%) with other microbial CDAs, and its phylogenetic tree belonged to a separate clade in bacteria, suggesting a high probability in structural novelty. In conclusion, the present study indicated that the novel CDA produced by B. aryabhattai TCI-16 might be a promising option for bioconversion of chitin to the value-added chitosan.
Collapse
Affiliation(s)
- Ying-yin Liang
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Technology Research Center of Seafood, Guangdong Province Engineering Laboratory for Marine Biological Products, College of Food Science and Technology, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Guangdong Ocean University, Zhanjiang, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
| | - Lu-qi Yan
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Technology Research Center of Seafood, Guangdong Province Engineering Laboratory for Marine Biological Products, College of Food Science and Technology, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Guangdong Ocean University, Zhanjiang, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
| | - Ming-hui Tan
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Technology Research Center of Seafood, Guangdong Province Engineering Laboratory for Marine Biological Products, College of Food Science and Technology, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Guangdong Ocean University, Zhanjiang, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
| | - Gang-hui Li
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Technology Research Center of Seafood, Guangdong Province Engineering Laboratory for Marine Biological Products, College of Food Science and Technology, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Guangdong Ocean University, Zhanjiang, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
| | - Jian-hao Fang
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Technology Research Center of Seafood, Guangdong Province Engineering Laboratory for Marine Biological Products, College of Food Science and Technology, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Guangdong Ocean University, Zhanjiang, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
| | - Jie-ying Peng
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Technology Research Center of Seafood, Guangdong Province Engineering Laboratory for Marine Biological Products, College of Food Science and Technology, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Guangdong Ocean University, Zhanjiang, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
| | - Kun-tai Li
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Technology Research Center of Seafood, Guangdong Province Engineering Laboratory for Marine Biological Products, College of Food Science and Technology, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Guangdong Ocean University, Zhanjiang, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
| |
Collapse
|
439
|
Towards Molecular Understanding of the Functional Role of UbiJ-UbiK2 Complex in Ubiquinone Biosynthesis by Multiscale Molecular Modelling Studies. Int J Mol Sci 2022; 23:ijms231810323. [PMID: 36142227 PMCID: PMC9499169 DOI: 10.3390/ijms231810323] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 08/30/2022] [Accepted: 09/01/2022] [Indexed: 11/17/2022] Open
Abstract
Ubiquinone (UQ) is a polyisoprenoid lipid found in the membranes of bacteria and eukaryotes. UQ has important roles, notably in respiratory metabolisms which sustain cellular bioenergetics. Most steps of UQ biosynthesis take place in the cytosol of E. coli within a multiprotein complex called the Ubi metabolon, that contains five enzymes and two accessory proteins, UbiJ and UbiK. The SCP2 domain of UbiJ was proposed to bind the hydrophobic polyisoprenoid tail of UQ biosynthetic intermediates in the Ubi metabolon. How the newly synthesised UQ might be released in the membrane is currently unknown. In this paper, we focused on better understanding the role of the UbiJ-UbiK2 heterotrimer forming part of the metabolon. Given the difficulties to gain functional insights using biophysical techniques, we applied a multiscale molecular modelling approach to study the UbiJ-UbiK2 heterotrimer. Our data show that UbiJ-UbiK2 interacts closely with the membrane and suggests possible pathways to enable the release of UQ into the membrane. This study highlights the UbiJ-UbiK2 complex as the likely interface between the membrane and the enzymes of the Ubi metabolon and supports that the heterotrimer is key to the biosynthesis of UQ8 and its release into the membrane of E. coli.
Collapse
|
440
|
del Alamo D, DeSousa L, Nair RM, Rahman S, Meiler J, Mchaourab HS. Integrated AlphaFold2 and DEER investigation of the conformational dynamics of a pH-dependent APC antiporter. Proc Natl Acad Sci U S A 2022; 119:e2206129119. [PMID: 35969794 PMCID: PMC9407458 DOI: 10.1073/pnas.2206129119] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 07/08/2022] [Indexed: 11/18/2022] Open
Abstract
The Amino Acid-Polyamine-Organocation (APC) transporter GadC contributes to the survival of pathogenic bacteria under extreme acid stress by exchanging extracellular glutamate for intracellular γ-aminobutyric acid (GABA). Its structure, determined in an inward-facing conformation at alkaline pH, consists of the canonical LeuT-fold with a conserved five-helix inverted repeat, thereby resembling functionally divergent transporters such as the serotonin transporter SERT and the glucose-sodium symporter SGLT1. However, despite this structural similarity, it is unclear if the conformational dynamics of antiporters such as GadC follow the blueprint of these or other LeuT-fold transporters. Here, we used double electron-electron resonance (DEER) spectroscopy to monitor the conformational dynamics of GadC in lipid bilayers in response to acidification and substrate binding. To guide experimental design and facilitate the interpretation of the DEER data, we generated an ensemble of structural models in multiple conformations using a recently introduced modification of AlphaFold2 . Our experimental results reveal acid-induced conformational changes that dislodge the Cterminus from the permeation pathway coupled with rearrangement of helices that enables isomerization between inward- and outward-facing states. The substrate glutamate, but not GABA, modulates the dynamics of an extracellular thin gate without shifting the equilibrium between inward- and outward-facing conformations. In addition to introducing an integrated methodology for probing transporter conformational dynamics, the congruence of the DEER data with patterns of structural rearrangements deduced from ensembles of AlphaFold2 models illuminates the conformational cycle of GadC underpinning transport and exposes yet another example of the divergence between the dynamics of different families in the LeuT-fold.
Collapse
Affiliation(s)
- Diego del Alamo
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37212
- Department of Chemistry, Vanderbilt University, Nashville, TN 37212
| | - Lillian DeSousa
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37212
| | - Rahul M. Nair
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37212
| | - Suhaila Rahman
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37212
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN 37212
- Institute for Drug Discovery, Leipzig University, Leipzig, Germany 04109
| | - Hassane S. Mchaourab
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37212
| |
Collapse
|
441
|
Innovative Hybrid-Alignment Annotation Method for Bioinformatics Identification and Functional Verification of a Novel Nitric Oxide Synthase in Trichomonas vaginalis. BIOLOGY 2022; 11:biology11081210. [PMID: 36009837 PMCID: PMC9404748 DOI: 10.3390/biology11081210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 08/06/2022] [Accepted: 08/08/2022] [Indexed: 11/17/2022]
Abstract
Simple Summary Both the annotation and identification of genes in pathogenic parasites remain challenging. As a survival factor, nitric oxide (NO) has been proven to be synthesized in Trichomonas vaginalis (TV). However, nitric oxide synthase (NOS) has not yet been annotated in the TV genome. By aligning whole coding sequences of TV against a thousand sequences of known proteins from other organisms via the Smith–Waterman and Needleman–Wunsch algorithms, we developed a witness-to-suspect strategy to identify incorrectly annotated genes in TV. A novel NOS of TV (TV NOS) with a high witness-to-suspect ratio, which was originally annotated as a hydrogenase in the NCBI database, was successfully identified. We then performed in silico modeling of the protein structure and the molecular docking of all cofactors (NADPH, tetrahydrobiopterin (BH4), heme and flavin adenine dinucleotide (FAD)), cloned the gene, expressed and purified the protein, and ultimately performed mass spectrometry analysis and enzymatic activity assays. We clearly showed that although the predicted structure of TV NOS is not similar to that of NOS proteins of other species, all cofactor-binding motifs can interact with their ligands with high affinities. Most importantly, the purified protein is a functional NOS, as it has a high enzymatic activity for generating NO in vitro. This study provides an innovative approach to identify incorrectly annotated genes. Abstract Both the annotation and identification of genes in pathogenic parasites are still challenging. Although, as a survival factor, nitric oxide (NO) has been proven to be synthesized in Trichomonas vaginalis (TV), nitric oxide synthase (NOS) has not yet been annotated in the TV genome. We developed a witness-to-suspect strategy to identify incorrectly annotated genes in TV via the Smith–Waterman and Needleman–Wunsch algorithms through in-depth and repeated alignment of whole coding sequences of TV against thousands of sequences of known proteins from other organisms. A novel NOS of TV (TV NOS), which was annotated as hydrogenase in the NCBI database, was successfully identified; this TV NOS had a high witness-to-suspect ratio and contained all the NOS cofactor-binding motifs (NADPH, tetrahydrobiopterin (BH4), heme and flavin adenine dinucleotide (FAD) motifs). To confirm this identification, we performed in silico modeling of the protein structure and cofactor docking, cloned the gene, expressed and purified the protein, performed mass spectrometry analysis, and ultimately performed an assay to measure enzymatic activity. Our data showed that although the predicted structure of the TV NOS protein was not similar to the structure of NOSs of other species, all cofactor-binding motifs could interact with their ligands with high affinities. We clearly showed that the purified protein had high enzymatic activity for generating NO in vitro. This study provides an innovative approach to identify incorrectly annotated genes in TV and highlights a novel NOS that might serve as a virulence factor of TV.
Collapse
|
442
|
Pervaiz I, Zahra FT, Mikelis C, Al-Ahmad AJ. An in vitro model of glucose transporter 1 deficiency syndrome at the blood-brain barrier using induced pluripotent stem cells. J Neurochem 2022; 162:483-500. [PMID: 35943296 DOI: 10.1111/jnc.15684] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 07/08/2022] [Accepted: 08/03/2022] [Indexed: 11/28/2022]
Abstract
Glucose is an important source of energy for the central nervous system. Its uptake at the blood-brain barrier (BBB) is mostly mediated via glucose transporter 1 (GLUT1), a facilitated transporter encoded by the SLC2A1 gene. GLUT1 Deficiency Syndrome (GLUT1DS) is a haploinsufficiency characterized by mutations in the SLC2A1 gene, resulting in impaired glucose uptake at the BBB and clinically characterized by epileptic seizures and movement disorder. A major limitation is an absence of in vitro models of the BBB reproducing the disease. This study aimed to characterize an in vitro model of GLUT1DS using human pluripotent stem cells (iPSCs). Two GLUT1DS clones were generated (GLUT1-iPSC) from their original parental clone iPS(IMR90)-c4 by CRISPR/Cas9 and differentiated into brain microvascular endothelial cells (iBMECs). Cells were characterized in terms of SLC2A1 expression, changes in the barrier function, glucose uptake and metabolism, and angiogenesis. GLUT1DS iPSCs and iBMECs showed comparable phenotype to their parental control, with exception of reduced GLUT1 expression at the protein level. Although no major disruption in the barrier function was reported in the two clones, a significant reduction in glucose uptake accompanied by an increase in glycolysis and mitochondrial respiration was reported in both GLUT1DS-iBMECs. Finally, impaired angiogenic features were reported in such clones compared to the parental clone. Our study provides the first documented characterization of GLUT1DS-iBMECs generated by CRISPR-Cas9, suggesting that GLUT1 truncation appears detrimental to brain angiogenesis and brain endothelial bioenergetics, but maybe not be detrimental to iBMECs differentiation and barriergenesis. Our future direction is to further characterize the functional outcome of such truncated product, as well as its impact on other cells of the neurovascular unit.
Collapse
Affiliation(s)
- Iqra Pervaiz
- Texas Tech University Health Sciences Center, Jerry H. Hodge School of Pharmacy, Department of Pharmaceutical Sciences, Amarillo, Texas, United States of America
| | - Fatema Tuz Zahra
- Texas Tech University Health Sciences Center, Jerry H. Hodge School of Pharmacy, Department of Pharmaceutical Sciences, Amarillo, Texas, United States of America
| | - Constantinos Mikelis
- Texas Tech University Health Sciences Center, Jerry H. Hodge School of Pharmacy, Department of Pharmaceutical Sciences, Amarillo, Texas, United States of America
| | - Abraham Jacob Al-Ahmad
- Texas Tech University Health Sciences Center, Jerry H. Hodge School of Pharmacy, Department of Pharmaceutical Sciences, Amarillo, Texas, United States of America
| |
Collapse
|
443
|
Lee C, Su BH, Tseng YJ. Comparative studies of AlphaFold, RoseTTAFold and Modeller: a case study involving the use of G-protein-coupled receptors. Brief Bioinform 2022; 23:6658852. [PMID: 35945035 PMCID: PMC9487610 DOI: 10.1093/bib/bbac308] [Citation(s) in RCA: 54] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 06/22/2022] [Accepted: 07/07/2022] [Indexed: 11/13/2022] Open
Abstract
Neural network (NN)-based protein modeling methods have improved significantly in recent years. Although the overall accuracy of the two non-homology-based modeling methods, AlphaFold and RoseTTAFold, is outstanding, their performance for specific protein families has remained unexamined. G-protein-coupled receptor (GPCR) proteins are particularly interesting since they are involved in numerous pathways. This work directly compares the performance of these novel deep learning-based protein modeling methods for GPCRs with the most widely used template-based software—Modeller. We collected the experimentally determined structures of 73 GPCRs from the Protein Data Bank. The official AlphaFold repository and RoseTTAFold web service were used with default settings to predict five structures of each protein sequence. The predicted models were then aligned with the experimentally solved structures and evaluated by the root-mean-square deviation (RMSD) metric. If only looking at each program’s top-scored structure, Modeller had the smallest average modeling RMSD of 2.17 Å, which is better than AlphaFold’s 5.53 Å and RoseTTAFold’s 6.28 Å, probably since Modeller already included many known structures as templates. However, the NN-based methods (AlphaFold and RoseTTAFold) outperformed Modeller in 21 and 15 out of the 73 cases with the top-scored model, respectively, where no good templates were available for Modeller. The larger RMSD values generated by the NN-based methods were primarily due to the differences in loop prediction compared to the crystal structures.
Collapse
Affiliation(s)
- Chien Lee
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| | - Bo-Han Su
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| | - Yufeng Jane Tseng
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan.,Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
444
|
Mugunthan SP, Harish MC. In silico structural homology modeling and functional characterization of Mycoplasma gallisepticum variable lipoprotein hemagglutin proteins. Front Vet Sci 2022; 9:943831. [PMID: 35990271 PMCID: PMC9386052 DOI: 10.3389/fvets.2022.943831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Accepted: 07/01/2022] [Indexed: 11/13/2022] Open
Abstract
Mycoplasma gallisepticum variable lipoprotein hemagglutin (vlhA) proteins are crucial for immune evasion from the host cells, permitting the persistence and survival of the pathogen. However, the exact molecular mechanism behind the immune evasion function is still not clear. In silico physiochemical analysis, domain analysis, subcellular localization, and homology modeling studies have been carried out to predict the structural and functional properties of these proteins. The outcomes of this study provide significant preliminary data for understanding the immune evasion by vlhA proteins. In this study, we have reported the primary, secondary, and tertiary structural characteristics and subcellular localization, presence of the transmembrane helix and signal peptide, and functional characteristics of vlhA proteins from M. gallisepticum strain R low. The results show variation between the structural and functional components of the proteins, signifying the role and diverse molecular mechanisms in functioning of vlhA proteins in host immune evasion. Moreover the 3D structure predicted in this study will pave a way for understanding vlhA protein function and its interaction with other molecules to undergo immune evasion. This study forms the basis for future experimental studies improving our understanding in the molecular mechanisms used by vlhA proteins.
Collapse
|
445
|
Yin R, Feng BY, Varshney A, Pierce BG. Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants. Protein Sci 2022; 31:e4379. [PMID: 35900023 PMCID: PMC9278006 DOI: 10.1002/pro.4379] [Citation(s) in RCA: 187] [Impact Index Per Article: 62.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Revised: 06/06/2022] [Accepted: 06/09/2022] [Indexed: 12/17/2022]
Abstract
High-resolution experimental structural determination of protein-protein interactions has led to valuable mechanistic insights, yet due to the massive number of interactions and experimental limitations there is a need for computational methods that can accurately model their structures. Here we explore the use of the recently developed deep learning method, AlphaFold, to predict structures of protein complexes from sequence. With a benchmark of 152 diverse heterodimeric protein complexes, multiple implementations and parameters of AlphaFold were tested for accuracy. Remarkably, many cases (43%) had near-native models (medium or high critical assessment of predicted interactions accuracy) generated as top-ranked predictions by AlphaFold, greatly surpassing the performance of unbound protein-protein docking (9% success rate for near-native top-ranked models), however AlphaFold modeling of antibody-antigen complexes within our set was unsuccessful. We identified sequence and structural features associated with lack of AlphaFold success, and we also investigated the impact of multiple sequence alignment input. Benchmarking of a multimer-optimized version of AlphaFold (AlphaFold-Multimer) with a set of recently released antibody-antigen structures confirmed a low rate of success for antibody-antigen complexes (11% success), and we found that T cell receptor-antigen complexes are likewise not accurately modeled by that algorithm, showing that adaptive immune recognition poses a challenge for the current AlphaFold algorithm and model. Overall, our study demonstrates that end-to-end deep learning can accurately model many transient protein complexes, and highlights areas of improvement for future developments to reliably model any protein-protein interaction of interest.
Collapse
Affiliation(s)
- Rui Yin
- Institute for Bioscience and Biotechnology ResearchUniversity of MarylandRockvilleMarylandUSA
- Department of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Brandon Y. Feng
- Department of Computer ScienceUniversity of MarylandCollege ParkMarylandUSA
| | - Amitabh Varshney
- Department of Computer ScienceUniversity of MarylandCollege ParkMarylandUSA
| | - Brian G. Pierce
- Institute for Bioscience and Biotechnology ResearchUniversity of MarylandRockvilleMarylandUSA
- Department of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
- Marlene and Stewart Greenebaum Comprehensive Cancer CenterUniversity of Maryland School of MedicineBaltimoreMarylandUSA
| |
Collapse
|
446
|
Brems MA, Runkel R, Yeates TO, Virnau P. AlphaFold predicts the most complex protein knot and composite protein knots. Protein Sci 2022; 31:e4380. [PMID: 35900026 PMCID: PMC9278004 DOI: 10.1002/pro.4380] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 06/09/2022] [Accepted: 06/11/2022] [Indexed: 11/06/2022]
Abstract
The computer artificial intelligence system AlphaFold has recently predicted previously unknown three-dimensional structures of thousands of proteins. Focusing on the subset with high-confidence scores, we algorithmically analyze these predictions for cases where the protein backbone exhibits rare topological complexity, that is, knotting. Amongst others, we discovered a 71 -knot, the most topologically complex knot ever found in a protein, as well several six-crossing composite knots comprised of two methyltransferase or carbonic anhydrase domains, each containing a simple trefoil knot. These deeply embedded composite knots occur evidently by gene duplication and interconnection of knotted dimers. Finally, we report two new five-crossing knots including the first 51 -knot. Our list of analyzed structures forms the basis for future experimental studies to confirm these novel-knotted topologies and to explore their complex folding mechanisms.
Collapse
Affiliation(s)
- Maarten A. Brems
- Department of PhysicsJohannes Gutenberg University MainzMainzGermany
| | - Robert Runkel
- Department of PhysicsJohannes Gutenberg University MainzMainzGermany
| | - Todd O. Yeates
- UCLA‐DOE Institute for Genomics and ProteomicsUniversity of California Los AngelesLos AngelesCaliforniaUSA
- UCLA Department of Chemistry and BiochemistryUniversity of California Los AngelesLos AngelesCaliforniaUSA
| | - Peter Virnau
- Department of PhysicsJohannes Gutenberg University MainzMainzGermany
| |
Collapse
|
447
|
Sen N, Anishchenko I, Bordin N, Sillitoe I, Velankar S, Baker D, Orengo C. Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs. Brief Bioinform 2022; 23:bbac187. [PMID: 35641150 PMCID: PMC9294430 DOI: 10.1093/bib/bbac187] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 04/23/2022] [Accepted: 04/27/2022] [Indexed: 12/12/2022] Open
Abstract
Mutations in human proteins lead to diseases. The structure of these proteins can help understand the mechanism of such diseases and develop therapeutics against them. With improved deep learning techniques, such as RoseTTAFold and AlphaFold, we can predict the structure of proteins even in the absence of structural homologs. We modeled and extracted the domains from 553 disease-associated human proteins without known protein structures or close homologs in the Protein Databank. We noticed that the model quality was higher and the Root mean square deviation (RMSD) lower between AlphaFold and RoseTTAFold models for domains that could be assigned to CATH families as compared to those which could only be assigned to Pfam families of unknown structure or could not be assigned to either. We predicted ligand-binding sites, protein-protein interfaces and conserved residues in these predicted structures. We then explored whether the disease-associated missense mutations were in the proximity of these predicted functional sites, whether they destabilized the protein structure based on ddG calculations or whether they were predicted to be pathogenic. We could explain 80% of these disease-associated mutations based on proximity to functional sites, structural destabilization or pathogenicity. When compared to polymorphisms, a larger percentage of disease-associated missense mutations were buried, closer to predicted functional sites, predicted as destabilizing and pathogenic. Usage of models from the two state-of-the-art techniques provide better confidence in our predictions, and we explain 93 additional mutations based on RoseTTAFold models which could not be explained based solely on AlphaFold models.
Collapse
Affiliation(s)
- Neeladri Sen
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Ivan Anishchenko
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Nicola Bordin
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Ian Sillitoe
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| |
Collapse
|
448
|
Hsueh SCC, Aina A, Roman AY, Cashman NR, Peng X, Plotkin SS. Optimizing Epitope Conformational Ensembles Using α-Synuclein Cyclic Peptide "Glycindel" Scaffolds: A Customized Immunogen Method for Generating Oligomer-Selective Antibodies for Parkinson's Disease. ACS Chem Neurosci 2022; 13:2261-2280. [PMID: 35840132 DOI: 10.1021/acschemneuro.1c00567] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Effectively presenting epitopes on immunogens, in order to raise conformationally selective antibodies through active immunization, is a central problem in treating protein misfolding diseases, particularly neurodegenerative diseases such as Alzheimer's disease or Parkinson's disease. We seek to selectively target conformations enriched in toxic, oligomeric propagating species while sparing the healthy forms of the protein that are often more abundant. To this end, we computationally modeled scaffolded epitopes in cyclic peptides by inserting/deleting a variable number of flanking glycines ("glycindels") to best mimic a misfolding-specific conformation of an epitope of α-synuclein enriched in the oligomer ensemble, as characterized by a region most readily disordered and solvent-exposed in a stressed, partially denatured protofibril. We screen and rank the cyclic peptide scaffolds of α-synuclein in silico based on their ensemble overlap properties with the fibril, oligomer-model and isolated monomer ensembles. We present experimental data of seeded aggregation that support nucleation rates consistent with computationally predicted cyclic peptide conformational similarity. We also introduce a method for screening against structured off-pathway targets in the human proteome by selecting scaffolds with minimal conformational similarity between their epitope and the same solvent-exposed primary sequence in structured human proteins. Different cyclic peptide scaffolds with variable numbers of glycines are predicted computationally to have markedly different conformational ensembles. Ensemble comparison and overlap were quantified by the Jensen-Shannon divergence and a new measure introduced here, the embedding depth, which determines the extent to which a given ensemble is subsumed by another ensemble and which may be a more useful measure in developing immunogens that confer conformational selectivity to an antibody.
Collapse
Affiliation(s)
- Shawn C C Hsueh
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| | - Adekunle Aina
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| | - Andrei Yu Roman
- Djavad Mowafaghian Centre for Brain Health, The University of British Columbia, Vancouver, BC V6T 2B5, Canada
| | - Neil R Cashman
- Djavad Mowafaghian Centre for Brain Health, The University of British Columbia, Vancouver, BC V6T 2B5, Canada
| | - Xubiao Peng
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| | - Steven S Plotkin
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada.,Genome Science and Technology Program, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| |
Collapse
|
449
|
Hryc CF, Baker ML. AlphaFold2 and CryoEM: Revisiting CryoEM modeling in near-atomic resolution density maps. iScience 2022; 25:104496. [PMID: 35733789 PMCID: PMC9207676 DOI: 10.1016/j.isci.2022.104496] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/07/2022] [Accepted: 05/24/2022] [Indexed: 11/27/2022] Open
Abstract
With the advent of new artificial intelligence and machine learning algorithms, predictive modeling can, in some cases, produce structures on par with experimental methods. The combination of predictive modeling and experimental structure determination by electron cryomicroscopy (cryoEM) offers a tantalizing approach for producing robust atomic models of macromolecular assemblies. Here, we apply AlphaFold2 to a set of community standard data sets and compare the results with the corresponding reference maps and models. Moreover, we present three unique case studies from previously determined cryoEM density maps of viruses. Our results show that AlphaFold2 can not only produce reasonably accurate models for analysis and additional hypotheses testing, but can also potentially yield incorrect structures if not properly validated with experimental data. Whereas we outline numerous shortcomings and potential pitfalls of predictive modeling, the obvious synergy between predictive modeling and cryoEM will undoubtedly result in new computational modeling tools.
Collapse
Affiliation(s)
- Corey F. Hryc
- Department of Biochemistry and Molecular Biology, Structural Biology Imaging Center, McGovern Medical School at The University of Texas Health Science Center at Houston, 6431 Fannin Street, Houston, TX 77030, USA
| | - Matthew L. Baker
- Department of Biochemistry and Molecular Biology, Structural Biology Imaging Center, McGovern Medical School at The University of Texas Health Science Center at Houston, 6431 Fannin Street, Houston, TX 77030, USA
| |
Collapse
|
450
|
Ros-Lucas A, Martinez-Peinado N, Bastida J, Gascón J, Alonso-Padilla J. The Use of AlphaFold for In Silico Exploration of Drug Targets in the Parasite Trypanosoma cruzi. Front Cell Infect Microbiol 2022; 12:944748. [PMID: 35909956 PMCID: PMC9329570 DOI: 10.3389/fcimb.2022.944748] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Accepted: 06/23/2022] [Indexed: 11/23/2022] Open
Abstract
Chagas disease is a devastating neglected disease caused by the parasite Trypanosoma cruzi, which affects millions of people worldwide. The two anti-parasitic drugs available, nifurtimox and benznidazole, have a good efficacy against the acute stage of the infection. But this is short, usually asymptomatic and often goes undiagnosed. Access to treatment is mostly achieved during the chronic stage, when the cardiac and/or digestive life-threatening symptoms manifest. Then, the efficacy of both drugs is diminished, and their long administration regimens involve frequently associated adverse effects that compromise treatment compliance. Therefore, the discovery of safer and more effective drugs is an urgent need. Despite its advantages over lately used phenotypic screening, target-based identification of new anti-parasitic molecules has been hampered by incomplete annotation and lack of structures of the parasite protein space. Presently, the AlphaFold Protein Structure Database is home to 19,036 protein models from T. cruzi, which could hold the key to not only describe new therapeutic approaches, but also shed light on molecular mechanisms of action for known compounds. In this proof-of-concept study, we screened the AlphaFold T. cruzi set of predicted protein models to find prospective targets for a pre-selected list of compounds with known anti-trypanosomal activity using docking-based inverse virtual screening. The best receptors (targets) for the most promising ligands were analyzed in detail to address molecular interactions and potential drugs’ mode of action. The results provide insight into the mechanisms of action of the compounds and their targets, and pave the way for new strategies to finding novel compounds or optimize already existing ones.
Collapse
Affiliation(s)
- Albert Ros-Lucas
- Barcelona Institute for Global Health (ISGlobal), Hospital Clinic - University of Barcelona, Barcelona, Spain
- *Correspondence: Albert Ros-Lucas, ; Nieves Martinez-Peinado, ; Julio Alonso-Padilla,
| | - Nieves Martinez-Peinado
- Barcelona Institute for Global Health (ISGlobal), Hospital Clinic - University of Barcelona, Barcelona, Spain
- *Correspondence: Albert Ros-Lucas, ; Nieves Martinez-Peinado, ; Julio Alonso-Padilla,
| | - Jaume Bastida
- Departament de Biologia, Sanitat i Medi Ambient, Facultat de Farmàcia i Ciències de l´Alimentació, Universitat de Barcelona, Barcelona, Spain
| | - Joaquim Gascón
- Barcelona Institute for Global Health (ISGlobal), Hospital Clinic - University of Barcelona, Barcelona, Spain
- CIBERINFEC, ISCIII—CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, Madrid, Spain
| | - Julio Alonso-Padilla
- Barcelona Institute for Global Health (ISGlobal), Hospital Clinic - University of Barcelona, Barcelona, Spain
- CIBERINFEC, ISCIII—CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, Madrid, Spain
- *Correspondence: Albert Ros-Lucas, ; Nieves Martinez-Peinado, ; Julio Alonso-Padilla,
| |
Collapse
|