301
|
Wallner B. AFsample: improving multimer prediction with AlphaFold using massive sampling. Bioinformatics 2023; 39:btad573. [PMID: 37713472 PMCID: PMC10534052 DOI: 10.1093/bioinformatics/btad573] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/29/2023] [Accepted: 09/14/2023] [Indexed: 09/17/2023] Open
Abstract
SUMMARY The AlphaFold2 neural network model has revolutionized structural biology with unprecedented performance. We demonstrate that by stochastically perturbing the neural network by enabling dropout at inference combined with massive sampling, it is possible to improve the quality of the generated models. We generated ∼6000 models per target compared with 25 default for AlphaFold-Multimer, with v1 and v2 multimer network models, with and without templates, and increased the number of recycles within the network. The method was benchmarked in CASP15, and compared with AlphaFold-Multimer v2 it improved the average DockQ from 0.41 to 0.55 using identical input and was ranked at the very top in the protein assembly category when compared with all other groups participating in CASP15. The simplicity of the method should facilitate the adaptation by the field, and the method should be useful for anyone interested in modeling multimeric structures, alternate conformations, or flexible structures. AVAILABILITY AND IMPLEMENTATION AFsample is available online at http://wallnerlab.org/AFsample.
Collapse
Affiliation(s)
- Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, SE-581 83 Linköping, Sweden
| |
Collapse
|
302
|
Simpkin AJ, Caballero I, McNicholas S, Stevenson K, Jiménez E, Sánchez Rodríguez F, Fando M, Uski V, Ballard C, Chojnowski G, Lebedev A, Krissinel E, Usón I, Rigden DJ, Keegan RM. Predicted models and CCP4. Acta Crystallogr D Struct Biol 2023; 79:806-819. [PMID: 37594303 PMCID: PMC10478639 DOI: 10.1107/s2059798323006289] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 07/19/2023] [Indexed: 08/19/2023] Open
Abstract
In late 2020, the results of CASP14, the 14th event in a series of competitions to assess the latest developments in computational protein structure-prediction methodology, revealed the giant leap forward that had been made by Google's Deepmind in tackling the prediction problem. The level of accuracy in their predictions was the first instance of a competitor achieving a global distance test score of better than 90 across all categories of difficulty. This achievement represents both a challenge and an opportunity for the field of experimental structural biology. For structure determination by macromolecular X-ray crystallography, access to highly accurate structure predictions is of great benefit, particularly when it comes to solving the phase problem. Here, details of new utilities and enhanced applications in the CCP4 suite, designed to allow users to exploit predicted models in determining macromolecular structures from X-ray diffraction data, are presented. The focus is mainly on applications that can be used to solve the phase problem through molecular replacement.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Iracema Caballero
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona, Spain
| | - Stuart McNicholas
- York Structural Biology Laboratory, Department of Chemistry, The University of York, York YO10 5DD, United Kingdom
| | - Kyle Stevenson
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Elisabet Jiménez
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona, Spain
| | - Filomeno Sánchez Rodríguez
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
- York Structural Biology Laboratory, Department of Chemistry, The University of York, York YO10 5DD, United Kingdom
| | - Maria Fando
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Ville Uski
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Charles Ballard
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Grzegorz Chojnowski
- European Molecular Biology Laboratory, Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany
| | - Andrey Lebedev
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Eugene Krissinel
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Isabel Usón
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona, Spain
- ICREA, Institució Catalana de Recerca i Estudis Avançats, Passeig Lluís Companys 23, 08003 Barcelona, Spain
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Ronan M. Keegan
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| |
Collapse
|
303
|
Emser SV, Spielvogel CP, Millesi E, Steinborn R. Mitochondrial polymorphism m.3017C>T of SHLP6 relates to heterothermy. Front Physiol 2023; 14:1207620. [PMID: 37675281 PMCID: PMC10478271 DOI: 10.3389/fphys.2023.1207620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 07/31/2023] [Indexed: 09/08/2023] Open
Abstract
Heterothermic thermoregulation requires intricate regulation of metabolic rate and activation of pro-survival factors. Eliciting these responses and coordinating the necessary energy shifts likely involves retrograde signalling by mitochondrial-derived peptides (MDPs). Members of the group were suggested before to play a role in heterothermic physiology, a key component of hibernation and daily torpor. Here we studied the mitochondrial single-nucleotide polymorphism (SNP) m.3017C>T that resides in the evolutionarily conserved gene MT-SHLP6. The substitution occurring in several mammalian orders causes truncation of SHLP6 peptide size from twenty to nine amino acids. Public mass spectrometric (MS) data of human SHLP6 indicated a canonical size of 20 amino acids, but not the use of alternative translation initiation codons that would expand the peptide. The shorter isoform of SHLP6 was found in heterothermic rodents at higher frequency compared to homeothermic rodents (p < 0.001). In heterothermic mammals it was associated with lower minimal body temperature (T b, p < 0.001). In the thirteen-lined ground squirrel, brown adipose tissue-a key organ required for hibernation, showed dynamic changes of the steady-state transcript level of mt-Shlp6. The level was significantly higher before hibernation and during interbout arousal and lower during torpor and after hibernation. Our finding argues to further explore the mode of action of SHLP6 size isoforms with respect to mammalian thermoregulation and possibly mitochondrial retrograde signalling.
Collapse
Affiliation(s)
- Sarah V. Emser
- Department of Behavioral and Cognitive Biology, University of Vienna, Vienna, Austria
- Genomics Core Facility, VetCore, University of Veterinary Medicine, Vienna, Austria
| | - Clemens P. Spielvogel
- Department of Biomedical Imaging and Image-Guided Therapy, Division of Nuclear Medicine, Medical University of Vienna, Vienna, Austria
| | - Eva Millesi
- Department of Behavioral and Cognitive Biology, University of Vienna, Vienna, Austria
| | - Ralf Steinborn
- Genomics Core Facility, VetCore, University of Veterinary Medicine, Vienna, Austria
- Department of Microbiology, Immunobiology and Genetics, University of Vienna, Vienna, Austria
| |
Collapse
|
304
|
Abstract
Drug development is a wide scientific field that faces many challenges these days. Among them are extremely high development costs, long development times, and a small number of new drugs that are approved each year. New and innovative technologies are needed to solve these problems that make the drug discovery process of small molecules more time and cost efficient, and that allow previously undruggable receptor classes to be targeted, such as protein-protein interactions. Structure-based virtual screenings (SBVSs) have become a leading contender in this context. In this review, we give an introduction to the foundations of SBVSs and survey their progress in the past few years with a focus on ultralarge virtual screenings (ULVSs). We outline key principles of SBVSs, recent success stories, new screening techniques, available deep learning-based docking methods, and promising future research directions. ULVSs have an enormous potential for the development of new small-molecule drugs and are already starting to transform early-stage drug discovery.
Collapse
Affiliation(s)
- Christoph Gorgulla
- Harvard Medical School and Physics Department, Harvard University, Boston, Massachusetts, USA;
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Current affiliation: Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| |
Collapse
|
305
|
Ahdritz G, Bouatta N, Kadyan S, Jarosch L, Berenberg D, Fisk I, Watkins AM, Ra S, Bonneau R, AlQuraishi M. OpenProteinSet: Training data for structural biology at scale. ARXIV 2023:arXiv:2308.05326v1. [PMID: 37608940 PMCID: PMC10441447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Multiple sequence alignments (MSAs) of proteins encode rich biological information and have been workhorses in bioinformatic methods for tasks like protein design and protein structure prediction for decades. Recent breakthroughs like AlphaFold2 that use transformers to attend directly over large quantities of raw MSAs have reaffirmed their importance. Generation of MSAs is highly computationally intensive, however, and no datasets comparable to those used to train AlphaFold2 have been made available to the research community, hindering progress in machine learning for proteins. To remedy this problem, we introduce OpenProteinSet, an open-source corpus of more than 16 million MSAs, associated structural homologs from the Protein Data Bank, and AlphaFold2 protein structure predictions. We have previously demonstrated the utility of OpenProteinSet by successfully retraining AlphaFold2 on it. We expect OpenProteinSet to be broadly useful as training and validation data for 1) diverse tasks focused on protein structure, function, and design and 2) large-scale multimodal machine learning research.
Collapse
Affiliation(s)
| | - Nazim Bouatta
- Laboratory of Systems Pharmacology, Harvard Medical School
| | | | | | - Daniel Berenberg
- Prescient Design, Genentech & Department of Computer Science, New York University
| | | | | | | | | | | |
Collapse
|
306
|
Moussad B, Roche R, Bhattacharya D. The transformative power of transformers in protein structure prediction. Proc Natl Acad Sci U S A 2023; 120:e2303499120. [PMID: 37523536 PMCID: PMC10410766 DOI: 10.1073/pnas.2303499120] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 06/27/2023] [Indexed: 08/02/2023] Open
Abstract
Transformer neural networks have revolutionized structural biology with the ability to predict protein structures at unprecedented high accuracy. Here, we report the predictive modeling performance of the state-of-the-art protein structure prediction methods built on transformers for 69 protein targets from the recently concluded 15th Critical Assessment of Structure Prediction (CASP15) challenge. Our study shows the power of transformers in protein structure modeling and highlights future areas of improvement.
Collapse
Affiliation(s)
- Bernard Moussad
- Department of Computer Science, Virginia Tech, Blacksburg, VA24061
| | | | | |
Collapse
|
307
|
Aina A, Hsueh SCC, Gibbs E, Peng X, Cashman NR, Plotkin SS. De Novo Design of a β-Helix Tau Protein Scaffold: An Oligomer-Selective Vaccine Immunogen Candidate for Alzheimer's Disease. ACS Chem Neurosci 2023; 14:2603-2617. [PMID: 37458595 DOI: 10.1021/acschemneuro.3c00007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2023] Open
Abstract
Tau pathology is associated with many neurodegenerative disorders, including Alzheimer's disease (AD), where the spatio-temporal pattern of tau neurofibrillary tangles strongly correlates with disease progression, which motivates therapeutics selective for misfolded tau. Here, we introduce a new avidity-enhanced, multi-epitope approach for protein-misfolding immunogen design, which is predicted to mimic the conformational state of an exposed epitope in toxic tau oligomers. A predicted oligomer-selective tau epitope 343KLDFK347 was scaffolded by designing a β-helix structure that incorporated multiple instances of the 16-residue tau fragment 339VKSEKLDFKDRVQSKI354. Large-scale conformational ensemble analyses involving Jensen-Shannon Divergence and the embedding depth D showed that the multi-epitope scaffolding approach, employed in designing the β-helix scaffold, was predicted to better discriminate toxic tau oligomers than other "monovalent" strategies utilizing a single instance of an epitope for vaccine immunogen design. Using Rosetta, 10,000 sequences were designed and screened for the linker portions of the β-helix scaffold, along with a C-terminal stabilizing α-helix that interacts with the linkers, to optimize the folded structure and stability of the scaffold. Structures were ranked by energy, and the lowest 1% (82 unique sequences) were verified using AlphaFold. Several selection criteria involving AlphaFold are implemented to obtain a lead-designed sequence. The structure was further predicted to have free energetic stability by using Hamiltonian replica exchange molecular dynamics (MD) simulations. The synthesized β-helix scaffold showed direct binding in surface plasmon resonance (SPR) experiments to several antibodies that were raised to the structured epitope using a designed cyclic peptide. Moreover, the strength of binding of these antibodies to in vitro tau oligomers correlated with the strength of binding to the β-helix construct, suggesting that the construct presents an oligomer-like conformation and may thus constitute an effective oligomer-selective immunogen.
Collapse
Affiliation(s)
- Adekunle Aina
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
| | - Shawn C C Hsueh
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
| | - Ebrima Gibbs
- Djavad Mowafaghian Centre for Brain Health, The University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
| | - Xubiao Peng
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
| | - Neil R Cashman
- Djavad Mowafaghian Centre for Brain Health, The University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
| | - Steven S Plotkin
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
- Genome Science and Technology Program, The University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
| |
Collapse
|
308
|
Ajith A, Subbiah U. In silico prediction of deleterious non-synonymous SNPs in STAT3. ASIAN BIOMED 2023; 17:185-199. [PMID: 37860678 PMCID: PMC10584383 DOI: 10.2478/abm-2023-0059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]
Abstract
Background STAT3, a pleiotropic transcription factor, plays a critical role in the pathogenesis of autoimmunity, cancer, and many aspects of the immune system, as well as having a link with inflammatory bowel disease. Changes caused by non-synonymous single nucleotide polymorphisms (nsSNPs) have the potential to damage the protein's structure and function. Objective We identified disease susceptible single nucleotide polymorphisms (SNPs) in STAT3 and predicted structural changes associated with mutants that disrupt normal protein-protein interactions using different computational algorithms. Methods Several in silico tools, such as SIFT, PolyPhen v2, PROVEAN, PhD-SNP, and SNPs&GO, were used to determine nsSNPs of the STAT3. Further, the potentially deleterious SNPs were evaluated using I-Mutant, ConSurf, and other computational tools like DynaMut for structural prediction. Result 417 nsSNPs of STAT3 were identified, 6 of which are considered deleterious by in silico SNP prediction algorithms. Amino acid changes in V507F, R335W, E415K, K591M, F561Y, and Q32K were identified as the most deleterious nsSNPs based on the conservation profile, structural conformation, relative solvent accessibility, secondary structure prediction, and protein-protein interaction tools. Conclusion The in silico prediction analysis could be beneficial as a diagnostic tool for both genetic counseling and mutation confirmation. The 6 deleterious nsSNPs of STAT3 may serve as potential targets for different proteomic studies, large population-based studies, diagnoses, and therapeutic interventions.
Collapse
Affiliation(s)
- Athira Ajith
- Human Genetics Research Centre, Sree Balaji Dental College and Hospital, Bharath Institute of Higher Education and Research, Chennai600100, Tamil Nadu, India
| | - Usha Subbiah
- Human Genetics Research Centre, Sree Balaji Dental College and Hospital, Bharath Institute of Higher Education and Research, Chennai600100, Tamil Nadu, India
| |
Collapse
|
309
|
Sala D, Engelberger F, Mchaourab HS, Meiler J. Modeling conformational states of proteins with AlphaFold. Curr Opin Struct Biol 2023; 81:102645. [PMID: 37392556 DOI: 10.1016/j.sbi.2023.102645] [Citation(s) in RCA: 83] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 05/16/2023] [Accepted: 06/01/2023] [Indexed: 07/03/2023]
Abstract
Many proteins exert their function by switching among different structures. Knowing the conformational ensembles affiliated with these states is critical to elucidate key mechanistic aspects that govern protein function. While experimental determination efforts are still bottlenecked by cost, time, and technical challenges, the machine-learning technology AlphaFold showed near experimental accuracy in predicting the three-dimensional structure of monomeric proteins. However, an AlphaFold ensemble of models usually represents a single conformational state with minimal structural heterogeneity. Consequently, several pipelines have been proposed to either expand the structural breadth of an ensemble or bias the prediction toward a desired conformational state. Here, we analyze how those pipelines work, what they can and cannot predict, and future directions.
Collapse
Affiliation(s)
- D Sala
- Institute of Drug Discovery, Faculty of Medicine, University of Leipzig, 04103 Leipzig, Germany. https://twitter.com/sala_davide
| | - F Engelberger
- Institute of Drug Discovery, Faculty of Medicine, University of Leipzig, 04103 Leipzig, Germany. https://twitter.com/fengel97
| | - H S Mchaourab
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN, USA. https://twitter.com/Mchaourablab
| | - J Meiler
- Institute of Drug Discovery, Faculty of Medicine, University of Leipzig, 04103 Leipzig, Germany; Center for Structural Biology, Vanderbilt University, Nashville, TN 37240, USA; Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Dresden/Leipzig, Germany.
| |
Collapse
|
310
|
Cochereau B, Le Strat Y, Ji Q, Pawtowski A, Delage L, Weill A, Mazéas L, Hervé C, Burgaud G, Gunde-Cimerman N, Pouchus YF, Demont-Caulet N, Roullier C, Meslet-Cladiere L. Heterologous Expression and Biochemical Characterization of a New Chloroperoxidase Isolated from the Deep-Sea Hydrothermal Vent Black Yeast Hortaea werneckii UBOCC-A-208029. MARINE BIOTECHNOLOGY (NEW YORK, N.Y.) 2023; 25:519-536. [PMID: 37354383 PMCID: PMC10427571 DOI: 10.1007/s10126-023-10222-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 06/04/2023] [Indexed: 06/26/2023]
Abstract
The initiation of this study relies on a targeted genome-mining approach to highlight the presence of a putative vanadium-dependent haloperoxidase-encoding gene in the deep-sea hydrothermal vent fungus Hortaea werneckii UBOCC-A-208029. To date, only three fungal vanadium-dependent haloperoxidases have been described, one from the terrestrial species Curvularia inaequalis, one from the fungal plant pathogen Botrytis cinerea, and one from a marine derived isolate identified as Alternaria didymospora. In this study, we describe a new vanadium chloroperoxidase from the black yeast H. werneckii, successfully cloned and overexpressed in a bacterial host, which possesses higher affinity for bromide (Km = 26 µM) than chloride (Km = 237 mM). The enzyme was biochemically characterized, and we have evaluated its potential for biocatalysis by determining its stability and tolerance in organic solvents. We also describe its potential three-dimensional structure by building a model using the AlphaFold 2 artificial intelligence tool. This model shows some conservation of the 3D structure of the active site compared to the vanadium chloroperoxidase from C. inaequalis but it also highlights some differences in the active site entrance and the volume of the active site pocket, underlining its originality.
Collapse
Affiliation(s)
- Bastien Cochereau
- Univ Brest, INRAE, Laboratoire Universitaire de Biodiversité et Ecologie Microbienne, F-29280, Plouzané, France
- Institut des Substances et Organismes de la Mer, Nantes Université, ISOMER, UR, 2160, F-44000, Nantes, France
| | - Yoran Le Strat
- Univ Brest, INRAE, Laboratoire Universitaire de Biodiversité et Ecologie Microbienne, F-29280, Plouzané, France
- Institut des Substances et Organismes de la Mer, Nantes Université, ISOMER, UR, 2160, F-44000, Nantes, France
| | - Qiaolin Ji
- Univ Brest, INRAE, Laboratoire Universitaire de Biodiversité et Ecologie Microbienne, F-29280, Plouzané, France
- Institut des Substances et Organismes de la Mer, Nantes Université, ISOMER, UR, 2160, F-44000, Nantes, France
| | - Audrey Pawtowski
- Univ Brest, INRAE, Laboratoire Universitaire de Biodiversité et Ecologie Microbienne, F-29280, Plouzané, France
| | - Ludovic Delage
- Integrative Biology of Marine Models (LBI2M), UMR8227, Station Biologique de Roscoff (SBR), CNRS, Université, 29680, Roscoff, Sorbonne, France
| | - Amélie Weill
- Univ Brest, INRAE, Laboratoire Universitaire de Biodiversité et Ecologie Microbienne, F-29280, Plouzané, France
- Univ Brest, UBO Culture Collection (UBOCC), F-29280, Plouzané, France
| | - Lisa Mazéas
- Integrative Biology of Marine Models (LBI2M), UMR8227, Station Biologique de Roscoff (SBR), CNRS, Université, 29680, Roscoff, Sorbonne, France
| | - Cécile Hervé
- Integrative Biology of Marine Models (LBI2M), UMR8227, Station Biologique de Roscoff (SBR), CNRS, Université, 29680, Roscoff, Sorbonne, France
| | - Gaëtan Burgaud
- Univ Brest, INRAE, Laboratoire Universitaire de Biodiversité et Ecologie Microbienne, F-29280, Plouzané, France
| | - Nina Gunde-Cimerman
- Molecular Genetics and Biology of Microorganisms, Dept. Biology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Yves François Pouchus
- Institut des Substances et Organismes de la Mer, Nantes Université, ISOMER, UR, 2160, F-44000, Nantes, France
| | - Nathalie Demont-Caulet
- INRAE, University of Paris, UMR ECOSYS, INRAE, Université Paris-Saclay, 78026, Versailles, AgroParisTech, France
| | - Catherine Roullier
- Institut des Substances et Organismes de la Mer, Nantes Université, ISOMER, UR, 2160, F-44000, Nantes, France.
| | - Laurence Meslet-Cladiere
- Univ Brest, INRAE, Laboratoire Universitaire de Biodiversité et Ecologie Microbienne, F-29280, Plouzané, France.
| |
Collapse
|
311
|
Simmons JR, Gasmi-Seabrook G, Rainey JK. Structural features, intrinsic disorder, and modularity of a pyriform spidroin 1 core repetitive domain. Biochem Cell Biol 2023; 101:271-283. [PMID: 36802452 DOI: 10.1139/bcb-2022-0338] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023] Open
Abstract
Orb-weaving spiders produce up to seven silk types, each with distinct biological roles, protein compositions, and mechanics. Pyriform (or piriform) silk is composed of pyriform spidroin 1 (PySp1) and is the fibrillar component of attachment discs that attach webs to substrates and to each other. Here, we characterize the 234-residue repeat unit (the "Py unit") from the core repetitive domain of Argiope argentata PySp1. Solution-state nuclear magnetic resonance (NMR) spectroscopy-based backbone chemical shift and dynamics analysis demonstrate a structured core flanked by disordered tails, structuring that is maintained in a tandem protein of two connected Py units, indicative of structural modularity of the Py unit in the context of the repetitive domain. Notably, AlphaFold2 predicts the Py unit structure with low confidence, echoing low confidence and poor agreement to the NMR-derived structure for the Argiope trifasciata aciniform spidroin (AcSp1) repeat unit. Rational truncation, validated through NMR spectroscopy, provided a 144-residue construct retaining the Py unit core fold, enabling near-complete backbone and side chain 1H, 13C, and 15N resonance assignment. A six α-helix globular core is inferred, flanked by regions of intrinsic disorder that would link helical bundles in tandem repeat proteins in a beads-on-a-string architecture.
Collapse
Affiliation(s)
- Jeffrey R Simmons
- Department of Biochemistry& Molecular Biology, Dalhousie University, Halifax, NS B3H 4R2, Canada
| | | | - Jan K Rainey
- Department of Biochemistry& Molecular Biology, Dalhousie University, Halifax, NS B3H 4R2, Canada
- Department of Chemistry, Dalhousie University, Halifax, NS B3H 4R2, Canada
- School of Biomedical Engineering, Dalhousie University, Halifax, NS B3H 4R2, Canada
| |
Collapse
|
312
|
Arrault C, Monneau YR, Martin M, Cantrelle FX, Boll E, Chirot F, Comby Zerbino C, Walker O, Hologne M. The battle for silver binding: How the interplay between the SilE, SilF, and SilB proteins contributes to the silver efflux pump mechanism. J Biol Chem 2023; 299:105004. [PMID: 37394004 PMCID: PMC10407283 DOI: 10.1016/j.jbc.2023.105004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Revised: 06/27/2023] [Accepted: 06/28/2023] [Indexed: 07/04/2023] Open
Abstract
The resistance of gram-negative bacteria to silver ions is mediated by a silver efflux pump, which mainly relies on a tripartite efflux complex SilCBA, a metallochaperone SilF and an intrinsically disordered protein SilE. However, the precise mechanism by which silver ions are extruded from the cell and the different roles of SilB, SilF, and SilE remain poorly understood. To address these questions, we employed nuclear magnetic resonance and mass spectrometry to investigate the interplay between these proteins. We first solved the solution structures of SilF in its free and Ag+-bound forms, and we demonstrated that SilB exhibits two silver binding sites in its N and C termini. Conversely to the homologous Cus system, we determined that SilF and SilB interact without the presence of silver ions and that the rate of silver dissociation is eight times faster when SilF is bound to SilB, indicating the formation of a SilF-Ag-SilB intermediate complex. Finally, we have shown that SilE does not bind to either SilF or SilB, regardless of the presence or absence of silver ions, further corroborating that it merely acts as a regulator that prevents the cell from being overloaded with silver. Collectively, we have provided further insights into protein interactions within the sil system that contribute to bacterial resistance to silver ions.
Collapse
Affiliation(s)
- Cyrielle Arrault
- Université de Lyon, CNRS, UCB Lyon1, Institut des Sciences Analytiques, UMR5280, Villeurbanne, France
| | - Yoan Rocky Monneau
- Université de Lyon, CNRS, UCB Lyon1, Institut des Sciences Analytiques, UMR5280, Villeurbanne, France; Department of Structural Biology, St Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Marie Martin
- Université de Lyon, CNRS, UCB Lyon1, Institut des Sciences Analytiques, UMR5280, Villeurbanne, France
| | - François-Xavier Cantrelle
- Université de Lille, CNRS, UMR8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Lille, France
| | - Emmanuelle Boll
- Université de Lille, CNRS, UMR8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Lille, France
| | - Fabien Chirot
- Univ Lyon 1, Université Claude Bernard Lyon 1, CNRS, Institut Lumière Matière, UMR5306, Cité Lyonnaise de l'Environnement et de l'Analyse, Villeurbanne, France
| | - Clothilde Comby Zerbino
- Univ Lyon 1, Université Claude Bernard Lyon 1, CNRS, Institut Lumière Matière, UMR5306, Cité Lyonnaise de l'Environnement et de l'Analyse, Villeurbanne, France
| | - Olivier Walker
- Université de Lyon, CNRS, UCB Lyon1, Institut des Sciences Analytiques, UMR5280, Villeurbanne, France
| | - Maggy Hologne
- Université de Lyon, CNRS, UCB Lyon1, Institut des Sciences Analytiques, UMR5280, Villeurbanne, France.
| |
Collapse
|
313
|
Ihlenburg RBJ, Petracek D, Schrank P, Davari MD, Taubert A, Rothenstein D. Identification of the First Sulfobetaine Hydrogel-Binding Peptides via Phage Display Assay. Macromol Rapid Commun 2023; 44:e2200896. [PMID: 36703485 DOI: 10.1002/marc.202200896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 01/11/2023] [Indexed: 01/28/2023]
Abstract
Using the M13 phage display, a series of 7- and 12-mer peptides which interact with new sulfobetaine hydrogels are identified. Two peptides each from the 7- and 12-mer peptide libraries bind to the new sulfobetaine hydrogels with high affinity compared to the wild-type phage lacking a dedicated hydrogel binding peptide. This is the first report of peptides binding to zwitterionic sulfobetaine hydrogels and the study therefore opens up the pathway toward new phage or peptide/hydrogel hybrids with high application potential.
Collapse
Affiliation(s)
- Ramona B J Ihlenburg
- Institute of Chemistry, University of Potsdam, Karl-Liebknecht-Straße 24-25, D-14476, Potsdam, Germany
| | - David Petracek
- Department Bioinspired Materials, Institute for Materials Science, University of Stuttgart, Heisenbergstraße 3, D-70569, Stuttgart, Germany
| | - Paul Schrank
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry, Weinberg 3, D-06120, Halle, Germany
| | - Mehdi D Davari
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry, Weinberg 3, D-06120, Halle, Germany
| | - Andreas Taubert
- Institute of Chemistry, University of Potsdam, Karl-Liebknecht-Straße 24-25, D-14476, Potsdam, Germany
| | - Dirk Rothenstein
- Department Bioinspired Materials, Institute for Materials Science, University of Stuttgart, Heisenbergstraße 3, D-70569, Stuttgart, Germany
| |
Collapse
|
314
|
Dendooven T, Zhang Z, Yang J, McLaughlin SH, Schwab J, Scheres SHW, Yatskevich S, Barford D. Cryo-EM structure of the complete inner kinetochore of the budding yeast point centromere. SCIENCE ADVANCES 2023; 9:eadg7480. [PMID: 37506202 PMCID: PMC10381965 DOI: 10.1126/sciadv.adg7480] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 06/26/2023] [Indexed: 07/30/2023]
Abstract
The point centromere of budding yeast specifies assembly of the large kinetochore complex to mediate chromatid segregation. Kinetochores comprise the centromere-associated inner kinetochore (CCAN) complex and the microtubule-binding outer kinetochore KNL1-MIS12-NDC80 (KMN) network. The budding yeast inner kinetochore also contains the DNA binding centromere-binding factor 1 (CBF1) and CBF3 complexes. We determined the cryo-electron microscopy structure of the yeast inner kinetochore assembled onto the centromere-specific centromere protein A nucleosomes (CENP-ANuc). This revealed a central CENP-ANuc with extensively unwrapped DNA ends. These free DNA duplexes bind two CCAN protomers, one of which entraps DNA topologically, positioned on the centromere DNA element I (CDEI) motif by CBF1. The two CCAN protomers are linked through CBF3 forming an arch-like configuration. With a structural mechanism for how CENP-ANuc can also be linked to KMN involving only CENP-QU, we present a model for inner kinetochore assembly onto a point centromere and how it organizes the outer kinetochore for chromosome attachment to the mitotic spindle.
Collapse
Affiliation(s)
| | | | - Jing Yang
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK
| | | | | | | | | | | |
Collapse
|
315
|
Medvedev KE, Schaeffer RD, Chen KS, Grishin NV. Pan-cancer structurome reveals overrepresentation of beta sandwiches and underrepresentation of alpha helical domains. Sci Rep 2023; 13:11988. [PMID: 37491511 PMCID: PMC10368619 DOI: 10.1038/s41598-023-39273-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 07/22/2023] [Indexed: 07/27/2023] Open
Abstract
The recent progress in the prediction of protein structures marked a historical milestone. AlphaFold predicted 200 million protein models with an accuracy comparable to experimental methods. Protein structures are widely used to understand evolution and to identify potential drug targets for the treatment of various diseases, including cancer. Thus, these recently predicted structures might convey previously unavailable information about cancer biology. Evolutionary classification of protein domains is challenging and different approaches exist. Recently our team presented a classification of domains from human protein models released by AlphaFold. Here we evaluated the pan-cancer structurome, domains from over and under expressed proteins in 21 cancer types, using the broadest levels of the ECOD classification: the architecture (A-groups) and possible homology (X-groups) levels. Our analysis reveals that AlphaFold has greatly increased the three-dimensional structural landscape for proteins that are differentially expressed in these 21 cancer types. We show that beta sandwich domains are significantly overrepresented and alpha helical domains are significantly underrepresented in the majority of cancer types. Our data suggest that the prevalence of the beta sandwiches is due to the high levels of immunoglobulins and immunoglobulin-like domains that arise during tumor development-related inflammation. On the other hand, proteins with exclusively alpha domains are important elements of homeostasis, apoptosis and transmembrane transport. Therefore cancer cells tend to reduce representation of these proteins to promote successful oncogeneses.
Collapse
Affiliation(s)
- Kirill E Medvedev
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA.
| | - R Dustin Schaeffer
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Kenneth S Chen
- Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
- Children's Medical Center Research Institute, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Nick V Grishin
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
- Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| |
Collapse
|
316
|
Saint-Vincent PMB, Furches A, Galanie S, Teixeira Prates E, Aldridge JL, Labbe A, Zhao N, Martin MZ, Ranjan P, Jones P, Kainer D, Kalluri UC, Chen JG, Muchero W, Jacobson DA, Tschaplinski TJ. Validation of a metabolite-GWAS network for Populus trichocarpa family 1 UDP-glycosyltransferases. FRONTIERS IN PLANT SCIENCE 2023; 14:1210146. [PMID: 37546246 PMCID: PMC10402742 DOI: 10.3389/fpls.2023.1210146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 06/05/2023] [Indexed: 08/08/2023]
Abstract
Metabolite genome-wide association studies (mGWASs) are increasingly used to discover the genetic basis of target phenotypes in plants such as Populus trichocarpa, a biofuel feedstock and model woody plant species. Despite their growing importance in plant genetics and metabolomics, few mGWASs are experimentally validated. Here, we present a functional genomics workflow for validating mGWAS-predicted enzyme-substrate relationships. We focus on uridine diphosphate-glycosyltransferases (UGTs), a large family of enzymes that catalyze sugar transfer to a variety of plant secondary metabolites involved in defense, signaling, and lignification. Glycosylation influences physiological roles, localization within cells and tissues, and metabolic fates of these metabolites. UGTs have substantially expanded in P. trichocarpa, presenting a challenge for large-scale characterization. Using a high-throughput assay, we produced substrate acceptance profiles for 40 previously uncharacterized candidate enzymes. Assays confirmed 10 of 13 leaf mGWAS associations, and a focused metabolite screen demonstrated varying levels of substrate specificity among UGTs. A substrate binding model case study of UGT-23 rationalized observed enzyme activities and mGWAS associations, including glycosylation of trichocarpinene to produce trichocarpin, a major higher-order salicylate in P. trichocarpa. We identified UGTs putatively involved in lignan, flavonoid, salicylate, and phytohormone metabolism, with potential implications for cell wall biosynthesis, nitrogen uptake, and biotic and abiotic stress response that determine sustainable biomass crop production. Our results provide new support for in silico analyses and evidence-based guidance for in vivo functional characterization.
Collapse
Affiliation(s)
- Patricia M. B. Saint-Vincent
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
| | - Anna Furches
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, TN, United States
| | - Stephanie Galanie
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
- Protein Engineering, Merck & Co., Inc., Rahway, NJ, United States
| | - Erica Teixeira Prates
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
| | - Jessa L. Aldridge
- Department of Biomedical Sciences, Quillen College of Medicine, East Tennessee State University, Johnson City, TN, United States
| | - Audrey Labbe
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
| | - Nan Zhao
- School of Electrical Engineering, Southeast University, Nanjing, China
| | - Madhavi Z. Martin
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
| | - Priya Ranjan
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
| | - Piet Jones
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, TN, United States
| | - David Kainer
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
| | - Udaya C. Kalluri
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, TN, United States
| | - Jin-Gui Chen
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, TN, United States
| | - Wellington Muchero
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, TN, United States
| | - Daniel A. Jacobson
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, TN, United States
| | - Timothy J. Tschaplinski
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
| |
Collapse
|
317
|
Wu KE, Zou JY, Chang H. Machine learning modeling of RNA structures: methods, challenges and future perspectives. Brief Bioinform 2023; 24:bbad210. [PMID: 37280185 DOI: 10.1093/bib/bbad210] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 05/12/2023] [Accepted: 05/17/2023] [Indexed: 06/08/2023] Open
Abstract
The three-dimensional structure of RNA molecules plays a critical role in a wide range of cellular processes encompassing functions from riboswitches to epigenetic regulation. These RNA structures are incredibly dynamic and can indeed be described aptly as an ensemble of structures that shifts in distribution depending on different cellular conditions. Thus, the computational prediction of RNA structure poses a unique challenge, even as computational protein folding has seen great advances. In this review, we focus on a variety of machine learning-based methods that have been developed to predict RNA molecules' secondary structure, as well as more complex tertiary structures. We survey commonly used modeling strategies, and how many are inspired by or incorporate thermodynamic principles. We discuss the shortcomings that various design decisions entail and propose future directions that could build off these methods to yield more robust, accurate RNA structure predictions.
Collapse
Affiliation(s)
- Kevin E Wu
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - James Y Zou
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Howard Chang
- Howard Hughes Medical Institute, Stanford University, Stanford, CA 94305, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
318
|
Matos ADS, Soares IF, Baptista BDO, de Souza HADS, Chaves LB, Perce-da-Silva DDS, Riccio EKP, Albrecht L, Totino PRR, Rodrigues-da-Silva RN, Daniel-Ribeiro CT, Pratt-Riccio LR, Lima-Junior JDC. Construction, Expression, and Evaluation of the Naturally Acquired Humoral Immune Response against Plasmodium vivax RMC-1, a Multistage Chimeric Protein. Int J Mol Sci 2023; 24:11571. [PMID: 37511330 PMCID: PMC10380678 DOI: 10.3390/ijms241411571] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/06/2023] [Accepted: 07/07/2023] [Indexed: 07/30/2023] Open
Abstract
The PvCelTOS, PvCyRPA, and Pvs25 proteins play important roles during the three stages of the P. vivax lifecycle. In this study, we designed and expressed a P. vivax recombinant modular chimeric protein (PvRMC-1) composed of the main antigenic regions of these vaccine candidates. After structure modelling by prediction, the chimeric protein was expressed, and the antigenicity was assessed by IgM and IgG (total and subclass) ELISA in 301 naturally exposed individuals from the Brazilian Amazon. The recombinant protein was recognized by IgG (54%) and IgM (40%) antibodies in the studied individuals, confirming the natural immunogenicity of the epitopes that composed PvRMC-1 as its maintenance in the chimeric structure. Among responders, a predominant cytophilic response mediated by IgG1 (70%) and IgG3 (69%) was observed. IgM levels were inversely correlated with age and time of residence in endemic areas (p < 0.01). By contrast, the IgG and IgM reactivity indexes were positively correlated with each other, and both were inversely correlated with the time of the last malaria episode. Conclusions: The study demonstrates that PvRMC-1 was successfully expressed and targeted by natural antibodies, providing important insights into the construction of a multistage chimeric recombinant protein and the use of naturally acquired antibodies to validate the construction.
Collapse
Affiliation(s)
- Ada da Silva Matos
- Laboratório de Imunoparasitologia, Instituto Oswaldo Cruz (IOC), Fundação Oswaldo Cruz (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
| | - Isabela Ferreira Soares
- Laboratório de Imunoparasitologia, Instituto Oswaldo Cruz (IOC), Fundação Oswaldo Cruz (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
| | - Barbara de Oliveira Baptista
- Laboratório de Pesquisa em Malária, Instituto Oswaldo Cruz (IOC), Fundação Oswaldo Cruz (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
| | - Hugo Amorim Dos Santos de Souza
- Laboratório de Pesquisa em Malária, Instituto Oswaldo Cruz (IOC), Fundação Oswaldo Cruz (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
| | - Lana Bitencourt Chaves
- Laboratório de Imunoparasitologia, Instituto Oswaldo Cruz (IOC), Fundação Oswaldo Cruz (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
| | - Daiana de Souza Perce-da-Silva
- Laboratório de Imunologia Básica e Aplicada, Centro Universitário Arthur Sá Earp Neto/Faculdade de Medicina de Petrópolis (UNIFASE/FMP), Petrópolis 25680-120, RJ, Brazil
- Laboratório de Imunologia Clínica, Instituto Oswaldo Cruz (IOC), Fundação Oswaldo Cruz (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
| | - Evelyn Kety Pratt Riccio
- Laboratório de Pesquisa em Malária, Instituto Oswaldo Cruz (IOC), Fundação Oswaldo Cruz (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
| | - Letusa Albrecht
- Laboratório de Pesquisa em Apicomplexa, Instituto Carlos Chagas, Curitiba 81350-010, PR, Brazil
| | - Paulo Renato Rivas Totino
- Laboratório de Pesquisa em Malária, Instituto Oswaldo Cruz (IOC), Fundação Oswaldo Cruz (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
| | - Rodrigo Nunes Rodrigues-da-Silva
- Laboratório de Tecnologia Imunológica, Instituto de Tecnologia em Imunobiológicos (Bio-Manguinhos), Fiocruz, Rio de Janeiro 21040-900, RJ, Brazil
| | - Cláudio Tadeu Daniel-Ribeiro
- Laboratório de Pesquisa em Malária, Instituto Oswaldo Cruz (IOC), Fundação Oswaldo Cruz (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
- Centro de Pesquisa, Diagnóstico e Treinamento em Malária (CPD-Mal), Fiocruz e Secretaria de Vigilância em Saúde, Ministério da Saúde, Rio de Janeiro 21040-900, RJ, Brazil
| | - Lilian Rose Pratt-Riccio
- Laboratório de Pesquisa em Malária, Instituto Oswaldo Cruz (IOC), Fundação Oswaldo Cruz (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
- Centro de Pesquisa, Diagnóstico e Treinamento em Malária (CPD-Mal), Fiocruz e Secretaria de Vigilância em Saúde, Ministério da Saúde, Rio de Janeiro 21040-900, RJ, Brazil
| | - Josué da Costa Lima-Junior
- Laboratório de Imunoparasitologia, Instituto Oswaldo Cruz (IOC), Fundação Oswaldo Cruz (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
| |
Collapse
|
319
|
Rocha ST, Shah DD, Zhu Q, Shrivastava A. The prevalence of motility within the human oral microbiota. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.17.549387. [PMID: 37503047 PMCID: PMC10370060 DOI: 10.1101/2023.07.17.549387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
The human oral and nasal microbiota contains approximately 770 cultivable bacterial species. More than 2000 genome sequences of these bacteria can be found in the expanded Human Oral Microbiome Database (eHOMD). We developed HOMDscrape, a freely available Python software tool to programmatically retrieve and process amino acid sequences and sequence identifiers from BLAST results acquired from the eHOMD website. Using the data obtained through HOMDscrape, the phylogeny of proteins involved in bacterial flagellar motility, Type 4 pilus driven twitching motility, and Type 9 Secretion system (T9SS) driven gliding motility was constructed. A comprehensive phylogenetic analysis was conducted for all components of the rotary T9SS, a machinery responsible for secreting various enzymes, virulence factors, and enabling bacterial gliding motility. Results revealed that the T9SS outer membrane ß-barrel protein SprA of human oral microbes underwent horizontal evolution. Overall, we catalog motile microbes that inhabit the human oral microbiota and document their evolutionary connections. These results will serve as a guide for further studies exploring the impact of motility on shaping of the human oral microbiota.
Collapse
|
320
|
Vallat B, Tauriello G, Bienert S, Haas J, Webb BM, Žídek A, Zheng W, Peisach E, Piehl DW, Anischanka I, Sillitoe I, Tolchard J, Varadi M, Baker D, Orengo C, Zhang Y, Hoch JC, Kurisu G, Patwardhan A, Velankar S, Burley SK, Sali A, Schwede T, Berman HM, Westbrook JD. ModelCIF: An Extension of PDBx/mmCIF Data Representation for Computed Structure Models. J Mol Biol 2023; 435:168021. [PMID: 36828268 PMCID: PMC10293049 DOI: 10.1016/j.jmb.2023.168021] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 02/15/2023] [Accepted: 02/16/2023] [Indexed: 02/24/2023]
Abstract
ModelCIF (github.com/ihmwg/ModelCIF) is a data information framework developed for and by computational structural biologists to enable delivery of Findable, Accessible, Interoperable, and Reusable (FAIR) data to users worldwide. ModelCIF describes the specific set of attributes and metadata associated with macromolecular structures modeled by solely computational methods and provides an extensible data representation for deposition, archiving, and public dissemination of predicted three-dimensional (3D) models of macromolecules. It is an extension of the Protein Data Bank Exchange / macromolecular Crystallographic Information Framework (PDBx/mmCIF), which is the global data standard for representing experimentally-determined 3D structures of macromolecules and associated metadata. The PDBx/mmCIF framework and its extensions (e.g., ModelCIF) are managed by the Worldwide Protein Data Bank partnership (wwPDB, wwpdb.org) in collaboration with relevant community stakeholders such as the wwPDB ModelCIF Working Group (wwpdb.org/task/modelcif). This semantically rich and extensible data framework for representing computed structure models (CSMs) accelerates the pace of scientific discovery. Herein, we describe the architecture, contents, and governance of ModelCIF, and tools and processes for maintaining and extending the data standard. Community tools and software libraries that support ModelCIF are also described.
Collapse
Affiliation(s)
- Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA.
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Juergen Haas
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Benjamin M Webb
- Department of Bioengineering and Therapeutic Sciences, the Quantitative Biosciences Institute (QBI), and the Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94157, USA
| | | | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dennis W Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ivan Anischanka
- Department of Biochemistry, and Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Ian Sillitoe
- Department of Structural and Molecular Biology, UCL, London, UK
| | - James Tolchard
- AlphaFold Protein Structure Database, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK; Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK
| | - Mihaly Varadi
- AlphaFold Protein Structure Database, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK; Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK
| | - David Baker
- Department of Biochemistry, and Institute for Protein Design, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | | | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, University of Connecticut, Farmington, CT 06030, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
| | - Ardan Patwardhan
- Electron Microscopy Data Bank, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Sameer Velankar
- AlphaFold Protein Structure Database, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK; Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, the Quantitative Biosciences Institute (QBI), and the Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94157, USA. https://twitter.com/salilab_ucsf
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| |
Collapse
|
321
|
Bittrich S, Bhikadiya C, Bi C, Chao H, Duarte JM, Dutta S, Fayazi M, Henry J, Khokhriakov I, Lowe R, Piehl DW, Segura J, Vallat B, Voigt M, Westbrook JD, Burley SK, Rose Y. RCSB Protein Data Bank: Efficient Searching and Simultaneous Access to One Million Computed Structure Models Alongside the PDB Structures Enabled by Architectural Advances. J Mol Biol 2023; 435:167994. [PMID: 36738985 PMCID: PMC11514064 DOI: 10.1016/j.jmb.2023.167994] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 01/27/2023] [Accepted: 01/28/2023] [Indexed: 02/05/2023]
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) provides open access to experimentally-determined three-dimensional (3D) structures of biomolecules. The RCSB PDB RCSB.org research-focused web portal is used annually by many millions of users around the world. They access biostructure information, run complex queries utilizing various search services (e.g., full-text, structural and chemical attribute, chemical, sequence, and structure similarity searches), and visualize macromolecules in 3D, all at no charge and with no limitations on data usage. Notwithstanding more than 24,000-fold growth of the PDB over the past five decades, experimentally-determined structures are only available for a small subset of the millions of proteins of known sequence. Recently developed machine learning software tools can predict 3D structures of proteins at accuracies comparable to lower-resolution experimental methods. The RCSB PDB now provides access to ∼1,000,000 Computed Structure Models (CSMs) of proteins coming from AlphaFold DB and the ModelArchive alongside ∼200,000 experimentally-determined PDB structures. Both CSMs and PDB structures are available on RCSB.org and via well-established RCSB PDB Data, Search, and 1D-Coordinates application programming interfaces (APIs). Simultaneous delivery of PDB data and CSMs provides users with access to complementary structural information across the human proteome and those of model organisms and selected pathogens. API enhancements are backwards-compatible and programmatic users can "opt in" to access CSMs with minimal effort. Herein, we describe modifications to RCSB PDB cyberinfrastructure required to support sixfold scaling of 3D biostructure data delivery and lay the groundwork for scaling to accommodate hundreds of millions of CSMs.
Collapse
Affiliation(s)
- Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA.
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Henry Chao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Maryam Fayazi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jeremy Henry
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Igor Khokhriakov
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dennis W Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| |
Collapse
|
322
|
Reggiano G, Lugmayr W, Farrell D, Marlovits TC, DiMaio F. Residue-level error detection in cryoelectron microscopy models. Structure 2023; 31:860-869.e4. [PMID: 37253357 PMCID: PMC10330749 DOI: 10.1016/j.str.2023.05.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 02/16/2023] [Accepted: 05/03/2023] [Indexed: 06/01/2023]
Abstract
Building accurate protein models into moderate resolution (3-5 Å) cryoelectron microscopy (cryo-EM) maps is challenging and error prone. We have developed MEDIC (Model Error Detection in Cryo-EM), a robust statistical model that identifies local backbone errors in protein structures built into cryo-EM maps by combining local fit-to-density with deep-learning-derived structural information. MEDIC is validated on a set of 28 structures that were subsequently solved to higher resolutions, where we identify the differences between low- and high-resolution structures with 68% precision and 60% recall. We additionally use this model to fix over 100 errors in 12 deposited structures and to identify errors in 4 refined AlphaFold predictions with 80% precision and 60% recall. As modelers more frequently use deep learning predictions as a starting point for refinement and rebuilding, MEDIC's ability to handle errors in structures derived from hand-building and machine learning methods makes it a powerful tool for structural biologists.
Collapse
Affiliation(s)
- Gabriella Reggiano
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Wolfgang Lugmayr
- University Medical Center Hamburg-Eppendorf (UKE), Institute of Structural and Systems Biology, Hamburg, Germany; CSSB Centre for Structural Systems Biology, Hamburg, Germany; Deutsches Elektronen Synchrotron (DESY), Hamburg, Germany
| | | | - Thomas C Marlovits
- University Medical Center Hamburg-Eppendorf (UKE), Institute of Structural and Systems Biology, Hamburg, Germany; CSSB Centre for Structural Systems Biology, Hamburg, Germany; Deutsches Elektronen Synchrotron (DESY), Hamburg, Germany
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
323
|
Read RJ, Baker EN, Bond CS, Garman EF, van Raaij MJ. AlphaFold and the future of structural biology. Acta Crystallogr F Struct Biol Commun 2023; 79:166-168. [PMID: 37358500 PMCID: PMC10327576 DOI: 10.1107/s2053230x23004934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/27/2023] Open
Abstract
This editorial acknowledges the transformative impact of new machine-learning methods, such as the use of AlphaFold, but also makes the case for the continuing need for experimental structural biology.
Collapse
Affiliation(s)
- Randy J. Read
- Cambridge Institute for Medical Research, University of Cambridge, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Edward N. Baker
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Charles S. Bond
- School of Molecular Sciences, University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia
| | - Elspeth F. Garman
- Department of Biochemistry, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, United Kingdom
| | - Mark J. van Raaij
- Departamento de Estructura de Macromoleculas, Centro Nacional de Biotecnologia, Consejo Superior de Investigaciones Cientificas, 28049 Madrid, Spain
| |
Collapse
|
324
|
Chu LS, Ruffolo JA, Harmalkar A, Gray JJ. Flexible Protein-Protein Docking with a Multi-Track Iterative Transformer. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.29.547134. [PMID: 37425754 PMCID: PMC10327054 DOI: 10.1101/2023.06.29.547134] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Conventional protein-protein docking algorithms usually rely on heavy candidate sampling and re-ranking, but these steps are time-consuming and hinder applications that require high-throughput complex structure prediction, e.g., structure-based virtual screening. Existing deep learning methods for protein-protein docking, despite being much faster, suffer from low docking success rates. In addition, they simplify the problem to assume no conformational changes within any protein upon binding (rigid docking). This assumption precludes applications when binding-induced conformational changes play a role, such as allosteric inhibition or docking from uncertain unbound model structures. To address these limitations, we present GeoDock, a multi-track iterative transformer network to predict a docked structure from separate docking partners. Unlike deep learning models for protein structure prediction that input multiple sequence alignments (MSAs), GeoDock inputs just the sequences and structures of the docking partners, which suits the tasks when the individual structures are given. GeoDock is flexible at the protein residue level, allowing the prediction of conformational changes upon binding. For a benchmark set of rigid targets, GeoDock obtains a 41% success rate, outperforming all the other tested methods. For a more challenging benchmark set of flexible targets, GeoDock achieves a similar number of top-model successes as the traditional method ClusPro [1], but fewer than ReplicaDock2 [2]. GeoDock attains an average inference speed of under one second on a single GPU, enabling its application in large-scale structure screening. Although binding-induced conformational changes are still a challenge owing to limited training and evaluation data, our architecture sets up the foundation to capture this backbone flexibility. Code and a demonstration Jupyter notebook are available at https://github.com/Graylab/GeoDock.
Collapse
Affiliation(s)
- Lee-Shin Chu
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Jeffrey A Ruffolo
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Ameya Harmalkar
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Jeffrey J Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD 21218, USA
| |
Collapse
|
325
|
Read RJ, Baker EN, Bond CS, Garman EF, van Raaij MJ. AlphaFold and the future of structural biology. IUCRJ 2023; 10:377-379. [PMID: 37358477 PMCID: PMC10324484 DOI: 10.1107/s2052252523004943] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/27/2023]
Abstract
This editorial acknowledges the transformative impact of new machine-learning methods, such as the use of AlphaFold, but also makes the case for the continuing need for experimental structural biology.
Collapse
Affiliation(s)
- Randy J. Read
- Cambridge Institute for Medical Research, University of Cambridge, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Edward N. Baker
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Charles S. Bond
- School of Molecular Sciences, University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia
| | - Elspeth F. Garman
- Department of Biochemistry, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, United Kingdom
| | - Mark J. van Raaij
- Departamento de Estructura de Macromoleculas, Centro Nacional de Biotecnologia, Consejo Superior de Investigaciones Cientificas, 28049 Madrid, Spain
| |
Collapse
|
326
|
Read RJ, Baker EN, Bond CS, Garman EF, van Raaij MJ. AlphaFold and the future of structural biology. Acta Crystallogr D Struct Biol 2023; 79:556-558. [PMID: 37378959 DOI: 10.1107/s2059798323004928] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2023] Open
Abstract
This editorial acknowledges the transformative impact of new machine-learning methods, such as the use of AlphaFold, but also makes the case for the continuing need for experimental structural biology.
Collapse
Affiliation(s)
- Randy J Read
- Cambridge Institute for Medical Research, University of Cambridge, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Edward N Baker
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Charles S Bond
- School of Molecular Sciences, University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia
| | - Elspeth F Garman
- Department of Biochemistry, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, United Kingdom
| | - Mark J van Raaij
- Departamento de Estructura de Macromoleculas, Centro Nacional de Biotecnologia, Consejo Superior de Investigaciones Cientificas, 28049 Madrid, Spain
| |
Collapse
|
327
|
Zhu W, Shenoy A, Kundrotas P, Elofsson A. Evaluation of AlphaFold-Multimer prediction on multi-chain protein complexes. Bioinformatics 2023; 39:btad424. [PMID: 37405868 PMCID: PMC10348836 DOI: 10.1093/bioinformatics/btad424] [Citation(s) in RCA: 72] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 05/25/2023] [Accepted: 07/04/2023] [Indexed: 07/07/2023] Open
Abstract
MOTIVATION Despite near-experimental accuracy on single-chain predictions, there is still scope for improvement among multimeric predictions. Methods like AlphaFold-Multimer and FoldDock can accurately model dimers. However, how well these methods fare on larger complexes is still unclear. Further, evaluation methods of the quality of multimeric complexes are not well established. RESULTS We analysed the performance of AlphaFold-Multimer on a homology-reduced dataset of homo- and heteromeric protein complexes. We highlight the differences between the pairwise and multi-interface evaluation of chains within a multimer. We describe why certain complexes perform well on one metric (e.g. TM-score) but poorly on another (e.g. DockQ). We propose a new score, Predicted DockQ version 2 (pDockQ2), to estimate the quality of each interface in a multimer. Finally, we modelled protein complexes (from CORUM) and identified two highly confident structures that do not have sequence homology to any existing structures. AVAILABILITY AND IMPLEMENTATION All scripts, models, and data used to perform the analysis in this study are freely available at https://gitlab.com/ElofssonLab/afm-benchmark.
Collapse
Affiliation(s)
- Wensi Zhu
- Science for Life Laboratory and Department of Biochemistry and Biophysics, Stockholm University, Solna 171 21, Sweden
| | - Aditi Shenoy
- Science for Life Laboratory and Department of Biochemistry and Biophysics, Stockholm University, Solna 171 21, Sweden
| | - Petras Kundrotas
- Science for Life Laboratory and Department of Biochemistry and Biophysics, Stockholm University, Solna 171 21, Sweden
- Center for Computational Biology, The University of Kansas, Lawrence, KS 66047, United States
| | - Arne Elofsson
- Science for Life Laboratory and Department of Biochemistry and Biophysics, Stockholm University, Solna 171 21, Sweden
| |
Collapse
|
328
|
Camponeschi C, Righino B, Pirolli D, Semeraro A, Ria F, De Rosa MC. Prediction of CD44 Structure by Deep Learning-Based Protein Modeling. Biomolecules 2023; 13:1047. [PMID: 37509083 PMCID: PMC10376988 DOI: 10.3390/biom13071047] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 06/19/2023] [Accepted: 06/24/2023] [Indexed: 07/30/2023] Open
Abstract
CD44 is a cell surface glycoprotein transmembrane receptor that is involved in cell-cell and cell-matrix interactions. It crucially associates with several molecules composing the extracellular matrix, the main one of which is hyaluronic acid. It is ubiquitously expressed in various types of cells and is involved in the regulation of important signaling pathways, thus playing a key role in several physiological and pathological processes. Structural information about CD44 is, therefore, fundamental for understanding the mechanism of action of this receptor and developing effective treatments against its aberrant expression and dysregulation frequently associated with pathological conditions. To date, only the structure of the hyaluronan-binding domain (HABD) of CD44 has been experimentally determined. To elucidate the nature of CD44s, the most frequently expressed isoform, we employed the recently developed deep-learning-based tools D-I-TASSER, AlphaFold2, and RoseTTAFold for an initial structural prediction of the full-length receptor, accompanied by molecular dynamics simulations on the most promising model. All three approaches correctly predicted the HABD, with AlphaFold2 outperforming D-I-TASSER and RoseTTAFold in the structural comparison with the crystallographic HABD structure and confidence in predicting the transmembrane helix. Low confidence regions were also predicted, which largely corresponded to the disordered regions of CD44s. These regions allow the receptor to perform its unconventional activity.
Collapse
Affiliation(s)
- Chiara Camponeschi
- Institute of Chemical Sciences and Technologies ''Giulio Natta'' (SCITEC)-CNR, 00168 Rome, Italy
| | - Benedetta Righino
- Institute of Chemical Sciences and Technologies ''Giulio Natta'' (SCITEC)-CNR, 00168 Rome, Italy
| | - Davide Pirolli
- Institute of Chemical Sciences and Technologies ''Giulio Natta'' (SCITEC)-CNR, 00168 Rome, Italy
| | - Alessandro Semeraro
- Department of Chemistry and Technology of Drugs, Sapienza University of Rome, 00185 Rome, Italy
| | - Francesco Ria
- Department of Translational Medicine and Surgery, Section of General Pathology, Università Cattolica del Sacro Cuore, 00168 Rome, Italy
- Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy
| | - Maria Cristina De Rosa
- Institute of Chemical Sciences and Technologies ''Giulio Natta'' (SCITEC)-CNR, 00168 Rome, Italy
| |
Collapse
|
329
|
Pasqualetto G, Mack A, Lewis E, Cooper R, Holland A, Borucu U, Mantell J, Davies T, Weckener M, Clare D, Green T, Kille P, Muhlhozl A, Young MT. CryoEM structure and Alphafold molecular modelling of a novel molluscan hemocyanin. PLoS One 2023; 18:e0287294. [PMID: 37347755 PMCID: PMC10286996 DOI: 10.1371/journal.pone.0287294] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 06/03/2023] [Indexed: 06/24/2023] Open
Abstract
Hemocyanins are multimeric oxygen transport proteins present in the blood of arthropods and molluscs, containing up to 8 oxygen-binding functional units per monomer. In molluscs, hemocyanins are assembled in decamer 'building blocks' formed of 5 dimer 'plates', routinely forming didecamer or higher-order assemblies with d5 or c5 symmetry. Here we describe the cryoEM structures of the didecamer (20-mer) and tridecamer (30-mer) forms of a novel hemocyanin from the slipper limpet Crepidula fornicata (SLH) at 7.0 and 4.7 Å resolution respectively. We show that two decamers assemble in a 'tail-tail' configuration, forming a partially capped cylinder, with an additional decamer adding on in 'head-tail' configuration to make the tridecamer. Analysis of SLH samples shows substantial heterogeneity, suggesting the presence of many higher-order multimers including tetra- and pentadecamers, formed by successive addition of decamers in head-tail configuration. Retrieval of sequence data for a full-length isoform of SLH enabled the use of Alphafold to produce a molecular model of SLH, which indicated the formation of dimer slabs with high similarity to those found in keyhole limpet hemocyanin. The fit of the molecular model to the cryoEM density was excellent, showing an overall structure where the final two functional units of the subunit (FU-g and FU-h) form the partial cap at one end of the decamer, and permitting analysis of the subunit interfaces governing the assembly of tail-tail and head-tail decamer interactions as well as potential sites for N-glycosylation. Our work contributes to the understanding of higher-order oligomer formation in molluscan hemocyanins and demonstrates the utility of Alphafold for building accurate structural models of large oligomeric proteins.
Collapse
Affiliation(s)
- Gaia Pasqualetto
- School of Biosciences, Cardiff University, Cardiff, United Kingdom
| | - Andrew Mack
- School of Biosciences, Cardiff University, Cardiff, United Kingdom
| | - Emily Lewis
- School of Biosciences, Cardiff University, Cardiff, United Kingdom
| | - Ryan Cooper
- School of Biosciences, Cardiff University, Cardiff, United Kingdom
| | - Alistair Holland
- School of Biosciences, Cardiff University, Cardiff, United Kingdom
| | - Ufuk Borucu
- Faculty of Life Sciences, GW4 Facility for High-Resolution Electron Cryo-Microscopy, Wolfson Bioimaging Facility, University of Bristol, Bristol, United Kingdom
| | - Judith Mantell
- Faculty of Life Sciences, GW4 Facility for High-Resolution Electron Cryo-Microscopy, Wolfson Bioimaging Facility, University of Bristol, Bristol, United Kingdom
| | - Tom Davies
- School of Chemistry, Cardiff University, Cardiff, United Kingdom
| | - Miriam Weckener
- The Rosalind Franklin Institute, Structural Biology, Harwell Science Campus, Didcot, United Kingdom
| | - Dan Clare
- Electron Bioimaging Centre, Diamond Light Source, Harwell, United Kingdom
| | - Tom Green
- Advanced Research Computing at Cardiff, Cardiff University, Cardiff, United Kingdom
| | - Pete Kille
- School of Biosciences, Cardiff University, Cardiff, United Kingdom
| | | | - Mark T. Young
- School of Biosciences, Cardiff University, Cardiff, United Kingdom
| |
Collapse
|
330
|
Jessen-Howard D, Pan Q, Ascher DB. Identifying the Molecular Drivers of Pathogenic Aldehyde Dehydrogenase Missense Mutations in Cancer and Non-Cancer Diseases. Int J Mol Sci 2023; 24:10157. [PMID: 37373306 DOI: 10.3390/ijms241210157] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/07/2023] [Accepted: 06/08/2023] [Indexed: 06/29/2023] Open
Abstract
Human aldehyde dehydrogenases (ALDHs) comprising 19 isoenzymes play a vital role on both endogenous and exogenous aldehyde metabolism. This NAD(P)-dependent catalytic process relies on the intact structural and functional activity of the cofactor binding, substrate interaction, and the oligomerization of ALDHs. Disruptions on the activity of ALDHs, however, could result in the accumulation of cytotoxic aldehydes, which have been linked with a wide range of diseases, including both cancers as well as neurological and developmental disorders. In our previous works, we have successfully characterised the structure-function relationships of the missense variants of other proteins. We, therefore, applied a similar analysis pipeline to identify potential molecular drivers of pathogenic ALDH missense mutations. Variants data were first carefully curated and labelled as cancer-risk, non-cancer diseases, and benign. We then leveraged various computational biophysical methods to describe the changes caused by missense mutations, informing a bias of detrimental mutations with destabilising effects. Cooperating with these insights, several machine learning approaches were further utilised to investigate the combination of features, revealing the necessity of the conservation of ALDHs. Our work aims to provide important biological perspectives on pathogenic consequences of missense mutations of ALDHs, which could be invaluable resources in the development of cancer treatment.
Collapse
Affiliation(s)
- Dana Jessen-Howard
- School of Chemistry and Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
| | - Qisheng Pan
- School of Chemistry and Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
| | - David B Ascher
- School of Chemistry and Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
| |
Collapse
|
331
|
Butkovic A, Dolja VV, Koonin EV, Krupovic M. Plant virus movement proteins originated from jelly-roll capsid proteins. PLoS Biol 2023; 21:e3002157. [PMID: 37319262 DOI: 10.1371/journal.pbio.3002157] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 05/11/2023] [Indexed: 06/17/2023] Open
Abstract
Numerous, diverse plant viruses encode movement proteins (MPs) that aid the virus movement through plasmodesmata, the plant intercellular channels. MPs are essential for virus spread and propagation in distal tissues, and several unrelated MPs have been identified. The 30K superfamily of MPs (named after the molecular mass of tobacco mosaic virus (TMV) MP, the classical model of plant virology) is the largest and most diverse MP variety, represented in 16 virus families, but its evolutionary origin remained obscure. Here, we show that the core structural domain of the 30K MPs is homologous to the jelly-roll domain of the capsid proteins (CPs) of small RNA and DNA viruses, in particular, those infecting plants. The closest similarity was observed between the 30K MPs and the CPs of the viruses in the families Bromoviridae and Geminiviridae. We hypothesize that the MPs evolved via duplication or horizontal acquisition of the CP gene in a virus that infected an ancestor of vascular plants, followed by neofunctionalization of one of the paralogous CPs, potentially through the acquisition of unique N- and C-terminal regions. During the subsequent coevolution of viruses with diversifying vascular plants, the 30K MP genes underwent explosive horizontal spread among emergent RNA and DNA viruses, likely permitting viruses of insects and fungi that coinfected plants to expand their host ranges, molding the contemporary plant virome.
Collapse
Affiliation(s)
- Anamarija Butkovic
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Archaeal Virology Unit, Paris, France
| | - Valerian V Dolja
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon, United States of America
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, Maryland, United States of America
| | - Mart Krupovic
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Archaeal Virology Unit, Paris, France
| |
Collapse
|
332
|
Bourganou MV, Kontopodis E, Tsangaris GT, Pierros V, Vasileiou NGC, Mavrogianni VS, Fthenakis GC, Katsafadou AI. Unique Peptides of Cathelicidin-1 in the Early Detection of Mastitis-In Silico Analysis. Int J Mol Sci 2023; 24:10160. [PMID: 37373309 DOI: 10.3390/ijms241210160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 05/31/2023] [Accepted: 06/13/2023] [Indexed: 06/29/2023] Open
Abstract
Based on the results of previously performed clinical studies, cathelicidin-1 has been proposed as a potential biomarker for the early diagnosis of mastitis in ewes. It has been hypothesized that the detection of unique peptides (defined as a peptide, irrespective of its length, that exists in only one protein of a proteome of interest) and core unique peptides (CUPs) (representing the shortest peptide that is unique) of cathelicidin-1 may potentially improve its identification and consequently the diagnosis of sheep mastitis. Peptides of sizes larger than those of the size of CUPs, which include consecutive or over-lapping CUPs, have been defined as 'composite core unique peptides' (CCUPs). The primary objective of the present study was the investigation of the sequence of cathelicidin-1 detected in ewes' milk in order to identify its unique peptides and core unique peptides, which would reveal potential targets for accurate detection of the protein. An additional objective was the detection of unique sequences among the tryptic digest peptides of cathelicidin-1, which would improve accuracy of identification of the protein when performing targeted MS-based proteomics. The potential uniqueness of each peptide of cathelicidin-1 was investigated using a bioinformatics tool built on a big data algorithm. A set of CUPs was created and CCUPs were also searched. Further, the unique sequences in the tryptic digest peptides of cathelicidin-1 were also detected. Finally, the 3D structure of the protein was analyzed from predicted models of proteins. In total, 59 CUPs and four CCUPs were detected in cathelicidin-1 of sheep origin. Among tryptic digest peptides, there were six peptides that were unique in that protein. After 3D structure analysis of the protein, 35 CUPs were found on the core of cathelicidin-1 of sheep origin and among them, 29 were located on amino acids in regions of the protein with 'very high' or 'confident' estimates of confidence of the structure. Ultimately, the following six CUPs: QLNEQ, NEQS, EQSSE, QSSEP, EDPD, DPDS, are proposed as potential antigenic targets for cathelicidin-1 of sheep. Moreover, another six unique peptides were detected in tryptic digests and offer novel mass tags to facilitate the detection of cathelicidin-1 during MS-based diagnostics.
Collapse
Affiliation(s)
- Maria V Bourganou
- Faculty of Public and One Health, University of Thessaly, 43100 Karditsa, Greece
- Proteomics Research Unit, Biomedical Research Foundation of the Academy of Athens, 11527 Athens, Greece
| | - Evangelos Kontopodis
- Proteomics Research Unit, Biomedical Research Foundation of the Academy of Athens, 11527 Athens, Greece
| | - George Th Tsangaris
- Proteomics Research Unit, Biomedical Research Foundation of the Academy of Athens, 11527 Athens, Greece
| | - Vasileios Pierros
- Proteomics Research Unit, Biomedical Research Foundation of the Academy of Athens, 11527 Athens, Greece
| | | | | | | | | |
Collapse
|
333
|
Jin S, Qian K, He L, Zhang Z. iORandLigandDB: A Website for Three-Dimensional Structure Prediction of Insect Odorant Receptors and Docking with Odorants. INSECTS 2023; 14:560. [PMID: 37367376 DOI: 10.3390/insects14060560] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 05/28/2023] [Accepted: 06/09/2023] [Indexed: 06/28/2023]
Abstract
The use of insect-specific odorants to control the behavior of insects has always been a hot spot in research on "green" control strategies of insects. However, it is generally time-consuming and laborious to explore insect-specific odorants with traditional reverse chemical ecology methods. Here, an insect odorant receptor (OR) and ligand database website (iORandLigandDB) was developed for the specific exploration of insect-specific odorants by using deep learning algorithms. The website provides a range of specific odorants before molecular biology experiments as well as the properties of ORs in closely related insects. At present, the existing three-dimensional structures of ORs in insects and the docking data with related odorants can be retrieved from the database and further analyzed.
Collapse
Affiliation(s)
- Shuo Jin
- College of Plant Protection, Southwest University, Chongqing 400716, China
| | - Kun Qian
- College of Plant Protection, Southwest University, Chongqing 400716, China
| | - Lin He
- College of Plant Protection, Southwest University, Chongqing 400716, China
| | - Zan Zhang
- College of Plant Protection, Southwest University, Chongqing 400716, China
| |
Collapse
|
334
|
Adiyaman R, Edmunds NS, Genc AG, Alharbi SMA, McGuffin LJ. Improvement of protein tertiary and quaternary structure predictions using the ReFOLD refinement method and the AlphaFold2 recycling process. BIOINFORMATICS ADVANCES 2023; 3:vbad078. [PMID: 37359722 PMCID: PMC10290552 DOI: 10.1093/bioadv/vbad078] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/09/2023] [Accepted: 06/13/2023] [Indexed: 06/28/2023]
Abstract
Motivation The accuracy gap between predicted and experimental structures has been significantly reduced following the development of AlphaFold2 (AF2). However, for many targets, AF2 models still have room for improvement. In previous CASP experiments, highly computationally intensive MD simulation-based methods have been widely used to improve the accuracy of single 3D models. Here, our ReFOLD pipeline was adapted to refine AF2 predictions while maintaining high model accuracy at a modest computational cost. Furthermore, the AF2 recycling process was utilized to improve 3D models by using them as custom template inputs for tertiary and quaternary structure predictions. Results According to the Molprobity score, 94% of the generated 3D models by ReFOLD were improved. AF2 recycling showed an improvement rate of 87.5% (using MSAs) and 81.25% (using single sequences) for monomeric AF2 models and 100% (MSA) and 97.8% (single sequence) for monomeric non-AF2 models, as measured by the average change in lDDT. By the same measure, the recycling of multimeric models showed an improvement rate of as much as 80% for AF2-Multimer (AF2M) models and 94% for non-AF2M models. Availability and implementation Refinement using AlphaFold2-Multimer recycling is available as part of the MultiFOLD docker package (https://hub.docker.com/r/mcguffin/multifold). The ReFOLD server is available at https://www.reading.ac.uk/bioinf/ReFOLD/ and the modified scripts can be downloaded from https://www.reading.ac.uk/bioinf/downloads/. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Recep Adiyaman
- School of Biological Sciences, University of Reading, Reading RG6 6EX, UK
| | - Nicholas S Edmunds
- School of Biological Sciences, University of Reading, Reading RG6 6EX, UK
| | - Ahmet G Genc
- School of Biological Sciences, University of Reading, Reading RG6 6EX, UK
| | - Shuaa M A Alharbi
- School of Biological Sciences, University of Reading, Reading RG6 6EX, UK
| | | |
Collapse
|
335
|
Liu R, Chen X, Zhao F, Jiang Y, Lu Z, Ji H, Feng Y, Li J, Zhang H, Zheng J, Zhang J, Zhao Y. The COMPASS Complex Regulates Fungal Development and Virulence through Histone Crosstalk in the Fungal Pathogen Cryptococcus neoformans. J Fungi (Basel) 2023; 9:672. [PMID: 37367608 DOI: 10.3390/jof9060672] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/07/2023] [Accepted: 06/10/2023] [Indexed: 06/28/2023] Open
Abstract
The Complex of Proteins Associated with Set1 (COMPASS) methylates lysine K4 on histone H3 (H3K4) and is conserved from yeast to humans. Its subunits and regulatory roles in the meningitis-causing fungal pathogen Cryptococcus neoformans remain unknown. Here we identified the core subunits of the COMPASS complex in C. neoformans and C. deneoformans and confirmed their conserved roles in H3K4 methylation. Through AlphaFold modeling, we found that Set1, Bre2, Swd1, and Swd3 form the catalytic core of the COMPASS complex and regulate the cryptococcal yeast-to-hypha transition, thermal tolerance, and virulence. The COMPASS complex-mediated histone H3K4 methylation requires H2B mono-ubiquitination by Rad6/Bre1 and the Paf1 complex in order to activate the expression of genes specific for the yeast-to-hypha transition in C. deneoformans. Taken together, our findings demonstrate that putative COMPASS subunits function as a unified complex, contributing to cryptococcal development and virulence.
Collapse
Affiliation(s)
- Ruoyan Liu
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
| | - Xiaoyu Chen
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
| | - Fujie Zhao
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
| | - Yixuan Jiang
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
| | - Zhenguo Lu
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
| | - Huining Ji
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yuanyuan Feng
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Junqiang Li
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
| | - Heng Zhang
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
| | - Jianting Zheng
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Joint International Research Laboratory of Metabolic and Developmental Sciences, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jing Zhang
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
| | - Youbao Zhao
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou 450046, China
| |
Collapse
|
336
|
Pogozheva ID, Cherepanov S, Park SJ, Raghavan M, Im W, Lomize AL. Structural modeling of cytokine-receptor-JAK2 signaling complexes using AlphaFold Multimer. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.14.544971. [PMID: 37398331 PMCID: PMC10312770 DOI: 10.1101/2023.06.14.544971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Homodimeric class 1 cytokine receptors include the erythropoietin (EPOR), thrombopoietin (TPOR), granulocyte colony-stimulating factor 3 (CSF3R), growth hormone (GHR), and prolactin receptors (PRLR). They are cell-surface single-pass transmembrane (TM) glycoproteins that regulate cell growth, proliferation, and differentiation and induce oncogenesis. An active TM signaling complex consists of a receptor homodimer, one or two ligands bound to the receptor extracellular domains and two molecules of Janus Kinase 2 (JAK2) constitutively associated with the receptor intracellular domains. Although crystal structures of soluble extracellular domains with ligands have been obtained for all the receptors except TPOR, little is known about the structure and dynamics of the complete TM complexes that activate the downstream JAK-STAT signaling pathway. Three-dimensional models of five human receptor complexes with cytokines and JAK2 were generated using AlphaFold Multimer. Given the large size of the complexes (from 3220 to 4074 residues), the modeling required a stepwise assembly from smaller parts with selection and validation of the models through comparisons with published experimental data. The modeling of active and inactive complexes supports a general activation mechanism that involves ligand binding to a monomeric receptor followed by receptor dimerization and rotational movement of the receptor TM α-helices causing proximity, dimerization, and activation of associated JAK2 subunits. The binding mode of two eltrombopag molecules to TM α-helices of the active TPOR dimer was proposed. The models also help elucidating the molecular basis of oncogenic mutations that may involve non-canonical activation route. Models equilibrated in explicit lipids of the plasma membrane are publicly available.
Collapse
Affiliation(s)
- Irina D. Pogozheva
- Department of Medicinal Chemistry, College of Pharmacy, University of Michigan, Ann Arbor, MI 48109, United States
| | | | - Sang-Jun Park
- Departments of Biological Sciences and Chemistry, Lehigh University, Bethlehem, PA 18015, United States
| | - Malini Raghavan
- Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI 48109, United States
| | - Wonpil Im
- Departments of Biological Sciences and Chemistry, Lehigh University, Bethlehem, PA 18015, United States
| | - Andrei L. Lomize
- Department of Medicinal Chemistry, College of Pharmacy, University of Michigan, Ann Arbor, MI 48109, United States
| |
Collapse
|
337
|
Mezősi-Csaplár M, Szöőr Á, Vereb G. CD28 and 41BB Costimulatory Domains Alone or in Combination Differentially Influence Cell Surface Dynamics and Organization of Chimeric Antigen Receptors and Early Activation of CAR T Cells. Cancers (Basel) 2023; 15:3081. [PMID: 37370693 DOI: 10.3390/cancers15123081] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 06/05/2023] [Accepted: 06/06/2023] [Indexed: 06/29/2023] Open
Abstract
Chimeric antigen receptor (CAR)-modified T cells brought a paradigm shift in the treatment of chemotherapy-resistant lymphomas. Conversely, clinical experience with CAR T cells targeting solid tumors has been disheartening, indicating the necessity of their molecular-level optimization. While incorporating CD28 or 41BB costimulatory domains into CARs in addition to the CD3z signaling domain improved the long-term efficacy of T cell products, their influence on early tumor engagement has yet to be elucidated. We studied the antigen-independent self-association and membrane diffusion kinetics of first- (.z), second- (CD28.z, 41BB.z), and third- (CD28.41BB.z) generation HER2-specific CARs in the resting T cell membrane using super-resolution AiryScan microscopy and fluorescence correlation spectroscopy, in correlation with RoseTTAFold-based structure prediction and assessment of oligomerization in native Western blot. While .z and CD28.z CARs formed large, high-density submicron clusters of dimers, 41BB-containing CARs formed higher oligomers that assembled into smaller but more numerous membrane clusters. The first-, second-, and third-generation CARs showed progressively increasing lateral diffusion as the distance of their CD3z domain from the membrane plane increased. Confocal microscopy analysis of immunological synapses showed that both small clusters of highly mobile CD28.41BB.z and large clusters of less mobile .z CAR induced more efficient CD3ζ and pLck phosphorylation than CD28.z or 41BB.z CARs of intermediate mobility. However, electric cell-substrate impedance sensing revealed that the CD28.41BB.z CAR performs worst in sequential short-term elimination of adherent tumor cells, while the .z CAR is superior to all others. We conclude that the molecular structure, membrane organization, and mobility of CARs are critical design parameters that can predict the development of an effective immune synapse. Therefore, they need to be taken into account alongside the long-term biological effects of costimulatory domains to achieve an optimal therapeutic effect.
Collapse
Affiliation(s)
- Marianna Mezősi-Csaplár
- Department of Biophysics and Cell Biology, Faculty of Medicine, University of Debrecen, 4032 Debrecen, Hungary
| | - Árpád Szöőr
- Department of Biophysics and Cell Biology, Faculty of Medicine, University of Debrecen, 4032 Debrecen, Hungary
| | - György Vereb
- Department of Biophysics and Cell Biology, Faculty of Medicine, University of Debrecen, 4032 Debrecen, Hungary
- ELKH-DE Cell Biology and Signaling Research Group, Faculty of Medicine, University of Debrecen, 4032 Debrecen, Hungary
- Faculty of Pharmacy, University of Debrecen, 4032 Debrecen, Hungary
| |
Collapse
|
338
|
Redl I, Fisicaro C, Dutton O, Hoffmann F, Henderson L, Owens BJ, Heberling M, Paci E, Tamiola K. ADOPT: intrinsic protein disorder prediction through deep bidirectional transformers. NAR Genom Bioinform 2023; 5:lqad041. [PMID: 37138579 PMCID: PMC10150328 DOI: 10.1093/nargab/lqad041] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 02/07/2023] [Accepted: 04/17/2023] [Indexed: 05/05/2023] Open
Abstract
Intrinsically disordered proteins (IDPs) are important for a broad range of biological functions and are involved in many diseases. An understanding of intrinsic disorder is key to develop compounds that target IDPs. Experimental characterization of IDPs is hindered by the very fact that they are highly dynamic. Computational methods that predict disorder from the amino acid sequence have been proposed. Here, we present ADOPT (Attention DisOrder PredicTor), a new predictor of protein disorder. ADOPT is composed of a self-supervised encoder and a supervised disorder predictor. The former is based on a deep bidirectional transformer, which extracts dense residue-level representations from Facebook's Evolutionary Scale Modeling library. The latter uses a database of nuclear magnetic resonance chemical shifts, constructed to ensure balanced amounts of disordered and ordered residues, as a training and a test dataset for protein disorder. ADOPT predicts whether a protein or a specific region is disordered with better performance than the best existing predictors and faster than most other proposed methods (a few seconds per sequence). We identify the features that are relevant for the prediction performance and show that good performance can already be gained with <100 features. ADOPT is available as a stand-alone package at https://github.com/PeptoneLtd/ADOPT and as a web server at https://adopt.peptone.io/.
Collapse
Affiliation(s)
- Istvan Redl
- Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK
| | | | - Oliver Dutton
- Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK
| | - Falk Hoffmann
- Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK
| | | | | | | | - Emanuele Paci
- Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK
- Department of Physics and Astronomy ‘Augusto Righi’, University of Bologna, 40127 Bologna, Italy
| | - Kamil Tamiola
- To whom correspondence should be addressed. Tel: +41 79 609 7333;
| |
Collapse
|
339
|
Haile ST, Rahman S, Fields JK, Orsburn BC, Bumpus NN, Wolberger C. The SAGA HAT module is tethered by its SWIRM domain and modulates activity of the SAGA DUB module. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2023; 1866:194929. [PMID: 36965704 PMCID: PMC10226619 DOI: 10.1016/j.bbagrm.2023.194929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 03/12/2023] [Accepted: 03/19/2023] [Indexed: 03/27/2023]
Abstract
The SAGA (Spt-Ada-Gcn5 acetyltransferase) complex is a transcriptional co-activator that both acetylates and deubiquitinates histones. The histone acetyltransferase (HAT) subunit, Gcn5, is part of a subcomplex of SAGA called the HAT module. A minimal HAT module complex containing Gcn5 bound to Ada2 and Ada3 is required for full Gcn5 activity on nucleosomes. Deletion studies have suggested that the Ada2 SWIRM domain plays a role in tethering the HAT module to the remainder of SAGA. While recent cryo-EM studies have resolved the structure of the core of the SAGA complex, the HAT module subunits and molecular details of its interactions with the SAGA core could not be resolved. Here we show that the SWIRM domain is required for incorporation of the HAT module into the yeast SAGA complex, but not the ADA complex, a distinct six-protein acetyltransferase complex that includes the SAGA HAT module proteins. In the isolated Gcn5/Ada2/Ada3 HAT module, deletion of the SWIRM domain modestly increased activity but had negligible effect on nucleosome binding. Loss of the HAT module due to deletion of the SWIRM domain decreases the H2B deubiquitinating activity of SAGA, indicating a role for the HAT module in regulating SAGA DUB module activity. A model of the HAT module created with Alphafold Multimer provides insights into the structural basis for our biochemical data, as well as prior deletion studies.
Collapse
Affiliation(s)
- Sara T Haile
- Department of Biophysics and Biophysical Chemistry, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, United States of America
| | - Sanim Rahman
- Department of Biophysics and Biophysical Chemistry, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, United States of America
| | - James K Fields
- Department of Biophysics and Biophysical Chemistry, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, United States of America
| | - Benjamin C Orsburn
- Department of Pharmacology and Molecular Sciences, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, United States of America
| | - Namandjé N Bumpus
- Department of Pharmacology and Molecular Sciences, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, United States of America
| | - Cynthia Wolberger
- Department of Biophysics and Biophysical Chemistry, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, United States of America.
| |
Collapse
|
340
|
Abbas U, Chen J, Shao Q. Assessing Fairness of AlphaFold2 Prediction of Protein 3D Structures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.23.542006. [PMID: 37293014 PMCID: PMC10245900 DOI: 10.1101/2023.05.23.542006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
AlphaFold2 is reshaping biomedical research by enabling the prediction of a protein's 3D structure solely based on its amino acid sequence. This breakthrough reduces reliance on labor-intensive experimental methods traditionally used to obtain protein structures, thereby accelerating the pace of scientific discovery. Despite the bright future, it remains unclear whether AlphaFold2 can uniformly predict the wide spectrum of proteins equally well. Systematic investigation into the fairness and unbiased nature of its predictions is still an area yet to be thoroughly explored. In this paper, we conducted an in-depth analysis of AlphaFold2's fairness using data comprised of five million reported protein structures from its open-access repository. Specifically, we assessed the variability in the distribution of PLDDT scores, considering factors such as amino acid type, secondary structure, and sequence length. Our findings reveal a systematic discrepancy in AlphaFold2's predictive reliability, varying across different types of amino acids and secondary structures. Furthermore, we observed that the size of the protein exerts a notable impact on the credibility of the 3D structural prediction. AlphaFold2 demonstrates enhanced prediction power for proteins of medium size compared to those that are either smaller or larger. These systematic biases could potentially stem from inherent biases present in its training data and model architecture. These factors need to be taken into account when expanding the applicability of AlphaFold2.
Collapse
Affiliation(s)
- Usman Abbas
- Chemical & Materials Engineering, University of Kentucky, Lexington, Kentucky, USA
| | - Jin Chen
- Institute for Biomedical Informatics, University of Kentucky, Lexington, Kentucky, USA
| | - Qing Shao
- Chemical & Materials Engineering, University of Kentucky, Lexington, Kentucky, USA
| |
Collapse
|
341
|
Butkovic A, Kraberger S, Smeele Z, Martin DP, Schmidlin K, Fontenele RS, Shero MR, Beltran RS, Kirkham AL, Aleamotu’a M, Burns JM, Koonin EV, Varsani A, Krupovic M. Evolution of anelloviruses from a circovirus-like ancestor through gradual augmentation of the jelly-roll capsid protein. Virus Evol 2023; 9:vead035. [PMID: 37325085 PMCID: PMC10266747 DOI: 10.1093/ve/vead035] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 05/15/2023] [Accepted: 05/22/2023] [Indexed: 06/17/2023] Open
Abstract
Anelloviruses are highly prevalent in diverse mammals, including humans, but so far have not been linked to any disease and are considered to be part of the 'healthy virome'. These viruses have small circular single-stranded DNA (ssDNA) genomes and encode several proteins with no detectable sequence similarity to proteins of other known viruses. Thus, anelloviruses are the only family of eukaryotic ssDNA viruses currently not included in the realm Monodnaviria. To gain insights into the provenance of these enigmatic viruses, we sequenced more than 250 complete genomes of anelloviruses from nasal and vaginal swab samples of Weddell seal (Leptonychotes weddellii) from Antarctica and a fecal sample of grizzly bear (Ursus arctos horribilis) from the USA and performed a comprehensive family-wide analysis of the signature anellovirus protein ORF1. Using state-of-the-art remote sequence similarity detection approaches and structural modeling with AlphaFold2, we show that ORF1 orthologs from all Anelloviridae genera adopt a jelly-roll fold typical of viral capsid proteins (CPs), establishing an evolutionary link to other eukaryotic ssDNA viruses, specifically, circoviruses. However, unlike CPs of other ssDNA viruses, ORF1 encoded by anelloviruses from different genera display remarkable variation in size, due to insertions into the jelly-roll domain. In particular, the insertion between β-strands H and I forms a projection domain predicted to face away from the capsid surface and function at the interface of virus-host interactions. Consistent with this prediction and supported by recent experimental evidence, the outermost region of the projection domain is a mutational hotspot, where rapid evolution was likely precipitated by the host immune system. Collectively, our findings further expand the known diversity of anelloviruses and explain how anellovirus ORF1 proteins likely diverged from canonical jelly-roll CPs through gradual augmentation of the projection domain. We suggest assigning Anelloviridae to a new phylum, 'Commensaviricota', and including it into the kingdom Shotokuvirae (realm Monodnaviria), alongside Cressdnaviricota and Cossaviricota.
Collapse
Affiliation(s)
- Anamarija Butkovic
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Archaeal Virology Unit, 25 rue du Dr Roux, Paris 75015, France
| | - Simona Kraberger
- The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, 1001 S. McAllister Ave, Tempe, AZ 85287, USA
| | - Zoe Smeele
- The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, 1001 S. McAllister Ave, Tempe, AZ 85287, USA
| | - Darren P Martin
- The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, 1001 S. McAllister Ave, Tempe, AZ 85287, USA
| | - Kara Schmidlin
- The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, 1001 S. McAllister Ave, Tempe, AZ 85287, USA
| | - Rafaela S Fontenele
- The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, 1001 S. McAllister Ave, Tempe, AZ 85287, USA
| | - Michelle R Shero
- Biology Department, Woods Hole Oceanographic Institution, 266 Woods Hole Rd, Woods Hole, MA 02543, USA
| | - Roxanne S Beltran
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, 130 McAllister Way, Santa Cruz, CA 95060, USA
| | - Amy L Kirkham
- U.S. Fish and Wildlife Service, Marine Mammals Management, 1011 E, Tudor Road, Anchorage, AK 99503, USA
| | - Maketalena Aleamotu’a
- School of Environmental and Life Sciences, The University of Newcastle, University Drive, Callaghan, NSW 2308, Australia
| | - Jennifer M Burns
- Department of Biological Sciences, Texas Tech University, 2500 Broadway, Lubbock, TX 79409, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Arvind Varsani
- The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, 1001 S. McAllister Ave, Tempe, AZ 85287, USA
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Diseases and Molecular Medicine, University of Cape Town, Observatory, 1 Anzio Road, Cape Town 7925, South Africa
| | - Mart Krupovic
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Archaeal Virology Unit, 25 rue du Dr Roux, Paris 75015, France
| |
Collapse
|
342
|
Faure G, Saito M, Benler S, Peng I, Wolf YI, Strecker J, Altae-Tran H, Neumann E, Li D, Makarova KS, Macrae RK, Koonin EV, Zhang F. Modularity and diversity of target selectors in Tn7 transposons. Mol Cell 2023:S1097-2765(23)00367-2. [PMID: 37267947 DOI: 10.1016/j.molcel.2023.05.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 01/17/2023] [Accepted: 05/09/2023] [Indexed: 06/04/2023]
Abstract
To spread, transposons must integrate into target sites without disruption of essential genes while avoiding host defense systems. Tn7-like transposons employ multiple mechanisms for target-site selection, including protein-guided targeting and, in CRISPR-associated transposons (CASTs), RNA-guided targeting. Combining phylogenomic and structural analyses, we conducted a broad survey of target selectors, revealing diverse mechanisms used by Tn7 to recognize target sites, including previously uncharacterized target-selector proteins found in newly discovered transposable elements (TEs). We experimentally characterized a CAST I-D system and a Tn6022-like transposon that uses TnsF, which contains an inactivated tyrosine recombinase domain, to target the comM gene. Additionally, we identified a non-Tn7 transposon, Tsy, encoding a homolog of TnsF with an active tyrosine recombinase domain, which we show also inserts into comM. Our findings show that Tn7 transposons employ modular architecture and co-opt target selectors from various sources to optimize target selection and drive transposon spread.
Collapse
Affiliation(s)
- Guilhem Faure
- Howard Hughes Medical Institute, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Makoto Saito
- Howard Hughes Medical Institute, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Sean Benler
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA
| | - Iris Peng
- Howard Hughes Medical Institute, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA
| | - Jonathan Strecker
- Howard Hughes Medical Institute, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Han Altae-Tran
- Howard Hughes Medical Institute, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Edwin Neumann
- Howard Hughes Medical Institute, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - David Li
- Howard Hughes Medical Institute, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Kira S Makarova
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA
| | - Rhiannon K Macrae
- Howard Hughes Medical Institute, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA.
| | - Feng Zhang
- Howard Hughes Medical Institute, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| |
Collapse
|
343
|
Rigo GV, Cardoso FG, Pereira MM, Devereux M, McCann M, Santos ALS, Tasca T. Peptidases Are Potential Targets of Copper(II)-1,10-Phenanthroline-5,6-dione Complex, a Promising and Potent New Drug against Trichomonas vaginalis. Pathogens 2023; 12:pathogens12050745. [PMID: 37242415 DOI: 10.3390/pathogens12050745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 05/09/2023] [Accepted: 05/17/2023] [Indexed: 05/28/2023] Open
Abstract
Trichomonas vaginalis is responsible for 156 million new cases per year worldwide. When present asymptomatically, the parasite can lead to serious complications, such as development of cervical and prostate cancer. As infection increases the acquisition and transmission of HIV, the control of trichomoniasis represents an important niche for the discovery and development of new antiparasitic molecules. This urogenital parasite synthesizes several molecules that allow the establishment and pathogenesis of infection. Among them, peptidases occupy key roles as virulence factors, and the inhibition of these enzymes has become an important mechanism for modulating pathogenesis. Based on these premises, our group recently reported the potent anti-T. vaginalis action of the metal-based complex [Cu(phendione)3](ClO4)2.4H2O (Cu-phendione). In the present study, we evaluated the influence of Cu-phendione on the modulation of proteolytic activities produced by T. vaginalis by biochemical and molecular approaches. Cu-phendione showed strong inhibitory potential against T. vaginalis peptidases, especially cysteine- and metallo-type peptidases. The latter revealed a more prominent effect at both the post-transcriptional and post-translational levels. Molecular Docking analysis confirmed the interaction of Cu-phendione, with high binding energy (-9.7 and -10.7 kcal·mol-1, respectively) at the active site of both TvMP50 and TvGP63 metallopeptidases. In addition, Cu-phendione significantly reduced trophozoite-mediated cytolysis in human vaginal (HMVII) and monkey kidney (VERO) epithelial cell lineages. These results highlight the antiparasitic potential of Cu-phendione by interaction with important T. vaginalis virulence factors.
Collapse
Affiliation(s)
- Graziela Vargas Rigo
- Faculdade de Farmácia and Centro de Biotecnologia, Universidade Federal do Rio Grande do Sul, Porto Alegre 90610-000, RS, Brazil
| | - Fernanda Gomes Cardoso
- Faculdade de Farmácia and Centro de Biotecnologia, Universidade Federal do Rio Grande do Sul, Porto Alegre 90610-000, RS, Brazil
| | - Matheus Mendonça Pereira
- CIEPQPF, Department of Chemical Engineering, University of Coimbra, Rua Sílvio Lima, Pólo II-Pinhal de Marrocos, 3030-790 Coimbra, Portugal
| | - Michael Devereux
- The Inorganic Pharmaceutical and Biomimetic Research Centre, Focas Research Institute, Dublin Institute of Technology, D08 CKP1 Dublin, Ireland
| | - Malachy McCann
- Chemistry Department, Maynooth University, National University of Ireland, W23 F2H6 Maynooth, Ireland
| | - André L S Santos
- Laboratório de Estudos Avançados de Microrganismos Emergentes e Resistentes (LEAMER), Departamento de Microbiologia Geral, Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro 21941-902, RJ, Brazil
| | - Tiana Tasca
- Faculdade de Farmácia and Centro de Biotecnologia, Universidade Federal do Rio Grande do Sul, Porto Alegre 90610-000, RS, Brazil
| |
Collapse
|
344
|
Ljubič M, Prašnikar E, Perdih A, Borišek J. All-Atom Simulations Reveal the Intricacies of Signal Transduction upon Binding of the HLA-E Ligand to the Transmembrane Inhibitory CD94/NKG2A Receptor. J Chem Inf Model 2023. [PMID: 37207294 DOI: 10.1021/acs.jcim.3c00249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Natural killer (NK) cells play an important role in the innate immune response against tumors and various pathogens such as viruses and bacteria. Their function is controlled by a wide array of activating and inhibitory receptors, which are expressed on their cell surface. Among them is a dimeric NKG2A/CD94 inhibitory transmembrane (TM) receptor which specifically binds to the non-classical MHC I molecule HLA-E, which is often overexpressed on the surface of senescent and tumor cells. Using the Alphafold 2 artificial intelligence system, we constructed the missing segments of the NKG2A/CD94 receptor and generated its complete 3D structure comprising extracellular (EC), TM, and intracellular regions, which served as a starting point for the multi-microsecond all-atom molecular dynamics simulations of the receptor with and without the bound HLA-E ligand and its nonameric peptide. The simulated models revealed that an intricate interplay of events is taking place between the EC and TM regions ultimately affecting the intracellular immunoreceptor tyrosine-based inhibition motif (ITIM) regions that host the point at which the signal is transmitted further down the inhibitory signaling cascade. Signal transduction through the lipid bilayer was also coupled with the changes in the relative orientation of the NKG2A/CD94 TM helices in response to linker reorganization, mediated by fine-tuned interactions in the EC region of the receptor, taking place after HLA-E binding. This research provides atomistic details of the cells' protection mechanism against NK cells and broadens the knowledge regarding the TM signaling of ITIM-bearing receptors.
Collapse
Affiliation(s)
- Martin Ljubič
- National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
- Faculty of Pharmacy, University of Ljubljana, Aškerčeva 7, 1000 Ljubljana, Slovenia
| | - Eva Prašnikar
- National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Andrej Perdih
- National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
- Faculty of Pharmacy, University of Ljubljana, Aškerčeva 7, 1000 Ljubljana, Slovenia
| | - Jure Borišek
- National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| |
Collapse
|
345
|
Pohl GM, Göz M, Gaertner A, Brodehl A, Cimen T, Saguner AM, Schulze-Bahr E, Walhorn V, Anselmetti D, Milting H. Cardiomyopathy related desmocollin-2 prodomain variants affect the intracellular cadherin transport and processing. Front Cardiovasc Med 2023; 10:1127261. [PMID: 37273868 PMCID: PMC10235514 DOI: 10.3389/fcvm.2023.1127261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 05/02/2023] [Indexed: 06/06/2023] Open
Abstract
Background Arrhythmogenic cardiomyopathy can be caused by genetic variants in desmosomal cadherins. Since cardiac desmosomal cadherins are crucial for cell-cell-adhesion, their correct localization at the plasma membrane is essential. Methods Nine desmocollin-2 variants at five positions from various public genetic databases (p.D30N, p.V52A/I, p.G77V/D/S, p.V79G, p.I96V/T) and three additional conserved positions (p.C32, p.C57, p.F71) within the prodomain were investigated in vitro using confocal microscopy. Model variants (p.C32A/S, p.V52G/L, p.C57A/S, p.F71Y/A/S, p.V79A/I/L, p.I96l/A) were generated to investigate the impact of specific amino acids. Results We revealed that all analyzed positions in the prodomain are critical for the intracellular transport. However, the variants p.D30N, p.V52A/I and p.I96V listed in genetic databases do not disturb the intracellular transport revealing that the loss of these canonical sequences may be compensated. Conclusion As disease-related homozygous truncating desmocollin-2 variants lacking the transmembrane domain are not localized at the plasma membrane, we predict that some of the investigated prodomain variants may be relevant in the context of arrhythmogenic cardiomyopathy due to disturbed intracellular transport.
Collapse
Affiliation(s)
- Greta Marie Pohl
- Erich & Hanna Klessmann-Institute for Cardiovascular Research and Development & Clinic for Thoracic and Cardiovascular Surgery, Heart- and Diabetes Center NRW, D-32545 Bad Oeynhausen, University Hospital of the Ruhr-University Bochum, Bad Oeynhausen, Germany
| | - Manuel Göz
- Experimental Biophysics and Applied Nanoscience, Faculty of Physics, University of Bielefeld, NRW, Bielefeld, Germany
| | - Anna Gaertner
- Erich & Hanna Klessmann-Institute for Cardiovascular Research and Development & Clinic for Thoracic and Cardiovascular Surgery, Heart- and Diabetes Center NRW, D-32545 Bad Oeynhausen, University Hospital of the Ruhr-University Bochum, Bad Oeynhausen, Germany
| | - Andreas Brodehl
- Erich & Hanna Klessmann-Institute for Cardiovascular Research and Development & Clinic for Thoracic and Cardiovascular Surgery, Heart- and Diabetes Center NRW, D-32545 Bad Oeynhausen, University Hospital of the Ruhr-University Bochum, Bad Oeynhausen, Germany
| | - Tolga Cimen
- Department of Cardiology, University Heart Center Zurich, University Hospital Zurich, Zürich, Switzerland
| | - Ardan M. Saguner
- Department of Cardiology, University Heart Center Zurich, University Hospital Zurich, Zürich, Switzerland
| | - Eric Schulze-Bahr
- Department of Cardiovascular Medicine, Institute for Genetics of Heart Diseases (IfGH), University Hospital Münster, Münster, Germany
| | - Volker Walhorn
- Experimental Biophysics and Applied Nanoscience, Faculty of Physics, University of Bielefeld, NRW, Bielefeld, Germany
| | - Dario Anselmetti
- Experimental Biophysics and Applied Nanoscience, Faculty of Physics, University of Bielefeld, NRW, Bielefeld, Germany
| | - Hendrik Milting
- Erich & Hanna Klessmann-Institute for Cardiovascular Research and Development & Clinic for Thoracic and Cardiovascular Surgery, Heart- and Diabetes Center NRW, D-32545 Bad Oeynhausen, University Hospital of the Ruhr-University Bochum, Bad Oeynhausen, Germany
| |
Collapse
|
346
|
Aina A, Hsueh SCC, Plotkin SS. PROTHON: A Local Order Parameter-Based Method for Efficient Comparison of Protein Ensembles. J Chem Inf Model 2023. [PMID: 37178169 DOI: 10.1021/acs.jcim.3c00145] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
The comparison of protein conformational ensembles is of central importance in structural biology. However, there are few computational methods for ensemble comparison, and those that are readily available, such as ENCORE, utilize methods that are sufficiently computationally expensive to be prohibitive for large ensembles. Here, a new method is presented for efficient representation and comparison of protein conformational ensembles. The method is based on the representation of a protein ensemble as a vector of probability distribution functions (pdfs), with each pdf representing the distribution of a local structural property such as the number of contacts between Cβ atoms. Dissimilarity between two conformational ensembles is quantified by the Jensen-Shannon distance between the corresponding set of probability distribution functions. The method is validated for conformational ensembles generated by molecular dynamics simulations of ubiquitin, as well as experimentally derived conformational ensembles of a 130 amino acid truncated form of human tau protein. In the ubiquitin ensemble data set, the method was up to 88 times faster than the existing ENCORE software, while simultaneously utilizing 48 times fewer computing cores. We make the method available as a Python package, called PROTHON, and provide a GitHub page with the Python source code at https://github.com/PlotkinLab/Prothon.
Collapse
Affiliation(s)
- Adekunle Aina
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| | - Shawn C C Hsueh
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| | - Steven S Plotkin
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
- Genome Science and Technology Program, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| |
Collapse
|
347
|
Zhao Y, Zheng Z, Zhang Z, Hillpot E, Lin YS, Zakusilo FT, Lu JY, Ablaeva J, Miller RA, Nevo E, Seluanov A, Gorbunova V. Evolution of High-Molecular-Mass Hyaluronic Acid is Associated with Subterranean Lifestyle. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.08.539764. [PMID: 37215017 PMCID: PMC10197608 DOI: 10.1101/2023.05.08.539764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Hyaluronic acid (HA) is a major component of extracellular matrix (ECM) which plays an important role in development, cellular response to injury and inflammation, cell migration, and cancer. The naked mole-rat (NMR, Heterocephalus glaber ) contains abundant high-molecular-mass HA (HMM-HA) in its tissues, which contributes to this species' cancer resistance and possibly longevity. Here we report that abundant HMM-HA is found in a wide range of subterranean mammalian species, but not in phylogenetically related aboveground species. These species accumulate abundant HMM-HA by regulating the expression of genes involved in HA degradation and synthesis and contain unique mutations in these genes. The abundant high molecular weight HA may benefit the adaptation to subterranean environment by increasing skin elasticity and protecting from oxidative stress due to hypoxic subterranean environment. HMM-HA may also be coopted to confer cancer resistance and longevity to subterranean mammals. Our work suggests that HMM-HA has evolved with subterranean lifestyle.
Collapse
|
348
|
Filgueiras JL, Varela D, Santos J. Protein structure prediction with energy minimization and deep learning approaches. NATURAL COMPUTING 2023:1-12. [PMID: 37363286 PMCID: PMC10165305 DOI: 10.1007/s11047-023-09943-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 04/12/2023] [Indexed: 06/28/2023]
Abstract
In this paper we discuss the advantages and problems of two alternatives for ab initio protein structure prediction. On one hand, recent approaches based on deep learning, which have significantly improved prediction results for a wide variety of proteins, are discussed. On the other hand, methods based on protein conformational energy minimization and with different search strategies are analyzed. In this latter case, our methods based on a memetic combination between differential evolution and the fragment replacement technique are included, incorporating also the possibility of niching in the evolutionary search. Different proteins have been used to analyze the pros and cons in both approaches, proposing possibilities of integration of both alternatives.
Collapse
Affiliation(s)
- Juan Luis Filgueiras
- Department of Computer Science and Information Technologies, CITIC (Centre for Information and Communications Technology Research), University of A Coruña, A Coruña, Spain
| | - Daniel Varela
- Department of Computer Science and Information Technologies, CITIC (Centre for Information and Communications Technology Research), University of A Coruña, A Coruña, Spain
| | - José Santos
- Department of Computer Science and Information Technologies, CITIC (Centre for Information and Communications Technology Research), University of A Coruña, A Coruña, Spain
| |
Collapse
|
349
|
Haczeyni F, Steensels S, Stein BD, Jordan JM, Li L, Dartigue V, Sarklioglu SS, Qiao J, Zhou XK, Dannenberg AJ, Iyengar NM, Yu H, Cantley LC, Ersoy BA. Submitochondrial Protein Translocation Upon Stress Inhibits Thermogenic Energy Expenditure. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.04.539294. [PMID: 37205525 PMCID: PMC10187325 DOI: 10.1101/2023.05.04.539294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Mitochondria-rich brown adipocytes dissipate cellular fuel as heat by thermogenic energy expenditure (TEE). Prolonged nutrient excess or cold exposure impair TEE and contribute to the pathogenesis of obesity, but the mechanisms remain incompletely understood. Here we report that stress-induced proton leak into the matrix interface of mitochondrial innermembrane (IM) mobilizes a group of proteins from IM into matrix, which in turn alter mitochondrial bioenergetics. We further determine a smaller subset that correlates with obesity in human subcutaneous adipose tissue. We go on to show that the top factor on this short list, acyl-CoA thioesterase 9 (ACOT9), migrates from the IM into the matrix upon stress where it enzymatically deactivates and prevents the utilization of acetyl-CoA in TEE. The loss of ACOT9 protects mice against the complications of obesity by maintaining unobstructed TEE. Overall, our results introduce aberrant protein translocation as a strategy to identify pathogenic factors. One-Sentence Summary Thermogenic stress impairs mitochondrial energy utilization by forcing translocation of IM-bound proteins into the matrix.
Collapse
|
350
|
Wu T, Guo Z, Cheng J. Atomic protein structure refinement using all-atom graph representations and SE(3)-equivariant graph transformer. Bioinformatics 2023; 39:btad298. [PMID: 37144951 PMCID: PMC10191610 DOI: 10.1093/bioinformatics/btad298] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 03/18/2023] [Accepted: 04/27/2023] [Indexed: 05/06/2023] Open
Abstract
MOTIVATION The state-of-art protein structure prediction methods such as AlphaFold are being widely used to predict structures of uncharacterized proteins in biomedical research. There is a significant need to further improve the quality and nativeness of the predicted structures to enhance their usability. In this work, we develop ATOMRefine, a deep learning-based, end-to-end, all-atom protein structural model refinement method. It uses a SE(3)-equivariant graph transformer network to directly refine protein atomic coordinates in a predicted tertiary structure represented as a molecular graph. RESULTS The method is first trained and tested on the structural models in AlphaFoldDB whose experimental structures are known, and then blindly tested on 69 CASP14 regular targets and 7 CASP14 refinement targets. ATOMRefine improves the quality of both backbone atoms and all-atom conformation of the initial structural models generated by AlphaFold. It also performs better than two state-of-the-art refinement methods in multiple evaluation metrics including an all-atom model quality score-the MolProbity score based on the analysis of all-atom contacts, bond length, atom clashes, torsion angles, and side-chain rotamers. As ATOMRefine can refine a protein structure quickly, it provides a viable, fast solution for improving protein geometry and fixing structural errors of predicted structures through direct coordinate refinement. AVAILABILITY AND IMPLEMENTATION The source code of ATOMRefine is available in the GitHub repository (https://github.com/BioinfoMachineLearning/ATOMRefine). All the required data for training and testing are available at https://doi.org/10.5281/zenodo.6944368.
Collapse
Affiliation(s)
- Tianqi Wu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
| | - Zhiye Guo
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
| |
Collapse
|