1
|
Rahman MM, Sulu R, Adediran B, Tu H, Salo AM, Murthy S, Myllyharju J, Wierenga RK, Koski MK. Binding Differences of the Peptide-Substrate-Binding Domain of Collagen Prolyl 4-Hydroxylases I and II for Proline- and Hydroxyproline-Rich Peptides. Proteins 2025. [PMID: 40386874 DOI: 10.1002/prot.26839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2025] [Revised: 04/24/2025] [Accepted: 05/02/2025] [Indexed: 05/20/2025]
Abstract
Collagen prolyl 4-hydroxylase (C-P4H) catalyzes the 4-hydroxylation of Y-prolines of the XYG-repeat of procollagen. C-P4Hs are tetrameric α2β2 enzymes. The α-subunit provides the N-terminal dimerization domain, the middle peptide-substrate-binding (PSB) domain, and the C-terminal catalytic (CAT) domain. There are three isoforms of the α-subunit, complexed with a β-subunit that is protein disulfide isomerase, forming C-P4H I-III. The PSB domain of the α-subunit binds proline-rich peptides, but its function with respect to the prolyl hydroxylation mechanism is unknown. An extended mode of binding of proline-rich peptides (PPII, polyproline type-II, conformation) to the PSB-I domain has previously been reported for the PPG-PPG-PPG and P9 peptides. Crystal structures now show that peptides with the motif PxGP (PPG-PRG-PPG, PPG-PAG-PPG) (where x, at Y-position 5, is not a proline) bind to the PSB-I domain differently, more deeply, in the peptide-binding groove. The latter mode of binding has previously been reported for structures of the PSB-II domain complexed with these PxGP-peptides. In addition, it is shown here by crystallographic binding studies that the POG-PAG-POG peptide (with 4-hydroxyprolines at Y-positions 2 and 8) also adopts the PxGP mode of binding to PSB-I as well as to PSB-II. Calorimetric binding studies show that the affinities of these peptides are lower for PSB-I than for PSB-II, with, respectively, KD values of about 70 μM for PSB-I and 20 μM for PSB-II. The importance of these results for understanding the reaction mechanism of C-P4H, in particular concerning the function of the PSB domain, is discussed.
Collapse
Affiliation(s)
- M Mubinur Rahman
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
| | - Ramita Sulu
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
| | - Bukunmi Adediran
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
| | - Hongmin Tu
- Biocenter Oulu, University of Oulu, Oulu, Finland
| | - Antti M Salo
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
| | - Sudarshan Murthy
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
| | - Johanna Myllyharju
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
| | - Rik K Wierenga
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
| | - M Kristian Koski
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
- Biocenter Oulu, University of Oulu, Oulu, Finland
| |
Collapse
|
2
|
Choudhary P, Kunnakkattu IR, Nair S, Lawal DK, Pidruchna I, Afonso MQL, Fleming JR, Velankar S. PDBe tools for an in-depth analysis of small molecules in the Protein Data Bank. Protein Sci 2025; 34:e70084. [PMID: 40100137 PMCID: PMC11917123 DOI: 10.1002/pro.70084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 01/27/2025] [Accepted: 02/12/2025] [Indexed: 03/20/2025]
Abstract
The Protein Data Bank (PDB) is the primary global repository for experimentally determined 3D structures of biological macromolecules and their complexes with ligands, proteins, and nucleic acids. PDB contains over 47,000 unique small molecules bound to the macromolecules. Despite the extensive data available, the complexity of small-molecule data in the PDB necessitates specialized tools for effective analysis and visualization. PDBe has developed a number of tools, including PDBe CCDUtils (https://github.com/PDBeurope/ccdutils) for accessing and enriching ligand data, PDBe Arpeggio (https://github.com/PDBeurope/arpeggio) for analyzing interactions between ligands and macromolecules, and PDBe RelLig (https://github.com/PDBeurope/rellig) for identifying the functional roles of ligands (such as reactants, cofactors, or drug-like molecules) within protein-ligand complexes. The enhanced ligand annotations and data generated by these tools are presented on the novel PDBe-KB ligand pages, offering a comprehensive overview of small molecules and providing valuable insights into their biological contexts (example page for Imatinib: https://pdbe.org/chem/sti). By improving the standardization of ligand identification, adding various annotations, and offering advanced visualization capabilities, these tools help researchers navigate the complexities of small molecules and their roles in biological systems, facilitating mechanistic understanding of biological functions. The ongoing enhancements to these resources are designed to support the scientific community in gaining valuable insights into ligands and their applications across various fields, including drug discovery, molecular biology, systems biology, structural biology, and pharmacology.
Collapse
Affiliation(s)
- Preeti Choudhary
- Protein Data Bank in Europe, European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Ibrahim Roshan Kunnakkattu
- Protein Data Bank in Europe, European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Sreenath Nair
- Protein Data Bank in Europe, European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Dare Kayode Lawal
- Protein Data Bank in Europe, European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Ivanna Pidruchna
- Protein Data Bank in Europe, European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Marcelo Querino Lima Afonso
- Protein Data Bank in Europe, European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Jennifer R. Fleming
- Protein Data Bank in Europe, European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| |
Collapse
|
3
|
Bittrich S, Rose AS, Sehnal D, Duarte JM, Rose Y, Segura J, Piehl DW, Vallat B, Shao C, Bhikadiya C, Liang J, Ma M, Goodsell DS, Burley SK, Dutta S. Visualizing and analyzing 3D biomolecular structures using Mol* at RCSB.org: Influenza A H5N1 virus proteome case study. Protein Sci 2025; 34:e70093. [PMID: 40099807 PMCID: PMC11915458 DOI: 10.1002/pro.70093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Revised: 01/29/2025] [Accepted: 02/21/2025] [Indexed: 03/20/2025]
Abstract
The easiest and often most useful way to work with experimentally determined or computationally predicted structures of biomolecules is by viewing their three-dimensional (3D) shapes using a molecular visualization tool. Mol* was collaboratively developed by RCSB Protein Data Bank (RCSB PDB, RCSB.org) and Protein Data Bank in Europe (PDBe, PDBe.org) as an open-source, web-based, 3D visualization software suite for examination and analyses of biostructures. It is capable of displaying atomic coordinates and related experimental data of biomolecular structures together with a variety of annotations, facilitating basic and applied research, training, education, and information dissemination. Across RCSB.org, the RCSB PDB research-focused web portal, Mol* has been implemented to support single-mouse-click atomic-level visualization of biomolecules (e.g., proteins, nucleic acids, carbohydrates) with bound cofactors, small-molecule ligands, ions, water molecules, or other macromolecules. RCSB.org Mol* can seamlessly display 3D structures from various sources, allowing structure interrogation, superimposition, and comparison. Using influenza A H5N1 virus as a topical case study of an important pathogen, we exemplify how Mol* has been embedded within various RCSB.org tools-allowing users to view polymer sequence and structure-based annotations integrated from trusted bioinformatics data resources, assess patterns and trends in groups of structures, and view structures of any size and compositional complexity. In addition to being linked to every experimentally determined biostructure and Computed Structure Model made available at RCSB.org, Standalone Mol* is freely available for visualizing any atomic-level or multi-scale biostructure at rcsb.org/3d-view.
Collapse
Affiliation(s)
- Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of California San DiegoLa JollaCaliforniaUSA
| | | | - David Sehnal
- National Centre for Biomolecular Research, Faculty of ScienceMasaryk UniversityBrnoCzech Republic
| | - Jose M. Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
- Rutgers Cancer Institute, RutgersThe State University of New JerseyNew BrunswickNew JerseyUSA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Jesse Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Mark Ma
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of California San DiegoLa JollaCaliforniaUSA
| | - David S. Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
- Rutgers Cancer Institute, RutgersThe State University of New JerseyNew BrunswickNew JerseyUSA
- Department of Integrative Structural and Computational BiologyThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of California San DiegoLa JollaCaliforniaUSA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
- Rutgers Cancer Institute, RutgersThe State University of New JerseyNew BrunswickNew JerseyUSA
- Rutgers Artificial Intelligence and Data Science (RAD) Collaboratory, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
- Department of Chemistry and Chemical Biology, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
- Rutgers Cancer Institute, RutgersThe State University of New JerseyNew BrunswickNew JerseyUSA
| |
Collapse
|
4
|
Prottay AAS, Emamuzzaman, Ripu TR, Sarwar MN, Rahman T, Ahmmed MS, Bappi MH, Emon M, Ansari SA, Coutinho HDM, Islam MT. Anxiogenic-like effects of coumarin, possibly through the GABAkine interaction pathway: Animal studies with in silico approaches. Behav Brain Res 2025; 480:115392. [PMID: 39667645 DOI: 10.1016/j.bbr.2024.115392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Revised: 09/18/2024] [Accepted: 12/09/2024] [Indexed: 12/14/2024]
Abstract
BACKGROUND Anxiety disorder is the most common mental illness and a major contributor to impairment. Thus, there is an urgent need to find novel lead compounds to mitigate anxiety. It is widely recognized that the neurobiology of anxiety-related behavior involves GABAergic systems. OBJECTIVES This research aimed to examine the anxiogenic action of coumarin (CMN), a natural benzopyrone derived from plants, and determine its underlying mechanism through in vivo and in silico investigations. METHODS This was accomplished by using a variety of behavioral procedures, including open field, swing, hole cross, and light-dark tests, on male and female Swiss albino mice that had been orally administered three experimental doses of CMN (1, 2, and 4 mg/kg). The CMN group was also examined with the GABAA receptor agonist diazepam (DZP, 2 mg/kg) and flumazenil antagonist (FLU, 0.1 mg/kg). Furthermore, CMN and standards were subjected to a molecular docking analysis to determine their binding affinities for the GABAA receptor subunits (α1, α4, β2, γ2, and δ). Several software programs were used to visualize the ligand-receptor interaction and analyze the pharmacokinetic profile. RESULTS Compared to typical treatments, our results show that CMN (1 mg/kg) significantly (p < 0.05) increases the locomotor activity of animals. Furthermore, CMN exerted the highest binding affinity (-6.5 kcal/mol) with the GABA-α1 receptor compared to conventional DZP. Along with FLU, CMN displayed several hydrophobic and hydrogen bonds with GABAA receptor subunits. The pharmacokinetic and drug-like properties of CMN are also remarkable. In animal studies, CMN worked synergistically with FLU to provide anxiogenic-like effects. CONCLUSION We conclude that, based on in vivo and in silico data, CMN, alone or in combination with FLU, may be employed in future neurological clinical studies. However, further research is needed to confirm this behavioral activity and elucidate the possible mechanism of action.
Collapse
Affiliation(s)
- Abdullah Al Shamsh Prottay
- Department of Pharmacy, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh; Bioinformatics and Drug Innovation Laboratory, BioLuster Research Center Ltd., Gopalganj 8100, Bangladesh
| | - Emamuzzaman
- Department of Pharmacy, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh; Bioinformatics and Drug Innovation Laboratory, BioLuster Research Center Ltd., Gopalganj 8100, Bangladesh
| | - Tawfik Rakaiyat Ripu
- Department of Pharmacy, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh
| | - Md Nazim Sarwar
- Department of Pharmacy, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh
| | - Towfiqur Rahman
- Department of Pharmacy, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh
| | - Md Shakil Ahmmed
- Department of Pharmacy, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh
| | - Mehedi Hasan Bappi
- School of Pharmacy, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Md Emon
- Department of Pharmacy, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh
| | - Siddique Akber Ansari
- Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh 11451, Saudi Arabia
| | - Henrique D M Coutinho
- Department of Biological Chemistry, Regional University of Cariri, Crato 63105-000, Brazil.
| | - Muhammad Torequl Islam
- Department of Pharmacy, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh; Bioinformatics and Drug Innovation Laboratory, BioLuster Research Center Ltd., Gopalganj 8100, Bangladesh; Pharmacy Discipline, Khulna University, Khulna 9208, Bangladesh.
| |
Collapse
|
5
|
Pintilie G, Shao C, Wang Z, Hudson BP, Flatt JW, Schmid MF, Morris K, Burley SK, Chiu W. Q - score as a reliability measure for protein, nucleic acid, and small molecule atomic coordinate models derived from 3DEM density maps. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.14.633006. [PMID: 39868161 PMCID: PMC11760781 DOI: 10.1101/2025.01.14.633006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2025]
Abstract
Atomic coordinate models are important in the interpretation of 3D maps produced with cryoEM and sub-tomogram averaging in cryoET, or more generically, 3D electron microscopy (3DEM). In addition to visual inspection of such maps and models, quantitative metrics convey the reliability of the atomic coordinates, in particular how well the model is supported by the experimentally determined 3DEM map. A recently introduced metric, Q - score , was shown to correlate well with the reported resolution of the map for well-fitted models. Here we present new statistical analyses of Q - scores based on its application to ∼ 10,000 maps and models archived in EMDB and PDB. Further we introduce two new metrics based on Q - score : Q - relative - all and Q - relative - resolution to compare a map and model to all entries in the EMDB and those with similar resolution respectively. We also explore through illustrative examples of proteins, nucleic acids, and small molecules how Q - scores can indicate whether the atomic coordinates are well-fitted to 3DEM maps and whether some parts of a map may be poorly resolved due to factors such as molecular flexibility, radiation damage, and/or conformational heterogeneity. Lastly, we show examples of how Q - scores can effectively be converted to atomic B - factors . These analyses provide a basis for how Q - scores can be interpreted effectively to evaluate 3DEM maps and atomic coordinate models prior to publication and archiving.
Collapse
Affiliation(s)
- Grigore Pintilie
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, 94305, USA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Zhe Wang
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, United Kingdom
| | - Brian P Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Justin W Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Michael F Schmid
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
| | - Kyle Morris
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, United Kingdom
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute, New Brunswick, NJ 08903, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Artificial Intelligence and Data Science (RAD) Collaboratory, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Wah Chiu
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, 94305, USA
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
| |
Collapse
|
6
|
Martinez K, Agirre J, Akune Y, Aoki-Kinoshita KF, Arighi C, Axelsen KB, Bolton E, Bordeleau E, Edwards NJ, Fadda E, Feizi T, Hayes C, Ives CM, Joshi HJ, Krishna Prasad K, Kossida S, Lisacek F, Liu Y, Lütteke T, Ma J, Malik A, Martin M, Mehta AY, Neelamegham S, Panneerselvam K, Ranzinger R, Ricard-Blum S, Sanou G, Shanker V, Thomas PD, Tiemeyer M, Urban J, Vita R, Vora J, Yamamoto Y, Mazumder R. Functional implications of glycans and their curation: insights from the workshop held at the 16th Annual International Biocuration Conference in Padua, Italy. Database (Oxford) 2024; 2024:baae073. [PMID: 39137905 PMCID: PMC11321244 DOI: 10.1093/database/baae073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 06/24/2024] [Accepted: 07/10/2024] [Indexed: 08/15/2024]
Abstract
Dynamic changes in protein glycosylation impact human health and disease progression. However, current resources that capture disease and phenotype information focus primarily on the macromolecules within the central dogma of molecular biology (DNA, RNA, proteins). To gain a better understanding of organisms, there is a need to capture the functional impact of glycans and glycosylation on biological processes. A workshop titled "Functional impact of glycans and their curation" was held in conjunction with the 16th Annual International Biocuration Conference to discuss ongoing worldwide activities related to glycan function curation. This workshop brought together subject matter experts, tool developers, and biocurators from over 20 projects and bioinformatics resources. Participants discussed four key topics for each of their resources: (i) how they curate glycan function-related data from publications and other sources, (ii) what type of data they would like to acquire, (iii) what data they currently have, and (iv) what standards they use. Their answers contributed input that provided a comprehensive overview of state-of-the-art glycan function curation and annotations. This report summarizes the outcome of discussions, including potential solutions and areas where curators, data wranglers, and text mining experts can collaborate to address current gaps in glycan and glycosylation annotations, leveraging each other's work to improve their respective resources and encourage impactful data sharing among resources. Database URL: https://wiki.glygen.org/Glycan_Function_Workshop_2023.
Collapse
Affiliation(s)
- Karina Martinez
- Department of Biochemistry & Molecular Medicine, The George Washington University School of Medicine and Health Sciences, 2300 I St. NW, Washington, DC 20052, United States
| | - Jon Agirre
- York Structural Biology Laboratory, Department of Chemistry, University of York, Wentworth Way, York YO10 5DD, United Kingdom
| | - Yukie Akune
- The Glycosciences Laboratory, Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, United Kingdom
| | - Kiyoko F Aoki-Kinoshita
- Glycan and Life Systems Integration Center (GaLSIC), Soka University, 1-236 Tangi-machi, Hachioji, Tokyo 192-8577, Japan
| | - Cecilia Arighi
- Department of Computer and Information Sciences, University of Delaware, 18 Amstel Ave, Newark, DE 19716, United States
| | - Kristian B Axelsen
- Swiss-Prot Group, Swiss Institute of Bioinformatics (SIB), CMU, 1 rue Michel Servet, Geneva 4 1211, Switzerland
| | - Evan Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, United States
| | - Emily Bordeleau
- Michael Smith Laboratories, The University of British Columbia, 2185 East Mall, Vancouver, British Columbia V6T 1Z4, Canada
| | - Nathan J Edwards
- Department of Biochemistry and Molecular & Cellular Biology, Georgetown University, 2115 Wisconsin Ave NW, Washington, DC 20007, United States
| | - Elisa Fadda
- Department of Chemistry and Hamilton Institute, Maynooth University, Kilcock Road, Maynooth, Co. Kildare W23 AH3Y, Ireland
| | - Ten Feizi
- The Glycosciences Laboratory, Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, United Kingdom
| | - Catherine Hayes
- Proteome Informatics Group, Swiss Institute of Bioinformatics (SIB), route de Drize 7, Geneva CH-1227, Switzerland
| | - Callum M Ives
- Department of Chemistry and Hamilton Institute, Maynooth University, Kilcock Road, Maynooth, Co. Kildare W23 AH3Y, Ireland
| | - Hiren J Joshi
- Copenhagen Center for Glycomics, Department of Cellular and Molecular Medicine, Faculty of Health Sciences, University of Copenhagen, Blegdamsvej 3, Copenhagen DK-2200, Denmark
| | - Khakurel Krishna Prasad
- ELI Beamlines Facility, The Extreme Light Infrastructure ERIC, Za Radnicí 835, Dolní Břežany 25241, Czech Republic
| | - Sofia Kossida
- IMGT, The International ImMunoGeneTics Information System, National Center for Scientific Research (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), 141 rue de la Cardonille, Montpellier 34 090, France
| | - Frederique Lisacek
- Proteome Informatics Group, Swiss Institute of Bioinformatics (SIB), route de Drize 7, Geneva CH-1227, Switzerland
| | - Yan Liu
- The Glycosciences Laboratory, Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, United Kingdom
| | - Thomas Lütteke
- Institute of Veterinary Physiology and Biochemistry, Justus-Liebig-University Gießen, Frankfurter Str. 100, Gießen 35392, Germany
| | - Junfeng Ma
- Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, 3900 Reservior Road NW, Washington, DC 20007, United States
| | - Adnan Malik
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Maria Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Akul Y Mehta
- Department of Surgery, Beth Israel Deaconess Medical Center, National Center for Functional Glycomics, Harvard Medical School, 330 Brookline Avenue, Boston, MA 02215, United States
| | - Sriram Neelamegham
- Departments of Chemical & Biological Engineering, Biomedical Engineering and Medicine, University at Buffalo, State University of New York, 906 Furnas Hall, Buffalo, NY 14260, United States
| | - Kalpana Panneerselvam
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - René Ranzinger
- Complex Carbohydrate Research Center, University of Georgia, 315 Riverbend Rd, Athens, GA 30602, United States
| | - Sylvie Ricard-Blum
- Institute of Molecular and Supramolecular Chemistry and Biochemistry (ICBMS), UMR 5246, University Lyon 1, CNRS, 43 Boulevard du 11 novembre 1918, Villeurbanne cedex F-69622, France
| | - Gaoussou Sanou
- IMGT, The International ImMunoGeneTics Information System, National Center for Scientific Research (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), 141 rue de la Cardonille, Montpellier 34 090, France
| | - Vijay Shanker
- Department of Computer and Information Sciences, University of Delaware, 18 Amstel Ave, Newark, DE 19716, United States
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California, 2001 N Soto Street, Los Angeles, CA 90032, United States
| | - Michael Tiemeyer
- Complex Carbohydrate Research Center, University of Georgia, 315 Riverbend Rd, Athens, GA 30602, United States
| | - James Urban
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 7 B, Gothenburg 41390, Sweden
| | - Randi Vita
- Immune Epitope Database and Analysis Project, La Jolla Institute for Allergy & Immunology, 9420 Athena Circle, La Jolla, CA 92037, United States
| | - Jeet Vora
- Department of Biochemistry & Molecular Medicine, The George Washington University School of Medicine and Health Sciences, 2300 I St. NW, Washington, DC 20052, United States
| | - Yasunori Yamamoto
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa, Chiba 277-0871, Japan
| | - Raja Mazumder
- Department of Biochemistry & Molecular Medicine, The George Washington University School of Medicine and Health Sciences, 2300 I St. NW, Washington, DC 20052, United States
| |
Collapse
|
7
|
Dalwani S, Metz A, Huschmann FU, Weiss MS, Wierenga RK, Venkatesan R. Crystallographic fragment-binding studies of the Mycobacterium tuberculosis trifunctional enzyme suggest binding pockets for the tails of the acyl-CoA substrates at its active sites and a potential substrate-channeling path between them. Acta Crystallogr D Struct Biol 2024; 80:605-619. [PMID: 39012716 DOI: 10.1107/s2059798324006557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 07/03/2024] [Indexed: 07/18/2024] Open
Abstract
The Mycobacterium tuberculosis trifunctional enzyme (MtTFE) is an α2β2 tetrameric enzyme in which the α-chain harbors the 2E-enoyl-CoA hydratase (ECH) and 3S-hydroxyacyl-CoA dehydrogenase (HAD) active sites, and the β-chain provides the 3-ketoacyl-CoA thiolase (KAT) active site. Linear, medium-chain and long-chain 2E-enoyl-CoA molecules are the preferred substrates of MtTFE. Previous crystallographic binding and modeling studies identified binding sites for the acyl-CoA substrates at the three active sites, as well as the NAD binding pocket at the HAD active site. These studies also identified three additional CoA binding sites on the surface of MtTFE that are different from the active sites. It has been proposed that one of these additional sites could be of functional relevance for the substrate channeling (by surface crawling) of reaction intermediates between the three active sites. Here, 226 fragments were screened in a crystallographic fragment-binding study of MtTFE crystals, resulting in the structures of 16 MtTFE-fragment complexes. Analysis of the 121 fragment-binding events shows that the ECH active site is the `binding hotspot' for the tested fragments, with 41 binding events. The mode of binding of the fragments bound at the active sites provides additional insight into how the long-chain acyl moiety of the substrates can be accommodated at their proposed binding pockets. In addition, the 20 fragment-binding events between the active sites identify potential transient binding sites of reaction intermediates relevant to the possible channeling of substrates between these active sites. These results provide a basis for further studies to understand the functional relevance of the latter binding sites and to identify substrates for which channeling is crucial.
Collapse
Affiliation(s)
- Subhadra Dalwani
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
| | - Alexander Metz
- Department of Pharmaceutical Chemistry, Philipps-University Marburg, Marburg, Germany
| | - Franziska U Huschmann
- Department of Pharmaceutical Chemistry, Philipps-University Marburg, Marburg, Germany
| | - Manfred S Weiss
- Macromolecular Crystallography, Helmholtz-Zentrum Berlin, Berlin, Germany
| | - Rik K Wierenga
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
| | - Rajaram Venkatesan
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
| |
Collapse
|
8
|
Lawson CL, Kryshtafovych A, Pintilie GD, Burley SK, Černý J, Chen VB, Emsley P, Gobbi A, Joachimiak A, Noreng S, Prisant MG, Read RJ, Richardson JS, Rohou AL, Schneider B, Sellers BD, Shao C, Sourial E, Williams CI, Williams CJ, Yang Y, Abbaraju V, Afonine PV, Baker ML, Bond PS, Blundell TL, Burnley T, Campbell A, Cao R, Cheng J, Chojnowski G, Cowtan KD, DiMaio F, Esmaeeli R, Giri N, Grubmüller H, Hoh SW, Hou J, Hryc CF, Hunte C, Igaev M, Joseph AP, Kao WC, Kihara D, Kumar D, Lang L, Lin S, Maddhuri Venkata Subramaniya SR, Mittal S, Mondal A, Moriarty NW, Muenks A, Murshudov GN, Nicholls RA, Olek M, Palmer CM, Perez A, Pohjolainen E, Pothula KR, Rowley CN, Sarkar D, Schäfer LU, Schlicksup CJ, Schröder GF, Shekhar M, Si D, Singharoy A, Sobolev OV, Terashi G, Vaiana AC, Vedithi SC, Verburgt J, Wang X, Warshamanage R, Winn MD, Weyand S, Yamashita K, Zhao M, Schmid MF, Berman HM, Chiu W. Outcomes of the EMDataResource cryo-EM Ligand Modeling Challenge. Nat Methods 2024; 21:1340-1348. [PMID: 38918604 PMCID: PMC11526832 DOI: 10.1038/s41592-024-02321-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Accepted: 05/24/2024] [Indexed: 06/27/2024]
Abstract
The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein-nucleic acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: Escherichia coli beta-galactosidase with inhibitor, SARS-CoV-2 virus RNA-dependent RNA polymerase with covalently bound nucleotide analog and SARS-CoV-2 virus ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. The quality of submitted ligand models and surrounding atoms were analyzed by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics and contact scores. A composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.
Collapse
Affiliation(s)
- Catherine L Lawson
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.
| | | | - Grigore D Pintilie
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, USA
| | - Stephen K Burley
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
- RCSB Protein Data Bank and San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, USA
| | - Jiří Černý
- Institute of Biotechnology, Czech Academy of Sciences, Vestec, Czech Republic
| | - Vincent B Chen
- Department of Biochemistry, Duke University, Durham, NC, USA
| | - Paul Emsley
- MRC Laboratory of Molecular Biology, Cambridge, UK
| | - Alberto Gobbi
- Discovery Chemistry, Genentech Inc., San Francisco, CA, USA
- , Berlin, Germany
| | - Andrzej Joachimiak
- Structural Biology Center, X-ray Science Division, Argonne National Laboratory, Argonne, IL, USA
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL, USA
| | - Sigrid Noreng
- Structural Biology, Genentech Inc., South San Francisco, CA, USA
- Protein Science, Septerna, South San Francisco, CA, USA
| | | | - Randy J Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | | | - Alexis L Rohou
- Structural Biology, Genentech Inc., South San Francisco, CA, USA
| | - Bohdan Schneider
- Institute of Biotechnology, Czech Academy of Sciences, Vestec, Czech Republic
| | - Benjamin D Sellers
- Discovery Chemistry, Genentech Inc., San Francisco, CA, USA
- Computational Chemistry, Vilya, South San Francisco, CA, USA
| | - Chenghua Shao
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | | | | | | | - Ying Yang
- Structural Biology, Genentech Inc., South San Francisco, CA, USA
| | - Venkat Abbaraju
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Pavel V Afonine
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Matthew L Baker
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Paul S Bond
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Tom Burnley
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Arthur Campbell
- Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | | | - K D Cowtan
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Frank DiMaio
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Reza Esmaeeli
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Nabin Giri
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Helmut Grubmüller
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Soon Wen Hoh
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Jie Hou
- Department of Computer Science, Saint Louis University, St. Louis, MO, USA
| | - Corey F Hryc
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Carola Hunte
- Institute of Biochemistry and Molecular Biology, ZBMZ, Faculty of Medicine and CIBSS-Centre for Integrative Biological Signalling Studies, University of Freiburg, Freiburg, Germany
| | - Maxim Igaev
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Agnel P Joseph
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Wei-Chun Kao
- Institute of Biochemistry and Molecular Biology, ZBMZ, Faculty of Medicine and CIBSS-Centre for Integrative Biological Signalling Studies, University of Freiburg, Freiburg, Germany
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Dilip Kumar
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
- Trivedi School of Biosciences, Ashoka University, Sonipat, India
| | - Lijun Lang
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
- The Chinese University of Hong Kong, Hong Kong, China
| | - Sean Lin
- Division of Computing & Software Systems, University of Washington, Bothell, WA, USA
| | | | - Sumit Mittal
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
- School of Advanced Sciences and Languages, VIT Bhopal University, Bhopal, India
| | - Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
- National Renewable Energy Laboratory (NREL), Golden, CO, USA
| | - Nigel W Moriarty
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Andrew Muenks
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | - Robert A Nicholls
- MRC Laboratory of Molecular Biology, Cambridge, UK
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Mateusz Olek
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
- Electron Bio-Imaging Centre, Diamond Light Source, Harwell Science and Innovation Campus, Didcot, UK
| | - Colin M Palmer
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Emmi Pohjolainen
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Karunakar R Pothula
- Institute of Biological Information Processing (IBI-7, Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | | | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
- MSU-DOE Plant Research Laboratory, East Lansing, MI, USA
- School of Molecular Sciences, Arizona State University, Tempe, AZ, USA
| | - Luisa U Schäfer
- Institute of Biological Information Processing (IBI-7, Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | - Christopher J Schlicksup
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Gunnar F Schröder
- Institute of Biological Information Processing (IBI-7, Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
- Physics Department, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Mrinal Shekhar
- Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Dong Si
- Division of Computing & Software Systems, University of Washington, Bothell, WA, USA
| | | | - Oleg V Sobolev
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Andrea C Vaiana
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- Nature's Toolbox (NTx), Rio Rancho, NM, USA
| | | | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | | | - Martyn D Winn
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Simone Weyand
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | | | - Minglei Zhao
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL, USA
| | - Michael F Schmid
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Helen M Berman
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Wah Chiu
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, USA.
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA.
| |
Collapse
|
9
|
Baskaran K, Ploskon E, Tejero R, Yokochi M, Harrus D, Liang Y, Peisach E, Persikova I, Ramelot TA, Sekharan M, Tolchard J, Westbrook JD, Bardiaux B, Schwieters CD, Patwardhan A, Velankar S, Burley SK, Kurisu G, Hoch JC, Montelione GT, Vuister GW, Young JY. Restraint validation of biomolecular structures determined by NMR in the Protein Data Bank. Structure 2024; 32:824-837.e1. [PMID: 38490206 PMCID: PMC11162339 DOI: 10.1016/j.str.2024.02.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 01/13/2024] [Accepted: 02/19/2024] [Indexed: 03/17/2024]
Abstract
Biomolecular structure analysis from experimental NMR studies generally relies on restraints derived from a combination of experimental and knowledge-based data. A challenge for the structural biology community has been a lack of standards for representing these restraints, preventing the establishment of uniform methods of model-vs-data structure validation against restraints and limiting interoperability between restraint-based structure modeling programs. The NEF and NMR-STAR formats provide a standardized approach for representing commonly used NMR restraints. Using these restraint formats, a standardized validation system for assessing structural models of biopolymers against restraints has been developed and implemented in the wwPDB OneDep data deposition-validation-biocuration system. The resulting wwPDB restraint violation report provides a model vs. data assessment of biomolecule structures determined using distance and dihedral restraints, with extensions to other restraint types currently being implemented. These tools are useful for assessing NMR models, as well as for assessing biomolecular structure predictions based on distance restraints.
Collapse
Affiliation(s)
- Kumaran Baskaran
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA.
| | - Eliza Ploskon
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, UK
| | - Roberto Tejero
- Departamento de Quίmica Fίsica, Universidad de Valencia, Dr. Moliner, 50 46100 Burjassot, Valencia, Spain
| | - Masashi Yokochi
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan; Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Deborah Harrus
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Theresa A Ramelot
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - James Tolchard
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Benjamin Bardiaux
- Department of Structural Biology and Chemistry, Institut Pasteur, Université Paris Cité, CNRS UMR3528, 75015 Paris, France
| | - Charles D Schwieters
- Computational Biomolecular Magnetic Resonance Core, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD 20892, USA
| | - Ardan Patwardhan
- The Electron Microscopy Data Bank, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, La Jolla, CA, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan; Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Gaetano T Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA.
| | - Geerten W Vuister
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, UK.
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
10
|
Lawson CL, Kryshtafovych A, Pintilie GD, Burley SK, Černý J, Chen VB, Emsley P, Gobbi A, Joachimiak A, Noreng S, Prisant M, Read RJ, Richardson JS, Rohou AL, Schneider B, Sellers BD, Shao C, Sourial E, Williams CI, Williams CJ, Yang Y, Abbaraju V, Afonine PV, Baker ML, Bond PS, Blundell TL, Burnley T, Campbell A, Cao R, Cheng J, Chojnowski G, Cowtan KD, DiMaio F, Esmaeeli R, Giri N, Grubmüller H, Hoh SW, Hou J, Hryc CF, Hunte C, Igaev M, Joseph AP, Kao WC, Kihara D, Kumar D, Lang L, Lin S, Maddhuri Venkata Subramaniya SR, Mittal S, Mondal A, Moriarty NW, Muenks A, Murshudov GN, Nicholls RA, Olek M, Palmer CM, Perez A, Pohjolainen E, Pothula KR, Rowley CN, Sarkar D, Schäfer LU, Schlicksup CJ, Schröder GF, Shekhar M, Si D, Singharoy A, Sobolev OV, Terashi G, Vaiana AC, Vedithi SC, Verburgt J, Wang X, Warshamanage R, Winn MD, Weyand S, Yamashita K, Zhao M, Schmid MF, Berman HM, Chiu W. Outcomes of the EMDataResource Cryo-EM Ligand Modeling Challenge. RESEARCH SQUARE 2024:rs.3.rs-3864137. [PMID: 38343795 PMCID: PMC10854310 DOI: 10.21203/rs.3.rs-3864137/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/18/2024]
Abstract
The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein/nucleic-acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: E. coli beta-galactosidase with inhibitor, SARS-CoV-2 RNA-dependent RNA polymerase with covalently bound nucleotide analog, and SARS-CoV-2 ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. We found that (1) the quality of submitted ligand models and surrounding atoms varied, as judged by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics, and contact scores, and (2) a composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.
Collapse
Affiliation(s)
- Catherine L. Lawson
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | | | - Grigore D. Pintilie
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, USA
| | - Stephen K. Burley
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ USA
- San Diego Supercomputer Center, University of California San Diego, La Jolla, CA USA
| | - Jiří Černý
- Institute of Biotechnology, Czech Academy of Sciences, Vestec, CZ
| | | | - Paul Emsley
- MRC Laboratory of Molecular Biology, Cambridge, UK
| | - Alberto Gobbi
- Discovery Chemistry, Genentech Inc, South San Francisco, USA
| | - Andrzej Joachimiak
- Structural Biology Center, X-ray Science Division, Argonne National Laboratory, Argonne, IL, USA
| | - Sigrid Noreng
- Structural Biology, Genentech Inc, South San Francisco, USA
| | | | - Randy J. Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | | | | | - Bohdan Schneider
- Institute of Biotechnology, Czech Academy of Sciences, Vestec, CZ
| | | | - Chenghua Shao
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | | | | | | | - Ying Yang
- Structural Biology, Genentech Inc, South San Francisco, USA
| | - Venkat Abbaraju
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Pavel V. Afonine
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Matthew L. Baker
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Paul S. Bond
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Tom L. Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Tom Burnley
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Arthur Campbell
- Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | | | - Kevin D. Cowtan
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Frank DiMaio
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Reza Esmaeeli
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Nabin Giri
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Helmut Grubmüller
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Soon Wen Hoh
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Jie Hou
- Department of Computer Science, Saint Louis University, St. Louis, MO, USA
| | - Corey F. Hryc
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Carola Hunte
- Institute of Biochemistry and Molecular Biology, ZBMZ, Faculty of Medicine and CIBSS - Centre for Integrative Biological Signalling Studies, University of Freiburg, 79104 Freiburg, Germany
| | - Maxim Igaev
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Agnel P. Joseph
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Wei-Chun Kao
- Institute of Biochemistry and Molecular Biology, ZBMZ, Faculty of Medicine and CIBSS - Centre for Integrative Biological Signalling Studies, University of Freiburg, 79104 Freiburg, Germany
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Dilip Kumar
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Lijun Lang
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Sean Lin
- Division of Computing & Software Systems, University of Washington, Bothell, WA, USA
| | | | - Sumit Mittal
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
- School of Advanced Sciences and Languages, VIT Bhopal University, Bhopal, India
| | - Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Nigel W. Moriarty
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Andrew Muenks
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | | | - Mateusz Olek
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
- Electron Bio-Imaging Centre, Diamond Light Source, Harwell Science and Innovation Campus, Didcot, UK
| | - Colin M. Palmer
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Emmi Pohjolainen
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Karunakar R. Pothula
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | | | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Luisa U. Schäfer
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | - Christopher J. Schlicksup
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Gunnar F. Schröder
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
- Physics Department, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Mrinal Shekhar
- Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Dong Si
- Division of Computing & Software Systems, University of Washington, Bothell, WA, USA
| | | | - Oleg V. Sobolev
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Andrea C. Vaiana
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- Nature’s Toolbox (NTx), Rio Rancho, NM, USA
| | | | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | | | - Martyn D. Winn
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Simone Weyand
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | | | - Minglei Zhao
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL, USA
| | - Michael F. Schmid
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Helen M. Berman
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Wah Chiu
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, USA
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| |
Collapse
|
11
|
Baskaran K, Ploskon E, Tejero R, Yokochi M, Harrus D, Liang Y, Peisach E, Persikova I, Ramelot TA, Sekharan M, Tolchard J, Westbrook JD, Bardiaux B, Schwieters CD, Patwardhan A, Velankar S, Burley SK, Kurisu G, Hoch JC, Montelione GT, Vuister GW, Young JY. Restraint Validation of Biomolecular Structures Determined by NMR in the Protein Data Bank. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.15.575520. [PMID: 38328042 PMCID: PMC10849500 DOI: 10.1101/2024.01.15.575520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Biomolecular structure analysis from experimental NMR studies generally relies on restraints derived from a combination of experimental and knowledge-based data. A challenge for the structural biology community has been a lack of standards for representing these restraints, preventing the establishment of uniform methods of model-vs-data structure validation against restraints and limiting interoperability between restraint-based structure modeling programs. The NMR exchange (NEF) and NMR-STAR formats provide a standardized approach for representing commonly used NMR restraints. Using these restraint formats, a standardized validation system for assessing structural models of biopolymers against restraints has been developed and implemented in the wwPDB OneDep data deposition-validation-biocuration system. The resulting wwPDB Restraint Violation Report provides a model vs. data assessment of biomolecule structures determined using distance and dihedral restraints, with extensions to other restraint types currently being implemented. These tools are useful for assessing NMR models, as well as for assessing biomolecular structure predictions based on distance restraints.
Collapse
Affiliation(s)
- Kumaran Baskaran
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Eliza Ploskon
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, United Kingdom
| | - Roberto Tejero
- Departamento de Quίmica Fίsica, Universidad de Valencia, Dr. Moliner, 50 46100-Burjassot, Valencia, Spain
| | - Masashi Yokochi
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
- Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Deborah Harrus
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Theresa A Ramelot
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - James Tolchard
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Benjamin Bardiaux
- Department of Structural Biology and Chemistry, Institut Pasteur, Université Paris Cité, CNRS UMR3528, 75015 Paris, France
| | - Charles D Schwieters
- Computational Biomolecular Magnetic Resonance Core, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD 20892, USA
| | - Ardan Patwardhan
- The Electron Microscopy Data Bank, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Sameer Velankar
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, California, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
- Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Gaetano T Montelione
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Geerten W Vuister
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, United Kingdom
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
12
|
Turner J, Abbott S, Fonseca N, Pye R, Carrijo L, Duraisamy AK, Salih O, Wang Z, Kleywegt GJ, Morris KL, Patwardhan A, Burley SK, Crichlow G, Feng Z, Flatt JW, Ghosh S, Hudson BP, Lawson CL, Liang Y, Peisach E, Persikova I, Sekharan M, Shao C, Young J, Velankar S, Armstrong D, Bage M, Bueno WM, Evans G, Gaborova R, Ganguly S, Gupta D, Harrus D, Tanweer A, Bansal M, Rangannan V, Kurisu G, Cho H, Ikegawa Y, Kengaku Y, Kim JY, Niwa S, Sato J, Takuwa A, Yu J, Hoch JC, Baskaran K, Xu W, Zhang W, Ma X. EMDB-the Electron Microscopy Data Bank. Nucleic Acids Res 2024; 52:D456-D465. [PMID: 37994703 PMCID: PMC10767987 DOI: 10.1093/nar/gkad1019] [Citation(s) in RCA: 44] [Impact Index Per Article: 44.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 10/18/2023] [Accepted: 10/20/2023] [Indexed: 11/24/2023] Open
Abstract
The Electron Microscopy Data Bank (EMDB) is the global public archive of three-dimensional electron microscopy (3DEM) maps of biological specimens derived from transmission electron microscopy experiments. As of 2021, EMDB is managed by the Worldwide Protein Data Bank consortium (wwPDB; wwpdb.org) as a wwPDB Core Archive, and the EMDB team is a core member of the consortium. Today, EMDB houses over 30 000 entries with maps containing macromolecules, complexes, viruses, organelles and cells. Herein, we provide an overview of the rapidly growing EMDB archive, including its current holdings, recent updates, and future plans.
Collapse
|
13
|
Xu W, Velankar S, Patwardhan A, Hoch JC, Burley SK, Kurisu G. Announcing the launch of Protein Data Bank China as an Associate Member of the Worldwide Protein Data Bank Partnership. Acta Crystallogr D Struct Biol 2023; 79:792-795. [PMID: 37561405 PMCID: PMC10478634 DOI: 10.1107/s2059798323006381] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 07/21/2023] [Indexed: 08/11/2023] Open
Abstract
The Protein Data Bank (PDB) is the single global archive of atomic-level, three-dimensional structures of biological macromolecules experimentally determined by macromolecular crystallography, nuclear magnetic resonance spectroscopy or three-dimensional cryo-electron microscopy. The PDB is growing continuously, with a recent rapid increase in new structure depositions from Asia. In 2022, the Worldwide Protein Data Bank (wwPDB; https://www.wwpdb.org/) partners welcomed Protein Data Bank China (PDBc; https://www.pdbc.org.cn) to the organization as an Associate Member. PDBc is based in the National Facility for Protein Science in Shanghai which is associated with the Shanghai Advanced Research Institute of Chinese Academy of Sciences, the Shanghai Institute for Advanced Immunochemical Studies and the iHuman Institute of ShanghaiTech University. This letter describes the history of the wwPDB, recently established mechanisms for adding new wwPDB data centers and the processes developed to bring PDBc into the partnership.
Collapse
Affiliation(s)
- Wenqing Xu
- Protein Data Bank China, ShanghaiTech University and National Facility for Protein Science in Shanghai, Shanghai, People’s Republic of China
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Ardan Patwardhan
- Electron Microscopy Data Bank, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Jeffrey C. Hoch
- Biological Magnetic Resonance Data Bank, UConn Health, Farmington, CT 06030-3305, USA
| | - Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Biology Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Osaka 565-0871, Japan
- Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| |
Collapse
|
14
|
Lisacek F, Tiemeyer M, Mazumder R, Aoki-Kinoshita KF. Worldwide Glycoscience Informatics Infrastructure: The GlySpace Alliance. JACS AU 2023; 3:4-12. [PMID: 36711080 PMCID: PMC9875223 DOI: 10.1021/jacsau.2c00477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 11/04/2022] [Accepted: 11/07/2022] [Indexed: 06/18/2023]
Abstract
The GlySpace Alliance was formed in 2018 among the principal investigators of three major glycoscience portals: Glyco@Expasy, GlyCosmos, and GlyGen, representing Europe, Asia, and the United States, respectively. While each of these portals has its unique user interface, the aim is to provide the same basic data set of glycan-related omics data. These portals will be introduced with the aim to enable users to find their target information in the most efficient manner, in particular, in terms of the chemical structures of glycans and their functions.
Collapse
Affiliation(s)
- Frederique Lisacek
- Proteome
Informatics Group, SIB Swiss Institute of Bioinformatics, University of Geneva, Geneva CH-1227, Switzerland
- Computer
Science Department & Section of Biology, University of Geneva, Geneva CH-1227, Switzerland
| | - Michael Tiemeyer
- Complex
Carbohydrate Research Center, University
of Georgia, Athens, Georgia 30602, United States
| | - Raja Mazumder
- George
Washington University, Washington, District of Columbia 20037, United States
| | - Kiyoko F. Aoki-Kinoshita
- Glycan
and Life Systems Integration Center (GaLSIC), Soka University, Hachioji, Tokyo 192-8577, Japan
| |
Collapse
|
15
|
Burley SK, Bhikadiya C, Bi C, Bittrich S, Chao H, Chen L, Craig PA, Crichlow GV, Dalenberg K, Duarte JM, Dutta S, Fayazi M, Feng Z, Flatt JW, Ganesan S, Ghosh S, Goodsell DS, Green RK, Guranovic V, Henry J, Hudson BP, Khokhriakov I, Lawson CL, Liang Y, Lowe R, Peisach E, Persikova I, Piehl DW, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Vallat B, Voigt M, Webb B, Westbrook JD, Whetstone S, Young JY, Zalevsky A, Zardecki C. RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Res 2023; 51:D488-D508. [PMID: 36420884 PMCID: PMC9825554 DOI: 10.1093/nar/gkac1077] [Citation(s) in RCA: 360] [Impact Index Per Article: 180.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 10/17/2022] [Accepted: 11/02/2022] [Indexed: 11/27/2022] Open
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), founding member of the Worldwide Protein Data Bank (wwPDB), is the US data center for the open-access PDB archive. As wwPDB-designated Archive Keeper, RCSB PDB is also responsible for PDB data security. Annually, RCSB PDB serves >10 000 depositors of three-dimensional (3D) biostructures working on all permanently inhabited continents. RCSB PDB delivers data from its research-focused RCSB.org web portal to many millions of PDB data consumers based in virtually every United Nations-recognized country, territory, etc. This Database Issue contribution describes upgrades to the research-focused RCSB.org web portal that created a one-stop-shop for open access to ∼200 000 experimentally-determined PDB structures of biological macromolecules alongside >1 000 000 incorporated Computed Structure Models (CSMs) predicted using artificial intelligence/machine learning methods. RCSB.org is a 'living data resource.' Every PDB structure and CSM is integrated weekly with related functional annotations from external biodata resources, providing up-to-date information for the entire corpus of 3D biostructure data freely available from RCSB.org with no usage limitations. Within RCSB.org, PDB structures and the CSMs are clearly identified as to their provenance and reliability. Both are fully searchable, and can be analyzed and visualized using the full complement of RCSB.org web portal capabilities.
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Henry Chao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Li Chen
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Paul A Craig
- School of Chemistry and Materials Science, Rochester Institute of Technology, Rochester, NY 14623, USA
| | - Gregg V Crichlow
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Kenneth Dalenberg
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Maryam Fayazi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Justin W Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Sai Ganesan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Sutapa Ghosh
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - David S Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Rachel Kramer Green
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Vladimir Guranovic
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jeremy Henry
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Brian P Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Igor Khokhriakov
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Catherine L Lawson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dennis W Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ben Webb
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Shamara Whetstone
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Arthur Zalevsky
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
16
|
Burley SK, Bhikadiya C, Bi C, Bittrich S, Chao H, Chen L, Craig PA, Crichlow GV, Dalenberg K, Duarte JM, Dutta S, Fayazi M, Feng Z, Flatt JW, Ganesan SJ, Ghosh S, Goodsell DS, Green RK, Guranovic V, Henry J, Hudson BP, Khokhriakov I, Lawson CL, Liang Y, Lowe R, Peisach E, Persikova I, Piehl DW, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Vallat B, Voigt M, Webb B, Westbrook JD, Whetstone S, Young JY, Zalevsky A, Zardecki C. RCSB Protein Data bank: Tools for visualizing and understanding biological macromolecules in 3D. Protein Sci 2022; 31:e4482. [PMID: 36281733 PMCID: PMC9667899 DOI: 10.1002/pro.4482] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 10/17/2022] [Accepted: 10/19/2022] [Indexed: 12/14/2022]
Abstract
Now in its 52nd year of continuous operations, the Protein Data Bank (PDB) is the premiere open-access global archive housing three-dimensional (3D) biomolecular structure data. It is jointly managed by the Worldwide Protein Data Bank (wwPDB) partnership. The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) is funded by the National Science Foundation, National Institutes of Health, and US Department of Energy and serves as the US data center for the wwPDB. RCSB PDB is also responsible for the security of PDB data in its role as wwPDB-designated Archive Keeper. Every year, RCSB PDB serves tens of thousands of depositors of 3D macromolecular structure data (coming from macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction). The RCSB PDB research-focused web portal (RCSB.org) makes PDB data available at no charge and without usage restrictions to many millions of PDB data consumers around the world. The RCSB PDB training, outreach, and education web portal (PDB101.RCSB.org) serves nearly 700 K educators, students, and members of the public worldwide. This invited Tools Issue contribution describes how RCSB PDB (i) is organized; (ii) works with wwPDB partners to process new depositions; (iii) serves as the wwPDB-designated Archive Keeper; (iv) enables exploration and 3D visualization of PDB data via RCSB.org; and (v) supports training, outreach, and education via PDB101.RCSB.org. New tools and features at RCSB.org are presented using examples drawn from high-resolution structural studies of proteins relevant to treatment of human cancers by targeting immune checkpoints.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
- Department of Chemistry and Chemical Biology, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Henry Chao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Li Chen
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Paul A. Craig
- School of Chemistry and Materials ScienceRochester Institute of TechnologyRochesterNew YorkUSA
| | - Gregg V. Crichlow
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Kenneth Dalenberg
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jose M. Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
| | - Maryam Fayazi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Justin W. Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Sai J. Ganesan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - Sutapa Ghosh
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - David S. Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
- Department of Integrative Structural and Computational BiologyThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Rachel Kramer Green
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Vladimir Guranovic
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jeremy Henry
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Brian P. Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Igor Khokhriakov
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Catherine L. Lawson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Benjamin Webb
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - John D. Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Shamara Whetstone
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jasmine Y. Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Arthur Zalevsky
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| |
Collapse
|
17
|
Shao C, Bittrich S, Wang S, Burley SK. Assessing PDB macromolecular crystal structure confidence at the individual amino acid residue level. Structure 2022; 30:1385-1394.e3. [PMID: 36049478 PMCID: PMC9547844 DOI: 10.1016/j.str.2022.08.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/24/2022] [Accepted: 08/05/2022] [Indexed: 11/22/2022]
Abstract
Approximately 87% of the more than 190,000 atomic-level three-dimensional (3D) biostructures in the PDB were determined using macromolecular crystallography (MX). Agreement between 3D atomic coordinates and experimental data for >100 million individual amino acid residues occurring within ∼150,000 PDB MX structures was analyzed in detail. The real-space correlation coefficient (RSCC) calculated using the 3D atomic coordinates for each residue and experimental-data-derived electron density enables outlier detection of unreliable atomic coordinates (particularly important for poorly resolved side-chain atoms) and ready evaluation of local structure quality by PDB users. For human protein MX structures in PDB, comparisons of the per-residue RSCC metric with AlphaFold2-computed structure model confidence (pLDDT-predicted local distance difference test) document (1) that RSCC values and pLDDT scores are correlated (median correlation coefficient ∼0.41), and (2) that experimentally determined MX structures (3.5 Å resolution or better) are more reliable than AlphaFold2-computed structure models and should be used preferentially whenever possible.
Collapse
Affiliation(s)
- Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Sijian Wang
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Statistics, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA; Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ 08903, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
18
|
Burley SK, Berman HM, Duarte JM, Feng Z, Flatt JW, Hudson BP, Lowe R, Peisach E, Piehl DW, Rose Y, Sali A, Sekharan M, Shao C, Vallat B, Voigt M, Westbrook JD, Young JY, Zardecki C. Protein Data Bank: A Comprehensive Review of 3D Structure Holdings and Worldwide Utilization by Researchers, Educators, and Students. Biomolecules 2022; 12:1425. [PMID: 36291635 PMCID: PMC9599165 DOI: 10.3390/biom12101425] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 09/23/2022] [Accepted: 09/26/2022] [Indexed: 11/18/2022] Open
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the United States National Science Foundation, National Institutes of Health, and Department of Energy, supports structural biologists and Protein Data Bank (PDB) data users around the world. The RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, serves as the US data center for the global PDB archive housing experimentally-determined three-dimensional (3D) structure data for biological macromolecules. As the wwPDB-designated Archive Keeper, RCSB PDB is also responsible for the security of PDB data and weekly update of the archive. RCSB PDB serves tens of thousands of data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) annually working on all permanently inhabited continents. RCSB PDB makes PDB data available from its research-focused web portal at no charge and without usage restrictions to many millions of PDB data consumers around the globe. It also provides educators, students, and the general public with an introduction to the PDB and related training materials through its outreach and education-focused web portal. This review article describes growth of the PDB, examines evolution of experimental methods for structure determination viewed through the lens of the PDB archive, and provides a detailed accounting of PDB archival holdings and their utilization by researchers, educators, and students worldwide.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Helen M. Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jose M. Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Justin W. Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brian P. Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D. Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Jasmine Y. Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
19
|
Macnar JM, Brzezinski D, Chruszcz M, Gront D. Analysis of protein structures containing
HEPES
and
MES
molecules. Protein Sci 2022. [PMCID: PMC9601878 DOI: 10.1002/pro.4415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
X‐ray crystallography is the main experimental method behind ligand–macromolecule complexes found in the Protein Data Bank (PDB). Applying bioinformatics methods to such structural data can fuel drug discovery, albeit under the condition that the information is correct. Regrettably, a small number of structures in the PDB are of suboptimal quality due to incorrectly identified and modeled ligands in protein–ligand complexes. In this paper, we combine a theoretical‐graph approach, nuclear density estimates, bioinformatics methods, and prior chemical knowledge to analyze two non‐physiological ligands, HEPES and MES, that are frequent components of crystallization and purifications buffers. Our analysis includes quantum mechanics calculations and Cambridge Structure Database (CSD) queries to define the ideal conformation of these ligands, geometry analysis of PDB deposits regarding several quality factors, and a search for homologous structures to identify other small molecules that could bind in place of the parasitic ligand. Our results highlight the need for careful refinement of macromolecule–ligand complexes and better validation tools that integrate results from all relevant resources. PDB Code(s): 3K4L, 3PYI, 5T6L, 6BB0, 1PJX, 3O4P, 6WCF, 3DKE, 3E10, 6G38, 4E8R, 4Z91, 3E9F, 1MOS, 1MOQ, 2ESB, 1VHR, 4P66 and 6NNI;
Collapse
Affiliation(s)
- Joanna Magdalena Macnar
- Department of Molecular Physiology and Biological Physics University of Virginia Charlottesville Virginia USA
- College of Inter‐Faculty Individual Studies in Mathematics and Natural Sciences University of Warsaw Warsaw Poland
- Faculty of Chemistry, Biological and Chemical Research Center University of Warsaw Warsaw Poland
| | - Dariusz Brzezinski
- Department of Molecular Physiology and Biological Physics University of Virginia Charlottesville Virginia USA
- Institute of Computing Science Poznan University of Technology Poznan Poland
- Center for Biocrystallographic Research, Institute of Bioorganic Chemistry Polish Academy of Sciences Poznan Poland
| | - Maksymilian Chruszcz
- Department of Chemistry and Biochemistry University of South Carolina Columbia South Carolina USA
| | - Dominik Gront
- Department of Molecular Physiology and Biological Physics University of Virginia Charlottesville Virginia USA
- Faculty of Chemistry, Biological and Chemical Research Center University of Warsaw Warsaw Poland
| |
Collapse
|
20
|
Exploring protein symmetry at the RCSB Protein Data Bank. Emerg Top Life Sci 2022; 6:231-243. [PMID: 35801924 PMCID: PMC9472815 DOI: 10.1042/etls20210267] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 06/15/2022] [Accepted: 06/20/2022] [Indexed: 11/17/2022]
Abstract
The symmetry of biological molecules has fascinated structural biologists ever since the structure of hemoglobin was determined. The Protein Data Bank (PDB) archive is the central global archive of three-dimensional (3D), atomic-level structures of biomolecules, providing open access to the results of structural biology research with no limitations on usage. Roughly 40% of the structures in the archive exhibit some type of symmetry, including formal global symmetry, local symmetry, or pseudosymmetry. The Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (founding member of the Worldwide Protein Data Bank partnership that jointly manages, curates, and disseminates the archive) provides a variety of tools to assist users interested in exploring the symmetry of biological macromolecules. These tools include multiple modalities for searching and browsing the archive, turnkey methods for biomolecular visualization, documentation, and outreach materials for exploring functional biomolecular symmetry.
Collapse
|
21
|
Rao RM, Dauchez M, Baud S. How molecular modelling can better broaden the understanding of glycosylations. Curr Opin Struct Biol 2022; 75:102393. [PMID: 35679802 DOI: 10.1016/j.sbi.2022.102393] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 03/31/2022] [Accepted: 04/18/2022] [Indexed: 11/03/2022]
Abstract
Glycosylations are among the most ubiquitous post-translational modifications (PTMs) in proteins, and the effects of their perturbations are seen in various diseases such as cancers, diabetes and arthritis to name a few. Yet they remain one of the most enigmatic aspects of protein structure and function. On the other hand, molecular modelling techniques have been rapidly bridging this knowledge gap since the last decade. In this review, we discuss how these techniques have proven to be indispensable for a better understanding of the role of glycosylations in glycoprotein structure and function.
Collapse
Affiliation(s)
- Rajas M Rao
- Université de Reims Champagne Ardenne, CNRS UMR 7369, MEDyC, Reims, 51687, France
| | - Manuel Dauchez
- Université de Reims Champagne Ardenne, CNRS UMR 7369, MEDyC, Reims, 51687, France.
| | - Stéphanie Baud
- Université de Reims Champagne Ardenne, CNRS UMR 7369, MEDyC, Reims, 51687, France
| |
Collapse
|
22
|
Scherbinina SI, Frank M, Toukach PV. Carbohydrate structure database (CSDB) oligosaccharide conformation tool. Glycobiology 2022; 32:460-468. [PMID: 35275211 DOI: 10.1093/glycob/cwac011] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Revised: 02/17/2022] [Accepted: 03/04/2022] [Indexed: 11/13/2022] Open
Abstract
Population analysis in terms of glycosidic torsion angles is frequently used to reveal preferred conformers of glycans. However, due to high structural diversity and flexibility of carbohydrates, conformational characterization of complex glycans can be a challenging task. Herein we present a conformation module of oligosaccharide fragments occurring in natural glycan structures developed on the platform of the Carbohydrate Structure Database (CSDB). Currently, this module deposits free energy surface and conformer abundance maps plotted as a function of glycosidic torsions for 194 inter-residue bonds. Data are automatically and continuously derived from explicit-solvent molecular dynamics (MD) simulations. The module was also supplemented with high-temperature MD data of saccharides (2403 maps) provided by GlycoMapsDB (hosted by GLYCOSCIENCES.de project). Conformational data defined by up to four torsional degrees of freedom can be freely explored using a web interface of the module available at http://csdb.glycoscience.ru/database/core/search_conf.html.
Collapse
Affiliation(s)
- S I Scherbinina
- Higher Chemical College, D. Mendeleev University of Chemical Technology of Russia, Miusskaya Square 9, 125047 Moscow, Russia
| | - M Frank
- Biognos AB, Box 8963, 40274 Göteborg, Sweden
| | - P V Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Science, Leninsky prospect 47, 119991 Moscow, Russia
| |
Collapse
|
23
|
Radusky LG, Serrano L. pyFoldX: enabling biomolecular analysis and engineering along structural ensembles. Bioinformatics 2022; 38:2353-2355. [PMID: 35176149 PMCID: PMC9004634 DOI: 10.1093/bioinformatics/btac072] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 12/19/2021] [Accepted: 02/09/2022] [Indexed: 02/03/2023] Open
Abstract
SUMMARY Recent years have seen an increase in the number of structures available, not only for new proteins but also for the same protein crystallized with different molecules and proteins. While protein design software has proven to be successful in designing and modifying proteins, they can also be overly sensitive to small conformational differences between structures of the same protein. To cope with this, we introduce here pyFoldX, a python library that allows the integrative analysis of structures of the same protein using FoldX, an established forcefield and modelling software. The library offers new functionalities for handling different structures of the same protein, an improved molecular parametrization module and an easy integration with the data analysis ecosystem of the python programming language. AVAILABILITY AND IMPLEMENTATION pyFoldX rely on the FoldX software for energy calculations and modelling, which can be downloaded upon registration in http://foldxsuite.crg.eu/ and its licence is free of charge for academics. The pyFoldX library is open-source. Full details on installation, tutorials covering the library functionality and the scripts used to generate the data and figures presented in this paper are available at https://github.com/leandroradusky/pyFoldX. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Leandro G Radusky
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain
| | | |
Collapse
|
24
|
Simplified quality assessment for small-molecule ligands in the Protein Data Bank. Structure 2022; 30:252-262.e4. [PMID: 35026162 PMCID: PMC8849442 DOI: 10.1016/j.str.2021.10.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 09/14/2021] [Accepted: 10/06/2021] [Indexed: 02/05/2023]
Abstract
More than 70% of the experimentally determined macromolecular structures in the Protein Data Bank (PDB) contain small-molecule ligands. Quality indicators of ∼643,000 ligands present in ∼106,000 PDB X-ray crystal structures have been analyzed. Ligand quality varies greatly with regard to goodness of fit between ligand structure and experimental data, deviations in bond lengths and angles from known chemical structures, and inappropriate interatomic clashes between the ligand and its surroundings. Based on principal component analysis, correlated quality indicators of ligand structure have been aggregated into two largely orthogonal composite indicators measuring goodness of fit to experimental data and deviation from ideal chemical structure. Ranking of the composite quality indicators across the PDB archive enabled construction of uniformly distributed composite ranking score. This score is implemented at RCSB.org to compare chemically identical ligands in distinct PDB structures with easy-to-interpret two-dimensional ligand quality plots, allowing PDB users to quickly assess ligand structure quality and select the best exemplars.
Collapse
|
25
|
Joosten RP, Nicholls RA, Agirre J. Towards Consistency in Geometry Restraints for Carbohydrates in the Pyranose form: Modern Dictionary Generators Reviewed. Curr Med Chem 2022; 29:1193-1207. [PMID: 34477506 PMCID: PMC7612510 DOI: 10.2174/0929867328666210902140754] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 08/03/2021] [Accepted: 08/07/2021] [Indexed: 11/23/2022]
Abstract
Macromolecular restrained refinement is nowadays the most used method for improving the agreement between an atomic structural model and experimental data. Restraint dictionaries, a key tool behind the success of the method, allow fine-tuning geometric properties such as distances and angles between atoms beyond simplistic expectations. Dictionary generators can provide restraint target estimates derived from different sources, from fully theoretical to experimental and any combination in between. Carbohydrates are stereochemically complex biomolecules and, in their pyranose form, have clear conformational preferences. As such, they pose unique problems to dictionary generators and in the course of this study, require special attention from software developers. Functional differences between restraint generators will be discussed, as well as the process of achieving consistent results with different software designs. The study will conclude a set of practical considerations, as well as recommendations for the generation of new restraint dictionaries, using the improved software alternatives discussed.
Collapse
Affiliation(s)
| | | | - Jon Agirre
- Address correspondence to this author at the York Structural Biology Laboratory, Department of Chemistry, University of York, YO10 5DD, England; Tel: +44 (0) 1904 32 8252;, E-mail:
| |
Collapse
|
26
|
Burley SK, Bhikadiya C, Bi C, Bittrich S, Chen L, Crichlow GV, Duarte JM, Dutta S, Fayazi M, Feng Z, Flatt JW, Ganesan SJ, Goodsell DS, Ghosh S, Kramer Green R, Guranovic V, Henry J, Hudson BP, Lawson CL, Liang Y, Lowe R, Peisach E, Persikova I, Piehl DW, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Vallat B, Voigt M, Westbrook JD, Whetstone S, Young JY, Zardecki C. RCSB Protein Data Bank: Celebrating 50 years of the PDB with new tools for understanding and visualizing biological macromolecules in 3D. Protein Sci 2022; 31:187-208. [PMID: 34676613 PMCID: PMC8740825 DOI: 10.1002/pro.4213] [Citation(s) in RCA: 86] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 10/12/2021] [Accepted: 10/12/2021] [Indexed: 01/03/2023]
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the US National Science Foundation, National Institutes of Health, and Department of Energy, has served structural biologists and Protein Data Bank (PDB) data consumers worldwide since 1999. RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, is the US data center for the global PDB archive housing biomolecular structure data. RCSB PDB is also responsible for the security of PDB data, as the wwPDB-designated Archive Keeper. Annually, RCSB PDB serves tens of thousands of three-dimensional (3D) macromolecular structure data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) from all inhabited continents. RCSB PDB makes PDB data available from its research-focused RCSB.org web portal at no charge and without usage restrictions to millions of PDB data consumers working in every nation and territory worldwide. In addition, RCSB PDB operates an outreach and education PDB101.RCSB.org web portal that was used by more than 800,000 educators, students, and members of the public during calendar year 2020. This invited Tools Issue contribution describes (i) how the archive is growing and evolving as new experimental methods generate ever larger and more complex biomolecular structures; (ii) the importance of data standards and data remediation in effective management of the archive and facile integration with more than 50 external data resources; and (iii) new tools and features for 3D structure analysis and visualization made available during the past year via the RCSB.org web portal.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New JerseyRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
- Department of Chemistry and Chemical BiologyRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Li Chen
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Gregg V. Crichlow
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jose M. Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New JerseyRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
| | - Maryam Fayazi
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Justin W. Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Sai J. Ganesan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences InstituteUniversity of CaliforniaSan FranciscoCaliforniaUSA
| | - David S. Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New JerseyRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
- Department of Integrative Structural and Computational BiologyThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Sutapa Ghosh
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Rachel Kramer Green
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Vladimir Guranovic
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jeremy Henry
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Brian P. Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Catherine L. Lawson
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences InstituteUniversity of CaliforniaSan FranciscoCaliforniaUSA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - John D. Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New JerseyRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
| | - Shamara Whetstone
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jasmine Y. Young
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| |
Collapse
|
27
|
Aoki-Kinoshita KF. Functions of Glycosylation and Related Web Resources for Its Prediction. Methods Mol Biol 2022; 2499:135-144. [PMID: 35696078 DOI: 10.1007/978-1-0716-2317-6_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Glycosylation involves the attachment of carbohydrate sugar chains, or glycans, onto an amino acid residue of a protein. These glycans are often branched structures and serve to modulate the function of proteins. Glycans are synthesized through a complex process of enzymatic reactions that occur in the Golgi apparatus in mammalian systems. Because there is currently no sequencer for glycans, technologies such as mass spectrometry is used to characterize glycans in a biological sample to ascertain its glycome. This is a tedious process that requires high levels of expertise and equipment. Thus, the enzymes that work on glycans, called glycogenes or glycoenzymes, have been studied to better understand glycan function. With the development of glycan-related databases and a glycan repository, bioinformatics approaches have attempted to predict the glycosylation pathway and the glycosylation sites on proteins. This chapter introduces these methods and related Web resources for understanding glycan function.
Collapse
|
28
|
Shao C, Feng Z, Westbrook JD, Peisach E, Berrisford J, Ikegawa Y, Kurisu G, Velankar S, Burley SK, Young JY. Modernized uniform representation of carbohydrate molecules in the Protein Data Bank. Glycobiology 2021; 31:1204-1218. [PMID: 33978738 PMCID: PMC8457362 DOI: 10.1093/glycob/cwab039] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 04/05/2021] [Accepted: 04/25/2021] [Indexed: 12/12/2022] Open
Abstract
Since 1971, the Protein Data Bank (PDB) has served as the single global archive for experimentally determined 3D structures of biological macromolecules made freely available to the global community according to the FAIR principles of Findability-Accessibility-Interoperability-Reusability. During the first 50 years of continuous PDB operations, standards for data representation have evolved to better represent rich and complex biological phenomena. Carbohydrate molecules present in more than 14,000 PDB structures have recently been reviewed and remediated to conform to a new standardized format. This machine-readable data representation for carbohydrates occurring in the PDB structures and the corresponding reference data improves the findability, accessibility, interoperability and reusability of structural information pertaining to these molecules. The PDB Exchange MacroMolecular Crystallographic Information File data dictionary now supports (i) standardized atom nomenclature that conforms to International Union of Pure and Applied Chemistry-International Union of Biochemistry and Molecular Biology (IUPAC-IUBMB) recommendations for carbohydrates, (ii) uniform representation of branched entities for oligosaccharides, (iii) commonly used linear descriptors of carbohydrates developed by the glycoscience community and (iv) annotation of glycosylation sites in proteins. For the first time, carbohydrates in PDB structures are consistently represented as collections of standardized monosaccharides, which precisely describe oligosaccharide structures and enable improved carbohydrate visualization, structure validation, robust quantitative and qualitative analyses, search for dendritic structures and classification. The uniform representation of carbohydrate molecules in the PDB described herein will facilitate broader usage of the resource by the glycoscience community and researchers studying glycoproteins.
Collapse
Affiliation(s)
- Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ 08903, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John Berrisford
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Yasuyo Ikegawa
- Protein Data Bank Japan (PDBj), Institute for Protein Research, Osaka University, Osaka 565-0871, Japan
| | - Genji Kurisu
- Protein Data Bank Japan (PDBj), Institute for Protein Research, Osaka University, Osaka 565-0871, Japan
| | - Sameer Velankar
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ 08903, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, San Diego, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
29
|
' All That Glitters Is Not Gold': High-Resolution Crystal Structures of Ligand-Protein Complexes Need Not Always Represent Confident Binding Poses. Int J Mol Sci 2021; 22:ijms22136830. [PMID: 34202053 PMCID: PMC8268033 DOI: 10.3390/ijms22136830] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 05/24/2021] [Accepted: 05/24/2021] [Indexed: 01/09/2023] Open
Abstract
Our understanding of the structure–function relationships of biomolecules and thereby applying it to drug discovery programs are substantially dependent on the availability of the structural information of ligand–protein complexes. However, the correct interpretation of the electron density of a small molecule bound to a crystal structure of a macromolecule is not trivial. Our analysis involving quality assessment of ~0.28 million small molecule–protein binding site pairs derived from crystal structures corresponding to ~66,000 PDB entries indicates that the majority (65%) of the pairs might need little (54%) or no (11%) attention. Out of the remaining 35% of pairs that need attention, 11% of the pairs (including structures with high/moderate resolution) pose serious concerns. Unfortunately, most users of crystal structures lack the training to evaluate the quality of a crystal structure against its experimental data and, in general, rely on the resolution as a ‘gold standard’ quality metric. Our work aims to sensitize the non-crystallographers that resolution, which is a global quality metric, need not be an accurate indicator of local structural quality. In this article, we demonstrate the use of several freely available tools that quantify local structural quality and are easy to use from a non-crystallographer’s perspective. We further propose a few solutions for consideration by the scientific community to promote quality research in structural biology and applied areas.
Collapse
|
30
|
Burley SK, Berman HM. Open-access data: A cornerstone for artificial intelligence approaches to protein structure prediction. Structure 2021; 29:515-520. [PMID: 33984281 PMCID: PMC8178243 DOI: 10.1016/j.str.2021.04.010] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 04/08/2021] [Accepted: 04/23/2021] [Indexed: 12/28/2022]
Abstract
The Protein Data Bank (PDB) was established in 1971 to archive three-dimensional (3D) structures of biological macromolecules as a public good. Fifty years later, the PDB is providing millions of data consumers around the world with open access to more than 175,000 experimentally determined structures of proteins and nucleic acids (DNA, RNA) and their complexes with one another and small-molecule ligands. PDB data users are working, teaching, and learning in fundamental biology, biomedicine, bioengineering, biotechnology, and energy sciences. They also represent the fields of agriculture, chemistry, physics and materials science, mathematics, statistics, computer science, and zoology, and even the social sciences. The enormous wealth of 3D structure data stored in the PDB has underpinned significant advances in our understanding of protein architecture, culminating in recent breakthroughs in protein structure prediction accelerated by artificial intelligence approaches and deep or machine learning methods.
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA 92093, USA.
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; The Bridge Institute, Michelson Center for Convergent Bioscience, University of Southern California, Los Angeles, CA 90089, USA.
| |
Collapse
|
31
|
Burley SK. Impact of structural biologists and the Protein Data Bank on small-molecule drug discovery and development. J Biol Chem 2021; 296:100559. [PMID: 33744282 PMCID: PMC8059052 DOI: 10.1016/j.jbc.2021.100559] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 02/02/2021] [Accepted: 03/16/2021] [Indexed: 12/12/2022] Open
Abstract
The Protein Data Bank (PDB) is an international core data resource central to fundamental biology, biomedicine, bioenergy, and biotechnology/bioengineering. Now celebrating its 50th anniversary, the PDB houses >175,000 experimentally determined atomic structures of proteins, nucleic acids, and their complexes with one another and small molecules and drugs. The importance of three-dimensional (3D) biostructure information for research and education obtains from the intimate link between molecular form and function evident throughout biology. Among the most prolific consumers of PDB data are biomedical researchers, who rely on the open access resource as the authoritative source of well-validated, expertly curated biostructures. This review recounts how the PDB grew from just seven protein structures to contain more than 49,000 structures of human proteins that have proven critical for understanding their roles in human health and disease. It then describes how these structures are used in academe and industry to validate drug targets, assess target druggability, characterize how tool compounds and other small-molecules bind to drug targets, guide medicinal chemistry optimization of binding affinity and selectivity, and overcome challenges during preclinical drug development. Three case studies drawn from oncology exemplify how structural biologists and open access to PDB structures impacted recent regulatory approvals of antineoplastic drugs.
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, New Jersey, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, California, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, USA.
| |
Collapse
|