1
|
Choudhary P, Feng Z, Berrisford J, Chao H, Ikegawa Y, Peisach E, Piehl DW, Smith J, Tanweer A, Varadi M, Westbrook JD, Young JY, Patwardhan A, Morris KL, Hoch JC, Kurisu G, Velankar S, Burley SK. PDB NextGen Archive: centralizing access to integrated annotations and enriched structural information by the Worldwide Protein Data Bank. Database (Oxford) 2024; 2024:baae041. [PMID: 38803272 DOI: 10.1093/database/baae041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/29/2024] [Accepted: 05/14/2024] [Indexed: 05/29/2024]
Abstract
The Protein Data Bank (PDB) is the global repository for public-domain experimentally determined 3D biomolecular structural information. The archival nature of the PDB presents certain challenges pertaining to updating or adding associated annotations from trusted external biodata resources. While each Worldwide PDB (wwPDB) partner has made best efforts to provide up-to-date external annotations, accessing and integrating information from disparate wwPDB data centers can be an involved process. To address this issue, the wwPDB has established the PDB Next Generation (or NextGen) Archive, developed to centralize and streamline access to enriched structural annotations from wwPDB partners and trusted external sources. At present, the NextGen Archive provides mappings between experimentally determined 3D structures of proteins and UniProt amino acid sequences, domain annotations from Pfam, SCOP2 and CATH databases and intra-molecular connectivity information. Since launch, the PDB NextGen Archive has seen substantial user engagement with over 3.5 million data file downloads, ensuring researchers have access to accurate, up-to-date and easily accessible structural annotations. Database URL: http://www.wwpdb.org/ftp/pdb-nextgen-archive-site.
Collapse
Affiliation(s)
- Preeti Choudhary
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - John Berrisford
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Henry Chao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - Yasuyo Ikegawa
- Protein Data Bank Japan, Protein Research Foundation, 3-2, Yamadaoka, Minoh, Osaka 562-8686, Japan
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - Dennis W Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - James Smith
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - Ahsan Tanweer
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
| | - Ardan Patwardhan
- The Electron Microscopy Data Bank, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Kyle L Morris
- The Electron Microscopy Data Bank, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, 263 Farmington Avenue, Farmington, CT 06030-3305, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Protein Research Foundation, 3-2, Yamadaoka, Minoh, Osaka 562-8686, Japan
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Rd., Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, 195 Little Albany St., New Brunswick, NJ 08901, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 123 Bevier Rd., Piscataway, NJ 08854, USA
| |
Collapse
|
2
|
Vijayanathan M, Vadakkepat AK, Mahendran KR, Sharaf A, Frandsen KEH, Bandyopadhyay D, Pillai MR, Soniya EV. Structural and mechanistic insights into Quinolone Synthase to address its functional promiscuity. Commun Biol 2024; 7:566. [PMID: 38745065 PMCID: PMC11093982 DOI: 10.1038/s42003-024-06152-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 04/07/2024] [Indexed: 05/16/2024] Open
Abstract
Quinolone synthase from Aegle marmelos (AmQNS) is a type III polyketide synthase that yields therapeutically effective quinolone and acridone compounds. Addressing the structural and molecular underpinnings of AmQNS and its substrate interaction in terms of its high selectivity and specificity can aid in the development of numerous novel compounds. This paper presents a high-resolution AmQNS crystal structure and explains its mechanistic role in synthetic selectivity. Additionally, we provide a model framework to comprehend structural constraints on ketide insertion and postulate that AmQNS's steric and electrostatic selectivity plays a role in its ability to bind to various core substrates, resulting in its synthetic diversity. AmQNS prefers quinolone synthesis and can accommodate large substrates because of its wide active site entrance. However, our research suggests that acridone is exclusively synthesized in the presence of high malonyl-CoA concentrations. Potential implications of functionally relevant residue mutations were also investigated, which will assist in harnessing the benefits of mutations for targeted polyketide production. The pharmaceutical industry stands to gain from these findings as they expand the pool of potential drug candidates, and these methodologies can also be applied to additional promising enzymes.
Collapse
Affiliation(s)
- Mallika Vijayanathan
- Transdisciplinary Research Program, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, 695014, India
- Department of Plant and Environment Sciences, University of Copenhagen, 1871, Frederiksberg C, Denmark
| | - Abhinav Koyamangalath Vadakkepat
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
- Department of Molecular and Cell Biology, University of Leicester, Henry Wellcome Building, Lancaster Road, Leicester, LE17HB, UK
| | - Kozhinjampara R Mahendran
- Transdisciplinary Research Program, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, 695014, India
| | - Abdoallah Sharaf
- SequAna Core Facility, Department of Biology, University of Konstanz, Konstanz, Germany
- Genetic Department, Faculty of Agriculture, Ain Shams University, Cairo, 11241, Egypt
| | - Kristian E H Frandsen
- Department of Plant and Environment Sciences, University of Copenhagen, 1871, Frederiksberg C, Denmark
| | - Debashree Bandyopadhyay
- Department of Biological Sciences, Birla Institute of Technology and Science, Hyderabad, India
| | - M Radhakrishna Pillai
- Cancer Research Program, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, 695014, India
| | - Eppurath Vasudevan Soniya
- Transdisciplinary Research Program, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, 695014, India.
| |
Collapse
|
3
|
Burley SK, Piehl DW, Vallat B, Zardecki C. RCSB Protein Data Bank: supporting research and education worldwide through explorations of experimentally determined and computationally predicted atomic level 3D biostructures. IUCRJ 2024; 11:279-286. [PMID: 38597878 PMCID: PMC11067742 DOI: 10.1107/s2052252524002604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 03/19/2024] [Indexed: 04/11/2024]
Abstract
The Protein Data Bank (PDB) was established as the first open-access digital data resource in biology and medicine in 1971 with seven X-ray crystal structures of proteins. Today, the PDB houses >210 000 experimentally determined, atomic level, 3D structures of proteins and nucleic acids as well as their complexes with one another and small molecules (e.g. approved drugs, enzyme cofactors). These data provide insights into fundamental biology, biomedicine, bioenergy and biotechnology. They proved particularly important for understanding the SARS-CoV-2 global pandemic. The US-funded Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) and other members of the Worldwide Protein Data Bank (wwPDB) partnership jointly manage the PDB archive and support >60 000 `data depositors' (structural biologists) around the world. wwPDB ensures the quality and integrity of the data in the ever-expanding PDB archive and supports global open access without limitations on data usage. The RCSB PDB research-focused web portal at https://www.rcsb.org/ (RCSB.org) supports millions of users worldwide, representing a broad range of expertise and interests. In addition to retrieving 3D structure data, PDB `data consumers' access comparative data and external annotations, such as information about disease-causing point mutations and genetic variations. RCSB.org also provides access to >1 000 000 computed structure models (CSMs) generated using artificial intelligence/machine-learning methods. To avoid doubt, the provenance and reliability of experimentally determined PDB structures and CSMs are identified. Related training materials are available to support users in their RCSB.org explorations.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Biology Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
4
|
Baskaran K, Ploskon E, Tejero R, Yokochi M, Harrus D, Liang Y, Peisach E, Persikova I, Ramelot TA, Sekharan M, Tolchard J, Westbrook JD, Bardiaux B, Schwieters CD, Patwardhan A, Velankar S, Burley SK, Kurisu G, Hoch JC, Montelione GT, Vuister GW, Young JY. Restraint validation of biomolecular structures determined by NMR in the Protein Data Bank. Structure 2024:S0969-2126(24)00050-9. [PMID: 38490206 DOI: 10.1016/j.str.2024.02.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 01/13/2024] [Accepted: 02/19/2024] [Indexed: 03/17/2024]
Abstract
Biomolecular structure analysis from experimental NMR studies generally relies on restraints derived from a combination of experimental and knowledge-based data. A challenge for the structural biology community has been a lack of standards for representing these restraints, preventing the establishment of uniform methods of model-vs-data structure validation against restraints and limiting interoperability between restraint-based structure modeling programs. The NEF and NMR-STAR formats provide a standardized approach for representing commonly used NMR restraints. Using these restraint formats, a standardized validation system for assessing structural models of biopolymers against restraints has been developed and implemented in the wwPDB OneDep data deposition-validation-biocuration system. The resulting wwPDB restraint violation report provides a model vs. data assessment of biomolecule structures determined using distance and dihedral restraints, with extensions to other restraint types currently being implemented. These tools are useful for assessing NMR models, as well as for assessing biomolecular structure predictions based on distance restraints.
Collapse
Affiliation(s)
- Kumaran Baskaran
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA.
| | - Eliza Ploskon
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, UK
| | - Roberto Tejero
- Departamento de Quίmica Fίsica, Universidad de Valencia, Dr. Moliner, 50 46100 Burjassot, Valencia, Spain
| | - Masashi Yokochi
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan; Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Deborah Harrus
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Theresa A Ramelot
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - James Tolchard
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Benjamin Bardiaux
- Department of Structural Biology and Chemistry, Institut Pasteur, Université Paris Cité, CNRS UMR3528, 75015 Paris, France
| | - Charles D Schwieters
- Computational Biomolecular Magnetic Resonance Core, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD 20892, USA
| | - Ardan Patwardhan
- The Electron Microscopy Data Bank, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, La Jolla, CA, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan; Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Gaetano T Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA.
| | - Geerten W Vuister
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, UK.
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
5
|
Baskaran K, Ploskon E, Tejero R, Yokochi M, Harrus D, Liang Y, Peisach E, Persikova I, Ramelot TA, Sekharan M, Tolchard J, Westbrook JD, Bardiaux B, Schwieters CD, Patwardhan A, Velankar S, Burley SK, Kurisu G, Hoch JC, Montelione GT, Vuister GW, Young JY. Restraint Validation of Biomolecular Structures Determined by NMR in the Protein Data Bank. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.15.575520. [PMID: 38328042 PMCID: PMC10849500 DOI: 10.1101/2024.01.15.575520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Biomolecular structure analysis from experimental NMR studies generally relies on restraints derived from a combination of experimental and knowledge-based data. A challenge for the structural biology community has been a lack of standards for representing these restraints, preventing the establishment of uniform methods of model-vs-data structure validation against restraints and limiting interoperability between restraint-based structure modeling programs. The NMR exchange (NEF) and NMR-STAR formats provide a standardized approach for representing commonly used NMR restraints. Using these restraint formats, a standardized validation system for assessing structural models of biopolymers against restraints has been developed and implemented in the wwPDB OneDep data deposition-validation-biocuration system. The resulting wwPDB Restraint Violation Report provides a model vs. data assessment of biomolecule structures determined using distance and dihedral restraints, with extensions to other restraint types currently being implemented. These tools are useful for assessing NMR models, as well as for assessing biomolecular structure predictions based on distance restraints.
Collapse
Affiliation(s)
- Kumaran Baskaran
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Eliza Ploskon
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, United Kingdom
| | - Roberto Tejero
- Departamento de Quίmica Fίsica, Universidad de Valencia, Dr. Moliner, 50 46100-Burjassot, Valencia, Spain
| | - Masashi Yokochi
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
- Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Deborah Harrus
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Theresa A Ramelot
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - James Tolchard
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Benjamin Bardiaux
- Department of Structural Biology and Chemistry, Institut Pasteur, Université Paris Cité, CNRS UMR3528, 75015 Paris, France
| | - Charles D Schwieters
- Computational Biomolecular Magnetic Resonance Core, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD 20892, USA
| | - Ardan Patwardhan
- The Electron Microscopy Data Bank, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Sameer Velankar
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, California, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
- Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Gaetano T Montelione
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Geerten W Vuister
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, United Kingdom
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
6
|
Turner J, Abbott S, Fonseca N, Pye R, Carrijo L, Duraisamy AK, Salih O, Wang Z, Kleywegt GJ, Morris KL, Patwardhan A, Burley SK, Crichlow G, Feng Z, Flatt JW, Ghosh S, Hudson BP, Lawson CL, Liang Y, Peisach E, Persikova I, Sekharan M, Shao C, Young J, Velankar S, Armstrong D, Bage M, Bueno WM, Evans G, Gaborova R, Ganguly S, Gupta D, Harrus D, Tanweer A, Bansal M, Rangannan V, Kurisu G, Cho H, Ikegawa Y, Kengaku Y, Kim JY, Niwa S, Sato J, Takuwa A, Yu J, Hoch JC, Baskaran K, Xu W, Zhang W, Ma X. EMDB-the Electron Microscopy Data Bank. Nucleic Acids Res 2024; 52:D456-D465. [PMID: 37994703 PMCID: PMC10767987 DOI: 10.1093/nar/gkad1019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 10/18/2023] [Accepted: 10/20/2023] [Indexed: 11/24/2023] Open
Abstract
The Electron Microscopy Data Bank (EMDB) is the global public archive of three-dimensional electron microscopy (3DEM) maps of biological specimens derived from transmission electron microscopy experiments. As of 2021, EMDB is managed by the Worldwide Protein Data Bank consortium (wwPDB; wwpdb.org) as a wwPDB Core Archive, and the EMDB team is a core member of the consortium. Today, EMDB houses over 30 000 entries with maps containing macromolecules, complexes, viruses, organelles and cells. Herein, we provide an overview of the rapidly growing EMDB archive, including its current holdings, recent updates, and future plans.
Collapse
|
7
|
Xu W, Velankar S, Patwardhan A, Hoch JC, Burley SK, Kurisu G. Announcing the launch of Protein Data Bank China as an Associate Member of the Worldwide Protein Data Bank Partnership. Acta Crystallogr D Struct Biol 2023; 79:792-795. [PMID: 37561405 PMCID: PMC10478634 DOI: 10.1107/s2059798323006381] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 07/21/2023] [Indexed: 08/11/2023] Open
Abstract
The Protein Data Bank (PDB) is the single global archive of atomic-level, three-dimensional structures of biological macromolecules experimentally determined by macromolecular crystallography, nuclear magnetic resonance spectroscopy or three-dimensional cryo-electron microscopy. The PDB is growing continuously, with a recent rapid increase in new structure depositions from Asia. In 2022, the Worldwide Protein Data Bank (wwPDB; https://www.wwpdb.org/) partners welcomed Protein Data Bank China (PDBc; https://www.pdbc.org.cn) to the organization as an Associate Member. PDBc is based in the National Facility for Protein Science in Shanghai which is associated with the Shanghai Advanced Research Institute of Chinese Academy of Sciences, the Shanghai Institute for Advanced Immunochemical Studies and the iHuman Institute of ShanghaiTech University. This letter describes the history of the wwPDB, recently established mechanisms for adding new wwPDB data centers and the processes developed to bring PDBc into the partnership.
Collapse
Affiliation(s)
- Wenqing Xu
- Protein Data Bank China, ShanghaiTech University and National Facility for Protein Science in Shanghai, Shanghai, People’s Republic of China
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Ardan Patwardhan
- Electron Microscopy Data Bank, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Jeffrey C. Hoch
- Biological Magnetic Resonance Data Bank, UConn Health, Farmington, CT 06030-3305, USA
| | - Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Biology Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Osaka 565-0871, Japan
- Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| |
Collapse
|
8
|
Choudhary P, Anyango S, Berrisford J, Tolchard J, Varadi M, Velankar S. Unified access to up-to-date residue-level annotations from UniProtKB and other biological databases for PDB data. Sci Data 2023; 10:204. [PMID: 37045837 PMCID: PMC10097656 DOI: 10.1038/s41597-023-02101-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 03/23/2023] [Indexed: 04/14/2023] Open
Abstract
More than 61,000 proteins have up-to-date correspondence between their amino acid sequence (UniProtKB) and their 3D structures (PDB), enabled by the Structure Integration with Function, Taxonomy and Sequences (SIFTS) resource. SIFTS incorporates residue-level annotations from many other biological resources. SIFTS data is available in various formats like XML, CSV and TSV format or also accessible via the PDBe REST API but always maintained separately from the structure data (PDBx/mmCIF file) in the PDB archive. Here, we extended the wwPDB PDBx/mmCIF data dictionary with additional categories to accommodate SIFTS data and added the UniProtKB, Pfam, SCOP2, and CATH residue-level annotations directly into the PDBx/mmCIF files from the PDB archive. With the integrated UniProtKB annotations, these files now provide consistent numbering of residues in different PDB entries allowing easy comparison of structure models. The extended dictionary yields a more consistent, standardised metadata description without altering the core PDB information. This development enables up-to-date cross-reference information at the residue level resulting in better data interoperability, supporting improved data analysis and visualisation.
Collapse
Grants
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley) National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley NSF | National Science Board (NSB)
Collapse
Affiliation(s)
- Preeti Choudhary
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Stephen Anyango
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - John Berrisford
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- AstraZeneca, Biomedical Campus, 1 Francis Crick Ave, Trumpington, Cambridge, CB2 0AA, UK
| | - James Tolchard
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- Claude Bernard University, Villeurbanne, Lyon, 69100, France
| | - Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
9
|
Burley SK, Bhikadiya C, Bi C, Bittrich S, Chao H, Chen L, Craig PA, Crichlow GV, Dalenberg K, Duarte JM, Dutta S, Fayazi M, Feng Z, Flatt JW, Ganesan S, Ghosh S, Goodsell DS, Green RK, Guranovic V, Henry J, Hudson BP, Khokhriakov I, Lawson CL, Liang Y, Lowe R, Peisach E, Persikova I, Piehl DW, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Vallat B, Voigt M, Webb B, Westbrook JD, Whetstone S, Young JY, Zalevsky A, Zardecki C. RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Res 2023; 51:D488-D508. [PMID: 36420884 PMCID: PMC9825554 DOI: 10.1093/nar/gkac1077] [Citation(s) in RCA: 119] [Impact Index Per Article: 119.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 10/17/2022] [Accepted: 11/02/2022] [Indexed: 11/27/2022] Open
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), founding member of the Worldwide Protein Data Bank (wwPDB), is the US data center for the open-access PDB archive. As wwPDB-designated Archive Keeper, RCSB PDB is also responsible for PDB data security. Annually, RCSB PDB serves >10 000 depositors of three-dimensional (3D) biostructures working on all permanently inhabited continents. RCSB PDB delivers data from its research-focused RCSB.org web portal to many millions of PDB data consumers based in virtually every United Nations-recognized country, territory, etc. This Database Issue contribution describes upgrades to the research-focused RCSB.org web portal that created a one-stop-shop for open access to ∼200 000 experimentally-determined PDB structures of biological macromolecules alongside >1 000 000 incorporated Computed Structure Models (CSMs) predicted using artificial intelligence/machine learning methods. RCSB.org is a 'living data resource.' Every PDB structure and CSM is integrated weekly with related functional annotations from external biodata resources, providing up-to-date information for the entire corpus of 3D biostructure data freely available from RCSB.org with no usage limitations. Within RCSB.org, PDB structures and the CSMs are clearly identified as to their provenance and reliability. Both are fully searchable, and can be analyzed and visualized using the full complement of RCSB.org web portal capabilities.
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Henry Chao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Li Chen
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Paul A Craig
- School of Chemistry and Materials Science, Rochester Institute of Technology, Rochester, NY 14623, USA
| | - Gregg V Crichlow
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Kenneth Dalenberg
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Maryam Fayazi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Justin W Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Sai Ganesan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Sutapa Ghosh
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - David S Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Rachel Kramer Green
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Vladimir Guranovic
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jeremy Henry
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Brian P Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Igor Khokhriakov
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Catherine L Lawson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dennis W Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ben Webb
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Shamara Whetstone
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Arthur Zalevsky
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
10
|
Burley SK, Bhikadiya C, Bi C, Bittrich S, Chao H, Chen L, Craig PA, Crichlow GV, Dalenberg K, Duarte JM, Dutta S, Fayazi M, Feng Z, Flatt JW, Ganesan SJ, Ghosh S, Goodsell DS, Green RK, Guranovic V, Henry J, Hudson BP, Khokhriakov I, Lawson CL, Liang Y, Lowe R, Peisach E, Persikova I, Piehl DW, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Vallat B, Voigt M, Webb B, Westbrook JD, Whetstone S, Young JY, Zalevsky A, Zardecki C. RCSB Protein Data bank: Tools for visualizing and understanding biological macromolecules in 3D. Protein Sci 2022; 31:e4482. [PMID: 36281733 PMCID: PMC9667899 DOI: 10.1002/pro.4482] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 10/17/2022] [Accepted: 10/19/2022] [Indexed: 12/14/2022]
Abstract
Now in its 52nd year of continuous operations, the Protein Data Bank (PDB) is the premiere open-access global archive housing three-dimensional (3D) biomolecular structure data. It is jointly managed by the Worldwide Protein Data Bank (wwPDB) partnership. The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) is funded by the National Science Foundation, National Institutes of Health, and US Department of Energy and serves as the US data center for the wwPDB. RCSB PDB is also responsible for the security of PDB data in its role as wwPDB-designated Archive Keeper. Every year, RCSB PDB serves tens of thousands of depositors of 3D macromolecular structure data (coming from macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction). The RCSB PDB research-focused web portal (RCSB.org) makes PDB data available at no charge and without usage restrictions to many millions of PDB data consumers around the world. The RCSB PDB training, outreach, and education web portal (PDB101.RCSB.org) serves nearly 700 K educators, students, and members of the public worldwide. This invited Tools Issue contribution describes how RCSB PDB (i) is organized; (ii) works with wwPDB partners to process new depositions; (iii) serves as the wwPDB-designated Archive Keeper; (iv) enables exploration and 3D visualization of PDB data via RCSB.org; and (v) supports training, outreach, and education via PDB101.RCSB.org. New tools and features at RCSB.org are presented using examples drawn from high-resolution structural studies of proteins relevant to treatment of human cancers by targeting immune checkpoints.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA,Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA,Department of Chemistry and Chemical Biology, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Henry Chao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Li Chen
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Paul A. Craig
- School of Chemistry and Materials ScienceRochester Institute of TechnologyRochesterNew YorkUSA
| | - Gregg V. Crichlow
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Kenneth Dalenberg
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jose M. Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
| | - Maryam Fayazi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Justin W. Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Sai J. Ganesan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA,Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - Sutapa Ghosh
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - David S. Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA,Department of Integrative Structural and Computational BiologyThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Rachel Kramer Green
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Vladimir Guranovic
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jeremy Henry
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Brian P. Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Igor Khokhriakov
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Catherine L. Lawson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA,Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Benjamin Webb
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA,Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - John D. Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Shamara Whetstone
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jasmine Y. Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Arthur Zalevsky
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA,Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| |
Collapse
|
11
|
Shao C, Bittrich S, Wang S, Burley SK. Assessing PDB macromolecular crystal structure confidence at the individual amino acid residue level. Structure 2022; 30:1385-1394.e3. [PMID: 36049478 PMCID: PMC9547844 DOI: 10.1016/j.str.2022.08.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/24/2022] [Accepted: 08/05/2022] [Indexed: 11/22/2022]
Abstract
Approximately 87% of the more than 190,000 atomic-level three-dimensional (3D) biostructures in the PDB were determined using macromolecular crystallography (MX). Agreement between 3D atomic coordinates and experimental data for >100 million individual amino acid residues occurring within ∼150,000 PDB MX structures was analyzed in detail. The real-space correlation coefficient (RSCC) calculated using the 3D atomic coordinates for each residue and experimental-data-derived electron density enables outlier detection of unreliable atomic coordinates (particularly important for poorly resolved side-chain atoms) and ready evaluation of local structure quality by PDB users. For human protein MX structures in PDB, comparisons of the per-residue RSCC metric with AlphaFold2-computed structure model confidence (pLDDT-predicted local distance difference test) document (1) that RSCC values and pLDDT scores are correlated (median correlation coefficient ∼0.41), and (2) that experimentally determined MX structures (3.5 Å resolution or better) are more reliable than AlphaFold2-computed structure models and should be used preferentially whenever possible.
Collapse
Affiliation(s)
- Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Sijian Wang
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Statistics, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA; Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ 08903, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
12
|
Burley SK, Berman HM, Duarte JM, Feng Z, Flatt JW, Hudson BP, Lowe R, Peisach E, Piehl DW, Rose Y, Sali A, Sekharan M, Shao C, Vallat B, Voigt M, Westbrook JD, Young JY, Zardecki C. Protein Data Bank: A Comprehensive Review of 3D Structure Holdings and Worldwide Utilization by Researchers, Educators, and Students. Biomolecules 2022; 12:1425. [PMID: 36291635 PMCID: PMC9599165 DOI: 10.3390/biom12101425] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 09/23/2022] [Accepted: 09/26/2022] [Indexed: 11/18/2022] Open
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the United States National Science Foundation, National Institutes of Health, and Department of Energy, supports structural biologists and Protein Data Bank (PDB) data users around the world. The RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, serves as the US data center for the global PDB archive housing experimentally-determined three-dimensional (3D) structure data for biological macromolecules. As the wwPDB-designated Archive Keeper, RCSB PDB is also responsible for the security of PDB data and weekly update of the archive. RCSB PDB serves tens of thousands of data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) annually working on all permanently inhabited continents. RCSB PDB makes PDB data available from its research-focused web portal at no charge and without usage restrictions to many millions of PDB data consumers around the globe. It also provides educators, students, and the general public with an introduction to the PDB and related training materials through its outreach and education-focused web portal. This review article describes growth of the PDB, examines evolution of experimental methods for structure determination viewed through the lens of the PDB archive, and provides a detailed accounting of PDB archival holdings and their utilization by researchers, educators, and students worldwide.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Helen M. Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jose M. Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Justin W. Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brian P. Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D. Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Jasmine Y. Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
13
|
Shpilker P, Freeman J, McKelvie H, Ashey J, Fonticella JM, Putnam H, Greenberg J, Cowen L, Couch A, Daniels NM. MEDFORD: A human- and machine-readable metadata markup language. Database (Oxford) 2022; 2022:6670690. [PMID: 35976727 PMCID: PMC9384832 DOI: 10.1093/database/baac065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 06/16/2022] [Accepted: 08/09/2022] [Indexed: 11/23/2022]
Abstract
Reproducibility of research is essential for science. However, in the way modern computational biology research is done, it is easy to lose track of small, but extremely critical, details. Key details, such as the specific version of a software used or iteration of a genome can easily be lost in the shuffle or perhaps not noted at all. Much work is being done on the database and storage side of things, ensuring that there exists a space-to-store experiment-specific details, but current mechanisms for recording details are cumbersome for scientists to use. We propose a new metadata description language, named MEtaData Format for Open Reef Data (MEDFORD), in which scientists can record all details relevant to their research. Being human-readable, easily editable and templatable, MEDFORD serves as a collection point for all notes that a researcher could find relevant to their research, be it for internal use or for future replication. MEDFORD has been applied to coral research, documenting research from RNA-seq analyses to photo collections.
Collapse
Affiliation(s)
- Polina Shpilker
- Department of Computer Science, Tufts University , 177 College Ave, 02155, MA, USA
| | - John Freeman
- Department of Computer Science, Tufts University , 177 College Ave, 02155, MA, USA
| | - Hailey McKelvie
- Department of Computer Science, Tufts University , 177 College Ave, 02155, MA, USA
| | - Jill Ashey
- Department of Biological Sciences, University of Rhode Island , 120 Flagg Rd, 02881, RI, USA
| | | | - Hollie Putnam
- Department of Biological Sciences, University of Rhode Island , 120 Flagg Rd, 02881, RI, USA
| | - Jane Greenberg
- Metadata Research Center, College of Computing & Informatics, Drexel University , 3675 Market Street, 19104, PA, USA
| | - Lenore Cowen
- Department of Computer Science, Tufts University , 177 College Ave, 02155, MA, USA
| | - Alva Couch
- Department of Computer Science, Tufts University , 177 College Ave, 02155, MA, USA
| | - Noah M Daniels
- Department of Computer Science and Statistics, University of Rhode Island , 9 Greenhouse Rd, 02881, RI, USA
| |
Collapse
|
14
|
Exploring protein symmetry at the RCSB Protein Data Bank. Emerg Top Life Sci 2022; 6:231-243. [PMID: 35801924 PMCID: PMC9472815 DOI: 10.1042/etls20210267] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 06/15/2022] [Accepted: 06/20/2022] [Indexed: 11/17/2022]
Abstract
The symmetry of biological molecules has fascinated structural biologists ever since the structure of hemoglobin was determined. The Protein Data Bank (PDB) archive is the central global archive of three-dimensional (3D), atomic-level structures of biomolecules, providing open access to the results of structural biology research with no limitations on usage. Roughly 40% of the structures in the archive exhibit some type of symmetry, including formal global symmetry, local symmetry, or pseudosymmetry. The Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (founding member of the Worldwide Protein Data Bank partnership that jointly manages, curates, and disseminates the archive) provides a variety of tools to assist users interested in exploring the symmetry of biological macromolecules. These tools include multiple modalities for searching and browsing the archive, turnkey methods for biomolecular visualization, documentation, and outreach materials for exploring functional biomolecular symmetry.
Collapse
|
15
|
Simplified quality assessment for small-molecule ligands in the Protein Data Bank. Structure 2022; 30:252-262.e4. [PMID: 35026162 PMCID: PMC8849442 DOI: 10.1016/j.str.2021.10.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 09/14/2021] [Accepted: 10/06/2021] [Indexed: 02/05/2023]
Abstract
More than 70% of the experimentally determined macromolecular structures in the Protein Data Bank (PDB) contain small-molecule ligands. Quality indicators of ∼643,000 ligands present in ∼106,000 PDB X-ray crystal structures have been analyzed. Ligand quality varies greatly with regard to goodness of fit between ligand structure and experimental data, deviations in bond lengths and angles from known chemical structures, and inappropriate interatomic clashes between the ligand and its surroundings. Based on principal component analysis, correlated quality indicators of ligand structure have been aggregated into two largely orthogonal composite indicators measuring goodness of fit to experimental data and deviation from ideal chemical structure. Ranking of the composite quality indicators across the PDB archive enabled construction of uniformly distributed composite ranking score. This score is implemented at RCSB.org to compare chemically identical ligands in distinct PDB structures with easy-to-interpret two-dimensional ligand quality plots, allowing PDB users to quickly assess ligand structure quality and select the best exemplars.
Collapse
|
16
|
Baskaran K, Craft DL, Eghbalnia HR, Gryk MR, Hoch JC, Maciejewski MW, Schuyler AD, Wedell JR, Wilburn CW. Merging NMR Data and Computation Facilitates Data-Centered Research. Front Mol Biosci 2022; 8:817175. [PMID: 35111815 PMCID: PMC8802229 DOI: 10.3389/fmolb.2021.817175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 12/23/2021] [Indexed: 12/01/2022] Open
Abstract
The Biological Magnetic Resonance Data Bank (BMRB) has served the NMR structural biology community for 40 years, and has been instrumental in the development of many widely-used tools. It fosters the reuse of data resources in structural biology by embodying the FAIR data principles (Findable, Accessible, Inter-operable, and Re-usable). NMRbox is less than a decade old, but complements BMRB by providing NMR software and high-performance computing resources, facilitating the reuse of software resources. BMRB and NMRbox both facilitate reproducible research. NMRbox also fosters the development and deployment of complex meta-software. Combining BMRB and NMRbox helps speed and simplify workflows that utilize BMRB, and enables facile federation of BMRB with other data repositories. Utilization of BMRB and NMRbox in tandem will enable additional advances, such as machine learning, that are poised to become increasingly powerful.
Collapse
Affiliation(s)
| | | | - Hamid R. Eghbalnia
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT, United States
| | | | - Jeffrey C. Hoch
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT, United States
| | | | | | | | | |
Collapse
|
17
|
Burley SK, Bhikadiya C, Bi C, Bittrich S, Chen L, Crichlow GV, Duarte JM, Dutta S, Fayazi M, Feng Z, Flatt JW, Ganesan SJ, Goodsell DS, Ghosh S, Kramer Green R, Guranovic V, Henry J, Hudson BP, Lawson CL, Liang Y, Lowe R, Peisach E, Persikova I, Piehl DW, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Vallat B, Voigt M, Westbrook JD, Whetstone S, Young JY, Zardecki C. RCSB Protein Data Bank: Celebrating 50 years of the PDB with new tools for understanding and visualizing biological macromolecules in 3D. Protein Sci 2022; 31:187-208. [PMID: 34676613 PMCID: PMC8740825 DOI: 10.1002/pro.4213] [Citation(s) in RCA: 63] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 10/12/2021] [Accepted: 10/12/2021] [Indexed: 01/03/2023]
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the US National Science Foundation, National Institutes of Health, and Department of Energy, has served structural biologists and Protein Data Bank (PDB) data consumers worldwide since 1999. RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, is the US data center for the global PDB archive housing biomolecular structure data. RCSB PDB is also responsible for the security of PDB data, as the wwPDB-designated Archive Keeper. Annually, RCSB PDB serves tens of thousands of three-dimensional (3D) macromolecular structure data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) from all inhabited continents. RCSB PDB makes PDB data available from its research-focused RCSB.org web portal at no charge and without usage restrictions to millions of PDB data consumers working in every nation and territory worldwide. In addition, RCSB PDB operates an outreach and education PDB101.RCSB.org web portal that was used by more than 800,000 educators, students, and members of the public during calendar year 2020. This invited Tools Issue contribution describes (i) how the archive is growing and evolving as new experimental methods generate ever larger and more complex biomolecular structures; (ii) the importance of data standards and data remediation in effective management of the archive and facile integration with more than 50 external data resources; and (iii) new tools and features for 3D structure analysis and visualization made available during the past year via the RCSB.org web portal.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New JerseyRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
- Department of Chemistry and Chemical BiologyRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Li Chen
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Gregg V. Crichlow
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jose M. Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New JerseyRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
| | - Maryam Fayazi
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Justin W. Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Sai J. Ganesan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences InstituteUniversity of CaliforniaSan FranciscoCaliforniaUSA
| | - David S. Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New JerseyRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
- Department of Integrative Structural and Computational BiologyThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Sutapa Ghosh
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Rachel Kramer Green
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Vladimir Guranovic
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jeremy Henry
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Brian P. Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Catherine L. Lawson
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences InstituteUniversity of CaliforniaSan FranciscoCaliforniaUSA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of CaliforniaLa JollaCaliforniaUSA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - John D. Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New JerseyRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
| | - Shamara Whetstone
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jasmine Y. Young
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| |
Collapse
|
18
|
Zardecki C, Dutta S, Goodsell DS, Lowe R, Voigt M, Burley SK. PDB-101: Educational resources supporting molecular explorations through biology and medicine. Protein Sci 2022; 31:129-140. [PMID: 34601771 PMCID: PMC8740840 DOI: 10.1002/pro.4200] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 09/24/2021] [Accepted: 09/28/2021] [Indexed: 01/03/2023]
Abstract
The Protein Data Bank (PDB) archive is a rich source of information in the form of atomic-level three-dimensional (3D) structures of biomolecules experimentally determined using macromolecular crystallography, nuclear magnetic resonance (NMR) spectroscopy, and electron microscopy (3DEM). Originally established in 1971 as a resource for protein crystallographers to freely exchange data, today PDB data drive research and education across scientific disciplines. In 2011, the online portal PDB-101 was launched to support teachers, students, and the general public in PDB archive exploration (pdb101.rcsb.org). Maintained by the Research Collaboratory for Structural Bioinformatics PDB, PDB-101 aims to help train the next generation of PDB users and to promote the overall importance of structural biology and protein science to nonexperts. Regularly published features include the highly popular Molecule of the Month series, 3D model activities, molecular animation videos, and educational curricula. Materials are organized into various categories (Health and Disease, Molecules of Life, Biotech and Nanotech, and Structures and Structure Determination) and searchable by keyword. A biennial health focus frames new resource creation and provides topics for annual video challenges for high school students. Web analytics document that PDB-101 materials relating to fundamental topics (e.g., hemoglobin, catalase) are highly accessed year-on-year. In addition, PDB-101 materials created in response to topical health matters (e.g., Zika, measles, coronavirus) are well received. PDB-101 shows how learning about the diverse shapes and functions of PDB structures promotes understanding of all aspects of biology, from the central dogma of biology to health and disease to biological energy.
Collapse
Affiliation(s)
- Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Rutgers Cancer Institute of New JerseyRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
| | - David S. Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Rutgers Cancer Institute of New JerseyRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA,Department of Integrative Structural and Computational BiologyThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data BankRutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayNew JerseyUSA,Rutgers Cancer Institute of New JerseyRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA,Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer CenterUniversity of California San DiegoLa JollaCaliforniaUSA,Department of Chemistry and Chemical BiologyRutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| |
Collapse
|
19
|
Shao C, Feng Z, Westbrook JD, Peisach E, Berrisford J, Ikegawa Y, Kurisu G, Velankar S, Burley SK, Young JY. Modernized uniform representation of carbohydrate molecules in the Protein Data Bank. Glycobiology 2021; 31:1204-1218. [PMID: 33978738 PMCID: PMC8457362 DOI: 10.1093/glycob/cwab039] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 04/05/2021] [Accepted: 04/25/2021] [Indexed: 12/12/2022] Open
Abstract
Since 1971, the Protein Data Bank (PDB) has served as the single global archive for experimentally determined 3D structures of biological macromolecules made freely available to the global community according to the FAIR principles of Findability-Accessibility-Interoperability-Reusability. During the first 50 years of continuous PDB operations, standards for data representation have evolved to better represent rich and complex biological phenomena. Carbohydrate molecules present in more than 14,000 PDB structures have recently been reviewed and remediated to conform to a new standardized format. This machine-readable data representation for carbohydrates occurring in the PDB structures and the corresponding reference data improves the findability, accessibility, interoperability and reusability of structural information pertaining to these molecules. The PDB Exchange MacroMolecular Crystallographic Information File data dictionary now supports (i) standardized atom nomenclature that conforms to International Union of Pure and Applied Chemistry-International Union of Biochemistry and Molecular Biology (IUPAC-IUBMB) recommendations for carbohydrates, (ii) uniform representation of branched entities for oligosaccharides, (iii) commonly used linear descriptors of carbohydrates developed by the glycoscience community and (iv) annotation of glycosylation sites in proteins. For the first time, carbohydrates in PDB structures are consistently represented as collections of standardized monosaccharides, which precisely describe oligosaccharide structures and enable improved carbohydrate visualization, structure validation, robust quantitative and qualitative analyses, search for dendritic structures and classification. The uniform representation of carbohydrate molecules in the PDB described herein will facilitate broader usage of the resource by the glycoscience community and researchers studying glycoproteins.
Collapse
Affiliation(s)
- Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ 08903, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John Berrisford
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Yasuyo Ikegawa
- Protein Data Bank Japan (PDBj), Institute for Protein Research, Osaka University, Osaka 565-0871, Japan
| | - Genji Kurisu
- Protein Data Bank Japan (PDBj), Institute for Protein Research, Osaka University, Osaka 565-0871, Japan
| | - Sameer Velankar
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ 08903, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, San Diego, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
20
|
Conformational changes of α-helical peptides with different hydrophobic residues induced by metal-ion binding. Biophys Chem 2021; 277:106661. [PMID: 34388679 DOI: 10.1016/j.bpc.2021.106661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 07/08/2021] [Accepted: 07/28/2021] [Indexed: 11/22/2022]
Abstract
We designed peptides that formed helix bundle structures upon binding of the metal-ions to His residues to form a stable hydrophobic core, in order to analyze the effects of Ala, Val, Ile, and Leu residues, located in the hydrophobic core, together with His, on the conformational changes in respective peptides designated as HA, HV, HI, and HL, respectively. Circular dichroism measurements showed that HV and HI changed from random coil to helix bundle structures upon Zn2+ binding, similar to that observed for HA, while HL existed in the helix bundle structure even in the absence of Zn2+. Electron spin resonance measurements showed that Cu2+ coordination of HI and HL was quite different from that of HA and HV, indicating that HA and HV fluctuated to a greater extent in the solution, despite that their apparent α-helical contents being similar to those of HI and HL. This was also supported by the results obtained from the analyses of thermal stabilities. The change in the structural fluctuation for each peptide upon Zn2+ binding was evaluated based on binding thermodynamics using isothermal titration calorimetry. The structural flexibility in the metal-ion-bound state was found to be in the order HA > HV > HI, and that in the metal-ion-unbound state was found to be greater for HI than that for HL.
Collapse
|
21
|
|
22
|
Kadir SR, Lilja A, Gunn N, Strong C, Hughes RT, Bailey BJ, Rae J, Parton RG, McGhee J. Nanoscape, a data-driven 3D real-time interactive virtual cell environment. eLife 2021; 10:64047. [PMID: 34191720 PMCID: PMC8245131 DOI: 10.7554/elife.64047] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 06/04/2021] [Indexed: 12/15/2022] Open
Abstract
Our understanding of cellular and structural biology has reached unprecedented levels of detail, and computer visualisation techniques can be used to create three-dimensional (3D) representations of cells and their environment that are useful in both teaching and research. However, extracting and integrating the relevant scientific data, and then presenting them in an effective way, can pose substantial computational and aesthetic challenges. Here we report how computer artists, experts in computer graphics and cell biologists have collaborated to produce a tool called Nanoscape that allows users to explore and interact with 3D representations of cells and their environment that are both scientifically accurate and visually appealing. We believe that using Nanoscape as an immersive learning application will lead to an improved understanding of the complexities of cellular scales, densities and interactions compared with traditional learning modalities.
Collapse
Affiliation(s)
- Shereen R Kadir
- 3D Visualisation Aesthetics Lab, School of Art and Design, and the ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, University of New South Wales, Sydney, Australia
| | - Andrew Lilja
- 3D Visualisation Aesthetics Lab, School of Art and Design, and the ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, University of New South Wales, Sydney, Australia
| | - Nick Gunn
- 3D Visualisation Aesthetics Lab, School of Art and Design, and the ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, University of New South Wales, Sydney, Australia
| | - Campbell Strong
- 3D Visualisation Aesthetics Lab, School of Art and Design, and the ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, University of New South Wales, Sydney, Australia
| | - Rowan T Hughes
- 3D Visualisation Aesthetics Lab, School of Art and Design, and the ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, University of New South Wales, Sydney, Australia
| | - Benjamin J Bailey
- 3D Visualisation Aesthetics Lab, School of Art and Design, and the ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, University of New South Wales, Sydney, Australia
| | - James Rae
- Institute for Molecular Bioscience, ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and Centre for Microscopy and Microanalysis, University of Queensland, Brisbane, Australia
| | - Robert G Parton
- Institute for Molecular Bioscience, ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and Centre for Microscopy and Microanalysis, University of Queensland, Brisbane, Australia
| | - John McGhee
- 3D Visualisation Aesthetics Lab, School of Art and Design, and the ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, University of New South Wales, Sydney, Australia
| |
Collapse
|
23
|
Burley SK, Berman HM. Open-access data: A cornerstone for artificial intelligence approaches to protein structure prediction. Structure 2021; 29:515-520. [PMID: 33984281 DOI: 10.1016/j.str.2021.04.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 04/08/2021] [Accepted: 04/23/2021] [Indexed: 12/28/2022]
Abstract
The Protein Data Bank (PDB) was established in 1971 to archive three-dimensional (3D) structures of biological macromolecules as a public good. Fifty years later, the PDB is providing millions of data consumers around the world with open access to more than 175,000 experimentally determined structures of proteins and nucleic acids (DNA, RNA) and their complexes with one another and small-molecule ligands. PDB data users are working, teaching, and learning in fundamental biology, biomedicine, bioengineering, biotechnology, and energy sciences. They also represent the fields of agriculture, chemistry, physics and materials science, mathematics, statistics, computer science, and zoology, and even the social sciences. The enormous wealth of 3D structure data stored in the PDB has underpinned significant advances in our understanding of protein architecture, culminating in recent breakthroughs in protein structure prediction accelerated by artificial intelligence approaches and deep or machine learning methods.
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA 92093, USA.
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; The Bridge Institute, Michelson Center for Convergent Bioscience, University of Southern California, Los Angeles, CA 90089, USA.
| |
Collapse
|
24
|
Breslauer KJ. The shaping of a molecular linguist: How a career studying DNA energetics revealed the language of molecular communication. J Biol Chem 2021; 296:100522. [PMID: 34237886 PMCID: PMC8058554 DOI: 10.1016/j.jbc.2021.100522] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 03/04/2021] [Indexed: 01/31/2023] Open
Abstract
My personal and professional journeys have been far from predictable based on my early childhood. Owing to a range of serendipitous influences, I miraculously transitioned from a rebellious, apathetic teenage street urchin who did poorly in school to a highly motivated, disciplined, and ambitious academic honors student. I was the proverbial “late bloomer.” Ultimately, I earned my PhD in biophysical chemistry at Yale, followed by a postdoc fellowship at Berkeley. These two meccas of thermodynamics, coupled with my deep fascination with biology, instilled in me a passion to pursue an academic career focused on mapping the energy landscapes of biological systems. I viewed differential energetics as the language of molecular communication that would dictate and control biological structures, as well as modulate the modes of action associated with biological functions. I wanted to be a “molecular linguist.” For the next 50 years, my group and I used a combination of spectroscopic and calorimetric techniques to characterize the energy profiles of the polymorphic conformational space of DNA molecules, their differential ligand-binding properties, and the energy landscapes associated with mutagenic DNA damage recognition, repair, and replication. As elaborated below, the resultant energy databases have enabled the development of quantitative molecular biology through the rational design of primers, probes, and arrays for diagnostic, therapeutic, and molecular-profiling protocols, which collectively have contributed to a myriad of biomedical assays. Such profiling is further justified by yielding unique energy-based insights that complement and expand elegant, structure-based understandings of biological processes.
Collapse
Affiliation(s)
- Kenneth J Breslauer
- Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey, USA; The Rutgers Cancer Institute of New Jersey, New Brunswick, New Jersey, USA.
| |
Collapse
|
25
|
Feng Z, Westbrook JD, Sala R, Smart OS, Bricogne G, Matsubara M, Yamada I, Tsuchiya S, Aoki-Kinoshita KF, Hoch JC, Kurisu G, Velankar S, Burley SK, Young JY. Enhanced validation of small-molecule ligands and carbohydrates in the Protein Data Bank. Structure 2021; 29:393-400.e1. [PMID: 33657417 PMCID: PMC8026741 DOI: 10.1016/j.str.2021.02.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 01/22/2021] [Accepted: 02/11/2021] [Indexed: 12/19/2022]
Abstract
The Worldwide Protein Data Bank (wwPDB) has provided validation reports based on recommendations from community Validation Task Forces for structures in the PDB since 2013. To further enhance validation of small molecules as recommended from the 2016 Ligand Validation Workshop, wwPDB, Global Phasing Ltd., and the Noguchi Institute, recently formed a public/private partnership to incorporate some of their software tools into the wwPDB validation package. Augmented wwPDB validation report features include: two-dimensional (2D) diagrams of small-molecule ligands and carbohydrates, highlighting geometric validation outcomes; 2D topological diagrams of oligosaccharides present in branched entities generated using 2D Symbol Nomenclature for Glycan representation; and views of 3D electron density maps for ligands and carbohydrates, illustrating the goodness-of-fit between the atomic structure and experimental data (X-ray crystallographic structures only). These improvements will impact confidence in ligand conformation and ligand-macromolecular interactions that will aid in understanding biochemical function and contribute to small-molecule drug discovery.
Collapse
Affiliation(s)
- Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA
| | - Raul Sala
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA
| | - Oliver S Smart
- Global Phasing Ltd., Sheraton House, Castle Park, Cambridge CB3 0AX, UK
| | - Gérard Bricogne
- Global Phasing Ltd., Sheraton House, Castle Park, Cambridge CB3 0AX, UK
| | - Masaaki Matsubara
- The Noguchi Institute, 1-9-7, Kaga, Itabashi-ku, Tokyo 173-0003, Japan
| | - Issaku Yamada
- The Noguchi Institute, 1-9-7, Kaga, Itabashi-ku, Tokyo 173-0003, Japan
| | | | - Kiyoko F Aoki-Kinoshita
- Faculty of Science and Engineering, Soka University, 1-236 Tangi-machi, Hachioji-shi, Tokyo 192-8577, Japan; Glycan & Life Systems Integration Center, Soka University, 1-236 Tangi-machi, Hachioji-shi, Tokyo 192-8577, Japan
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, University of Connecticut, UConn Health, 263 Farmington Avenue, Farmington, CT 06030-3305, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA; Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ 08903, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA 92093, USA
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
26
|
Burley SK, Bhikadiya C, Bi C, Bittrich S, Chen L, Crichlow GV, Christie CH, Dalenberg K, Di Costanzo L, Duarte JM, Dutta S, Feng Z, Ganesan S, Goodsell DS, Ghosh S, Green RK, Guranović V, Guzenko D, Hudson BP, Lawson C, Liang Y, Lowe R, Namkoong H, Peisach E, Persikova I, Randle C, Rose A, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Tao YP, Voigt M, Westbrook J, Young JY, Zardecki C, Zhuravleva M. RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. Nucleic Acids Res 2021; 49:D437-D451. [PMID: 33211854 PMCID: PMC7779003 DOI: 10.1093/nar/gkaa1038] [Citation(s) in RCA: 755] [Impact Index Per Article: 251.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/14/2020] [Accepted: 11/17/2020] [Indexed: 12/14/2022] Open
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), the US data center for the global PDB archive and a founding member of the Worldwide Protein Data Bank partnership, serves tens of thousands of data depositors in the Americas and Oceania and makes 3D macromolecular structure data available at no charge and without restrictions to millions of RCSB.org users around the world, including >660 000 educators, students and members of the curious public using PDB101.RCSB.org. PDB data depositors include structural biologists using macromolecular crystallography, nuclear magnetic resonance spectroscopy, 3D electron microscopy and micro-electron diffraction. PDB data consumers accessing our web portals include researchers, educators and students studying fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. During the past 2 years, the research-focused RCSB PDB web portal (RCSB.org) has undergone a complete redesign, enabling improved searching with full Boolean operator logic and more facile access to PDB data integrated with >40 external biodata resources. New features and resources are described in detail using examples that showcase recently released structures of SARS-CoV-2 proteins and host cell proteins relevant to understanding and addressing the COVID-19 global pandemic.
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Li Chen
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Gregg V Crichlow
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Cole H Christie
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Kenneth Dalenberg
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Luigi Di Costanzo
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Sai Ganesan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Biotherapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - David S Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Center for Computational Structural Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Sutapa Ghosh
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Rachel Kramer Green
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Vladimir Guranović
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dmytro Guzenko
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Brian P Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Catherine L Lawson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Harry Namkoong
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chris Randle
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Alexander Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Biotherapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yi-Ping Tao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Marina Zhuravleva
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
27
|
Abstract
Protein Data Bank is the single worldwide archive of experimentally determined macromolecular structure data. Established in 1971 as the first open access data resource in biology, the PDB archive is managed by the worldwide Protein Data Bank (wwPDB) consortium which has four partners-the RCSB Protein Data Bank (RCSB PDB; rcsb.org), the Protein Data Bank Japan (PDBj; pdbj.org), the Protein Data Bank in Europe (PDBe; pdbe.org), and BioMagResBank (BMRB; www.bmrb.wisc.edu ). The PDB archive currently includes ~175,000 entries. The wwPDB has established a number of task forces and working groups that bring together experts form the community who provide recommendations on improving data standards and data validation for improving data quality and integrity. The wwPDB members continue to develop the joint deposition, biocuration, and validation system (OneDep) to improve data quality and accommodate new data from emerging techniques such as 3DEM. Each PDB entry contains coordinate model and associated metadata for all experimentally determined atomic structures, experimental data for the traditional structure determination techniques (X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy), validation reports, and additional information on quaternary structures. The wwPDB partners are committed to following the FAIR (Findability, Accessibility, Interoperability, and Reproducibility) principles and have implemented a DOI resolution mechanism that provides access to all the relevant files for a given PDB entry. On average, >250 new entries are added to the archive every week and made available by each wwPDB partner via FTP area. The wwPDB partner sites also develop data access and analysis tools and make these available via their websites. wwPDB continues to work with experts in the community to establish a federation of archives for archiving structures determined using integrative/hybrid method where multiple experimental techniques are used.
Collapse
Affiliation(s)
- Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK.
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.,Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.,Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences and San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka, Japan
| | - Jeffrey C Hoch
- BioMagResBank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT, USA
| | - John L Markley
- BioMagResBank, Biochemistry Department, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
28
|
Burley SK. Impact of structural biologists and the Protein Data Bank on small-molecule drug discovery and development. J Biol Chem 2021; 296:100559. [PMID: 33744282 PMCID: PMC8059052 DOI: 10.1016/j.jbc.2021.100559] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 02/02/2021] [Accepted: 03/16/2021] [Indexed: 12/12/2022] Open
Abstract
The Protein Data Bank (PDB) is an international core data resource central to fundamental biology, biomedicine, bioenergy, and biotechnology/bioengineering. Now celebrating its 50th anniversary, the PDB houses >175,000 experimentally determined atomic structures of proteins, nucleic acids, and their complexes with one another and small molecules and drugs. The importance of three-dimensional (3D) biostructure information for research and education obtains from the intimate link between molecular form and function evident throughout biology. Among the most prolific consumers of PDB data are biomedical researchers, who rely on the open access resource as the authoritative source of well-validated, expertly curated biostructures. This review recounts how the PDB grew from just seven protein structures to contain more than 49,000 structures of human proteins that have proven critical for understanding their roles in human health and disease. It then describes how these structures are used in academe and industry to validate drug targets, assess target druggability, characterize how tool compounds and other small-molecules bind to drug targets, guide medicinal chemistry optimization of binding affinity and selectivity, and overcome challenges during preclinical drug development. Three case studies drawn from oncology exemplify how structural biologists and open access to PDB structures impacted recent regulatory approvals of antineoplastic drugs.
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, New Jersey, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, California, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, USA.
| |
Collapse
|
29
|
Guérin A, Sulaeman S, Coquet L, Ménard A, Barloy-Hubler F, Dé E, Tresse O. Membrane Proteocomplexome of Campylobacter jejuni Using 2-D Blue Native/SDS-PAGE Combined to Bioinformatics Analysis. Front Microbiol 2020; 11:530906. [PMID: 33329413 PMCID: PMC7717971 DOI: 10.3389/fmicb.2020.530906] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 10/14/2020] [Indexed: 12/27/2022] Open
Abstract
Campylobacter is the leading cause of the human bacterial foodborne infections in the developed countries. The perception cues from biotic or abiotic environments by the bacteria are often related to bacterial surface and membrane proteins that mediate the cellular response for the adaptation of Campylobacter jejuni to the environment. These proteins function rarely as a unique entity, they are often organized in functional complexes. In C. jejuni, these complexes are not fully identified and some of them remain unknown. To identify putative functional multi-subunit entities at the membrane subproteome level of C. jejuni, a holistic non a priori method was addressed using two-dimensional blue native/Sodium dodecyl sulfate (SDS) polyacrylamide gel electrophoresis (PAGE) in strain C. jejuni 81-176. Couples of acrylamide gradient/migration-time, membrane detergent concentration and hand-made strips were optimized to obtain reproducible extraction and separation of intact membrane protein complexes (MPCs). The MPCs were subsequently denatured using SDS-PAGE and each spot from each MPCs was identified by mass spectrometry. Altogether, 21 MPCs could be detected including multi homo-oligomeric and multi hetero-oligomeric complexes distributed in both inner and outer membranes. The function, the conservation and the regulation of the MPCs across C. jejuni strains were inspected by functional and genomic comparison analyses. In this study, relatedness between subunits of two efflux pumps, CmeABC and MacABputC was observed. In addition, a consensus sequence CosR-binding box in promoter regions of MacABputC was present in C. jejuni but not in Campylobacter coli. The MPCs identified in C. jejuni 81-176 membrane are involved in protein folding, molecule trafficking, oxidative phosphorylation, membrane structuration, peptidoglycan biosynthesis, motility and chemotaxis, stress signaling, efflux pumps and virulence.
Collapse
Affiliation(s)
| | | | - Laurent Coquet
- UMR 6270 Laboratoire Polymères Biopolymères Surfaces, UNIROUEN, INSA Rouen, CNRS, Normandie Université, Rouen, France
- UNIROUEN, Plateforme PISSARO, IRIB, Normandie Université, Mont-Saint-Aignan, France
| | - Armelle Ménard
- INSERM, UMR 1053 Bordeaux Research in Translational Oncology, BaRITOn, Bordeaux, France
| | - Frédérique Barloy-Hubler
- UMR 6290, CNRS, Institut de Génétique et Développement de Rennes, University of Rennes, Rennes, France
| | - Emmanuelle Dé
- UMR 6270 Laboratoire Polymères Biopolymères Surfaces, UNIROUEN, INSA Rouen, CNRS, Normandie Université, Rouen, France
| | | |
Collapse
|
30
|
Rose Y, Duarte JM, Lowe R, Segura J, Bi C, Bhikadiya C, Chen L, Rose AS, Bittrich S, Burley SK, Westbrook JD. RCSB Protein Data Bank: Architectural Advances Towards Integrated Searching and Efficient Access to Macromolecular Structure Data from the PDB Archive. J Mol Biol 2020; 433:166704. [PMID: 33186584 PMCID: PMC9093041 DOI: 10.1016/j.jmb.2020.11.003] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 11/03/2020] [Accepted: 11/05/2020] [Indexed: 11/10/2022]
Abstract
The US Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) serves many millions of unique users worldwide by delivering experimentally-determined 3D structures of biomolecules integrated with >40 external data resources via RCSB.org, application programming interfaces (APIs), and FTP downloads. Herein, we present the architectural redesign of RCSB PDB data delivery services that build on existing PDBx/mmCIF data schemas. New data access APIs (data.rcsb.org) enable efficient delivery of all PDB archive data. A novel GraphQL-based API provides flexible, declarative data retrieval along with a simple-to-use REST API. A powerful new search system (search.rcsb.org) seamlessly integrates heterogeneous types of searches across the PDB archive. Searches may combine text attributes, protein or nucleic acid sequences, small-molecule chemical descriptors, 3D macromolecular shapes, and sequence motifs. The new RCSB.org architecture adheres to the FAIR Principles, empowering users to address a wide array of research problems in fundamental biology, biomedicine, biotechnology, bioengineering, and bioenergy.
Collapse
Affiliation(s)
- Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Li Chen
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Alexander S Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA.
| |
Collapse
|
31
|
Goodsell DS, Burley SK. RCSB Protein Data Bank tools for 3D structure-guided cancer research: human papillomavirus (HPV) case study. Oncogene 2020; 39:6623-6632. [PMID: 32939013 PMCID: PMC7581513 DOI: 10.1038/s41388-020-01461-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Revised: 07/30/2020] [Accepted: 09/04/2020] [Indexed: 11/21/2022]
Abstract
Atomic-level three-dimensional (3D) structure data for biological macromolecules often prove critical to dissecting and understanding the precise mechanisms of action of cancer-related proteins and their diverse roles in oncogenic transformation, proliferation, and metastasis. They are also used extensively to identify potentially druggable targets and facilitate discovery and development of both small-molecule and biologic drugs that are today benefiting individuals diagnosed with cancer around the world. 3D structures of biomolecules (including proteins, DNA, RNA, and their complexes with one another, drugs, and other small molecules) are freely distributed by the open-access Protein Data Bank (PDB). This global data repository is used by millions of scientists and educators working in the areas of drug discovery, vaccine design, and biomedical and biotechnology research. The US Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) provides an integrated portal to the PDB archive that streamlines access for millions of worldwide PDB data consumers worldwide. Herein, we review online resources made available free of charge by the RCSB PDB to basic and applied researchers, healthcare providers, educators and their students, patients and their families, and the curious public. We exemplify the value of understanding cancer-related proteins in 3D with a case study focused on human papillomavirus.
Collapse
Affiliation(s)
- David S Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA. .,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA.
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA. .,Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA. .,Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, and the Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, 92093, USA. .,Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, 08903, USA.
| |
Collapse
|
32
|
Burley SK, Berman HM, Bhikadiya C, Bi C, Chen L, Di Costanzo L, Christie C, Dalenberg K, Duarte JM, Dutta S, Feng Z, Ghosh S, Goodsell DS, Green RK, Guranovic V, Guzenko D, Hudson BP, Kalro T, Liang Y, Lowe R, Namkoong H, Peisach E, Periskova I, Prlic A, Randle C, Rose A, Rose P, Sala R, Sekharan M, Shao C, Tan L, Tao YP, Valasatava Y, Voigt M, Westbrook J, Woo J, Yang H, Young J, Zhuravleva M, Zardecki C. RCSB Protein Data Bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy. Nucleic Acids Res 2020; 47:D464-D474. [PMID: 30357411 PMCID: PMC6324064 DOI: 10.1093/nar/gky1004] [Citation(s) in RCA: 717] [Impact Index Per Article: 179.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 10/11/2018] [Indexed: 02/06/2023] Open
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, rcsb.org), the US data center for the global PDB archive, serves thousands of Data Depositors in the Americas and Oceania and makes 3D macromolecular structure data available at no charge and without usage restrictions to more than 1 million rcsb.org Users worldwide and 600 000 pdb101.rcsb.org education-focused Users around the globe. PDB Data Depositors include structural biologists using macromolecular crystallography, nuclear magnetic resonance spectroscopy and 3D electron microscopy. PDB Data Consumers include researchers, educators and students studying Fundamental Biology, Biomedicine, Biotechnology and Energy. Recent reorganization of RCSB PDB activities into four integrated, interdependent services is described in detail, together with tools and resources added over the past 2 years to RCSB PDB web portals in support of a ‘Structural View of Biology.’
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA.,Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Li Chen
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Luigi Di Costanzo
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Cole Christie
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Ken Dalenberg
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Sutapa Ghosh
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - David S Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Rachel K Green
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Vladimir Guranovic
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dmytro Guzenko
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Brian P Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Tara Kalro
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Harry Namkoong
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Irina Periskova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Andreas Prlic
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Chris Randle
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Alexander Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Peter Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Raul Sala
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Lihua Tan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yi-Ping Tao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yana Valasatava
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jesse Woo
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Huanwang Yang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jasmine Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Marina Zhuravleva
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
33
|
Nakamura H. Big data science at AMED-BINDS. Biophys Rev 2020; 12:221-224. [PMID: 32030637 DOI: 10.1007/s12551-020-00628-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 01/23/2020] [Indexed: 12/13/2022] Open
Affiliation(s)
- Haruki Nakamura
- Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka, 565-0871, Japan.
| |
Collapse
|
34
|
Armstrong DR, Berrisford JM, Conroy MJ, Gutmanas A, Anyango S, Choudhary P, Clark AR, Dana JM, Deshpande M, Dunlop R, Gane P, Gáborová R, Gupta D, Haslam P, Koča J, Mak L, Mir S, Mukhopadhyay A, Nadzirin N, Nair S, Paysan-Lafosse T, Pravda L, Sehnal D, Salih O, Smart O, Tolchard J, Varadi M, Svobodova-Vařeková R, Zaki H, Kleywegt GJ, Velankar S. PDBe: improved findability of macromolecular structure data in the PDB. Nucleic Acids Res 2020; 48:D335-D343. [PMID: 31691821 PMCID: PMC7145656 DOI: 10.1093/nar/gkz990] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Revised: 10/11/2019] [Accepted: 10/25/2019] [Indexed: 11/23/2022] Open
Abstract
The Protein Data Bank in Europe (PDBe), a founding member of the Worldwide Protein Data Bank (wwPDB), actively participates in the deposition, curation, validation, archiving and dissemination of macromolecular structure data. PDBe supports diverse research communities in their use of macromolecular structures by enriching the PDB data and by providing advanced tools and services for effective data access, visualization and analysis. This paper details the enrichment of data at PDBe, including mapping of RNA structures to Rfam, and identification of molecules that act as cofactors. PDBe has developed an advanced search facility with ∼100 data categories and sequence searches. New features have been included in the LiteMol viewer at PDBe, with updated visualization of carbohydrates and nucleic acids. Small molecules are now mapped more extensively to external databases and their visual representation has been enhanced. These advances help users to more easily find and interpret macromolecular structure data in order to solve scientific problems.
Collapse
Affiliation(s)
- David R Armstrong
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - John M Berrisford
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthew J Conroy
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Aleksandras Gutmanas
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen Anyango
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Preeti Choudhary
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alice R Clark
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jose M Dana
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mandar Deshpande
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Roisin Dunlop
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Gane
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Romana Gáborová
- CEITEC - Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno-Bohunice, Czech Republic
| | - Deepti Gupta
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Pauline Haslam
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jaroslav Koča
- CEITEC - Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno-Bohunice, Czech Republic
| | - Lora Mak
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Saqib Mir
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Abhik Mukhopadhyay
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nurul Nadzirin
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sreenath Nair
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Typhaine Paysan-Lafosse
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- InterPro, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Lukas Pravda
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - David Sehnal
- CEITEC - Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno-Bohunice, Czech Republic
| | - Osman Salih
- Electron Microscopy Data Bank (EMDB), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Oliver Smart
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - James Tolchard
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Radka Svobodova-Vařeková
- CEITEC - Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno-Bohunice, Czech Republic
| | - Hossam Zaki
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gerard J Kleywegt
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- Electron Microscopy Data Bank (EMDB), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
35
|
Mukhopadhyay A, Borkakoti N, Pravda L, Tyzack JD, Thornton JM, Velankar S. Finding enzyme cofactors in Protein Data Bank. Bioinformatics 2019; 35:3510-3511. [PMID: 30759194 PMCID: PMC6748742 DOI: 10.1093/bioinformatics/btz115] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Revised: 01/30/2019] [Accepted: 02/12/2019] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Cofactors are essential for many enzyme reactions. The Protein Data Bank (PDB) contains >67 000 entries containing enzyme structures, many with bound cofactor or cofactor-like molecules. This work aims to identify and categorize these small molecules in the PDB and make it easier to find them. RESULTS The Protein Data Bank in Europe (PDBe; pdbe.org) has implemented a pipeline to identify enzyme cofactor and cofactor-like molecules, which are now part of the PDBe weekly release process. AVAILABILITY AND IMPLEMENTATION Information is made available on the individual PDBe entry pages at pdbe.org and programmatically through the PDBe REST API (pdbe.org/api). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Abhik Mukhopadhyay
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Neera Borkakoti
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Lukáš Pravda
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Jonathan D Tyzack
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Janet M Thornton
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Sameer Velankar
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| |
Collapse
|
36
|
Davies JA, Ireland S, Harding S, Sharman JL, Southan C, Dominguez-Monedero A. Inverse pharmacology: Approaches and tools for introducing druggability into engineered proteins. Biotechnol Adv 2019; 37:107439. [PMID: 31494210 PMCID: PMC6891246 DOI: 10.1016/j.biotechadv.2019.107439] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 07/24/2019] [Accepted: 08/20/2019] [Indexed: 01/08/2023]
Abstract
A major feature of twenty-first century medical research is the development of therapeutic strategies that use 'biologics' (large molecules, usually engineered proteins) and living cells instead of, or as well as, the small molecules that were the basis of pharmacology in earlier eras. The high power of these techniques can bring correspondingly high risk, and therefore the need for the potential for external control. One way of exerting control on therapeutic proteins is to make them responsive to small molecules; in a clinical context, these small molecules themselves have to be safe. Conventional pharmacology has resulted in thousands of small molecules licensed for use in humans, and detailed structural data on their binding to their protein targets. In principle, these data can be used to facilitate the engineering of drug-responsive modules, taken from natural proteins, into synthetic proteins. This has been done for some years (for example, Cre-ERT2) but usually in a painstaking manner. Recently, we have developed the bioinformatic tool SynPharm to facilitate the design of drug-responsive proteins. In this review, we outline the history of the field, the design and use of the Synpharm tool, and describe our own experiences in engineering druggability into the Cpf1 effector of CRISPR gene editing.
Collapse
Affiliation(s)
- Jamie A Davies
- Deanery of Biomedical Sciences, University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XB, UK.
| | - Sam Ireland
- Biomolecular Structure & Modelling Unit, Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London WC1E 6BT, UK
| | - Simon Harding
- Deanery of Biomedical Sciences, University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XB, UK
| | - Joanna L Sharman
- Deanery of Biomedical Sciences, University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XB, UK; Novo Nordisk Research Centre Oxford, Novo Nordisk Ltd, Innovation Building, Old Road Campus, Roosevelt Drive, Oxford OX3 7FZ, UK
| | | | - Alazne Dominguez-Monedero
- Deanery of Biomedical Sciences, University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XB, UK
| |
Collapse
|
37
|
Adams PD, Afonine PV, Baskaran K, Berman HM, Berrisford J, Bricogne G, Brown DG, Burley SK, Chen M, Feng Z, Flensburg C, Gutmanas A, Hoch JC, Ikegawa Y, Kengaku Y, Krissinel E, Kurisu G, Liang Y, Liebschner D, Mak L, Markley JL, Moriarty NW, Murshudov GN, Noble M, Peisach E, Persikova I, Poon BK, Sobolev OV, Ulrich EL, Velankar S, Vonrhein C, Westbrook J, Wojdyr M, Yokochi M, Young JY. Announcing mandatory submission of PDBx/mmCIF format files for crystallographic depositions to the Protein Data Bank (PDB). Acta Crystallogr D Struct Biol 2019; 75:451-454. [PMID: 30988261 PMCID: PMC6465986 DOI: 10.1107/s2059798319004522] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 04/03/2019] [Indexed: 11/10/2022] Open
Abstract
This letter announces that PDBx/mmCIF format files will become mandatory for crystallographic depositions to the Protein Data Bank (PDB).
Collapse
Affiliation(s)
- Paul D. Adams
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Department of Bioengineering, University of California, Berkeley, CA 94720, USA
| | - Pavel V. Afonine
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Kumaran Baskaran
- BioMagResBank (BMRB), University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Helen M. Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John Berrisford
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Gerard Bricogne
- Global Phasing Limited, Sheraton House, Castle Park, Cambridge, CB3 0AX, UK
| | - David G. Brown
- School of Biosciences, University of Kent, Canterbury, Kent CT2 7NJ, UK
| | - Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ 08903, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Minyu Chen
- Protein Data Bank Japan (PDBj), Institute for Protein Research, Osaka University, Osaka, 565-0871, Japan
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Claus Flensburg
- Global Phasing Limited, Sheraton House, Castle Park, Cambridge, CB3 0AX, UK
| | - Aleksandras Gutmanas
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Jeffrey C. Hoch
- BioMagResBank (BMRB), UConn Health, 263 Farmington Avenue, Farmington, CT 06030, USA
| | - Yasuyo Ikegawa
- Protein Data Bank Japan (PDBj), Institute for Protein Research, Osaka University, Osaka, 565-0871, Japan
| | - Yumiko Kengaku
- Protein Data Bank Japan (PDBj), Institute for Protein Research, Osaka University, Osaka, 565-0871, Japan
| | - Eugene Krissinel
- CCP4, Research Complex at Harwell (RCaH), Rutherford Appleton Laboratory, Didcot, Oxon OX11 0FA, UK
| | - Genji Kurisu
- Protein Data Bank Japan (PDBj), Institute for Protein Research, Osaka University, Osaka, 565-0871, Japan
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dorothee Liebschner
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Lora Mak
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - John L. Markley
- BioMagResBank (BMRB), University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Nigel W. Moriarty
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Garib N. Murshudov
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0QH, UK
| | - Martin Noble
- Newcastle University, Framlington Place, Newcastle Upon Tyne, NE2 4HH, UK
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Billy K. Poon
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Oleg V. Sobolev
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Eldon L. Ulrich
- BioMagResBank (BMRB), University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Sameer Velankar
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Clemens Vonrhein
- Global Phasing Limited, Sheraton House, Castle Park, Cambridge, CB3 0AX, UK
| | - John Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Marcin Wojdyr
- Global Phasing Limited, Sheraton House, Castle Park, Cambridge, CB3 0AX, UK
- CCP4, Research Complex at Harwell (RCaH), Rutherford Appleton Laboratory, Didcot, Oxon OX11 0FA, UK
| | - Masashi Yokochi
- Protein Data Bank Japan (PDBj), Institute for Protein Research, Osaka University, Osaka, 565-0871, Japan
| | - Jasmine Y. Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
38
|
Research update and opportunity of non-hormonal male contraception: Histone demethylase KDM5B-based targeting. Pharmacol Res 2019; 141:1-20. [DOI: 10.1016/j.phrs.2018.12.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Revised: 11/29/2018] [Accepted: 12/09/2018] [Indexed: 12/28/2022]
|
39
|
Burley SK, Berman HM, Bhikadiya C, Bi C, Chen L, Costanzo LD, Christie C, Duarte JM, Dutta S, Feng Z, Ghosh S, Goodsell DS, Green RK, Guranovic V, Guzenko D, Hudson BP, Liang Y, Lowe R, Peisach E, Periskova I, Randle C, Rose A, Sekharan M, Shao C, Tao YP, Valasatava Y, Voigt M, Westbrook J, Young J, Zardecki C, Zhuravleva M, Kurisu G, Nakamura H, Kengaku Y, Cho H, Sato J, Kim JY, Ikegawa Y, Nakagawa A, Yamashita R, Kudou T, Bekker GJ, Suzuki H, Iwata T, Yokochi M, Kobayashi N, Fujiwara T, Velankar S, Kleywegt GJ, Anyango S, Armstrong DR, Berrisford JM, Conroy MJ, Dana JM, Deshpande M, Gane P, Gáborová R, Gupta D, Gutmanas A, Koča J, Mak L, Mir S, Mukhopadhyay A, Nadzirin N, Nair S, Patwardhan A, Paysan-Lafosse T, Pravda L, Salih O, Sehnal D, Varadi M, Vařeková R, Markley JL, Hoch JC, Romero PR, Baskaran K, Maziuk D, Ulrich EL, Wedell JR, Yao H, Livny M, Ioannidis YE. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res 2019; 47:D520-D528. [PMID: 30357364 PMCID: PMC6324056 DOI: 10.1093/nar/gky949] [Citation(s) in RCA: 505] [Impact Index Per Article: 101.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 09/28/2018] [Accepted: 10/05/2018] [Indexed: 01/10/2023] Open
Abstract
The Protein Data Bank (PDB) is the single global archive of experimentally determined three-dimensional (3D) structure data of biological macromolecules. Since 2003, the PDB has been managed by the Worldwide Protein Data Bank (wwPDB; wwpdb.org), an international consortium that collaboratively oversees deposition, validation, biocuration, and open access dissemination of 3D macromolecular structure data. The PDB Core Archive houses 3D atomic coordinates of more than 144 000 structural models of proteins, DNA/RNA, and their complexes with metals and small molecules and related experimental data and metadata. Structure and experimental data/metadata are also stored in the PDB Core Archive using the readily extensible wwPDB PDBx/mmCIF master data format, which will continue to evolve as data/metadata from new experimental techniques and structure determination methods are incorporated by the wwPDB. Impacts of the recently developed universal wwPDB OneDep deposition/validation/biocuration system and various methods-specific wwPDB Validation Task Forces on improving the quality of structures and data housed in the PDB Core Archive are described together with current challenges and future plans.
Collapse
|
40
|
Capturing variation impact on molecular interactions in the IMEx Consortium mutations data set. Nat Commun 2019; 10:10. [PMID: 30602777 PMCID: PMC6315030 DOI: 10.1038/s41467-018-07709-6] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Accepted: 11/15/2018] [Indexed: 01/26/2023] Open
Abstract
The current wealth of genomic variation data identified at nucleotide level presents the challenge of understanding by which mechanisms amino acid variation affects cellular processes. These effects may manifest as distinct phenotypic differences between individuals or result in the development of disease. Physical interactions between molecules are the linking steps underlying most, if not all, cellular processes. Understanding the effects that sequence variation has on a molecule’s interactions is a key step towards connecting mechanistic characterization of nonsynonymous variation to phenotype. We present an open access resource created over 14 years by IMEx database curators, featuring 28,000 annotations describing the effect of small sequence changes on physical protein interactions. We describe how this resource was built, the formats in which the data is provided and offer a descriptive analysis of the data set. The data set is publicly available through the IntAct website and is enhanced with every monthly release. Genetic variants might exert their functional effects via influencing molecular interaction. Here, the authors present a resource featuring almost 28,000 annotations describing the effect of small sequence changes on physical protein interactions, curated by IMEx Consortium curators.
Collapse
|
41
|
Youkharibache P. Protodomains: Symmetry-Related Supersecondary Structures in Proteins and Self-Complementarity. Methods Mol Biol 2019; 1958:187-219. [PMID: 30945220 PMCID: PMC8323591 DOI: 10.1007/978-1-4939-9161-7_10] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
We will consider in this chapter supersecondary structures (SSS) as a set of secondary structure elements (SSEs) found in protein domains. Some SSS arrangements/topologies have been consistently observed within known tertiary structural domains. We use them in the context of repeating supersecondary structures that self-assemble in a symmetric arrangement to form a domain. We call them protodomains (or protofolds). Protodomains are some of the most interesting and insightful SSSs. Within a given 3D protein domain/fold, recognizing such sets may give insights into a possible evolutionary process of duplication, fusion, and coevolution of these protodomains, pointing to possible original protogenes. On protein folding itself, pseudosymmetric domains may point to a "directed" assembly of pseudosymmetric protodomains, directed by the only fact that they are tethered together in a protein chain. On function, tertiary functional sites often occur at protodomain interfaces, as they often occur at domain-domain interfaces in quaternary arrangements.First, we will briefly review some lessons learned from a previously published census of pseudosymmetry in protein domains (Myers-Turnbull, D. et al., J Mol Biol. 426:2255-2268, 2014) to introduce protodomains/protofolds. We will observe that the most abundant and diversified folds, or superfolds, in the currently known protein structure universe are indeed pseudosymmetric. Then, we will learn by example and select a few domain representatives of important pseudosymmetric folds and chief among them the immunoglobulin (Ig) fold and go over a pseudosymmetry supersecondary structure (protodomain) analysis in tertiary and quaternary structures. We will point to currently available software tools to help in identifying pseudosymmetry, delineating protodomains, and see how the study of pseudosymmetry and the underlying supersecondary structures can enrich a structural analysis. This should potentially help in protein engineering, especially in the development of biologics and immunoengineering.
Collapse
|
42
|
How Structural Biologists and the Protein Data Bank Contributed to Recent FDA New Drug Approvals. Structure 2018; 27:211-217. [PMID: 30595456 DOI: 10.1016/j.str.2018.11.007] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 11/09/2018] [Accepted: 11/15/2018] [Indexed: 01/01/2023]
Abstract
Discovery and development of 210 new molecular entities (NMEs; new drugs) approved by the US Food and Drug Administration 2010-2016 was facilitated by 3D structural information generated by structural biologists worldwide and distributed on an open-access basis by the PDB. The molecular targets for 94% of these NMEs are known. The PDB archive contains 5,914 structures containing one of the known targets and/or a new drug, providing structural coverage for 88% of the recently approved NMEs across all therapeutic areas. More than half of the 5,914 structures were published and made available by the PDB at no charge, with no restrictions on usage >10 years before drug approval. Citation analyses revealed that these 5,914 PDB structures significantly affected the very large body of publicly funded research reported in publications on the NME targets that motivated biopharmaceutical company investment in discovery and development programs that produced the NMEs.
Collapse
|
43
|
Shao C, Liu Z, Yang H, Wang S, Burley SK. Outlier analyses of the Protein Data Bank archive using a probability-density-ranking approach. Sci Data 2018; 5:180293. [PMID: 30532050 PMCID: PMC6289109 DOI: 10.1038/sdata.2018.293] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Accepted: 11/12/2018] [Indexed: 02/02/2023] Open
Abstract
Outlier analyses are central to scientific data assessments. Conventional outlier identification methods do not work effectively for Protein Data Bank (PDB) data, which are characterized by heavy skewness and the presence of bounds and/or long tails. We have developed a data-driven nonparametric method to identify outliers in PDB data based on kernel probability density estimation. Unlike conventional outlier analyses based on location and scale, Probability Density Ranking can be used for robust assessments of distance from other observations. Analyzing PDB data from the vantage points of probability and frequency enables proper outlier identification, which is important for quality control during deposition-validation-biocuration of new three-dimensional structure data. Ranking of Probability Density also permits use of Most Probable Range as a robust measure of data dispersion that is more compact than Interquartile Range. The Probability-Density-Ranking approach can be employed to analyze outliers and data-spread on any large data set with continuous distribution.
Collapse
Affiliation(s)
- Chenghua Shao
- RCSB Protein Data Bank, Rutgers, The State University of New
Jersey, Piscataway,
NJ
08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State
University of New Jersey, Piscataway, NJ
08854, USA
| | - Zonghong Liu
- Department of Statistics and Biostatistics, Rutgers, The State
University of New Jersey, New
Brunswick, NJ,
08903, USA
| | - Huanwang Yang
- RCSB Protein Data Bank, Rutgers, The State University of New
Jersey, Piscataway,
NJ
08854, USA
| | - Sijian Wang
- Institute for Quantitative Biomedicine, Rutgers, The State
University of New Jersey, Piscataway, NJ
08854, USA
- Department of Statistics and Biostatistics, Rutgers, The State
University of New Jersey, New
Brunswick, NJ,
08903, USA
| | - Stephen K. Burley
- RCSB Protein Data Bank, Rutgers, The State University of New
Jersey, Piscataway,
NJ
08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State
University of New Jersey, Piscataway, NJ
08854, USA
- Rutgers Cancer Institute of New Jersey, Rutgers, The State
University of New Jersey, New
Brunswick, NJ,
08903, USA
- RCSB Protein Data Bank, San Diego Supercomputer Center and
Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California
San Diego, La Jolla,
CA
92093, USA
| |
Collapse
|
44
|
Kleywegt GJ, Velankar S, Patwardhan A. Structural biology data archiving - where we are and what lies ahead. FEBS Lett 2018; 592:2153-2167. [PMID: 29749603 PMCID: PMC6019198 DOI: 10.1002/1873-3468.13086] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 04/25/2018] [Accepted: 04/30/2018] [Indexed: 12/31/2022]
Abstract
For almost 50 years, structural biology has endeavoured to conserve and share its experimental data and their interpretations (usually, atomistic models) through global public archives such as the Protein Data Bank, Electron Microscopy Data Bank and Biological Magnetic Resonance Data Bank (BMRB). These archives are treasure troves of freely accessible data that document our quest for molecular or atomic understanding of biological function and processes in health and disease. They have prepared the field to tackle new archiving challenges as more and more (combinations of) techniques are being utilized to elucidate structure at ever increasing length scales. Furthermore, the field has made substantial efforts to develop validation methods that help users to assess the reliability of structures and to identify the most appropriate data for their needs. In this Review, we present an overview of public data archives in structural biology and discuss the importance of validation for users and producers of structural data. Finally, we sketch our efforts to integrate structural data with bioimaging data and with other sources of biological data. This will make relevant structural information available and more easily discoverable for a wide range of scientists.
Collapse
Affiliation(s)
- Gerard J. Kleywegt
- European Molecular Biology Laboratory (EMBL)European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Sameer Velankar
- European Molecular Biology Laboratory (EMBL)European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Ardan Patwardhan
- European Molecular Biology Laboratory (EMBL)European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| |
Collapse
|
45
|
Burley SK, Berman HM, Christie C, Duarte JM, Feng Z, Westbrook J, Young J, Zardecki C. RCSB Protein Data Bank: Sustaining a living digital data resource that enables breakthroughs in scientific research and biomedical education. Protein Sci 2018; 27:316-330. [PMID: 29067736 PMCID: PMC5734314 DOI: 10.1002/pro.3331] [Citation(s) in RCA: 165] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Revised: 10/20/2017] [Accepted: 10/23/2017] [Indexed: 01/27/2023]
Abstract
The Protein Data Bank (PDB) is one of two archival resources for experimental data central to biomedical research and education worldwide (the other key Primary Data Archive in biology being the International Nucleotide Sequence Database Collaboration). The PDB currently houses >134,000 atomic level biomolecular structures determined by crystallography, NMR spectroscopy, and 3D electron microscopy. It was established in 1971 as the first open-access, digital-data resource in biology, and is managed by the Worldwide Protein Data Bank partnership (wwPDB; wwpdb.org). US PDB operations are conducted by the RCSB Protein Data Bank (RCSB PDB; RCSB.org; Rutgers University and UC San Diego) and funded by NSF, NIH, and DoE. The RCSB PDB serves as the global Archive Keeper for the wwPDB. During calendar 2016, >591 million structure data files were downloaded from the PDB by Data Consumers working in every sovereign nation recognized by the United Nations. During this same period, the RCSB PDB processed >5300 new atomic level biomolecular structures plus experimental data and metadata coming into the archive from Data Depositors working in the Americas and Oceania. In addition, RCSB PDB served >1 million RCSB.org users worldwide with PDB data integrated with ∼40 external data resources providing rich structural views of fundamental biology, biomedicine, and energy sciences, and >600,000 PDB101.rcsb.org educational website users around the globe. RCSB PDB resources are described in detail together with metrics documenting the impact of access to PDB data on basic and applied research, clinical medicine, education, and the economy.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data BankInstitute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew Jersey08854
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical SchoolNew BrunswickNew Jersey08903
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of California, San DiegoLa JollaCalifornia92093
| | - Helen M. Berman
- Research Collaboratory for Structural Bioinformatics Protein Data BankInstitute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew Jersey08854
| | - Cole Christie
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of California, San DiegoLa JollaCalifornia92093
| | - Jose M. Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of California, San DiegoLa JollaCalifornia92093
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data BankInstitute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew Jersey08854
| | - John Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data BankInstitute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew Jersey08854
| | - Jasmine Young
- Research Collaboratory for Structural Bioinformatics Protein Data BankInstitute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew Jersey08854
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data BankInstitute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew Jersey08854
| |
Collapse
|