1
|
Duke R, Bhat V, Risko C. Data storage architectures to accelerate chemical discovery: data accessibility for individual laboratories and the community. Chem Sci 2022; 13:13646-13656. [PMID: 36544717 PMCID: PMC9710231 DOI: 10.1039/d2sc05142g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 11/06/2022] [Indexed: 11/11/2022] Open
Abstract
As buzzwords like "big data," "machine learning," and "high-throughput" expand through chemistry, chemists need to consider more than ever their data storage, data management, and data accessibility, whether in their own laboratories or with the broader community. While it is commonplace for chemists to use spreadsheets for data storage and analysis, a move towards database architectures ensures that the data can be more readily findable, accessible, interoperable, and reusable (FAIR). However, making this move has several challenges for those with limited-to-no knowledge of computer programming and databases. This Perspective presents basics of data management using databases with a focus on chemical data. We overview database fundamentals by exploring benefits of database use, introducing terminology, and establishing database design principles. We then detail the extract, transform, and load process for database construction, which includes an overview of data parsing and database architectures, spanning Standard Query Language (SQL) and No-SQL structures. We close by cataloging overarching challenges in database design. This Perspective is accompanied by an interactive demonstration available at https://github.com/D3TaLES/databases_demo. We do all of this within the context of chemical data with the aim of equipping chemists with the knowledge and skills to store, manage, and share their data while abiding by FAIR principles.
Collapse
Affiliation(s)
- Rebekah Duke
- Department of Chemistry & Center for Applied Energy Research, University of Kentucky Lexington 40506 Kentucky USA
| | - Vinayak Bhat
- Department of Chemistry & Center for Applied Energy Research, University of Kentucky Lexington 40506 Kentucky USA
| | - Chad Risko
- Department of Chemistry & Center for Applied Energy Research, University of Kentucky Lexington 40506 Kentucky USA
| |
Collapse
|
2
|
Lee MY, Geiger J, Ishchenko A, Han GW, Barty A, White TA, Gati C, Batyuk A, Hunter MS, Aquila A, Boutet S, Weierstall U, Cherezov V, Liu W. Harnessing the power of an X-ray laser for serial crystallography of membrane proteins crystallized in lipidic cubic phase. IUCRJ 2020; 7:976-984. [PMID: 33209312 PMCID: PMC7642783 DOI: 10.1107/s2052252520012701] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Accepted: 09/17/2020] [Indexed: 05/31/2023]
Abstract
Serial femtosecond crystallography (SFX) with X-ray free-electron lasers (XFELs) has proven highly successful for structure determination of challenging membrane proteins crystallized in lipidic cubic phase; however, like most techniques, it has limitations. Here we attempt to address some of these limitations related to the use of a vacuum chamber and the need for attenuation of the XFEL beam, in order to further improve the efficiency of this method. Using an optimized SFX experimental setup in a helium atmosphere, the room-temperature structure of the adenosine A2A receptor (A2AAR) at 2.0 Å resolution is determined and compared with previous A2AAR structures determined in vacuum and/or at cryogenic temperatures. Specifically, the capability of utilizing high XFEL beam transmissions is demonstrated, in conjunction with a high dynamic range detector, to collect high-resolution SFX data while reducing crystalline material consumption and shortening the collection time required for a complete dataset. The experimental setup presented herein can be applied to future SFX applications for protein nanocrystal samples to aid in structure-based discovery efforts of therapeutic targets that are difficult to crystallize.
Collapse
Affiliation(s)
- Ming-Yue Lee
- Center for Applied Structural Discovery at the Biodesign Institute, Arizona State University, Tempe, AZ 85287-1604, USA
| | - James Geiger
- Center for Applied Structural Discovery at the Biodesign Institute, Arizona State University, Tempe, AZ 85287-1604, USA
| | - Andrii Ishchenko
- Bridge Institute, Michelson Center for Convergent Bioscience, University of Southern California, 1002 W. Childs Way, Los Angeles, CA 90089, USA
| | - Gye Won Han
- Bridge Institute, Michelson Center for Convergent Bioscience, University of Southern California, 1002 W. Childs Way, Los Angeles, CA 90089, USA
| | - Anton Barty
- Center for Free-Electron Laser Science, Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany
| | - Thomas A White
- Center for Free-Electron Laser Science, Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany
| | - Cornelius Gati
- LCLS, SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, CA 94025, USA
| | - Alexander Batyuk
- LCLS, SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, CA 94025, USA
| | - Mark S Hunter
- LCLS, SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, CA 94025, USA
| | - Andrew Aquila
- LCLS, SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, CA 94025, USA
| | - Sébastien Boutet
- LCLS, SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, CA 94025, USA
| | - Uwe Weierstall
- Center for Applied Structural Discovery at the Biodesign Institute, Arizona State University, Tempe, AZ 85287-1604, USA
| | - Vadim Cherezov
- Bridge Institute, Michelson Center for Convergent Bioscience, University of Southern California, 1002 W. Childs Way, Los Angeles, CA 90089, USA
- Department of Chemistry, University of Southern California, Los Angeles, CA 90089, USA
| | - Wei Liu
- Center for Applied Structural Discovery at the Biodesign Institute, Arizona State University, Tempe, AZ 85287-1604, USA
- School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
| |
Collapse
|