1
|
Mendez D, Holton JM, Lyubimov AY, Hollatz S, Mathews II, Cichosz A, Martirosyan V, Zeng T, Stofer R, Liu R, Song J, McPhillips S, Soltis M, Cohen AE. Deep residual networks for crystallography trained on synthetic data. Acta Crystallogr D Struct Biol 2024; 80:26-43. [PMID: 38164955 PMCID: PMC10833344 DOI: 10.1107/s2059798323010586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Accepted: 12/12/2023] [Indexed: 01/03/2024] Open
Abstract
The use of artificial intelligence to process diffraction images is challenged by the need to assemble large and precisely designed training data sets. To address this, a codebase called Resonet was developed for synthesizing diffraction data and training residual neural networks on these data. Here, two per-pattern capabilities of Resonet are demonstrated: (i) interpretation of crystal resolution and (ii) identification of overlapping lattices. Resonet was tested across a compilation of diffraction images from synchrotron experiments and X-ray free-electron laser experiments. Crucially, these models readily execute on graphics processing units and can thus significantly outperform conventional algorithms. While Resonet is currently utilized to provide real-time feedback for macromolecular crystallography users at the Stanford Synchrotron Radiation Lightsource, its simple Python-based interface makes it easy to embed in other processing frameworks. This work highlights the utility of physics-based simulation for training deep neural networks and lays the groundwork for the development of additional models to enhance diffraction collection and analysis.
Collapse
Affiliation(s)
- Derek Mendez
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - James M. Holton
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Department of Biochemistry and Biophysics, UC San Francisco, San Francisco, CA 94158, USA
| | - Artem Y. Lyubimov
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Sabine Hollatz
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Irimpan I. Mathews
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Aleksander Cichosz
- Department of Statistics and Applied Probability, UC Santa Barbara, Santa Barbara, CA 93106, USA
| | - Vardan Martirosyan
- Department of Mathematics, UC Santa Barbara, Santa Barbara, CA 93106, USA
| | - Teo Zeng
- Department of Statistics and Applied Probability, UC Santa Barbara, Santa Barbara, CA 93106, USA
| | - Ryan Stofer
- Department of Statistics and Applied Probability, UC Santa Barbara, Santa Barbara, CA 93106, USA
| | - Ruobin Liu
- Department of Statistics and Applied Probability, UC Santa Barbara, Santa Barbara, CA 93106, USA
| | - Jinhu Song
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Scott McPhillips
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Mike Soltis
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Aina E. Cohen
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| |
Collapse
|
2
|
Young ID, Mendez D, Poon BK, Blaschke JP, Wittwer F, Wall ME, Sauter NK. Interpreting macromolecular diffraction through simulation. Methods Enzymol 2023; 688:195-222. [PMID: 37748827 DOI: 10.1016/bs.mie.2023.06.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/27/2023]
Abstract
This chapter discusses the use of diffraction simulators to improve experimental outcomes in macromolecular crystallography, in particular for future experiments aimed at diffuse scattering. Consequential decisions for upcoming data collection include the selection of either a synchrotron or free electron laser X-ray source, rotation geometry or serial crystallography, and fiber-coupled area detector technology vs. pixel-array detectors. The hope is that simulators will provide insights to make these choices with greater confidence. Simulation software, especially those packages focused on physics-based calculation of the diffraction, can help to predict the location, size, shape, and profile of Bragg spots and diffuse patterns in terms of an underlying physical model, including assumptions about the crystal's mosaic structure, and therefore can point to potential issues with data analysis in the early planning stages. Also, once the data are collected, simulation may offer a pathway to improve the measurement of diffraction, especially with weak data, and might help to treat problematic cases such as overlapping patterns.
Collapse
Affiliation(s)
- Iris D Young
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, United States
| | - Derek Mendez
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, United States; Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA, United States
| | - Billy K Poon
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, United States
| | - Johannes P Blaschke
- National Energy Research Scientific Computing Center, Lawrence Berkeley National Laboratory, Berkeley, CA, United States
| | - Felix Wittwer
- National Energy Research Scientific Computing Center, Lawrence Berkeley National Laboratory, Berkeley, CA, United States
| | - Michael E Wall
- Computer, Computational and Statistical Sciences Division, Los Alamos, NM, United States
| | - Nicholas K Sauter
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, United States.
| |
Collapse
|
3
|
Ganapati V, Tchoń D, Brewster AS, Sauter NK. Self-Supervised Deep Learning for Model Correction in the Computational Crystallography Toolbox. ArXiv 2023:arXiv:2307.01901v1. [PMID: 37461412 PMCID: PMC10350105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/23/2023]
Abstract
The Computational Crystallography Toolbox (cctbx) is open-source software that allows for processing of crystallographic data, including from serial femtosecond crystallography (SFX), for macromolecular structure determination. We aim to use the modules in cctbx to determine the oxidation state of individual metal atoms in a macromolecule. Changes in oxidation state are reflected in small shifts of the atom's X-ray absorption edge. These energy shifts can be extracted from the diffraction images recorded in serial femtosecond crystallography, given knowledge of a forward physics model. However, as the diffraction changes only slightly due to the absorption edge shift, inaccuracies in the forward physics model make it extremely challenging to observe the oxidation state. In this work, we describe the potential impact of using self-supervised deep learning to correct the scientific model in cctbx and provide uncertainty quantification. We provide code for forward model simulation and data analysis, built from cctbx modules, at https://github.com/gigantocypris/SPREAD, which can be integrated with machine learning. We describe open questions in algorithm development to help spur advances through dialog between crystallographers and machine learning researchers. New methods could help elucidate charge transfer processes in many reactions, including key events in photosynthesis.
Collapse
Affiliation(s)
- Vidya Ganapati
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Department of Engineering, Swarthmore College, Swarthmore, PA 19081, USA
| | - Daniel Tchoń
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Aaron S. Brewster
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nicholas K. Sauter
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|
4
|
Worrall JAR, Hough MA. Serial femtosecond crystallography approaches to understanding catalysis in iron enzymes. Curr Opin Struct Biol 2022; 77:102486. [PMID: 36274419 DOI: 10.1016/j.sbi.2022.102486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 09/11/2022] [Accepted: 09/16/2022] [Indexed: 12/14/2022]
Abstract
Enzymes with iron-containing active sites play crucial roles in catalysing a myriad of oxidative reactions essential to aerobic life. Defining the three-dimensional structures of iron enzymes in resting, oxy-bound intermediate and substrate-bound states is particularly challenging, not least because of the extreme susceptibility of the Fe(III) and Fe(IV) redox states to radiation-induced chemistry caused by intense X-ray or electron beams. The availability of novel sources such as X-ray free electron lasers has enabled structures that are effectively free of the effects of radiation-induced chemistry and allows time-resolved structures to be determined. Important to both applications is the ability to obtain in crystallo spectroscopic data to identify the redox state of the iron in any particular structure or timepoint.
Collapse
Affiliation(s)
- Jonathan A R Worrall
- School of Life Sciences, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, UK
| | - Michael A Hough
- School of Life Sciences, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, UK; Diamond Light Source Ltd, Harwell Science and Innovation Campus, Didcot, Oxfordshire OX11 0DE, UK.
| |
Collapse
|
5
|
Hough MA, Owen RL. Serial synchrotron and XFEL crystallography for studies of metalloprotein catalysis. Curr Opin Struct Biol 2021; 71:232-238. [PMID: 34455163 PMCID: PMC8667872 DOI: 10.1016/j.sbi.2021.07.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 07/15/2021] [Accepted: 07/17/2021] [Indexed: 11/24/2022]
Abstract
An estimated half of all proteins contain a metal, with these being essential for a tremendous variety of biological functions. X-ray crystallography is the major method for obtaining structures at high resolution of these metalloproteins, but there are considerable challenges to obtain intact structures due to the effects of radiation damage. Serial crystallography offers the prospect of determining low-dose synchrotron or effectively damage free XFEL structures at room temperature and enables time-resolved or dose-resolved approaches. Complementary spectroscopic data can validate redox and or ligand states within metalloprotein crystals. In this opinion, we discuss developments in the application of serial crystallographic approaches to metalloproteins and comment on future directions.
Collapse
Affiliation(s)
- Michael A Hough
- School of Life Sciences, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, UK.
| | - Robin L Owen
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, UK.
| |
Collapse
|
6
|
Fuller FD, Loukianov A, Takanashi T, You D, Li Y, Ueda K, Fransson T, Yabashi M, Katayama T, Weng TC, Alonso-Mori R, Bergmann U, Jan Kern, Yachandra VK, Wernet P, Yano J. Resonant X-ray emission spectroscopy from broadband stochastic pulses at an X-ray free electron laser. Commun Chem 2021; 4:84. [PMID: 35291552 PMCID: PMC8920481 DOI: 10.1038/s42004-021-00512-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 04/21/2021] [Indexed: 01/27/2023] Open
Abstract
Hard X-ray spectroscopy is an element specific probe of electronic state, but signals are weak and require intense light to study low concentration samples. Free electron laser facilities offer the highest intensity X-rays of any available light source. The light produced at such facilities is stochastic, with spikey, broadband spectra that change drastically from shot to shot. Here, using aqueous ferrocyanide, we show that the resonant X-ray emission (RXES) spectrum can be inferred by correlating for each shot the fluorescence intensity from the sample with spectra of the fluctuating, self-amplified spontaneous emission (SASE) source. We obtain resolved narrow and chemically rich information in core-to-valence transitions of the pre-edge region at the Fe K-edge. Our approach avoids monochromatization, provides higher photon flux to the sample, and allows non-resonant signals like elastic scattering to be simultaneously recorded. The spectra obtained match well with spectra measured using a monochromator. We also show that inaccurate measurements of the stochastic light spectra reduce the measurement efficiency of our approach.
Collapse
Affiliation(s)
| | | | | | | | - Yiwen Li
- Tohoku University, Sendai, Miyagi Japan
| | | | | | | | - Tetsuo Katayama
- RIKEN SPring-8 Center, Sayo, Hyogo Japan
- Japan Synchrotron Radiation Research Institute, Sayo, Hyogo Japan
| | - Tsu-Chien Weng
- School of Physical Science and Technology, ShanghaiTech University, Shanghai, China
| | | | | | - Jan Kern
- Lawrence Berkeley National Laboratory, Berkeley, CA USA
| | | | | | - Junko Yano
- Lawrence Berkeley National Laboratory, Berkeley, CA USA
| |
Collapse
|
7
|
Aller P, Orville AM. Dynamic Structural Biology Experiments at XFEL or Synchrotron Sources. Methods Mol Biol 2021; 2305:203-28. [PMID: 33950392 DOI: 10.1007/978-1-0716-1406-8_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Macromolecular crystallography (MX) leverages the methods of physics and the language of chemistry to reveal fundamental insights into biology. Often beautifully artistic images present MX results to support profound functional hypotheses that are vital to entire life science research community. Over the past several decades, synchrotrons around the world have been the workhorses for X-ray diffraction data collection at many highly automated beamlines. The newest tools include X-ray-free electron lasers (XFELs) located at facilities in the USA, Japan, Korea, Switzerland, and Germany that deliver about nine orders of magnitude higher brightness in discrete femtosecond long pulses. At each of these facilities, new serial femtosecond crystallography (SFX) strategies exploit slurries of micron-size crystals by rapidly delivering individual crystals into the XFEL X-ray interaction region, from which one diffraction pattern is collected per crystal before it is destroyed by the intense X-ray pulse. Relatively simple adaptions to SFX methods produce time-resolved data collection strategies wherein reactions are triggered by visible light illumination or by chemical diffusion/mixing. Thus, XFELs provide new opportunities for high temporal and spatial resolution studies of systems engaged in function at physiological temperature. In this chapter, we summarize various issues related to microcrystal slurry preparation, sample delivery into the X-ray interaction region, and some emerging strategies for time-resolved SFX data collection.
Collapse
|
8
|
Mendez D, Bolotovsky R, Bhowmick A, Brewster AS, Kern J, Yano J, Holton JM, Sauter NK. Beyond integration: modeling every pixel to obtain better structure factors from stills. IUCrJ 2020; 7:1151-1167. [PMID: 33209326 PMCID: PMC7642780 DOI: 10.1107/s2052252520013007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 09/23/2020] [Indexed: 05/25/2023]
Abstract
Most crystallographic data processing methods use pixel integration. In serial femtosecond crystallography (SFX), the intricate interaction between the reciprocal lattice point and the Ewald sphere is integrated out by averaging symmetrically equivalent observations recorded across a large number (104-106) of exposures. Although sufficient for generating biological insights, this approach converges slowly, and using it to accurately measure anomalous differences has proved difficult. This report presents a novel approach for increasing the accuracy of structure factors obtained from SFX data. A physical model describing all observed pixels is defined to a degree of complexity such that it can decouple the various contributions to the pixel intensities. Model dependencies include lattice orientation, unit-cell dimensions, mosaic structure, incident photon spectra and structure factor amplitudes. Maximum likelihood estimation is used to optimize all model parameters. The application of prior knowledge that structure factor amplitudes are positive quantities is included in the form of a reparameterization. The method is tested using a synthesized SFX dataset of ytterbium(III) lysozyme, where each X-ray laser pulse energy is centered at 9034 eV. This energy is 100 eV above the Yb3+ L-III absorption edge, so the anomalous difference signal is stable at 10 electrons despite the inherent energy jitter of each femtosecond X-ray laser pulse. This work demonstrates that this approach allows the determination of anomalous structure factors with very high accuracy while requiring an order-of-magnitude fewer shots than conventional integration-based methods would require to achieve similar results.
Collapse
Affiliation(s)
- Derek Mendez
- Molecular Biophysics and Integrated Bioimaging Division (MBIB), Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Robert Bolotovsky
- Molecular Biophysics and Integrated Bioimaging Division (MBIB), Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Asmit Bhowmick
- Molecular Biophysics and Integrated Bioimaging Division (MBIB), Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Aaron S. Brewster
- Molecular Biophysics and Integrated Bioimaging Division (MBIB), Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jan Kern
- Molecular Biophysics and Integrated Bioimaging Division (MBIB), Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Junko Yano
- Molecular Biophysics and Integrated Bioimaging Division (MBIB), Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - James M. Holton
- Molecular Biophysics and Integrated Bioimaging Division (MBIB), Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
- Department of Biochemistry and Biophysics, UC San Francisco, San Francisco, CA 94158, USA
| | - Nicholas K. Sauter
- Molecular Biophysics and Integrated Bioimaging Division (MBIB), Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|
9
|
Nass K. Pixel modelling - a new age in SFX data analysis. IUCrJ 2020; 7:949-950. [PMID: 33209307 PMCID: PMC7642796 DOI: 10.1107/s2052252520014281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
A new program, diffBragg, employs per-pixel maximum likelihood optimization of X-ray pulse and crystal parameters to improve the accuracy of structure factor amplitudes attainable in SFX experiments.
Collapse
Affiliation(s)
- Karol Nass
- SwissFEL, Paul Scherrer Institut, Forschungsstrasse 111, Villigen PSI, 5232, Switzerland
| |
Collapse
|
10
|
Abstract
Recent technical advances have dramatically increased the power and scope of structural biology. New developments in high-resolution cryo-electron microscopy, serial X-ray crystallography, and electron diffraction have been especially transformative. Here we highlight some of the latest advances and current challenges at the frontiers of atomic resolution methods for elucidating the structures and dynamical properties of macromolecules and their complexes.
Collapse
Affiliation(s)
- Michael C. Thompson
- Department of Chemistry and Chemical Biology, University of California, Merced, CA, USA
| | - Todd O. Yeates
- Department of Chemistry and Biochemistry, University of California Los Angeles, Los Angeles, CA, USA
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, USA
| | - Jose A. Rodriguez
- Department of Chemistry and Biochemistry, University of California Los Angeles, Los Angeles, CA, USA
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, USA
| |
Collapse
|