1
|
King ONF, Levik KE, Sandy J, Basham M. CHiMP: deep-learning tools trained on protein crystallization micrographs to enable automation of experiments. Acta Crystallogr D Struct Biol 2024; 80:744-764. [PMID: 39361357 PMCID: PMC11448919 DOI: 10.1107/s2059798324009276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Accepted: 09/22/2024] [Indexed: 10/05/2024] Open
Abstract
A group of three deep-learning tools, referred to collectively as CHiMP (Crystal Hits in My Plate), were created for analysis of micrographs of protein crystallization experiments at the Diamond Light Source (DLS) synchrotron, UK. The first tool, a classification network, assigns images into categories relating to experimental outcomes. The other two tools are networks that perform both object detection and instance segmentation, resulting in masks of individual crystals in the first case and masks of crystallization droplets in addition to crystals in the second case, allowing the positions and sizes of these entities to be recorded. The creation of these tools used transfer learning, where weights from a pre-trained deep-learning network were used as a starting point and repurposed by further training on a relatively small set of data. Two of the tools are now integrated at the VMXi macromolecular crystallography beamline at DLS, where they have the potential to absolve the need for any user input, both for monitoring crystallization experiments and for triggering in situ data collections. The third is being integrated into the XChem fragment-based drug-discovery screening platform, also at DLS, to allow the automatic targeting of acoustic compound dispensing into crystallization droplets.
Collapse
Affiliation(s)
- Oliver N F King
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Karl E Levik
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - James Sandy
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Mark Basham
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| |
Collapse
|
2
|
Wegner CH, Eming SM, Walla B, Bischoff D, Weuster-Botz D, Hubbuch J. Spectroscopic insights into multi-phase protein crystallization in complex lysate using Raman spectroscopy and a particle-free bypass. Front Bioeng Biotechnol 2024; 12:1397465. [PMID: 38812919 PMCID: PMC11133712 DOI: 10.3389/fbioe.2024.1397465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 04/23/2024] [Indexed: 05/31/2024] Open
Abstract
Protein crystallization as opposed to well-established chromatography processes has the benefits to reduce production costs while reaching a comparable high purity. However, monitoring crystallization processes remains a challenge as the produced crystals may interfere with analytical measurements. Especially for capturing proteins from complex feedstock containing various impurities, establishing reliable process analytical technology (PAT) to monitor protein crystallization processes can be complicated. In heterogeneous mixtures, important product characteristics can be found by multivariate analysis and chemometrics, thus contributing to the development of a thorough process understanding. In this project, an analytical set-up is established combining offline analytics, on-line ultraviolet visible light (UV/Vis) spectroscopy, and in-line Raman spectroscopy to monitor a stirred-batch crystallization process with multiple phases and species being present. As an example process, the enzyme Lactobacillus kefir alcohol dehydrogenase (LkADH) was crystallized from clarified Escherichia coli (E. coli) lysate on a 300 mL scale in five distinct experiments, with the experimental conditions changing in terms of the initial lysate solution preparation method and precipitant concentration. Since UV/Vis spectroscopy is sensitive to particles, a cross-flow filtration (cross-flow filtration)-based bypass enabled the on-line analysis of the liquid phase providing information on the lysate composition regarding the nucleic acid to protein ratio. A principal component analysis (PCA) of in situ Raman spectra supported the identification of spectra and wavenumber ranges associated with productspecific information and revealed that the experiments followed a comparable, spectral trend when crystals were present. Based on preprocessed Raman spectra, a partial least squares (PLS) regression model was optimized to monitor the target molecule concentration in real-time. The off-line sample analysis provided information on the crystal number and crystal geometry by automated image analysis as well as the concentration of LkADH and host cell proteins (HCPs) In spite of a complex lysate suspension containing scattering crystals and various impurities, it was possible to monitor the target molecule concentration in a heterogeneous, multi-phase process using spectroscopic methods. With the presented analytical set-up of off-line, particle-sensitive on-line, and in-line analyzers, a crystallization capture process can be characterized better in terms of the geometry, yield, and purity of the crystals.
Collapse
Affiliation(s)
- Christina Henriette Wegner
- Institute of Process Engineering in Life Sciences, Section IV: Biomolecular Separation Engineering, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| | - Sebastian Mathis Eming
- Institute of Process Engineering in Life Sciences, Section IV: Biomolecular Separation Engineering, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| | - Brigitte Walla
- Institute of Biochemical Engineering, Technical University of Munich, Garching, Germany
| | - Daniel Bischoff
- Institute of Biochemical Engineering, Technical University of Munich, Garching, Germany
| | - Dirk Weuster-Botz
- Institute of Biochemical Engineering, Technical University of Munich, Garching, Germany
| | - Jürgen Hubbuch
- Institute of Process Engineering in Life Sciences, Section IV: Biomolecular Separation Engineering, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| |
Collapse
|
3
|
Bag S, Meinel MK, Müller-Plathe F. Synthetic Force-Field Database for Training Machine Learning Models to Predict Mobility-Preserving Coarse-Grained Molecular-Simulation Potentials. J Chem Theory Comput 2024; 20:3046-3060. [PMID: 38593205 DOI: 10.1021/acs.jctc.4c00242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2024]
Abstract
Balancing accuracy and efficiency is a common problem in molecular simulation. This tradeoff is evident in coarse-grained molecular dynamics simulation, which prioritizes efficiency, and all-atom molecular simulation, which prioritizes accuracy. Despite continuous efforts, creating a coarse-grained model that accurately captures both the system's structure and dynamics remains elusive. In this article, we present a data-driven approach for constructing coarse-grained models that aim to describe both the structure and dynamics of the system equally well. While the development of machine learning models is well-received in the scientific community, the significance of dataset creation for these models is often overlooked. However, data-driven approaches cannot progress without a robust dataset. To address this, we construct a database of synthetic coarse-grained potentials generated from unphysical all-atom models. A neural network is trained with the generated database to predict the coarse-grained potentials of real liquids. We evaluate their quality by calculating the combined loss of structural and dynamical accuracy upon coarse-graining. When we compare our machine learning-based coarse-grained potential with the one from iterative Boltzmann inversion, the machine learning prediction turns out better for all eight hydrocarbon liquids we studied. As all-atom surfaces turn more nonspherical, both ways of coarse-graining degrade. Still, the neural network outperforms iterative Boltzmann inversion in constructing good quality coarse-grained models for such cases. The synthetic database and the developed machine learning models are freely available to the community, and we believe that our approach will generate interest in efficiently deriving accurate coarse-grained models for liquids.
Collapse
Affiliation(s)
- Saientan Bag
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Peter-Grünberg-Str. 8, 64287 Darmstadt, Germany
| | - Melissa K Meinel
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Peter-Grünberg-Str. 8, 64287 Darmstadt, Germany
| | - Florian Müller-Plathe
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Peter-Grünberg-Str. 8, 64287 Darmstadt, Germany
| |
Collapse
|
4
|
Jiang C, Ma CY, Hazlehurst TA, Ilett TP, Jackson ASM, Hogg DC, Roberts KJ. Automated Growth Rate Measurement of the Facet Surfaces of Single Crystals of the β-Form of l-Glutamic Acid Using Machine Learning Image Processing. CRYSTAL GROWTH & DESIGN 2024; 24:3277-3288. [PMID: 38659658 PMCID: PMC11036364 DOI: 10.1021/acs.cgd.3c01548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 03/06/2024] [Accepted: 03/06/2024] [Indexed: 04/26/2024]
Abstract
Precision measurement of the growth rate of individual single crystal facets (hkl) represents an important component in the design of industrial crystallization processes. Current approaches for crystal growth measurement using optical microscopy are labor intensive and prone to error. An automated process using state-of-the-art computer vision and machine learning to segment and measure the crystal images is presented. The accuracies and efficiencies of the new crystal sizing approach are evaluated against existing manual and semi-automatic methods, demonstrating equivalent accuracy but over a much shorter time, thereby enabling a more complete kinematic analysis of the overall crystallization process. This is applied to measure in situ the crystal growth rates and through this determining the associated kinetic mechanisms for the crystallization of β-form l-glutamic acid from the solution phase. Growth on the {101} capping faces is consistent with a Birth and Spread mechanism, in agreement with the literature, while the growth rate of the {021} prismatic faces, previously not available in the literature, is consistent with a Burton-Cabrera-Frank screw dislocation mechanism. At a typical supersaturation of σ = 0.78, the growth rate of the {101} capping faces (3.2 × 10-8 m s-1) is found to be 17 times that of the {021} prismatic faces (1.9 × 10-9 m s-1). Both capping and prismatic faces are found to have dead zones in their growth kinetic profiles, with the capping faces (σc = 0.23) being about half that of the prismatic faces (σc = 0.46). The importance of this overall approach as an integral component of the digital design of industrial crystallization processes is highlighted.
Collapse
Affiliation(s)
- Chen Jiang
- Centre
for the Digital Design of Drug Products, School of Chemical and Process
Engineering, University of Leeds, Leeds LS2 9JT, U.K.
| | - Cai Y. Ma
- Centre
for the Digital Design of Drug Products, School of Chemical and Process
Engineering, University of Leeds, Leeds LS2 9JT, U.K.
| | - Thomas A. Hazlehurst
- Centre
for the Digital Design of Drug Products, School of Chemical and Process
Engineering, University of Leeds, Leeds LS2 9JT, U.K.
- School
of Computing, University of Leeds, Leeds LS2 9JT, U.K.
| | - Thomas P. Ilett
- Centre
for the Digital Design of Drug Products, School of Chemical and Process
Engineering, University of Leeds, Leeds LS2 9JT, U.K.
- School
of Computing, University of Leeds, Leeds LS2 9JT, U.K.
| | - Alexander S. M. Jackson
- Centre
for the Digital Design of Drug Products, School of Chemical and Process
Engineering, University of Leeds, Leeds LS2 9JT, U.K.
| | - David C. Hogg
- Centre
for the Digital Design of Drug Products, School of Chemical and Process
Engineering, University of Leeds, Leeds LS2 9JT, U.K.
- School
of Computing, University of Leeds, Leeds LS2 9JT, U.K.
| | - Kevin J. Roberts
- Centre
for the Digital Design of Drug Products, School of Chemical and Process
Engineering, University of Leeds, Leeds LS2 9JT, U.K.
| |
Collapse
|
5
|
Matinyan S, Filipcik P, Abrahams JP. Deep learning applications in protein crystallography. Acta Crystallogr A Found Adv 2024; 80:1-17. [PMID: 38189437 PMCID: PMC10833361 DOI: 10.1107/s2053273323009300] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 10/24/2023] [Indexed: 01/09/2024] Open
Abstract
Deep learning techniques can recognize complex patterns in noisy, multidimensional data. In recent years, researchers have started to explore the potential of deep learning in the field of structural biology, including protein crystallography. This field has some significant challenges, in particular producing high-quality and well ordered protein crystals. Additionally, collecting diffraction data with high completeness and quality, and determining and refining protein structures can be problematic. Protein crystallographic data are often high-dimensional, noisy and incomplete. Deep learning algorithms can extract relevant features from these data and learn to recognize patterns, which can improve the success rate of crystallization and the quality of crystal structures. This paper reviews progress in this field.
Collapse
Affiliation(s)
| | | | - Jan Pieter Abrahams
- Biozentrum, Basel University, Basel, Switzerland
- Paul Scherrer Institute, Villigen, Switzerland
| |
Collapse
|
6
|
Bijak V, Szczygiel M, Lenkiewicz J, Gucwa M, Cooper DR, Murzyn K, Minor W. The current role and evolution of X-ray crystallography in drug discovery and development. Expert Opin Drug Discov 2023; 18:1221-1230. [PMID: 37592849 PMCID: PMC10620067 DOI: 10.1080/17460441.2023.2246881] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 08/08/2023] [Indexed: 08/19/2023]
Abstract
INTRODUCTION Macromolecular X-ray crystallography and cryo-EM are currently the primary techniques used to determine the three-dimensional structures of proteins, nucleic acids, and viruses. Structural information has been critical to drug discovery and structural bioinformatics. The integration of artificial intelligence (AI) into X-ray crystallography has shown great promise in automating and accelerating the analysis of complex structural data, further improving the efficiency and accuracy of structure determination. AREAS COVERED This review explores the relationship between X-ray crystallography and other modern structural determination methods. It examines the integration of data acquired from diverse biochemical and biophysical techniques with those derived from structural biology. Additionally, the paper offers insights into the influence of AI on X-ray crystallography, emphasizing how integrating AI with experimental approaches can revolutionize our comprehension of biological processes and interactions. EXPERT OPINION Investing in science is crucially emphasized due to its significant role in drug discovery and advancements in healthcare. X-ray crystallography remains an essential source of structural biology data for drug discovery. Recent advances in biochemical, spectroscopic, and bioinformatic methods, along with the integration of AI techniques, hold the potential to revolutionize drug discovery when effectively combined with robust data management practices.
Collapse
Affiliation(s)
- Vanessa Bijak
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville 22908
| | - Michal Szczygiel
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville 22908
- Department of Computational Biophysics and Bioinformatics, Jagiellonian University, Krakow, Poland
| | - Joanna Lenkiewicz
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville 22908
| | - Michal Gucwa
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville 22908
- Doctoral School of Exact and Natural Sciences, Jagiellonian University, Krakow, Poland
| | - David R. Cooper
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville 22908
| | - Krzysztof Murzyn
- Department of Computational Biophysics and Bioinformatics, Jagiellonian University, Krakow, Poland
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville 22908
| |
Collapse
|
7
|
Walla B, Bischoff D, Franz S, Janowski R, Weuster-Botz D. Crystallization of Rationally Engineered
Lactobacillus kefir
Alcoholdehydrogenases Monitored by Machine‐Learning Based Protein Crystal Detection. CHEM-ING-TECH 2022. [DOI: 10.1002/cite.202255156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- B. Walla
- Technical University of Munich Chair of Biochemical Engineering Boltzmannstr. 15 85748 Garching Germany
| | - D. Bischoff
- Technical University of Munich Chair of Biochemical Engineering Boltzmannstr. 15 85748 Garching Germany
| | - S. Franz
- Technical University of Munich Chair of Biochemical Engineering Boltzmannstr. 15 85748 Garching Germany
| | - R. Janowski
- Helmholtz Zentrum München Institute of Structural Biology Ingolstädter Landstraße 1 85764 Neuherberg Germany
| | - D. Weuster-Botz
- Technical University of Munich Chair of Biochemical Engineering Boltzmannstr. 15 85748 Garching Germany
| |
Collapse
|
8
|
Baeumner AJ, Gauglitz G, Mondello L, Bondi MCM, Szunerits S, Wang Q, Wise SA, Woolley AT. Sustainability in (bio-)analytical chemistry. Anal Bioanal Chem 2022; 414:6281-6284. [PMID: 35831536 DOI: 10.1007/s00216-022-04211-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/05/2022] [Indexed: 11/01/2022]
Affiliation(s)
- Antje J Baeumner
- Institute for Analytical Chemistry, Bio- and Chemosensors, University Regensburg, Regensburg, Germany.
| | - Günter Gauglitz
- Institute for Theoretical and Physical Chemistry, Eberhard-Karls-University, Tübingen, Germany.
| | - Luigi Mondello
- Department of Chemical, Biological, Pharmaceutical and Environmental Sciences, University of Messina, Messina, Italy.
| | - María Cruz Moreno Bondi
- Department of Analytical Chemistry, Faculty of Chemistry, Complutense University, Madrid, Spain
| | - Sabine Szunerits
- Institut d'Electronique, de Microélectronique et de Nanotechnologie (IEMN, UMR CNRS 8520), Université de Lille, Villeneuve d'Ascq, France.
| | - Qiuquan Wang
- Department of Chemistry, MOE Key Lab of Analytical Sciences, Xiamen University, Xiamen, Fujian, China.
| | - Stephen A Wise
- Office of Dietary Supplements, National Institutes of Health, Bethesda, MD, USA. .,Scientist Emeritus, National Institute of Standards and Technology (NIST), Gaithersburg, MD, USA.
| | - Adam T Woolley
- Department of Chemistry and Biochemistry, Brigham Young University, Provo, UT, USA.
| |
Collapse
|