1
|
Maity D, Chakrabarty S. IceCoder: Identification of Ice Phases in Molecular Simulation Using Variational Autoencoder. J Chem Theory Comput 2025; 21:1916-1928. [PMID: 39933150 DOI: 10.1021/acs.jctc.4c01298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2025]
Abstract
The identification and classification of different phases of ice within molecular simulations are challenging tasks due to the complex and varied phase space of ice, which includes numerous crystalline and amorphous forms. Traditional order parameters often struggle to differentiate between these phases, especially under the conditions of thermal fluctuations. In this work, we present a novel machine learning-based framework, IceCoder, which combines a variational autoencoder (VAE) with the smooth overlap of atomic position (SOAP) descriptor to classify a large number of ice phases effectively. Our approach compresses high-dimensional SOAP vectors into a two-dimensional latent space using VAE, facilitating the visualization and distinction of various ice phases. We trained the model on a comprehensive data set generated through molecular dynamics simulations and demonstrated its ability to accurately detect various phases of crystalline ice as well as liquid water at the molecular level. IceCoder provides a robust and generalizable tool for tracking ice phase transitions in simulations, overcoming the limitations of traditional methods. This approach may also be generalized to detect polymorphs in other molecular crystals, leading to new insights into the microscopic mechanisms underlying nucleation, growth, and phase transitions while maintaining computational efficiency.
Collapse
Affiliation(s)
- Dibyendu Maity
- Department of Chemical and Biological Sciences, S. N. Bose National Centre for Basic Sciences, Kolkata 700106, India
| | - Suman Chakrabarty
- Department of Chemical and Biological Sciences, S. N. Bose National Centre for Basic Sciences, Kolkata 700106, India
| |
Collapse
|
2
|
Sun H, Hamel S, Hsu T, Sadigh B, Lordi V, Zhou F. Ice Phase Classification Made Easy with Score-Based Denoising. J Chem Inf Model 2024; 64:6369-6376. [PMID: 39183596 DOI: 10.1021/acs.jcim.4c00822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
Accurate identification of ice phases is essential for understanding various physicochemical phenomena. However, such classification for structures simulated with molecular dynamics is complicated by the complex symmetries of ice polymorphs and thermal fluctuations. For this purpose, both traditional order parameters and data-driven machine learning approaches have been employed, but they often rely on expert intuition, specific geometric information, or large training data sets. In this work, we present an unsupervised phase classification framework that combines a score-based denoiser model with a subsequent model-free classification method to accurately identify ice phases. The denoiser model is trained on perturbed synthetic data of ideal reference structures, eliminating the need for large data sets and labeling efforts. The classification step utilizes the smooth overlap of atomic position (SOAP) descriptors as the atomic fingerprint, ensuring Euclidean symmetries and transferability to various structural systems. Our approach achieves a remarkable 100% accuracy in distinguishing ice phases of test trajectories using only seven ideal reference structures of ice phases as model inputs. This demonstrates the generalizability of the score-based denoiser model in facilitating phase identification for complex molecular systems. The proposed classification strategy can be broadly applied to investigate structural evolution and phase identification for a wide range of materials, offering new insights into the fundamental understanding of water and other complex systems.
Collapse
Affiliation(s)
- Hong Sun
- Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Sebastien Hamel
- Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Tim Hsu
- Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Babak Sadigh
- Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Vincenzo Lordi
- Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Fei Zhou
- Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| |
Collapse
|
3
|
Kuroshima D, Kilgour M, Tuckerman ME, Rogal J. Machine Learning Classification of Local Environments in Molecular Crystals. J Chem Theory Comput 2024; 20:6197-6206. [PMID: 38959410 PMCID: PMC11270820 DOI: 10.1021/acs.jctc.4c00418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 06/14/2024] [Accepted: 06/17/2024] [Indexed: 07/05/2024]
Abstract
Identifying local structural motifs and packing patterns of molecular solids is a challenging task for both simulation and experiment. We demonstrate two novel approaches to characterize local environments in different polymorphs of molecular crystals using learning models that employ either flexibly learned or handcrafted molecular representations. In the first case, we follow our earlier work on graph learning in molecular crystals, deploying an atomistic graph convolutional network combined with molecule-wise aggregation to enable per-molecule environmental classification. For the second model, we develop a new set of descriptors based on symmetry functions combined with a point-vector representation of the molecules, encoding information about the positions and relative orientations of the molecule. We demonstrate very high classification accuracy for both approaches on urea and nicotinamide crystal polymorphs and practical applications to the analysis of dynamical trajectory data for nanocrystals and solid-solid interfaces. Both architectures are applicable to a wide range of molecules and diverse topologies, providing an essential step in the exploration of complex condensed matter phenomena.
Collapse
Affiliation(s)
- Daisuke Kuroshima
- Department
of Chemistry, New York University (NYU), New York, New York 10003, United States
| | - Michael Kilgour
- Department
of Chemistry, New York University (NYU), New York, New York 10003, United States
| | - Mark E. Tuckerman
- Department
of Chemistry, New York University (NYU), New York, New York 10003, United States
- Courant
Institute of Mathematical Sciences, New
York University, New York, New York 10012, United States
- NYU-ECNU
Center for Computational Chemistry at NYU Shanghai, 3663 Zhongshan Rd. North, Shanghai 200062, China
- Simons
Center for Computational Physical Chemistry at New York University, New York, New York 10003, United States
| | - Jutta Rogal
- Department
of Chemistry, New York University (NYU), New York, New York 10003, United States
- Fachbereich
Physik, Freie Universität Berlin, Berlin 14195, Germany
| |
Collapse
|
4
|
Lee SKA, Tsai ST, Glotzer SC. Classification of complex local environments in systems of particle shapes through shape symmetry-encoded data augmentation. J Chem Phys 2024; 160:154102. [PMID: 38624110 DOI: 10.1063/5.0194820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 03/29/2024] [Indexed: 04/17/2024] Open
Abstract
Detecting and analyzing the local environment is crucial for investigating the dynamical processes of crystal nucleation and shape colloidal particle self-assembly. Recent developments in machine learning provide a promising avenue for better order parameters in complex systems that are challenging to study using traditional approaches. However, the application of machine learning to self-assembly on systems of particle shapes is still underexplored. To address this gap, we propose a simple, physics-agnostic, yet powerful approach that involves training a multilayer perceptron (MLP) as a local environment classifier for systems of particle shapes, using input features such as particle distances and orientations. Our MLP classifier is trained in a supervised manner with a shape symmetry-encoded data augmentation technique without the need for any conventional roto-translations invariant symmetry functions. We evaluate the performance of our classifiers on four different scenarios involving self-assembly of cubic structures, two-dimensional and three-dimensional patchy particle shape systems, hexagonal bipyramids with varying aspect ratios, and truncated shapes with different degrees of truncation. The proposed training process and data augmentation technique are both straightforward and flexible, enabling easy application of the classifier to other processes involving particle orientations. Our work thus presents a valuable tool for investigating self-assembly processes on systems of particle shapes, with potential applications in structure identification of any particle-based or molecular system where orientations can be defined.
Collapse
Affiliation(s)
- Shih-Kuang Alex Lee
- Department of Material Science and Engineering, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Sun-Ting Tsai
- Department of Chemical Engineering, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Sharon C Glotzer
- Department of Material Science and Engineering, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Chemical Engineering, University of Michigan, Ann Arbor, Michigan 48109, USA
- Biointerfaces Institute, University of Michigan, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
5
|
Zou Z, Tiwary P. Enhanced Sampling of Crystal Nucleation with Graph Representation Learnt Variables. J Phys Chem B 2024. [PMID: 38502931 DOI: 10.1021/acs.jpcb.4c00080] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]
Abstract
In this study, we present a graph neural network (GNN)-based learning approach using an autoencoder setup to derive low-dimensional variables from features observed in experimental crystal structures. These variables are then biased in enhanced sampling to observe state-to-state transitions and reliable thermodynamic weights. In our approach, we used simple convolution and pooling methods. To verify the effectiveness of our protocol, we examined the nucleation of various allotropes and polymorphs of iron and glycine in their molten states. Our graph latent variables, when biased in well-tempered metadynamics, consistently show transitions between states and achieve accurate thermodynamic rankings in agreement with experiments, both of which are indicators of dependable sampling. This underscores the strength and promise of our GNN variables for improved sampling. The protocol shown here should be applicable for other systems and other sampling methods.
Collapse
Affiliation(s)
- Ziyue Zou
- Department of Chemistry and Biochemistry, University of Maryland, College Park 20742, Maryland, United States
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry, University of Maryland, College Park 20742, Maryland, United States
- Institute for Physical Science and Technology, University of Maryland, College Park 20742, Maryland, United States
- University of Maryland Institute for Health Computing, Rockville, Maryland 20852, United States
| |
Collapse
|
6
|
Matsumoto M, Yagasaki T, Tanaka H. GenIce-core: Efficient algorithm for generation of hydrogen-disordered ice structures. J Chem Phys 2024; 160:094101. [PMID: 38426513 DOI: 10.1063/5.0198056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 02/09/2024] [Indexed: 03/02/2024] Open
Abstract
Ice is different from ordinary crystals because it contains randomness, which means that statistical treatment based on ensemble averaging is essential. Ice structures are constrained by topological rules known as the ice rules, which give them unique anomalous properties. These properties become more apparent when the system size is large. For this reason, there is a need to produce a large number of sufficiently large crystals that are homogeneously random and satisfy the ice rules. We have developed an algorithm to quickly generate ice structures containing ions and defects. This algorithm is provided as an independent software module that can be incorporated into crystal structure generation software. By doing so, it becomes possible to simulate ice crystals on a previously impossible scale.
Collapse
Affiliation(s)
- Masakazu Matsumoto
- Research Institute for Interdisciplinary Science, Okayama University, Okayama 700-8530, Japan
| | - Takuma Yagasaki
- Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University, Osaka 560-8531, Japan
| | - Hideki Tanaka
- Research Institute for Interdisciplinary Science, Okayama University, Okayama 700-8530, Japan
- Toyota Physical and Chemical Research Institute, Nagakute 480-1192, Japan
| |
Collapse
|
7
|
Chen J, Zhu L, Wang J. Quantitative structure-property relationship modelling on autoignition temperature: evaluation and comparative analysis. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:199-218. [PMID: 38372083 DOI: 10.1080/1062936x.2024.2312527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 01/25/2024] [Indexed: 02/20/2024]
Abstract
The autoignition temperature (AIT) serves as a crucial indicator for assessing the potential hazards associated with a chemical substance. In order to gain deeper insights into model performance and facilitate the establishment of effective methodological practices for AIT predictions, this study conducts a benchmark investigation on Quantitative Structure-Property Relationship (QSPR) modelling for AIT. As novelties of this work, three significant advancements are implemented in the AIT modelling process, including explicit consideration of data quality, utilization of state-of-the-art feature engineering workflows, and the innovative application of graph-based deep learning techniques, which are employed for the first time in AIT prediction. Specifically, three traditional QSPR models (multi-linear regression, support vector regression, and artificial neural networks) are evaluated, alongside the assessment of a deep-learning model employing message passing neural network architecture supplemented by graph-data augmentation techniques.
Collapse
Affiliation(s)
- J Chen
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou, China
| | - L Zhu
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou, China
| | - J Wang
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou, China
| |
Collapse
|
8
|
Ishiai S, Yasuda I, Endo K, Yasuoka K. Graph-Neural-Network-Based Unsupervised Learning of the Temporal Similarity of Structural Features Observed in Molecular Dynamics Simulations. J Chem Theory Comput 2024; 20:819-831. [PMID: 38190503 DOI: 10.1021/acs.jctc.3c00995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Abstract
Classification of molecular structures is a crucial step in molecular dynamics (MD) simulations to detect various structures and phases within systems. Molecular structures, which are commonly identified using order parameters, were recently identified using machine learning (ML), that is, the ML models acquire structural features using labeled crystals or phases via supervised learning. However, these approaches may not identify unlabeled or unknown structures, such as the imperfect crystal structures observed in nonequilibrium systems and interfaces. In this study, we proposed the use of a novel unsupervised learning framework, denoted temporal self-supervised learning (TSSL), to learn structural features and design their parameters. In TSSL, the ML models learn that the structural similarity is learned via contrastive learning based on minor short-term variations caused by perturbations in MD simulations. This learning framework is applied to a sophisticated architecture of graph neural network models that use bond angle and length data of the neighboring atoms. TSSL successfully classifies water and ice crystals based on high local ordering, and furthermore, it detects imperfect structures typical of interfaces such as the water-ice and ice-vapor interfaces.
Collapse
Affiliation(s)
- Satoki Ishiai
- Department of Mechanical Engineering, Keio University, Yokohama 223-8522, Japan
| | - Ikki Yasuda
- Department of Mechanical Engineering, Keio University, Yokohama 223-8522, Japan
| | - Katsuhiro Endo
- Department of Mechanical Engineering, Keio University, Yokohama 223-8522, Japan
- National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki 305-8568, Japan
| | - Kenji Yasuoka
- Department of Mechanical Engineering, Keio University, Yokohama 223-8522, Japan
| |
Collapse
|
9
|
Rogal J, Díaz Leines G. Controlling crystallization: what liquid structure and dynamics reveal about crystal nucleation mechanisms. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2023; 381:20220249. [PMID: 37211029 DOI: 10.1098/rsta.2022.0249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 12/06/2022] [Indexed: 05/23/2023]
Abstract
Over recent years, molecular simulations have provided invaluable insights into the microscopic processes governing the initial stages of crystal nucleation and growth. A key aspect that has been observed in many different systems is the formation of precursors in the supercooled liquid that precedes the emergence of crystalline nuclei. The structural and dynamical properties of these precursors determine to a large extent the nucleation probability as well as the formation of specific polymorphs. This novel microscopic view on nucleation mechanisms has further implications for our understanding of the nucleating ability and polymorph selectivity of nucleating agents, as these appear to be strongly linked to their ability in modifying structural and dynamical characteristics of the supercooled liquid, namely liquid heterogeneity. In this perspective, we highlight recent progress in exploring the connection between liquid heterogeneity and crystallization, including the effects of templates, and the potential impact for controlling crystallization processes. This article is part of a discussion meeting issue 'Supercomputing simulations of advanced materials'.
Collapse
Affiliation(s)
- Jutta Rogal
- Department of Chemistry, New York University, New York, NY 10003, USA
- Fachbereich Physik, Freie Universität Berlin, 14195 Berlin, Germany
| | - Grisell Díaz Leines
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
| |
Collapse
|
10
|
Reiser P, Neubert M, Eberhard A, Torresi L, Zhou C, Shao C, Metni H, van Hoesel C, Schopmans H, Sommer T, Friederich P. Graph neural networks for materials science and chemistry. COMMUNICATIONS MATERIALS 2022; 3:93. [PMID: 36468086 PMCID: PMC9702700 DOI: 10.1038/s43246-022-00315-6] [Citation(s) in RCA: 134] [Impact Index Per Article: 44.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 11/07/2022] [Indexed: 05/14/2023]
Abstract
Machine learning plays an increasingly important role in many areas of chemistry and materials science, being used to predict materials properties, accelerate simulations, design new structures, and predict synthesis routes of new materials. Graph neural networks (GNNs) are one of the fastest growing classes of machine learning models. They are of particular relevance for chemistry and materials science, as they directly work on a graph or structural representation of molecules and materials and therefore have full access to all relevant information required to characterize materials. In this Review, we provide an overview of the basic principles of GNNs, widely used datasets, and state-of-the-art architectures, followed by a discussion of a wide range of recent applications of GNNs in chemistry and materials science, and concluding with a road-map for the further development and application of GNNs.
Collapse
Affiliation(s)
- Patrick Reiser
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Institute of Nanotechnology, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
| | - Marlen Neubert
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
| | - André Eberhard
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
| | - Luca Torresi
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
| | - Chen Zhou
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
| | - Chen Shao
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Present Address: Institute for Applied Informatics and Formal Description Systems, Karlsruhe Institute of Technology, Kaiserstr. 89, 76133 Karlsruhe, Germany
| | - Houssam Metni
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- ECPM, Université de Strasbourg, 25 Rue Becquerel, 67087 Strasbourg, France
| | - Clint van Hoesel
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Department of Applied Physics, Eindhoven University of Technology, Groene Loper 19, 5612 AP Eindhoven, The Netherlands
| | - Henrik Schopmans
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Institute of Nanotechnology, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
| | - Timo Sommer
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Institute for Theory of Condensed Matter, Karlsruhe Institute of Technology, Wolfgang-Gaede-Str. 1, 76131 Karlsruhe, Germany
- Present Address: School of Chemistry, Trinity College Dublin, College Green, Dublin 2, Ireland
| | - Pascal Friederich
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Institute of Nanotechnology, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
| |
Collapse
|
11
|
Shi J, Fulford M, Li H, Marzook M, Reisjalali M, Salvalaglio M, Molteni C. Investigating the quasi-liquid layer on ice surfaces: a comparison of order parameters. Phys Chem Chem Phys 2022; 24:12476-12487. [PMID: 35576067 DOI: 10.1039/d2cp00752e] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Ice surfaces are characterized by pre-melted quasi-liquid layers (QLLs), which mediate both crystal growth processes and interactions with external agents. Understanding QLLs at the molecular level is necessary to unravel the mechanisms of ice crystal formation. Computational studies of the QLLs heavily rely on the accuracy of the methods employed for identifying the local molecular environment and arrangements, discriminating between solid-like and liquid-like water molecules. Here we compare the results obtained using different order parameters to characterize the QLLs on hexagonal ice (Ih) and cubic ice (Ic) model surfaces investigated with molecular dynamics (MD) simulations in a range of temperatures. For the classification task, in addition to the traditional Steinhardt order parameters in different flavours, we select an entropy fingerprint and a deep learning neural network approach (DeepIce), which are conceptually different methodologies. We find that all the analysis methods give qualitatively similar trends for the behaviours of the QLLs on ice surfaces with temperature, with some subtle differences in the classification sensitivity limited to the solid-liquid interface. The thickness of QLLs on the ice surface increases gradually as the temperature increases. The trends of the QLL size and of the values of the order parameters as a function of temperature for the different facets may be linked to surface growth rates which, in turn, affect crystal morphologies at lower vapour pressure. The choice of the order parameter can be therefore informed by computational convenience except in cases where a very accurate determination of the liquid-solid interface is important.
Collapse
Affiliation(s)
- Jihong Shi
- Department of Physics, King's College London, Strand, London WC2R 2LS, UK.
| | - Maxwell Fulford
- Department of Physics, King's College London, Strand, London WC2R 2LS, UK.
| | - Hui Li
- Department of Physics, King's College London, Strand, London WC2R 2LS, UK.
| | - Mariam Marzook
- Department of Physics, King's College London, Strand, London WC2R 2LS, UK.
| | - Maryam Reisjalali
- Department of Physics, King's College London, Strand, London WC2R 2LS, UK.
| | - Matteo Salvalaglio
- Department of Chemical Engineering, University College London, Torrington Place, London WC1E 7JE, UK
| | - Carla Molteni
- Department of Physics, King's College London, Strand, London WC2R 2LS, UK.
| |
Collapse
|
12
|
Blow KE, Quigley D, Sosso GC. The seven deadly sins: When computing crystal nucleation rates, the devil is in the details. J Chem Phys 2021; 155:040901. [PMID: 34340373 DOI: 10.1063/5.0055248] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The formation of crystals has proven to be one of the most challenging phase transformations to quantitatively model-let alone to actually understand-be it by means of the latest experimental technique or the full arsenal of enhanced sampling approaches at our disposal. One of the most crucial quantities involved with the crystallization process is the nucleation rate, a single elusive number that is supposed to quantify the average probability for a nucleus of critical size to occur within a certain volume and time span. A substantial amount of effort has been devoted to attempt a connection between the crystal nucleation rates computed by means of atomistic simulations and their experimentally measured counterparts. Sadly, this endeavor almost invariably fails to some extent, with the venerable classical nucleation theory typically blamed as the main culprit. Here, we review some of the recent advances in the field, focusing on a number of perhaps more subtle details that are sometimes overlooked when computing nucleation rates. We believe it is important for the community to be aware of the full impact of aspects, such as finite size effects and slow dynamics, that often introduce inconspicuous and yet non-negligible sources of uncertainty into our simulations. In fact, it is key to obtain robust and reproducible trends to be leveraged so as to shed new light on the kinetics of a process, that of crystal nucleation, which is involved into countless practical applications, from the formulation of pharmaceutical drugs to the manufacturing of nano-electronic devices.
Collapse
Affiliation(s)
- Katarina E Blow
- Department of Physics, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - David Quigley
- Department of Physics, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Gabriele C Sosso
- Department of Chemistry, University of Warwick, Coventry CV4 7AL, United Kingdom
| |
Collapse
|
13
|
Kim QH, Ko JH, Kim S, Park N, Jhe W. Bayesian neural network with pretrained protein embedding enhances prediction accuracy of drug-protein interaction. Bioinformatics 2021; 37:3428-3435. [PMID: 33978713 PMCID: PMC8545317 DOI: 10.1093/bioinformatics/btab346] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 04/26/2021] [Accepted: 05/05/2021] [Indexed: 11/25/2022] Open
Abstract
Motivation Characterizing drug–protein interactions (DPIs) is crucial to the high-throughput screening for drug discovery. The deep learning-based approaches have attracted attention because they can predict DPIs without human trial and error. However, because data labeling requires significant resources, the available protein data size is relatively small, which consequently decreases model performance. Here, we propose two methods to construct a deep learning framework that exhibits superior performance with a small labeled dataset. Results At first, we use transfer learning in encoding protein sequences with a pretrained model, which trains general sequence representations in an unsupervised manner. Second, we use a Bayesian neural network to make a robust model by estimating the data uncertainty. Our resulting model performs better than the previous baselines at predicting interactions between molecules and proteins. We also show that the quantified uncertainty from the Bayesian inference is related to confidence and can be used for screening DPI data points. Availability and implementation The code is available at https://github.com/QHwan/PretrainDPI. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- QHwan Kim
- Department of Physics and Astronomy, Institute of Applied Physics, Seoul National University, Gwanak-gu, Seoul 08826, Republic of Korea
| | - Joon-Hyuk Ko
- Department of Physics and Astronomy, Institute of Applied Physics, Seoul National University, Gwanak-gu, Seoul 08826, Republic of Korea
| | - Sunghoon Kim
- Department of Physics and Astronomy, Institute of Applied Physics, Seoul National University, Gwanak-gu, Seoul 08826, Republic of Korea
| | - Nojun Park
- Department of Physics and Astronomy, Institute of Applied Physics, Seoul National University, Gwanak-gu, Seoul 08826, Republic of Korea
| | - Wonho Jhe
- Department of Physics and Astronomy, Institute of Applied Physics, Seoul National University, Gwanak-gu, Seoul 08826, Republic of Korea
| |
Collapse
|