1
|
Masella M, Léonforté F. The multi-scale polarizable pseudo-particle solvent coarse-grained approach: From NaCl salt solutions to polyelectrolyte hydration. J Chem Phys 2024; 160:204902. [PMID: 38780384 DOI: 10.1063/5.0194968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 04/22/2024] [Indexed: 05/25/2024] Open
Abstract
We discuss key parameters that affect the reliability of hybrid simulations in the aqueous phase based on an efficient multi-scale coarse-grained polarizable pseudo-particle approach, denoted as pppl, to model the solvent water, whereas solutes are modeled using an all atom polarizable force field. Among those parameters, the extension of the solvent domain (SD) at the solute vicinity (domain in which each solvent particle corresponds to a single water molecule) and the magnitude of solute/solvent short range polarization damping effects are shown to be pivotal to model NaCl salty aqueous solutions and the hydration of charged systems, such as the hydrophobic polyelectrolyte polymer that we have recently investigated [Masella et al., J. Chem. Phys. 155, 114903 (2021)]. Strong short range damping is pivotal to simulate aqueous salt NaCl solutions at moderate concentration (up to 1.0M). The SD extension (as well as short range damping) has a weak effect on the polymer conformation; however, it plays a pivotal role in computing accurate polymer/solvent interaction energies. As the pppl approach is up to two orders of magnitude computationally more efficient than all atom polarizable force field methods, our results show it to be an efficient alternative route to investigate the equilibrium properties of complex charged molecular systems in extended chemical environments.
Collapse
Affiliation(s)
- Michel Masella
- Laboratoire de Biologie Structurale et Radiobiologie, Service de Bioénergétique, Biologie Structurale et Mécanismes, Institut de Biologie et de Technologies de Saclay, CEA Saclay, F-91191 Gif sur Yvette Cedex, France
| | - Fabien Léonforté
- L'Oréal Group, Research and Innovation, Aulnay-Sous-Bois, France
| |
Collapse
|
2
|
Karmakar T, Soares TA, Merz KM. Enhancing Coarse-Grained Models through Machine Learning. J Chem Inf Model 2024; 64:2931-2932. [PMID: 38644772 DOI: 10.1021/acs.jcim.4c00537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Affiliation(s)
- Tarak Karmakar
- Department of Chemistry, Indian Institute of Technology, Delhi, Hauz Khas, New Delhi 110016, India
| | - Thereza A Soares
- Department of Chemistry, FFCLRP, University of São Paulo, Ribeirão Preto 14040-901, Brazil
- Hylleraas Centre for Quantum Molecular Sciences, University of Oslo, Oslo 0315, Norway
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, Lansing 48824, Michigan, United States
| |
Collapse
|
3
|
Wu Z, Zhou T. Structural Coarse-Graining via Multiobjective Optimization with Differentiable Simulation. J Chem Theory Comput 2024; 20:2605-2617. [PMID: 38483262 DOI: 10.1021/acs.jctc.3c01348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
In the realm of multiscale molecular simulations, structure-based coarse-graining is a prominent approach for creating efficient coarse-grained (CG) representations of soft matter systems, such as polymers. This involves optimizing CG interactions by matching static correlation functions of the corresponding degrees of freedom in all-atom (AA) models. Here, we present a versatile method, namely, differentiable coarse-graining (DiffCG), which combines multiobjective optimization and differentiable simulation. The DiffCG approach is capable of constructing robust CG models by iteratively optimizing the effective potentials to simultaneously match multiple target properties. We demonstrate our approach by concurrently optimizing bonded and nonbonded potentials of a CG model of polystyrene (PS) melts. The resulting CG-PS model effectively reproduces both the structural characteristics, such as the equilibrium probability distribution of microscopic degrees of freedom and the thermodynamic pressure of the AA counterpart. More importantly, leveraging the multiobjective optimization capability, we develop a precise and efficient CG model for PS melts that is transferable across a wide range of temperatures, i.e., from 400 to 600 K. It is achieved via optimizing a pairwise potential with nonlinear temperature dependence in the CG model to simultaneously match target data from AA-MD simulations at multiple thermodynamic states. The temperature transferable CG-PS model demonstrates its ability to accurately predict the radial distribution functions and density at different temperatures, including those that are not included in the target thermodynamic states. Our work opens up a promising route for developing accurate and transferable CG models of complex soft-matter systems through multiobjective optimization with differentiable simulation.
Collapse
Affiliation(s)
- Zhenghao Wu
- Department of Chemistry, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, P. R. China
| | - Tianhang Zhou
- College of Carbon Neutrality Future Technology, State Key Laboratory of Heavy Oil Processing, China University of Petroleum (Beijing), Beijing 102249, P. R. China
| |
Collapse
|
4
|
Izvekov S, Kroonblawd MP, Larentzos JP, Brennan JK, Rice BM. Maximum Entropy Theory of Multiscale Coarse-Graining via Matching Thermodynamic Forces: Application to a Molecular Crystal (TATB). J Phys Chem B 2024. [PMID: 38489758 DOI: 10.1021/acs.jpcb.3c07078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2024]
Abstract
The MSCG/FM (multiscale coarse-graining via force-matching) approach is an efficient supervised machine learning method to develop microscopically informed coarse-grained (CG) models. We present a theory based on the principle of maximum entropy (PME) enveloping the existing MSCG/FM approaches. This theory views the MSCG/FM method as a special case of matching the thermodynamic forces from the extended ensemble described by the set of thermodynamic (relevant) system coordinates. This set may include CG coordinates, the stress tensor, applied external fields, and so forth, and may be characterized by nonequilibrium conditions. Following the presentation of the theory, we discuss the consistent matching of both bonded and nonbonded interactions. The proposed PME formulation is used as a starting point to extend the MSCG/FM method to the constant strain ensemble, which together with the explicit matching of the bonded forces is better suited for coarse-graining anisotropic media at a submolecular resolution. The theory is demonstrated by performing the fine coarse-graining of crystalline 1,3,5-triamino-2,4,6-trinitrobenzene (TATB), a well-known insensitive molecular energetic material, which exhibits highly anisotropic mechanical properties.
Collapse
Affiliation(s)
- Sergei Izvekov
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| | - Matthew P Kroonblawd
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - James P Larentzos
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| | - John K Brennan
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| | - Betsy M Rice
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| |
Collapse
|
5
|
Célerse F, Wodrich MD, Vela S, Gallarati S, Fabregat R, Juraskova V, Corminboeuf C. From Organic Fragments to Photoswitchable Catalysts: The OFF-ON Structural Repository for Transferable Kernel-Based Potentials. J Chem Inf Model 2024; 64:1201-1212. [PMID: 38319296 PMCID: PMC10900300 DOI: 10.1021/acs.jcim.3c01953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 01/18/2024] [Accepted: 01/22/2024] [Indexed: 02/07/2024]
Abstract
Structurally and conformationally diverse databases are needed to train accurate neural networks or kernel-based potentials capable of exploring the complex free energy landscape of flexible functional organic molecules. Curating such databases for species beyond "simple" drug-like compounds or molecules composed of well-defined building blocks (e.g., peptides) is challenging as it requires thorough chemical space mapping and evaluation of both chemical and conformational diversities. Here, we introduce the OFF-ON (organic fragments from organocatalysts that are non-modular) database, a repository of 7869 equilibrium and 67,457 nonequilibrium geometries of organic compounds and dimers aimed at describing conformationally flexible functional organic molecules, with an emphasis on photoswitchable organocatalysts. The relevance of this database is then demonstrated by training a local kernel regression model on a low-cost semiempirical baseline and comparing it with a PBE0-D3 reference for several known catalysts, notably the free energy surfaces of exemplary photoswitchable organocatalysts. Our results demonstrate that the OFF-ON data set offers reliable predictions for simulating the conformational behavior of virtually any (photoswitchable) organocatalyst or organic compound composed of H, C, N, O, F, and S atoms, thereby opening a computationally feasible route to explore complex free energy surfaces in order to rationalize and predict catalytic behavior.
Collapse
Affiliation(s)
- Frédéric Célerse
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Matthew D. Wodrich
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| | - Sergi Vela
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Simone Gallarati
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Raimon Fabregat
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Veronika Juraskova
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Clémence Corminboeuf
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
- National
Centre for Computational Design and Discovery of Novel Materials (MARVEL), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| |
Collapse
|
6
|
Jin J, Reichman DR. Perturbative Expansion in Reciprocal Space: Bridging Microscopic and Mesoscopic Descriptions of Molecular Interactions. J Phys Chem B 2024; 128:1061-1078. [PMID: 38232134 DOI: 10.1021/acs.jpcb.3c06048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Determining the Fourier representation of various molecular interactions is important for constructing density-based field theories from a microscopic point of view, enabling a multiscale bridge between microscopic and mesoscopic descriptions. However, due to the strongly repulsive nature of short-ranged interactions, interparticle interactions cannot be formally defined in Fourier space, which renders coarse-grained (CG) approaches in k-space somewhat ambiguous. In this paper, we address this issue by designing a perturbative expansion of pair interactions in reciprocal space. Our perturbation theory, starting from reciprocal space, elucidates the microscopic origins underlying zeroth-order (long-range attractions) and divergent repulsive interactions from higher order contributions. We propose a systematic framework for constructing a faithful Fourier-space representation of molecular interactions, capturing key structural correlations in various systems, including simple model systems and molecular CG models of liquids. Building upon the Ornstein-Zernike equation, our approach can be combined with appropriate closure relations, and to further improve the closure approximations, we develop a bottom-up parameterization strategy for inferring the bridge function from microscopic statistics. By incorporating the bridge function into the Fourier representation, our findings suggest a systematic, bottom-up approach to performing coarse-graining in reciprocal space, leading to the systematic construction of a bottom-up classical field theory of complex aqueous systems.
Collapse
Affiliation(s)
- Jaehyeok Jin
- Department of Chemistry, Columbia University, 3000 Broadway, New York, New York 10027, United States
| | - David R Reichman
- Department of Chemistry, Columbia University, 3000 Broadway, New York, New York 10027, United States
| |
Collapse
|
7
|
Maier JC, Wang CI, Jackson NE. Distilling coarse-grained representations of molecular electronic structure with continuously gated message passing. J Chem Phys 2024; 160:024109. [PMID: 38193551 DOI: 10.1063/5.0179253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 12/14/2023] [Indexed: 01/10/2024] Open
Abstract
Bottom-up methods for coarse-grained (CG) molecular modeling are critically needed to establish rigorous links between atomistic reference data and reduced molecular representations. For a target molecule, the ideal reduced CG representation is a function of both the conformational ensemble of the system and the target physical observable(s) to be reproduced at the CG resolution. However, there is an absence of algorithms for selecting CG representations of molecules from which complex properties, including molecular electronic structure, can be accurately modeled. We introduce continuously gated message passing (CGMP), a graph neural network (GNN) method for atomically decomposing molecular electronic structure sampled over conformational ensembles. CGMP integrates 3D-invariant GNNs and a novel gated message passing system to continuously reduce the atomic degrees of freedom accessible for electronic predictions, resulting in a one-shot importance ranking of atoms contributing to a target molecular property. Moreover, CGMP provides the first approach by which to quantify the degeneracy of "good" CG representations conditioned on specific prediction targets, facilitating the development of more transferable CG representations. We further show how CGMP can be used to highlight multiatom correlations, illuminating a path to developing CG electronic Hamiltonians in terms of interpretable collective variables for arbitrarily complex molecules.
Collapse
Affiliation(s)
- J Charlie Maier
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Chun-I Wang
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Nicholas E Jackson
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|
8
|
Giunta G, Campos-Villalobos G, Dijkstra M. Coarse-Grained Many-Body Potentials of Ligand-Stabilized Nanoparticles from Machine-Learned Mean Forces. ACS NANO 2023; 17:23391-23404. [PMID: 38011344 PMCID: PMC10722599 DOI: 10.1021/acsnano.3c04162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 11/07/2023] [Accepted: 11/08/2023] [Indexed: 11/29/2023]
Abstract
Colloidal nanoparticles self-assemble into a variety of superstructures with distinctive optical, structural, and electronic properties. These nanoparticles are usually stabilized by a capping layer of organic ligands to prevent aggregation in the solvent. When the ligands are sufficiently long compared to the dimensions of the nanocrystal cores, the effective coarse-grained forces between pairs of nanoparticles are largely affected by the presence of neighboring particles. In order to efficiently investigate the self-assembly behavior of these complex colloidal systems, we propose a machine-learning approach to construct effective coarse-grained many-body interaction potentials. The multiscale methodology presented in this work constitutes a general bottom-up coarse-graining strategy where the coarse-grained forces acting on coarse-grained sites are extracted from measuring the vectorial mean forces on these sites in reference fine-grained simulations. These effective coarse-grained forces, i.e., gradients of the potential of mean force or of the free-energy surface, are represented by a simple linear model in terms of gradients of structural descriptors, which are scalar functions that are rotationally invariant. In this way, we also directly obtain the free-energy surface of the coarse-grained model as a function of all coarse-grained coordinates. We expect that this simple yet accurate coarse-graining framework for the many-body potential of mean force will enable the characterization, understanding, and prediction of the structure and phase behavior of relevant soft-matter systems by direct simulations. The key advantage of this method is its generality, which allows it to be applicable to a broad range of systems. To demonstrate the generality of our method, we also apply it to a colloid-polymer model system, where coarse-grained many-body interactions are pronounced.
Collapse
Affiliation(s)
| | - Gerardo Campos-Villalobos
- Soft Condensed Matter, Debye
Institute for Nanomaterials Science, Utrecht
University, Princetonplein
5, 3584 CC Utrecht, The Netherlands
| | - Marjolein Dijkstra
- Soft Condensed Matter, Debye
Institute for Nanomaterials Science, Utrecht
University, Princetonplein
5, 3584 CC Utrecht, The Netherlands
| |
Collapse
|
9
|
Nitta H, Ozawa T, Yasuoka K. Construction of full-atomistic polymer amorphous structures using reverse-mapping from Kremer-Grest models. J Chem Phys 2023; 159:194903. [PMID: 37982485 DOI: 10.1063/5.0159722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 10/30/2023] [Indexed: 11/21/2023] Open
Abstract
We propose a method to build full-atomistic (FA) amorphous polymer structures using reverse-mapping from coarse-grained (CG) models. In this method, three models with different resolutions are utilized, namely the CG1, CG2, and FA models. It is assumed that the CG1 model is more abstract than the CG2 model. The CG1 is utilized to equilibrate the system, and then sequential reverse-mapping procedures from the CG1 to the CG2 models and from the CG2 to the FA models are conducted. A mapping relation between the CG1 and the FA models is necessary to generate a polymer structure with a given density and radius of chains. Actually, we have used the Kremer-Grest (KG) model as the CG1 and the monomer-level CG model as the CG2 model. Utilizing the mapping relation, we have developed a scheme that constructs an FA polymer model from the KG model. In the scheme, the KG model, the monomer level CG model, and the FA model are successively constructed. The scheme is applied to polyethylene (PE), cis 1,4-polybutadiene (PB), and poly(methyl methacrylate) (PMMA). As a validation, the structures of PE and PB constructed by the scheme were carefully checked through comparison with those obtained using long-time FA molecular dynamics (MD) simulations. We found that both short- and long-range chain structures constructed by the scheme reproduced those obtained by the FA MD simulations. Then, as an interesting application, the scheme is applied to generate an entangled PMMA structure. The results showed that the scheme provides an efficient and easy way to construct amorphous structures of FA polymers.
Collapse
Affiliation(s)
- Hiroya Nitta
- JSOL Corporation, KUDAN-KAIKAN TERRACE 1-6-5, Kudanminami, Chiyoda-ku, Tokyo 102-0074, Japan
| | - Taku Ozawa
- JSOL Corporation, KUDAN-KAIKAN TERRACE 1-6-5, Kudanminami, Chiyoda-ku, Tokyo 102-0074, Japan
| | - Kenji Yasuoka
- Department of Mechanical Engineering, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, Kanagawa 223-8522, Japan
| |
Collapse
|
10
|
Jin J, Hwang J, Voth GA. Gaussian representation of coarse-grained interactions of liquids: Theory, parametrization, and transferability. J Chem Phys 2023; 159:184105. [PMID: 37942867 DOI: 10.1063/5.0160567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 10/06/2023] [Indexed: 11/10/2023] Open
Abstract
Coarse-grained (CG) interactions determined via bottom-up methodologies can faithfully reproduce the structural correlations observed in fine-grained (atomistic resolution) systems, yet they can suffer from limited extensibility due to complex many-body correlations. As part of an ongoing effort to understand and improve the applicability of bottom-up CG models, we propose an alternative approach to address both accuracy and transferability. Our main idea draws from classical perturbation theory to partition the hard sphere repulsive term from effective CG interactions. We then introduce Gaussian basis functions corresponding to the system's characteristic length by linking these Gaussian sub-interactions to the local particle densities at each coordination shell. The remaining perturbative long-range interaction can be treated as a collective solvation interaction, which we show exhibits a Gaussian form derived from integral equation theories. By applying this numerical parametrization protocol to CG liquid systems, our microscopic theory elucidates the emergence of Gaussian interactions in common phenomenological CG models. To facilitate transferability for these reduced descriptions, we further infer equations of state to determine the sub-interaction parameter as a function of the system variables. The reduced models exhibit excellent transferability across the thermodynamic state points. Furthermore, we propose a new strategy to design the cross-interactions between distinct CG sites in liquid mixtures. This involves combining each Gaussian in the proper radial domain, yielding accurate CG potentials of mean force and structural correlations for multi-component systems. Overall, our findings establish a solid foundation for constructing transferable bottom-up CG models of liquids with enhanced extensibility.
Collapse
Affiliation(s)
- Jaehyeok Jin
- Department of Chemistry, Chicago Center for Theoretical Chemistry, James Franck Institute, and Institute for Biophysical Dynamics, The University of Chicago, 5735 S. Ellis Ave., Chicago, Illinois 60637, USA
- Department of Chemistry, Columbia University, 3000 Broadway, New York, New York 10027, USA
| | - Jisung Hwang
- Department of Statistics, The University of Chicago, 5747 S. Ellis Ave., Chicago, Illinois 60637, USA
| | - Gregory A Voth
- Department of Chemistry, Chicago Center for Theoretical Chemistry, James Franck Institute, and Institute for Biophysical Dynamics, The University of Chicago, 5735 S. Ellis Ave., Chicago, Illinois 60637, USA
| |
Collapse
|
11
|
Lyu L, Lei H. Construction of Coarse-Grained Molecular Dynamics with Many-Body Non-Markovian Memory. PHYSICAL REVIEW LETTERS 2023; 131:177301. [PMID: 37955502 DOI: 10.1103/physrevlett.131.177301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 09/19/2023] [Indexed: 11/14/2023]
Abstract
We introduce a machine-learning-based coarse-grained molecular dynamics model that faithfully retains the many-body nature of the intermolecular dissipative interactions. Unlike the common empirical coarse-grained models, the present model is constructed based on the Mori-Zwanzig formalism and naturally inherits the heterogeneous state-dependent memory term rather than matching the mean-field metrics such as the velocity autocorrelation function. Numerical results show that preserving the many-body nature of the memory term is crucial for predicting the collective transport and diffusion processes, where empirical forms generally show limitations.
Collapse
Affiliation(s)
- Liyao Lyu
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, Michigan 48824, USA
| | - Huan Lei
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, Michigan 48824, USA
- Department of Statistics and Probability, Michigan State University, East Lansing, Michigan 48824, USA
| |
Collapse
|
12
|
Faure Beaulieu Z, Nicholas TC, Gardner JLA, Goodwin AL, Deringer VL. Coarse-grained versus fully atomistic machine learning for zeolitic imidazolate frameworks. Chem Commun (Camb) 2023; 59:11405-11408. [PMID: 37668310 PMCID: PMC10513772 DOI: 10.1039/d3cc02265j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 08/22/2023] [Indexed: 09/06/2023]
Abstract
Zeolitic imidazolate frameworks are widely thought of as being analogous to inorganic AB2 phases. We test the validity of this assumption by comparing simplified and fully atomistic machine-learning models for local environments in ZIFs. Our work addresses the central question to what extent chemical information can be "coarse-grained" in hybrid framework materials.
Collapse
Affiliation(s)
- Zoé Faure Beaulieu
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, UK.
| | - Thomas C Nicholas
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, UK.
| | - John L A Gardner
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, UK.
| | - Andrew L Goodwin
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, UK.
| | - Volker L Deringer
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, UK.
| |
Collapse
|
13
|
Zaporozhets I, Clementi C. Multibody Terms in Protein Coarse-Grained Models: A Top-Down Perspective. J Phys Chem B 2023; 127:6920-6927. [PMID: 37499123 DOI: 10.1021/acs.jpcb.3c04493] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Coarse-grained models allow computational investigation of biomolecular processes occurring on long time and length scales, intractable with atomistic simulation. Traditionally, many coarse-grained models rely mostly on pairwise interaction potentials. However, the decimation of degrees of freedom should, in principle, lead to a complex many-body effective interaction potential. In this work, we use experimental data on mutant stability to parametrize coarse-grained models for two proteins with and without many-body terms. We demonstrate that many-body terms are necessary to reproduce quantitatively the effects of point mutations on protein stability, particularly to implicitly take into account the effect of the solvent.
Collapse
Affiliation(s)
- Iryna Zaporozhets
- Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Department of Physics, Freie Universität, Arnimallee 12, Berlin 14195, Germany
| | - Cecilia Clementi
- Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Department of Physics, Freie Universität, Arnimallee 12, Berlin 14195, Germany
| |
Collapse
|
14
|
Sahrmann P, Loose TD, Durumeric AEP, Voth GA. Utilizing Machine Learning to Greatly Expand the Range and Accuracy of Bottom-Up Coarse-Grained Models through Virtual Particles. J Chem Theory Comput 2023; 19:4402-4413. [PMID: 36802592 PMCID: PMC10373655 DOI: 10.1021/acs.jctc.2c01183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Indexed: 02/22/2023]
Abstract
Coarse-grained (CG) models parametrized using atomistic reference data, i.e., "bottom up" CG models, have proven useful in the study of biomolecules and other soft matter. However, the construction of highly accurate, low resolution CG models of biomolecules remains challenging. We demonstrate in this work how virtual particles, CG sites with no atomistic correspondence, can be incorporated into CG models within the context of relative entropy minimization (REM) as latent variables. The methodology presented, variational derivative relative entropy minimization (VD-REM), enables optimization of virtual particle interactions through a gradient descent algorithm aided by machine learning. We apply this methodology to the challenging case of a solvent-free CG model of a 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC) lipid bilayer and demonstrate that introduction of virtual particles captures solvent-mediated behavior and higher-order correlations which REM alone cannot capture in a more standard CG model based only on the mapping of collections of atoms to the CG sites.
Collapse
Affiliation(s)
- Patrick
G. Sahrmann
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| | - Timothy D. Loose
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| | - Aleksander E. P. Durumeric
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| | - Gregory A. Voth
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| |
Collapse
|
15
|
Izvekov S, Rice BM. Hierarchical Machine Learning of Low-Resolution Coarse-Grained Free Energy Potentials. J Chem Theory Comput 2023. [PMID: 37256918 DOI: 10.1021/acs.jctc.3c00128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
A force-matching-based method for supervised machine learning (ML) of coarse-grained (CG) free energy (FE) potentials─known as multiscale coarse-graining via force-matching (MSCG/FM)─is an efficient method to develop microscopically informed CG models that are thermodynamically and statistically equivalent to the reference microscopic models. For low-resolution models, when the coarse-graining is at supramolecular scales, objective-oriented clustering of nonbonded particles is required and the reduced description becomes a function of the clustering algorithm. In the present work, we explore the dependence of the ML of the CG Helmholtz FE potential on the clustering algorithm. We consider coarse-graining based on partitional (k-means, leading to Voronoi diagram) and hierarchical agglomerative (bottom-up) clustering algorithms common in unsupervised ML and develop theory connecting the MSCG/FM learned CG Helmholtz potential and the clustering statistics. By combining the agglomerative clustering and the MSCG/FM learning in a recursive manner, we propose an efficient ML methodology to develop the fine-to-low resolution hierarchies of the CG models. The methodology does not suffer from degrading accuracy or increased computational cost to construct larger hierarchies and as such does not impose an upper size limitation of the CG particles resulting from the extended hierarchies. The utility of the methodology is demonstrated by obtaining the bottom-up agglomerative hierarchy for liquid nitromethane from all-atom molecular dynamics (MD) simulations. For agglomerative hierarchies, we prove the existence of renormalization group transformations that indicate self-similarity and allow for learning the low-resolution MSCG/FM potentials at low computational cost by rescaling and renormalizing the certain finer-resolution members of the hierarchy. The hierarchies of the CG models can be used to carry out simulations under constant-pressure conditions.
Collapse
Affiliation(s)
- Sergei Izvekov
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| | - Betsy M Rice
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| |
Collapse
|
16
|
Krämer A, Durumeric AEP, Charron NE, Chen Y, Clementi C, Noé F. Statistically Optimal Force Aggregation for Coarse-Graining Molecular Dynamics. J Phys Chem Lett 2023; 14:3970-3979. [PMID: 37079800 DOI: 10.1021/acs.jpclett.3c00444] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Machine-learned coarse-grained (CG) models have the potential for simulating large molecular complexes beyond what is possible with atomistic molecular dynamics. However, training accurate CG models remains a challenge. A widely used methodology for learning bottom-up CG force fields maps forces from all-atom molecular dynamics to the CG representation and matches them with a CG force field on average. We show that there is flexibility in how to map all-atom forces to the CG representation and that the most commonly used mapping methods are statistically inefficient and potentially even incorrect in the presence of constraints in the all-atom simulation. We define an optimization statement for force mappings and demonstrate that substantially improved CG force fields can be learned from the same simulation data when using optimized force maps. The method is demonstrated on the miniproteins chignolin and tryptophan cage and published as open-source code.
Collapse
Affiliation(s)
- Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
| | - Aleksander E P Durumeric
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
| | - Nicholas E Charron
- Department of Physics and Astronomy, Rice University, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77251, United States
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
| | - Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- International Max Planck Research School for Biology and Computation (IMPRS-BAC), Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Cecilia Clementi
- Department of Physics and Astronomy, Rice University, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77251, United States
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Microsoft Research AI4Science, Karl-Liebknecht Straße 32, 10178 Berlin, Germany
| |
Collapse
|
17
|
Durumeric AEP, Charron NE, Templeton C, Musil F, Bonneau K, Pasos-Trejo AS, Chen Y, Kelkar A, Noé F, Clementi C. Machine learned coarse-grained protein force-fields: Are we there yet? Curr Opin Struct Biol 2023; 79:102533. [PMID: 36731338 PMCID: PMC10023382 DOI: 10.1016/j.sbi.2023.102533] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 12/14/2022] [Accepted: 12/18/2022] [Indexed: 02/04/2023]
Abstract
The successful recent application of machine learning methods to scientific problems includes the learning of flexible and accurate atomic-level force-fields for materials and biomolecules from quantum chemical data. In parallel, the machine learning of force-fields at coarser resolutions is rapidly gaining relevance as an efficient way to represent the higher-body interactions needed in coarse-grained force-fields to compensate for the omitted degrees of freedom. Coarse-grained models are important for the study of systems at time and length scales exceeding those of atomistic simulations. However, the development of transferable coarse-grained models via machine learning still presents significant challenges. Here, we discuss recent developments in this field and current efforts to address the remaining challenges.
Collapse
Affiliation(s)
- Aleksander E P Durumeric
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Nicholas E Charron
- Department of Physics and Astronomy, Rice University, 6100 Main Street, Houston, 77005, Texas, USA; Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, 77005, Texas, USA
| | - Clark Templeton
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/pbrun03
| | - Félix Musil
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/FelixMusil
| | - Klara Bonneau
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Aldo S Pasos-Trejo
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/sayeg84
| | - Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/hello_yaoyi
| | - Atharva Kelkar
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Frank Noé
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, Berlin, 10178, Berlin, Germany; Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Department of Chemistry, Rice University, 6100 Main Street, Houston, 77005, Texas, USA. https://twitter.com/FrankNoeBerlin
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, 77005, Texas, USA; Department of Chemistry, Rice University, 6100 Main Street, Houston, 77005, Texas, USA; Department of Physics and Astronomy, Rice University, 6100 Main Street, Houston, 77005, Texas, USA.
| |
Collapse
|
18
|
Ricci E, Vergadou N. Integrating Machine Learning in the Coarse-Grained Molecular Simulation of Polymers. J Phys Chem B 2023; 127:2302-2322. [PMID: 36888553 DOI: 10.1021/acs.jpcb.2c06354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
Machine learning (ML) is having an increasing impact on the physical sciences, engineering, and technology and its integration into molecular simulation frameworks holds great potential to expand their scope of applicability to complex materials and facilitate fundamental knowledge and reliable property predictions, contributing to the development of efficient materials design routes. The application of ML in materials informatics in general, and polymer informatics in particular, has led to interesting results, however great untapped potential lies in the integration of ML techniques into the multiscale molecular simulation methods for the study of macromolecular systems, specifically in the context of Coarse Grained (CG) simulations. In this Perspective, we aim at presenting the pioneering recent research efforts in this direction and discussing how these new ML-based techniques can contribute to critical aspects of the development of multiscale molecular simulation methods for bulk complex chemical systems, especially polymers. Prerequisites for the implementation of such ML-integrated methods and open challenges that need to be met toward the development of general systematic ML-based coarse graining schemes for polymers are discussed.
Collapse
Affiliation(s)
- Eleonora Ricci
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
- Institute of Informatics and Telecommunications, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| | - Niki Vergadou
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| |
Collapse
|
19
|
Köhler J, Chen Y, Krämer A, Clementi C, Noé F. Flow-Matching: Efficient Coarse-Graining of Molecular Dynamics without Forces. J Chem Theory Comput 2023; 19:942-952. [PMID: 36668906 DOI: 10.1021/acs.jctc.3c00016] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Coarse-grained (CG) molecular simulations have become a standard tool to study molecular processes on time and length scales inaccessible to all-atom simulations. Parametrizing CG force fields to match all-atom simulations has mainly relied on force-matching or relative entropy minimization, which require many samples from costly simulations with all-atom or CG resolutions, respectively. Here we present flow-matching, a new training method for CG force fields that combines the advantages of both methods by leveraging normalizing flows, a generative deep learning method. Flow-matching first trains a normalizing flow to represent the CG probability density, which is equivalent to minimizing the relative entropy without requiring iterative CG simulations. Subsequently, the flow generates samples and forces according to the learned distribution in order to train the desired CG free energy model via force-matching. Even without requiring forces from the all-atom simulations, flow-matching outperforms classical force-matching by an order of magnitude in terms of data efficiency and produces CG models that can capture the folding and unfolding transitions of small proteins.
Collapse
Affiliation(s)
- Jonas Köhler
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Center for Theoretical Biological Physics, Rice University, Houston, Texas77005, United States.,Department of Physics, Rice University, Houston, Texas77005, United States.,Department of Chemistry, Rice University, Houston, Texas77005, United States
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Department of Chemistry, Rice University, Houston, Texas77005, United States.,Microsoft Research AI4Science, Karl-Liebknecht Strasse 32, 10178Berlin, Germany
| |
Collapse
|
20
|
Ge P, Zhang L, Lei H. Machine learning assisted coarse-grained molecular dynamics modeling of meso-scale interfacial fluids. J Chem Phys 2023; 158:064104. [PMID: 36792498 DOI: 10.1063/5.0131567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
A hallmark of meso-scale interfacial fluids is the multi-faceted, scale-dependent interfacial energy, which often manifests different characteristics across the molecular and continuum scale. The multi-scale nature imposes a challenge to construct reliable coarse-grained (CG) models, where the CG potential function needs to faithfully encode the many-body interactions arising from the unresolved atomistic interactions and account for the heterogeneous density distributions across the interface. We construct the CG models of both single- and two-component polymeric fluid systems based on the recently developed deep coarse-grained potential [Zhang et al., J. Chem. Phys. 149, 034101 (2018)] scheme, where each polymer molecule is modeled as a CG particle. By only using the training samples of the instantaneous force under the thermal equilibrium state, the constructed CG models can accurately reproduce both the probability density function of the void formation in bulk and the spectrum of the capillary wave across the fluid interface. More importantly, the CG models accurately predict the volume-to-area scaling transition for the apolar solvation energy, illustrating the effectiveness to probe the meso-scale collective behaviors encoded with molecular-level fidelity.
Collapse
Affiliation(s)
- Pei Ge
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, Michigan 48824, USA
| | | | - Huan Lei
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, Michigan 48824, USA
| |
Collapse
|
21
|
Bag S, Meinel MK, Müller-Plathe F. Toward a Mobility-Preserving Coarse-Grained Model: A Data-Driven Approach. J Chem Theory Comput 2022; 18:7108-7120. [PMID: 36449362 DOI: 10.1021/acs.jctc.2c00898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Coarse-grained molecular dynamics (MD) simulation is a promising alternative to all-atom MD simulation for the fast calculation of system properties, which is imperative in designing materials with a specific target property. There have been several coarse-graining strategies developed over the past few years that provide accurate structural properties of the system. However, these coarse-grained models share a major drawback in that they introduce an artificial acceleration in molecular mobility. In this paper, we report a data-driven approach to generate coarse-grained models that preserve the all-atom molecular mobility. We designed a machine learning model in the form of an artificial neural network, which directly predicts the simulation-ready mobility-preserving coarse-grained potential as an output given the all-atom force field (FF) parameters as inputs. As a proof of principle, we took 2,3,4-trimethylpentane as a model system and described the development of machine learning models in detail. We quantify the artificial acceleration in molecular mobility by defining the acceleration factor as the ratio of the coarse-grained and the all-atom diffusion coefficient. The predicted coarse-grained potential generated by the best machine learning model can bring down the acceleration factor to a value of ∼2, which could be otherwise as large as 7 for a typical value of 3 × 10-9 m2 s-1 for the all-atom diffusion coefficient. We believe our method will be of interest in the community as a route to generating coarse-grained potentials with accurate dynamics.
Collapse
Affiliation(s)
- Saientan Bag
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Str. 8, 64287Darmstadt, Germany
| | - Melissa K Meinel
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Str. 8, 64287Darmstadt, Germany
| | - Florian Müller-Plathe
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Str. 8, 64287Darmstadt, Germany
| |
Collapse
|
22
|
Depta PN, Dosta M, Wenzel W, Kozlowska M, Heinrich S. Hierarchical Coarse-Grained Strategy for Macromolecular Self-Assembly: Application to Hepatitis B Virus-Like Particles. Int J Mol Sci 2022; 23:ijms232314699. [PMID: 36499027 PMCID: PMC9740473 DOI: 10.3390/ijms232314699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 11/01/2022] [Accepted: 11/14/2022] [Indexed: 11/27/2022] Open
Abstract
Macromolecular self-assembly is at the basis of many phenomena in material and life sciences that find diverse applications in technology. One example is the formation of virus-like particles (VLPs) that act as stable empty capsids used for drug delivery or vaccine fabrication. Similarly to the capsid of a virus, VLPs are protein assemblies, but their structural formation, stability, and properties are not fully understood, especially as a function of the protein modifications. In this work, we present a data-driven modeling approach for capturing macromolecular self-assembly on scales beyond traditional molecular dynamics (MD), while preserving the chemical specificity. Each macromolecule is abstracted as an anisotropic object and high-dimensional models are formulated to describe interactions between molecules and with the solvent. For this, data-driven protein-protein interaction potentials are derived using a Kriging-based strategy, built on high-throughput MD simulations. Semi-automatic supervised learning is employed in a high performance computing environment and the resulting specialized force-fields enable a significant speed-up to the micrometer and millisecond scale, while maintaining high intermolecular detail. The reported generic framework is applied for the first time to capture the formation of hepatitis B VLPs from the smallest building unit, i.e., the dimer of the core protein HBcAg. Assembly pathways and kinetics are analyzed and compared to the available experimental observations. We demonstrate that VLP self-assembly phenomena and dependencies are now possible to be simulated. The method developed can be used for the parameterization of other macromolecules, enabling a molecular understanding of processes impossible to be attained with other theoretical models.
Collapse
Affiliation(s)
- Philipp Nicolas Depta
- Institute of Solids Process Engineering and Particle Technology (SPE), Hamburg University of Technology, 21073 Hamburg, Germany
- Correspondence:
| | - Maksym Dosta
- Institute of Solids Process Engineering and Particle Technology (SPE), Hamburg University of Technology, 21073 Hamburg, Germany
- Boehringer Ingelheim Pharma GmbH & Co Kg., 88400 Biberach an der Riss, Germany
| | - Wolfgang Wenzel
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology, 76344 Eggenstein-Leopoldshafen, Germany
| | - Mariana Kozlowska
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology, 76344 Eggenstein-Leopoldshafen, Germany
| | - Stefan Heinrich
- Institute of Solids Process Engineering and Particle Technology (SPE), Hamburg University of Technology, 21073 Hamburg, Germany
| |
Collapse
|
23
|
Schmid F. Understanding and Modeling Polymers: The Challenge of Multiple Scales. ACS POLYMERS AU 2022. [DOI: 10.1021/acspolymersau.2c00049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Friederike Schmid
- Institut für Physik, Johannes Gutenberg-Universität Mainz, Staudingerweg 9, 55128Mainz, Germany
| |
Collapse
|
24
|
Jin J, Pak AJ, Durumeric AEP, Loose TD, Voth GA. Bottom-up Coarse-Graining: Principles and Perspectives. J Chem Theory Comput 2022; 18:5759-5791. [PMID: 36070494 PMCID: PMC9558379 DOI: 10.1021/acs.jctc.2c00643] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Indexed: 01/14/2023]
Abstract
Large-scale computational molecular models provide scientists a means to investigate the effect of microscopic details on emergent mesoscopic behavior. Elucidating the relationship between variations on the molecular scale and macroscopic observable properties facilitates an understanding of the molecular interactions driving the properties of real world materials and complex systems (e.g., those found in biology, chemistry, and materials science). As a result, discovering an explicit, systematic connection between microscopic nature and emergent mesoscopic behavior is a fundamental goal for this type of investigation. The molecular forces critical to driving the behavior of complex heterogeneous systems are often unclear. More problematically, simulations of representative model systems are often prohibitively expensive from both spatial and temporal perspectives, impeding straightforward investigations over possible hypotheses characterizing molecular behavior. While the reduction in resolution of a study, such as moving from an atomistic simulation to that of the resolution of large coarse-grained (CG) groups of atoms, can partially ameliorate the cost of individual simulations, the relationship between the proposed microscopic details and this intermediate resolution is nontrivial and presents new obstacles to study. Small portions of these complex systems can be realistically simulated. Alone, these smaller simulations likely do not provide insight into collectively emergent behavior. However, by proposing that the driving forces in both smaller and larger systems (containing many related copies of the smaller system) have an explicit connection, systematic bottom-up CG techniques can be used to transfer CG hypotheses discovered using a smaller scale system to a larger system of primary interest. The proposed connection between different CG systems is prescribed by (i) the CG representation (mapping) and (ii) the functional form and parameters used to represent the CG energetics, which approximate potentials of mean force (PMFs). As a result, the design of CG methods that facilitate a variety of physically relevant representations, approximations, and force fields is critical to moving the frontier of systematic CG forward. Crucially, the proposed connection between the system used for parametrization and the system of interest is orthogonal to the optimization used to approximate the potential of mean force present in all systematic CG methods. The empirical efficacy of machine learning techniques on a variety of tasks provides strong motivation to consider these approaches for approximating the PMF and analyzing these approximations.
Collapse
Affiliation(s)
- Jaehyeok Jin
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Alexander J. Pak
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Aleksander E. P. Durumeric
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Timothy D. Loose
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Gregory A. Voth
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
25
|
Bull-Vulpe EF, Riera M, Bore SL, Paesani F. Data-Driven Many-Body Potential Energy Functions for Generic Molecules: Linear Alkanes as a Proof-of-Concept Application. J Chem Theory Comput 2022. [PMID: 36113028 DOI: 10.1021/acs.jctc.2c00645] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We present a generalization of the many-body energy (MB-nrg) theoretical/computational framework that enables the development of data-driven potential energy functions (PEFs) for generic covalently bonded molecules, with arbitrary quantum mechanical accuracy. The "nearsightedness of electronic matter" is exploited to define monomers as "natural building blocks" on the basis of their distinct chemical identity. The energy of generic molecules is then expressed as a sum of individual many-body energies of incrementally larger subsystems. The MB-nrg PEFs represent the low-order n-body energies, with n = 1-4, using permutationally invariant polynomials derived from electronic structure data carried out at an arbitrary quantum mechanical level of theory, while all higher-order n-body terms (n > 4) are represented by a classical many-body polarization term. As a proof-of-concept application of the general MB-nrg framework, we present MB-nrg PEFs for linear alkanes. The MB-nrg PEFs are shown to accurately reproduce reference energies, harmonic frequencies, and potential energy scans of alkanes, independently of their length. Since, by construction, the MB-nrg framework introduced here can be applied to generic covalently bonded molecules, we envision future computer simulations of complex molecular systems using data-driven MB-nrg PEFs, with arbitrary quantum mechanical accuracy.
Collapse
Affiliation(s)
- Ethan F. Bull-Vulpe
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Marc Riera
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Sigbjørn L. Bore
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Francesco Paesani
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
- Materials Science and Engineering, University of California San Diego, La Jolla, California 92093, United States
- San Diego Supercomputer Center, University of California San Diego, La Jolla, California 92093, United States
| |
Collapse
|
26
|
Kanekal KH, Rudzinski JF, Bereau T. Broad chemical transferability in structure-based coarse-graining. J Chem Phys 2022; 157:104102. [DOI: 10.1063/5.0104914] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Compared to top-down coarse-grained (CG) models, bottom-up approaches are capable of offering higher structural fidelity. This fidelity results from the tight link to a higher-resolution reference, making the CG model chemically specific. Unfortunately, chemical specificity can be at odds with compound-screening strategies, which call for transferable parametrizations. Here we present an approach to reconcile bottom-up, structure-preserving CG models with chemical transferability. We consider the bottom-up CG parametrization of 3,441 C7O2 small-molecule isomers. Our approach combines atomic representations, unsupervised learning, and a large-scale extended-ensemble force-matching parametrization. We first identify a subset of 19 representative molecules, which maximally encode the local environment of all gas-phase conformers. Reference interactions between the 19 representative molecules were obtained from both homogeneous bulk liquids and various binary mixtures. An extended-ensemble parametrization over all 703 state points leads to a CG model that is both structure-based and chemically transferable. Remarkably, the resulting force field is on average more structurally accurate than single-state-point equivalents. Averaging over the extended ensemble acts as a mean-force regularizer, smoothing out both force and structural correlations that are overly specific to a single state point. Our approach aims at transferability through a set of CG bead types that can be used to easily construct new molecules, while retaining the benefits of a structure-based parametrization.
Collapse
Affiliation(s)
- Kiran H. Kanekal
- AK Kremer - Theory Group, Max Planck Institute for Polymer Research, Germany
| | | | - Tristan Bereau
- Van 't Hoff Institute for Molecular Sciences and Informatics Institute, University of Amsterdam, Netherlands
| |
Collapse
|
27
|
Nguyen HTL, Huang DM. Systematic bottom-up molecular coarse-graining via force and torque matching using anisotropic particles. J Chem Phys 2022; 156:184118. [DOI: 10.1063/5.0085006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We derive a systematic and general method for parametrizing coarse-grained molecular models consisting of anisotropic particles from fine-grained (e.g. all-atom) models for condensed-phase molecular dynamics simulations. The method, which we call anisotropic force-matching coarse-graining (AFM-CG), is based on rigorous statistical mechanical principles, enforcing consistency between the coarse-grained and fine-grained phase-space distributions to derive equations for the coarse-grained forces, torques, masses, and moments of inertia in terms of properties of a condensed-phase fine-grained system. We verify the accuracy and efficiency of the method by coarse-graining liquid-state systems of two different anisotropic organic molecules, benzene and perylene, and show that the parametrized coarse-grained models more accurately describe properties of these systems than previous anisotropic coarse-grained models parametrized using other methods that do not account for finite-temperature and many-body effects on the condensed-phase coarse-grained interactions. The AFM-CG method will be useful for developing accurate and efficient dynamical simulation models of condensed-phase systems of molecules consisting of large, rigid, anisotropic fragments, such as liquid crystals, organic semiconductors, and nucleic acids.
Collapse
|
28
|
DeLyser MR, Noid WG. Coarse-grained models for local density gradients. J Chem Phys 2022; 156:034106. [DOI: 10.1063/5.0075291] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Affiliation(s)
- Michael R. DeLyser
- Department of Chemistry, Penn State University, University Park, Pennsylvania 16802, USA
| | - W. G. Noid
- Department of Chemistry, Penn State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
29
|
Sivaraman G, Jackson NE. Coarse-Grained Density Functional Theory Predictions via Deep Kernel Learning. J Chem Theory Comput 2022; 18:1129-1141. [PMID: 35020388 DOI: 10.1021/acs.jctc.1c01001] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Scalable electronic predictions are critical for soft materials design. Recently, the Electronic Coarse-Graining (ECG) method was introduced to renormalize all-atom quantum chemical (QC) predictions to coarse-grained (CG) resolutions using deep neural networks (DNNs). While DNNs can learn complex representations that prove challenging for kernel-based methods, they are susceptible to overfitting and the overconfidence of uncertainty estimations. Here, we develop ECG within a GPU-accelerated Deep Kernel Learning (DKL) framework to enable CG QC predictions using range-separated hybrid density functional theory (DFT), obtaining a 107 speedup relative to naive all-atom QC. By treating the predicted electronic properties as random Gaussian Processes, DKL incorporates CG mapping degeneracy by learning the distribution of electronic energies as a function of CG configuration. DKL-ECG accurately reproduces molecular orbital energies from range-separated DFT while facilitating efficient training via active learning using the uncertainties provided by DKL. We show that while active learning algorithms enable efficient sampling of a more diverse configurational space relative to random sampling, all explored query methods exhibit comparable performance for the examined system. We attribute this result to the significant overlap of the feature space and output property distributions across multiple temperatures.
Collapse
Affiliation(s)
- Ganesh Sivaraman
- Data Science and Learning Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Nicholas E Jackson
- Department of Chemistry, University of Illinois at Urbana-Champaign, 505 South Mathews Avenue, Urbana, Illinois 61801, United States
| |
Collapse
|
30
|
Xu P, Mou X, Guo Q, Fu T, Ren H, Wang G, Li Y, Li G. Coarse-grained molecular dynamics study based on TorchMD. CHINESE J CHEM PHYS 2021. [DOI: 10.1063/1674-0068/cjcp2110218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Affiliation(s)
- Peijun Xu
- Liaoning Normal University, Dalian 116029, China
| | - Xiaohong Mou
- Liaoning Normal University, Dalian 116029, China
| | - Qiuhan Guo
- Liaoning Normal University, Dalian 116029, China
| | - Ting Fu
- Pharmacy Department of Affiliated Zhongshan Hospital of Dalian University, Dalian 116001, China
| | - Hong Ren
- Department of Ophthalmology Aerospace Center Hospital, Beijing 100049, China
| | - Guiyan Wang
- Dalian Ocean University, Dalian 116029, China
| | - Yan Li
- Dalian Institute of Chemical Physics, State Key Laboratory of Molecular Reaction Dynamics, Dalian 116023, China
| | - Guohui Li
- Dalian Institute of Chemical Physics, State Key Laboratory of Molecular Reaction Dynamics, Dalian 116023, China
| |
Collapse
|
31
|
Badin M, Martoňák R. Nucleating a Different Coordination in a Crystal under Pressure: A Study of the B1-B2 Transition in NaCl by Metadynamics. PHYSICAL REVIEW LETTERS 2021; 127:105701. [PMID: 34533357 DOI: 10.1103/physrevlett.127.105701] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 07/21/2021] [Indexed: 06/13/2023]
Abstract
Here we propose an NPT metadynamics simulation scheme for pressure-induced structural phase transitions, using coordination number and volume as collective variables, and apply it to the reconstructive structural transformation B1-B2 in NaCl. By studying systems with size up to 64 000 atoms we reach a regime beyond collective mechanism and observe transformations proceeding via nucleation and growth. We also reveal the crossover of the transition mechanism from Buerger-like for smaller systems to Watanabe-Tolédano for larger ones. The scheme is likely to be applicable to a broader class of pressure-induced structural transitions, allowing study of complex nucleation effects and bringing simulations closer to realistic conditions.
Collapse
Affiliation(s)
- Matej Badin
- SISSA - Scuola Internazionale Superiore di Studi Avanzati, Via Bonomea 265, 34136 Trieste, Italy
- Department of Experimental Physics, Faculty of Mathematics, Physics and Informatics, Comenius University in Bratislava, Mlynská Dolina F2, 842 48 Bratislava, Slovakia
| | - Roman Martoňák
- Department of Experimental Physics, Faculty of Mathematics, Physics and Informatics, Comenius University in Bratislava, Mlynská Dolina F2, 842 48 Bratislava, Slovakia
| |
Collapse
|
32
|
Deringer VL, Bartók AP, Bernstein N, Wilkins DM, Ceriotti M, Csányi G. Gaussian Process Regression for Materials and Molecules. Chem Rev 2021; 121:10073-10141. [PMID: 34398616 PMCID: PMC8391963 DOI: 10.1021/acs.chemrev.1c00022] [Citation(s) in RCA: 197] [Impact Index Per Article: 65.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Indexed: 12/18/2022]
Abstract
We provide an introduction to Gaussian process regression (GPR) machine-learning methods in computational materials science and chemistry. The focus of the present review is on the regression of atomistic properties: in particular, on the construction of interatomic potentials, or force fields, in the Gaussian Approximation Potential (GAP) framework; beyond this, we also discuss the fitting of arbitrary scalar, vectorial, and tensorial quantities. Methodological aspects of reference data generation, representation, and regression, as well as the question of how a data-driven model may be validated, are reviewed and critically discussed. A survey of applications to a variety of research questions in chemistry and materials science illustrates the rapid growth in the field. A vision is outlined for the development of the methodology in the years to come.
Collapse
Affiliation(s)
- Volker L. Deringer
- Department
of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, United Kingdom
| | - Albert P. Bartók
- Department
of Physics and Warwick Centre for Predictive Modelling, School of
Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Noam Bernstein
- Center
for Computational Materials Science, U.S.
Naval Research Laboratory, Washington D.C. 20375, United States
| | - David M. Wilkins
- Atomistic
Simulation Centre, School of Mathematics and Physics, Queen’s University Belfast, Belfast BT7 1NN, Northern Ireland, United Kingdom
| | - Michele Ceriotti
- Laboratory
of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
- National
Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale
de Lausanne, Lausanne, Switzerland
| | - Gábor Csányi
- Engineering
Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| |
Collapse
|
33
|
Keith JA, Vassilev-Galindo V, Cheng B, Chmiela S, Gastegger M, Müller KR, Tkatchenko A. Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems. Chem Rev 2021; 121:9816-9872. [PMID: 34232033 PMCID: PMC8391798 DOI: 10.1021/acs.chemrev.1c00107] [Citation(s) in RCA: 186] [Impact Index Per Article: 62.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Indexed: 12/23/2022]
Abstract
Machine learning models are poised to make a transformative impact on chemical sciences by dramatically accelerating computational algorithms and amplifying insights available from computational chemistry methods. However, achieving this requires a confluence and coaction of expertise in computer science and physical sciences. This Review is written for new and experienced researchers working at the intersection of both fields. We first provide concise tutorials of computational chemistry and machine learning methods, showing how insights involving both can be achieved. We follow with a critical review of noteworthy applications that demonstrate how computational chemistry and machine learning can be used together to provide insightful (and useful) predictions in molecular and materials modeling, retrosyntheses, catalysis, and drug design.
Collapse
Affiliation(s)
- John A. Keith
- Department
of Chemical and Petroleum Engineering Swanson School of Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Valentin Vassilev-Galindo
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Bingqing Cheng
- Accelerate
Programme for Scientific Discovery, Department
of Computer Science and Technology, 15 J. J. Thomson Avenue, Cambridge CB3 0FD, United Kingdom
| | - Stefan Chmiela
- Department
of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, 10587, Berlin, Germany
| | - Michael Gastegger
- Department
of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, 10587, Berlin, Germany
| | - Klaus-Robert Müller
- Machine
Learning Group, Technische Universität
Berlin, 10587, Berlin, Germany
- Department
of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea
- Max-Planck-Institut für Informatik, 66123 Saarbrücken, Germany
- Google Research, Brain Team, 10117 Berlin, Germany
| | - Alexandre Tkatchenko
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| |
Collapse
|
34
|
Abstract
Chemical compound space (CCS), the set of all theoretically conceivable combinations of chemical elements and (meta-)stable geometries that make up matter, is colossal. The first-principles based virtual sampling of this space, for example, in search of novel molecules or materials which exhibit desirable properties, is therefore prohibitive for all but the smallest subsets and simplest properties. We review studies aimed at tackling this challenge using modern machine learning techniques based on (i) synthetic data, typically generated using quantum mechanics based methods, and (ii) model architectures inspired by quantum mechanics. Such Quantum mechanics based Machine Learning (QML) approaches combine the numerical efficiency of statistical surrogate models with an ab initio view on matter. They rigorously reflect the underlying physics in order to reach universality and transferability across CCS. While state-of-the-art approximations to quantum problems impose severe computational bottlenecks, recent QML based developments indicate the possibility of substantial acceleration without sacrificing the predictive power of quantum mechanics.
Collapse
Affiliation(s)
- Bing Huang
- Faculty
of Physics, University of Vienna, 1090 Vienna, Austria
| | - O. Anatole von Lilienfeld
- Faculty
of Physics, University of Vienna, 1090 Vienna, Austria
- Institute
of Physical Chemistry and National Center for Computational Design
and Discovery of Novel Materials (MARVEL), Department of Chemistry, University of Basel, 4056 Basel, Switzerland
| |
Collapse
|
35
|
Wang J, Charron N, Husic B, Olsson S, Noé F, Clementi C. Multi-body effects in a coarse-grained protein force field. J Chem Phys 2021; 154:164113. [PMID: 33940848 DOI: 10.1063/5.0041022] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
The use of coarse-grained (CG) models is a popular approach to study complex biomolecular systems. By reducing the number of degrees of freedom, a CG model can explore long time- and length-scales inaccessible to computational models at higher resolution. If a CG model is designed by formally integrating out some of the system's degrees of freedom, one expects multi-body interactions to emerge in the effective CG model's energy function. In practice, it has been shown that the inclusion of multi-body terms indeed improves the accuracy of a CG model. However, no general approach has been proposed to systematically construct a CG effective energy that includes arbitrary orders of multi-body terms. In this work, we propose a neural network based approach to address this point and construct a CG model as a multi-body expansion. By applying this approach to a small protein, we evaluate the relative importance of the different multi-body terms in the definition of an accurate model. We observe a slow convergence in the multi-body expansion, where up to five-body interactions are needed to reproduce the free energy of an atomistic model.
Collapse
Affiliation(s)
- Jiang Wang
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Nicholas Charron
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Brooke Husic
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Simon Olsson
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Frank Noé
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
36
|
Miksch AM, Morawietz T, Kästner J, Urban A, Artrith N. Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abfd96] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Recent advances in machine-learning interatomic potentials have enabled the efficient modeling of complex atomistic systems with an accuracy that is comparable to that of conventional quantum-mechanics based methods. At the same time, the construction of new machine-learning potentials can seem a daunting task, as it involves data-science techniques that are not yet common in chemistry and materials science. Here, we provide a tutorial-style overview of strategies and best practices for the construction of artificial neural network (ANN) potentials. We illustrate the most important aspects of (a) data collection, (b) model selection, (c) training and validation, and (d) testing and refinement of ANN potentials on the basis of practical examples. Current research in the areas of active learning and delta learning are also discussed in the context of ANN potentials. This tutorial review aims at equipping computational chemists and materials scientists with the required background knowledge for ANN potential construction and application, with the intention to accelerate the adoption of the method, so that it can facilitate exciting research that would otherwise be challenging with conventional strategies.
Collapse
|
37
|
Ma Z, Wang S, Kim M, Liu K, Chen CL, Pan W. Transfer learning of memory kernels for transferable coarse-graining of polymer dynamics. SOFT MATTER 2021; 17:5864-5877. [PMID: 34096961 DOI: 10.1039/d1sm00364j] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The present work concerns the transferability of coarse-grained (CG) modeling in reproducing the dynamic properties of the reference atomistic systems across a range of parameters. In particular, we focus on implicit-solvent CG modeling of polymer solutions. The CG model is based on the generalized Langevin equation, where the memory kernel plays the critical role in determining the dynamics in all time scales. Thus, we propose methods for transfer learning of memory kernels. The key ingredient of our methods is Gaussian process regression. By integration with the model order reduction via proper orthogonal decomposition and the active learning technique, the transfer learning can be practically efficient and requires minimum training data. Through two example polymer solution systems, we demonstrate the accuracy and efficiency of the proposed transfer learning methods in the construction of transferable memory kernels. The transferability allows for out-of-sample predictions, even in the extrapolated domain of parameters. Built on the transferable memory kernels, the CG models can reproduce the dynamic properties of polymers in all time scales at different thermodynamic conditions (such as temperature and solvent viscosity) and for different systems with varying concentrations and lengths of polymers.
Collapse
Affiliation(s)
- Zhan Ma
- Department of Mechanical Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA.
| | - Shu Wang
- Department of Mechanical Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA.
| | - Minhee Kim
- Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Kaibo Liu
- Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Chun-Long Chen
- Physical Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Wenxiao Pan
- Department of Mechanical Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
38
|
Terao T. Semi-supervised learning for the study of structural formation in colloidal systems via image recognition. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2021; 33:325901. [PMID: 33962403 DOI: 10.1088/1361-648x/abfee4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 05/07/2021] [Indexed: 06/12/2023]
Abstract
The analysis of the structural formation of colloidal systems using machine learning techniques has recently attracted much attention. In many of these studies, local bond-order parameters (LBOPs) were employed as descriptors, where such LBOPs are suitable mainly for the detection of crystal structures. On the other hand, image-based convolutional neural networks (CNNs) are quite effective in detecting not only crystals but also random structures, and the author demonstrated their efficiency in a previous paper. However, in supervised learning, it is difficult to obtain a correct result when there is an unexpected new phase that was unknown when training the CNN. In this paper, we propose a hybrid scheme that consists of supervised and unsupervised learning techniques, employing two different approaches: image-based CNN and generalized LBOP. The proposed method was applied to two-dimensional colloidal systems, and its efficiency was demonstrated.
Collapse
Affiliation(s)
- Takamichi Terao
- Department of Electrical, Electronic and Computer Engineering, Gifu University, Gifu, Japan
| |
Collapse
|
39
|
Westermayr J, Gastegger M, Schütt KT, Maurer RJ. Perspective on integrating machine learning into computational chemistry and materials science. J Chem Phys 2021; 154:230903. [PMID: 34241249 DOI: 10.1063/5.0047760] [Citation(s) in RCA: 65] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Machine learning (ML) methods are being used in almost every conceivable area of electronic structure theory and molecular simulation. In particular, ML has become firmly established in the construction of high-dimensional interatomic potentials. Not a day goes by without another proof of principle being published on how ML methods can represent and predict quantum mechanical properties-be they observable, such as molecular polarizabilities, or not, such as atomic charges. As ML is becoming pervasive in electronic structure theory and molecular simulation, we provide an overview of how atomistic computational modeling is being transformed by the incorporation of ML approaches. From the perspective of the practitioner in the field, we assess how common workflows to predict structure, dynamics, and spectroscopy are affected by ML. Finally, we discuss how a tighter and lasting integration of ML methods with computational chemistry and materials science can be achieved and what it will mean for research practice, software development, and postgraduate training.
Collapse
Affiliation(s)
- Julia Westermayr
- Department of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Kristof T Schütt
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Reinhard J Maurer
- Department of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
| |
Collapse
|
40
|
Lach D, Zhdan U, Smolinski A, Polanski J. Functional and Material Properties in Nanocatalyst Design: A Data Handling and Sharing Problem. Int J Mol Sci 2021; 22:ijms22105176. [PMID: 34068386 PMCID: PMC8153597 DOI: 10.3390/ijms22105176] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 05/06/2021] [Accepted: 05/11/2021] [Indexed: 11/16/2022] Open
Abstract
(1) Background: Properties and descriptors are two forms of molecular in silico representations. Properties can be further divided into functional, e.g., catalyst or drug activity, and material, e.g., X-ray crystal data. Millions of real measured functional property records are available for drugs or drug candidates in online databases. In contrast, there is not a single database that registers a real conversion, TON or TOF data for catalysts. All of the data are molecular descriptors or material properties, which are mainly of a calculation origin. (2) Results: Here, we explain the reason for this. We reviewed the data handling and sharing problems in the design and discovery of catalyst candidates particularly, material informatics and catalyst design, structural coding, data collection and validation, infrastructure for catalyst design and the online databases for catalyst design. (3) Conclusions: Material design requires a property prediction step. This can only be achieved based on the registered real property measurement. In reality, in catalyst design and discovery, we can observe either a severe functional property deficit or even property famine.
Collapse
Affiliation(s)
- Daniel Lach
- Institute of Chemistry, Faculty of Science and Technology, University of Silesia, Szkolna 9, 40-006 Katowice, Poland; (D.L.); (U.Z.)
| | - Uladzislau Zhdan
- Institute of Chemistry, Faculty of Science and Technology, University of Silesia, Szkolna 9, 40-006 Katowice, Poland; (D.L.); (U.Z.)
| | - Adam Smolinski
- Central Mining Institute, Plac Gwarkow 1, 40-166 Katowice, Poland;
| | - Jaroslaw Polanski
- Institute of Chemistry, Faculty of Science and Technology, University of Silesia, Szkolna 9, 40-006 Katowice, Poland; (D.L.); (U.Z.)
- Correspondence: ; Tel.: +48-32-259-9978
| |
Collapse
|
41
|
Rudzinski JF, Kloth S, Wörner S, Pal T, Kremer K, Bereau T, Vogel M. Dynamical properties across different coarse-grained models for ionic liquids. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2021; 33:224001. [PMID: 33592598 DOI: 10.1088/1361-648x/abe6e1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 02/16/2021] [Indexed: 06/12/2023]
Abstract
Room-temperature ionic liquids (RTILs) stand out among molecular liquids for their rich physicochemical characteristics, including structural and dynamic heterogeneity. The significance of electrostatic interactions in RTILs results in long characteristic length- and timescales, and has motivated the development of a number of coarse-grained (CG) simulation models. In this study, we aim to better understand the connection between certain CG parameterization strategies and the dynamical properties and transferability of the resulting models. We systematically compare five CG models: a model largely parameterized from experimental thermodynamic observables; a refinement of this model to increase its structural accuracy; and three models that reproduce a given set of structural distribution functions by construction, with varying intramolecular parameterizations and reference temperatures. All five CG models display limited structural transferability over temperature, and also result in various effective dynamical speedup factors, relative to a reference atomistic model. On the other hand, the structure-based CG models tend to result in more consistent cation-anion relative diffusion than the thermodynamic-based models, for a single thermodynamic state point. By linking short- and long-timescale dynamical behaviors, we demonstrate that the varying dynamical properties of the different CG models can be largely collapsed onto a single curve, which provides evidence for a route to constructing dynamically-consistent CG models of RTILs.
Collapse
Affiliation(s)
| | - Sebastian Kloth
- Institute of Condensed Matter Physics, Technische Universität Darmstadt, Hochschulstr. 6, 64289 Darmstadt, Germany
| | - Svenja Wörner
- Max Planck Institute for Polymer Research, 55128 Mainz, Germany
| | - Tamisra Pal
- Institute of Condensed Matter Physics, Technische Universität Darmstadt, Hochschulstr. 6, 64289 Darmstadt, Germany
| | - Kurt Kremer
- Max Planck Institute for Polymer Research, 55128 Mainz, Germany
| | - Tristan Bereau
- Max Planck Institute for Polymer Research, 55128 Mainz, Germany
- Van 't Hoff Institute for Molecular Sciences and Informatics Institute, University of Amsterdam, Amsterdam 1098 XH, The Netherlands
| | - Michael Vogel
- Institute of Condensed Matter Physics, Technische Universität Darmstadt, Hochschulstr. 6, 64289 Darmstadt, Germany
| |
Collapse
|
42
|
Zhang Y, Shen C, Long T, Zhang H. Thermal conductivity of h-BN monolayers using machine learning interatomic potential. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2021; 33:105903. [PMID: 33260161 DOI: 10.1088/1361-648x/abcf61] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Thermal management materials are of critical importance for engineering miniaturized electronic devices, where theoretical design of such materials demands the evaluation of thermal conductivities which are numerically expensive. In this work, we applied the recently developed machine learning interatomic potential (MLIP) to evaluate the thermal conductivity of hexagonal boron nitride monolayers. The MLIP is obtained using the Gaussian approximation potential method, and the resulting lattice dynamical properties and thermal conductivity are compared with those obtained from explicit frozen phonon calculations. It is observed that accurate thermal conductivity can be obtained based on MLIP constructed with about 30% representative configurations, and the high-order force constants provide a more reliable benchmark on the quality of MLIP than the harmonic approximation.
Collapse
Affiliation(s)
- Yixuan Zhang
- Institute of Materials Science, Technical University of Darmstadt, Darmstadt 64287, Germany
| | - Chen Shen
- Institute of Materials Science, Technical University of Darmstadt, Darmstadt 64287, Germany
| | - Teng Long
- Institute of Materials Science, Technical University of Darmstadt, Darmstadt 64287, Germany
| | - Hongbin Zhang
- Institute of Materials Science, Technical University of Darmstadt, Darmstadt 64287, Germany
| |
Collapse
|
43
|
Ye H, Xian W, Li Y. Machine Learning of Coarse-Grained Models for Organic Molecules and Polymers: Progress, Opportunities, and Challenges. ACS OMEGA 2021; 6:1758-1772. [PMID: 33521417 PMCID: PMC7841771 DOI: 10.1021/acsomega.0c05321] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 01/04/2021] [Indexed: 05/02/2023]
Abstract
Machine learning (ML) has emerged as one of the most powerful tools transforming all areas of science and engineering. The nature of molecular dynamics (MD) simulations, complex and time-consuming calculations, makes them particularly suitable for ML research. This review article focuses on recent advancements in developing efficient and accurate coarse-grained (CG) models using various ML methods, in terms of regulating the coarse-graining process, constructing adequate descriptors/features, generating representative training data sets, and optimization of the loss function. Two classes of the CG models are introduced: bottom-up and top-down CG methods. To illustrate these methods and demonstrate the open methodological questions, we survey several important principles in constructing CG models and how these are incorporated into ML methods and improved with specific learning techniques. Finally, we discuss some key aspects of developing machine-learned CG models with high accuracy and efficiency. Besides, we describe how these aspects are tackled in state-of-the-art methods and which remain to be addressed in the near future. We expect that these machine-learned CG models can address thermodynamic consistent, transferable, and representative issues in classical CG models.
Collapse
Affiliation(s)
- Huilin Ye
- Department
of Mechanical Engineering, University of
Connecticut, Storrs, Connecticut 06269, United States
| | - Weikang Xian
- Department
of Mechanical Engineering, University of
Connecticut, Storrs, Connecticut 06269, United States
| | - Ying Li
- Department
of Mechanical Engineering, University of
Connecticut, Storrs, Connecticut 06269, United States
- Polymer
Program, Institute of Materials Science, University of Connecticut, Storrs, Connecticut 06269, United States
- E-mail: . Phone: +1 860 4867110. Fax: +1 860 4865088
| |
Collapse
|
44
|
Susanty M, Rajab TE, Hertadi R. A Review of Protein Structure Prediction using Deep Learning. BIO WEB OF CONFERENCES 2021. [DOI: 10.1051/bioconf/20214104003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Proteins are macromolecules composed of 20 types of amino acids in a specific order. Understanding how proteins fold is vital because its 3-dimensional structure determines the function of a protein. Prediction of protein structure based on amino acid strands and evolutionary information becomes the basis for other studies such as predicting the function, property or behaviour of a protein and modifying or designing new proteins to perform certain desired functions. Machine learning advances, particularly deep learning, are igniting a paradigm shift in scientific study. In this review, we summarize recent work in applying deep learning techniques to tackle problems in protein structural prediction. We discuss various deep learning approaches used to predict protein structure and future achievements and challenges. This review is expected to help provide perspectives on problems in biochemistry that can take advantage of the deep learning approach. Some of the unanswered challenges with current computational approaches are predicting the location and precision orientation of protein side chains, predicting protein interactions with DNA, RNA and other small molecules and predicting the structure of protein complexes.
Collapse
|
45
|
Rudzinski JF, Bereau T. Coarse-grained conformational surface hopping: Methodology and transferability. J Chem Phys 2020; 153:214110. [DOI: 10.1063/5.0031249] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Affiliation(s)
| | - Tristan Bereau
- Max Planck Institute for Polymer Research, 55128 Mainz, Germany
- Van ’t Hoff Institute for Molecular Sciences and Informatics Institute, University of Amsterdam, Amsterdam 1098 XH, The Netherlands
| |
Collapse
|
46
|
Husic BE, Charron NE, Lemm D, Wang J, Pérez A, Majewski M, Krämer A, Chen Y, Olsson S, de Fabritiis G, Noé F, Clementi C. Coarse graining molecular dynamics with graph neural networks. J Chem Phys 2020; 153:194101. [PMID: 33218238 PMCID: PMC7671749 DOI: 10.1063/5.0026133] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 10/27/2020] [Indexed: 11/14/2022] Open
Abstract
Coarse graining enables the investigation of molecular dynamics for larger systems and at longer timescales than is possible at an atomic resolution. However, a coarse graining model must be formulated such that the conclusions we draw from it are consistent with the conclusions we would draw from a model at a finer level of detail. It has been proved that a force matching scheme defines a thermodynamically consistent coarse-grained model for an atomistic system in the variational limit. Wang et al. [ACS Cent. Sci. 5, 755 (2019)] demonstrated that the existence of such a variational limit enables the use of a supervised machine learning framework to generate a coarse-grained force field, which can then be used for simulation in the coarse-grained space. Their framework, however, requires the manual input of molecular features to machine learn the force field. In the present contribution, we build upon the advance of Wang et al. and introduce a hybrid architecture for the machine learning of coarse-grained force fields that learn their own features via a subnetwork that leverages continuous filter convolutions on a graph neural network architecture. We demonstrate that this framework succeeds at reproducing the thermodynamics for small biomolecular systems. Since the learned molecular representations are inherently transferable, the architecture presented here sets the stage for the development of machine-learned, coarse-grained force fields that are transferable across molecular systems.
Collapse
Affiliation(s)
| | | | - Dominik Lemm
- Computational Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr. Aiguader 88, Barcelona, Spain
| | | | - Adrià Pérez
- Computational Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr. Aiguader 88, Barcelona, Spain
| | - Maciej Majewski
- Computational Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr. Aiguader 88, Barcelona, Spain
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | | | - Simon Olsson
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | | | | | | |
Collapse
|
47
|
Li W, Ando Y, Watanabe S. Effects of density and composition on the properties of amorphous alumina: A high-dimensional neural network potential study. J Chem Phys 2020; 153:164119. [PMID: 33138388 DOI: 10.1063/5.0026289] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Amorphous alumina (a-AlOx), which plays important roles in several technological fields, shows a wide variation of density and composition. However, their influences on the properties of a-AlOx have rarely been investigated from a theoretical perspective. In this study, high-dimensional neural network potentials were constructed to generate a series of atomic structures of a-AlOx with different densities (2.6 g/cm3-3.3 g/cm3) and O/Al ratios (1.0-1.75). The structural, vibrational, mechanical, and thermal properties of the a-AlOx models were investigated, as well as the Li and Cu diffusion behavior in the models. The results showed that density and composition had different degrees of effects on the different properties. The structural and vibrational properties were strongly affected by composition, whereas the mechanical properties were mainly determined by density. The thermal conductivity was affected by both the density and composition of a-AlOx. However, the effects on the Li and Cu diffusion behavior were relatively unclear.
Collapse
Affiliation(s)
- Wenwen Li
- Research Center for Computational Design of Advanced Functional Materials, National Institute of Advanced Industrial Science and Technology, Tsukuba, Ibaraki 305-8568, Japan
| | - Yasunobu Ando
- Research Center for Computational Design of Advanced Functional Materials, National Institute of Advanced Industrial Science and Technology, Tsukuba, Ibaraki 305-8568, Japan
| | - Satoshi Watanabe
- Department of Materials Engineering, The University of Tokyo, Bunkyo, Tokyo 113-8656, Japan
| |
Collapse
|
48
|
Nicholas TC, Goodwin AL, Deringer VL. Understanding the geometric diversity of inorganic and hybrid frameworks through structural coarse-graining. Chem Sci 2020; 11:12580-12587. [PMID: 34123235 PMCID: PMC8162807 DOI: 10.1039/d0sc03287e] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 10/16/2020] [Indexed: 12/13/2022] Open
Abstract
Much of our understanding of complex structures is based on simplification: for example, metal-organic frameworks are often discussed in the context of "nodes" and "linkers", allowing for a qualitative comparison with simpler inorganic structures. Here we show how such an understanding can be obtained in a systematic and quantitative framework, combining atom-density based similarity (kernel) functions and unsupervised machine learning with the long-standing idea of "coarse-graining" atomic structure. We demonstrate how the latter enables a comparison of vastly different chemical systems, and we use it to create a unified, two-dimensional structure map of experimentally known tetrahedral AB2 networks - including clathrate hydrates, zeolitic imidazolate frameworks (ZIFs), and diverse inorganic phases. The structural relationships that emerge can then be linked to microscopic properties of interest, which we exemplify for structural heterogeneity and tetrahedral density.
Collapse
Affiliation(s)
- Thomas C Nicholas
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford Oxford OX1 3QR UK
| | - Andrew L Goodwin
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford Oxford OX1 3QR UK
| | - Volker L Deringer
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford Oxford OX1 3QR UK
| |
Collapse
|
49
|
Bogojeski M, Vogt-Maranto L, Tuckerman ME, Müller KR, Burke K. Quantum chemical accuracy from density functional approximations via machine learning. Nat Commun 2020; 11:5223. [PMID: 33067479 PMCID: PMC7567867 DOI: 10.1038/s41467-020-19093-1] [Citation(s) in RCA: 121] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 09/24/2020] [Indexed: 12/21/2022] Open
Abstract
Kohn-Sham density functional theory (DFT) is a standard tool in most branches of chemistry, but accuracies for many molecules are limited to 2-3 kcal ⋅ mol-1 with presently-available functionals. Ab initio methods, such as coupled-cluster, routinely produce much higher accuracy, but computational costs limit their application to small molecules. In this paper, we leverage machine learning to calculate coupled-cluster energies from DFT densities, reaching quantum chemical accuracy (errors below 1 kcal ⋅ mol-1) on test data. Moreover, density-based Δ-learning (learning only the correction to a standard DFT calculation, termed Δ-DFT ) significantly reduces the amount of training data required, particularly when molecular symmetries are included. The robustness of Δ-DFT is highlighted by correcting "on the fly" DFT-based molecular dynamics (MD) simulations of resorcinol (C6H4(OH)2) to obtain MD trajectories with coupled-cluster accuracy. We conclude, therefore, that Δ-DFT facilitates running gas-phase MD simulations with quantum chemical accuracy, even for strained geometries and conformer changes where standard DFT fails.
Collapse
Affiliation(s)
- Mihail Bogojeski
- Machine Learning Group, Technische Universität Berlin, Marchstr. 23, 10587, Berlin, Germany
| | | | - Mark E Tuckerman
- Department of Chemistry, New York University, New York, NY, 10003, USA.
- Courant Institute of Mathematical Science, New York University, New York, NY, 10012, USA.
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, 3663 Zhongshan Road North, Shanghai, 200062, China.
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, Marchstr. 23, 10587, Berlin, Germany.
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea.
- Max-Planck-Institut für Informatik, Stuhlsatzenhausweg, 66123, Saarbrücken, Germany.
| | - Kieron Burke
- Department of Physics and Astronomy, University of California, Irvine, CA, 92697, USA.
- Department of Chemistry, University of California, Irvine, CA, 92697, USA.
| |
Collapse
|
50
|
Joshi SY, Deshmukh SA. A review of advancements in coarse-grained molecular dynamics simulations. MOLECULAR SIMULATION 2020. [DOI: 10.1080/08927022.2020.1828583] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Soumil Y. Joshi
- Department of Chemical Engineering, Virginia Tech, Blacksburg, VA, USA
| | | |
Collapse
|