1
|
Yu Q, Ma R, Qu C, Conte R, Nandi A, Pandey P, Houston PL, Zhang DH, Bowman JM. Extending atomic decomposition and many-body representation with a chemistry-motivated approach to machine learning potentials. NATURE COMPUTATIONAL SCIENCE 2025:10.1038/s43588-025-00790-0. [PMID: 40229410 DOI: 10.1038/s43588-025-00790-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2024] [Accepted: 03/13/2025] [Indexed: 04/16/2025]
Abstract
Most widely used machine learning potentials for condensed-phase applications rely on many-body permutationally invariant polynomial or atom-centered neural networks. However, these approaches face challenges in achieving chemical interpretability in atomistic energy decomposition and fully matching the computational efficiency of traditional force fields. Here we present a method that combines aspects of both approaches and balances accuracy and force-field-level speed. This method utilizes a monomer-centered representation, where the potential energy is decomposed into the sum of chemically meaningful monomeric energies. The structural descriptors of monomers are described by one-body and two-body effective interactions, enforced by appropriate sets of permutationally invariant polynomials as inputs to the feed-forward neural networks. Systematic assessments of models for gas-phase water trimer, liquid water, methane-water cluster and liquid carbon dioxide are performed. The improved accuracy, efficiency and flexibility of this method have promise for constructing accurate machine learning potentials and enabling large-scale quantum and classical simulations for complex molecular systems.
Collapse
Affiliation(s)
- Qi Yu
- Department of Chemistry, Fudan University, Shanghai, China.
- Shanghai Innovation Institute, Shanghai, China.
| | - Ruitao Ma
- Department of Chemistry, Fudan University, Shanghai, China
| | - Chen Qu
- Independent Researcher, Toronto, Ontario, Canada
| | - Riccardo Conte
- Dipartimento di Chimica, Università degli Studi di Milano, Milan, Italy
| | - Apurba Nandi
- Department of Physics and Materials Science, University of Luxembourg, Luxembourg City, Luxembourg
| | - Priyanka Pandey
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, GA, USA
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY, USA
| | - Dong H Zhang
- State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, China
| | - Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, GA, USA
| |
Collapse
|
2
|
Drehwald MS, Jamali A, Vargas-Hernández RA. MOLPIPx: An end-to-end differentiable package for permutationally invariant polynomials in Python and Rust. J Chem Phys 2025; 162:084115. [PMID: 40019201 DOI: 10.1063/5.0250837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2024] [Accepted: 01/31/2025] [Indexed: 03/01/2025] Open
Abstract
In this work, we present MOLPIPx, a versatile library designed to seamlessly integrate permutationally invariant polynomials with modern machine learning frameworks, enabling the efficient development of linear models, neural networks, and Gaussian process models. These methodologies are widely employed for parameterizing potential energy surfaces across diverse molecular systems. MOLPIPx leverages two powerful automatic differentiation engines-JAX and EnzymeAD-Rust-to facilitate the efficient computation of energy gradients and higher-order derivatives, which are essential for tasks such as force field development and dynamic simulations. MOLPIPx is available at https://github.com/ChemAI-Lab/molpipx.
Collapse
Affiliation(s)
- Manuel S Drehwald
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario L8S 4L8, Canada
| | - Asma Jamali
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario L8S 4L8, Canada
- School of Computational Science and Engineering, McMaster University, Hamilton, Ontario L8S 4K1, Canada
| | - Rodrigo A Vargas-Hernández
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario L8S 4L8, Canada
- School of Computational Science and Engineering, McMaster University, Hamilton, Ontario L8S 4K1, Canada
- Brockhouse Institute for Materials Research, McMaster University, Hamilton, Ontario L8S 4M1, Canada
| |
Collapse
|
3
|
Hao Y, Lu X, Fu B, Zhang DH. New Algorithms to Generate Permutationally Invariant Polynomials and Fundamental Invariants for Potential Energy Surface Fitting. J Chem Theory Comput 2025; 21:1046-1053. [PMID: 39841118 DOI: 10.1021/acs.jctc.4c01447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2025]
Abstract
Symmetric functions, such as Permutationally Invariant Polynomials (PIPs) and Fundamental Invariants (FIs), are effective and concise descriptors for incorporating permutation symmetry into neural network (NN) potential energy surface (PES) fitting. The traditional algorithm for generating such symmetric polynomials has a factorial time complexity of N!, where N is the number of identical atoms, posing a significant challenge to applying symmetric polynomials as descriptors of NN PESs for larger systems, particularly with more than 10 atoms. Herein, we report a new algorithm which has only linear time complexity for identical atoms. It can tremendously accelerate generation process of symmetric polynomials for molecular systems. The proposed algorithm is based on graph connectivity analysis following the action of the generation set of molecular permutational group. For instance, in the case of calculating the invariant polynomials for a 15-atom molecule, such as tropolone, our algorithm is approximately 2 million times faster than the previous method. The efficiency of the new algorithm can be further enhanced with increasing molecular size and number of identical atoms, making the FI-NN approach feasible for systems with over 10 atoms and high symmetry demands.
Collapse
Affiliation(s)
- Yiping Hao
- State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Science, Dalian 116023, People's Republic of China
- School of Chemical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaoxiao Lu
- State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Science, Dalian 116023, People's Republic of China
| | - Bina Fu
- State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Science, Dalian 116023, People's Republic of China
- School of Chemical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- Hefei National Laboratory, Hefei 230088, China
| | - Dong H Zhang
- State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Science, Dalian 116023, People's Republic of China
- School of Chemical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- Hefei National Laboratory, Hefei 230088, China
| |
Collapse
|
4
|
Qu C, Houston PL, Allison T, Schneider BI, Bowman JM. DFT-Based Permutationally Invariant Polynomial Potentials Capture the Twists and Turns of C 14H 30. J Chem Theory Comput 2024; 20:9339-9353. [PMID: 39431711 PMCID: PMC11562071 DOI: 10.1021/acs.jctc.4c00932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 09/14/2024] [Accepted: 10/01/2024] [Indexed: 10/22/2024]
Abstract
Hydrocarbons are ubiquitous as fuels, solvents, lubricants, and as the principal components of plastics and fibers, yet our ability to predict their dynamical properties is limited to force-field mechanics. Here, we report two machine-learned potential energy surfaces (PESs) for the linear 44-atom hydrocarbon C14H30 using an extensive data set of roughly 250,000 density functional theory (DFT) (B3LYP) energies for a large variety of configurations, obtained using MM3 direct-dynamics calculations at 500, 1000, and 2500 K. The surfaces, based on Permutationally Invariant Polynomials (PIPs) and using both a many-body expansion approach and a fragmented-basis approach, produce precise fits for energies and forces and also produce excellent out-of-sample agreement with direct DFT calculations for torsional and dihedral angle potentials. Going beyond precision, the PESs are used in molecular dynamics calculations that demonstrate the robustness of the PESs for a large range of conformations. The many-body PIPs PES, although more compute intensive than the fragmented-basis one, is directly transferable for other linear hydrocarbons.
Collapse
Affiliation(s)
- Chen Qu
- Independent
Researcher, Toronto, Ontario M9B0E3, Canada
| | - Paul L. Houston
- Department
of Chemistry and Chemical Biology, Cornell
University, Ithaca, New York 14853, United States
- Department
of Chemistry and Biochemistry, Georgia Institute
of Technology, Atlanta, Georgia 30332, United States
| | - Thomas Allison
- National
Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Barry I. Schneider
- National
Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Joel M. Bowman
- Department
of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
5
|
Yang Y, Zhang S, Ranasinghe KD, Isayev O, Roitberg AE. Machine Learning of Reactive Potentials. Annu Rev Phys Chem 2024; 75:371-395. [PMID: 38941524 DOI: 10.1146/annurev-physchem-062123-024417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
In the past two decades, machine learning potentials (MLPs) have driven significant developments in chemical, biological, and material sciences. The construction and training of MLPs enable fast and accurate simulations and analysis of thermodynamic and kinetic properties. This review focuses on the application of MLPs to reaction systems with consideration of bond breaking and formation. We review the development of MLP models, primarily with neural network and kernel-based algorithms, and recent applications of reactive MLPs (RMLPs) to systems at different scales. We show how RMLPs are constructed, how they speed up the calculation of reactive dynamics, and how they facilitate the study of reaction trajectories, reaction rates, free energy calculations, and many other calculations. Different data sampling strategies applied in building RMLPs are also discussed with a focus on how to collect structures for rare events and how to further improve their performance with active learning.
Collapse
Affiliation(s)
- Yinuo Yang
- Department of Chemistry, University of Florida, Gainesville, Florida;
| | - Shuhao Zhang
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | | | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | - Adrian E Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida;
| |
Collapse
|
6
|
Houston PL, Qu C, Yu Q, Pandey P, Conte R, Nandi A, Bowman JM. No Headache for PIPs: A PIP Potential for Aspirin Runs Much Faster and with Similar Precision Than Other Machine-Learned Potentials. J Chem Theory Comput 2024; 20:3008-3018. [PMID: 38593438 DOI: 10.1021/acs.jctc.4c00054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2024]
Abstract
Assessments of machine-learning (ML) potentials are an important aspect of the rapid development of this field. We recently reported an assessment of the linear-regression permutationally invariant polynomial (PIP) method for ethanol, using the widely used (revised) rMD17 data set. We demonstrated that the PIP approach outperformed numerous other methods, e.g., ANI, PhysNet, sGDML, and p-KRR, with respect to precision and notably with respect to speed [Houston et al., J. Chem. Phys. 2022, 156, 044120]. Here, we extend this assessment to the 21-atom aspirin molecule, using the rMD17 data set, with a focus on the speed of evaluation. Both energies and forces are used for training, and the precision of several PIPs is examined for both. Normal mode frequencies, the methyl torsional potential, and 1d vibrational energies for an OH stretch are presented. We show that the PIP approach achieves the level of precision obtained from other ML methods, e.g., atom-centered neural network methods, linear regression ACE, and kernel methods, as reported by Kovács et al. in J. Chem. Theory Comput. 2021, 17, 7696-7711. More significantly, we show that the PIP PESs run much faster than all other ML methods, whose timings were evaluated in that paper. We also show that the PIP PES extrapolates well enough to describe several internal motions of aspirin, including an OH stretch.
Collapse
Affiliation(s)
- Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States
- Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Chen Qu
- Independent Researcher, Toronto, Ontario M9B0E3, Canada
| | - Qi Yu
- Department of Chemistry, Fudan University, Shanghai 200438, P. R. China
| | - Priyanka Pandey
- Department of Chemistry, Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Riccardo Conte
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry, Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
- Department of Physics and Materials Science, University of Luxembourg, Luxembourg City L-1511, Luxembourg
| | - Joel M Bowman
- Department of Chemistry, Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
7
|
Fu B, Zhang DH. Accurate fundamental invariant-neural network representation of ab initio potential energy surfaces. Natl Sci Rev 2023; 10:nwad321. [PMID: 38274241 PMCID: PMC10808953 DOI: 10.1093/nsr/nwad321] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Revised: 11/01/2023] [Accepted: 11/02/2023] [Indexed: 01/27/2024] Open
Abstract
Highly accurate potential energy surfaces are critically important for chemical reaction dynamics. The large number of degrees of freedom and the intricate symmetry adaption pose a big challenge to accurately representing potential energy surfaces (PESs) for polyatomic reactions. Recently, our group has made substantial progress in this direction by developing the fundamental invariant-neural network (FI-NN) approach. Here, we review these advances, demonstrating that the FI-NN approach can represent highly accurate, global, full-dimensional PESs for reactive systems with even more than 10 atoms. These multi-channel reactions typically involve many intermediates, transition states, and products. The complexity and ruggedness of this potential energy landscape present even greater challenges for full-dimensional PES representation. These PESs exhibit a high level of complexity, molecular size, and accuracy of fit. Dynamics simulations based on these PESs have unveiled intriguing and novel reaction mechanisms, providing deep insights into the intricate dynamics involved in combustion, atmospheric, and organic chemistry.
Collapse
Affiliation(s)
- Bina Fu
- State Key Laboratory of Molecular Reaction Dynamics and Center for Theoretical and Computational Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China
- Hefei National Laboratory, Hefei 230088, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Dong H Zhang
- State Key Laboratory of Molecular Reaction Dynamics and Center for Theoretical and Computational Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China
- Hefei National Laboratory, Hefei 230088, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
8
|
Riera M, Knight C, Bull-Vulpe EF, Zhu X, Agnew H, Smith DGA, Simmonett AC, Paesani F. MBX: A many-body energy and force calculator for data-driven many-body simulations. J Chem Phys 2023; 159:054802. [PMID: 37526156 PMCID: PMC10550339 DOI: 10.1063/5.0156036] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 07/11/2023] [Indexed: 08/02/2023] Open
Abstract
Many-Body eXpansion (MBX) is a C++ library that implements many-body potential energy functions (PEFs) within the "many-body energy" (MB-nrg) formalism. MB-nrg PEFs integrate an underlying polarizable model with explicit machine-learned representations of many-body interactions to achieve chemical accuracy from the gas to the condensed phases. MBX can be employed either as a stand-alone package or as an energy/force engine that can be integrated with generic software for molecular dynamics and Monte Carlo simulations. MBX is parallelized internally using Open Multi-Processing and can utilize Message Passing Interface when available in interfaced molecular simulation software. MBX enables classical and quantum molecular simulations with MB-nrg PEFs, as well as hybrid simulations that combine conventional force fields and MB-nrg PEFs, for diverse systems ranging from small gas-phase clusters to aqueous solutions and molecular fluids to biomolecular systems and metal-organic frameworks.
Collapse
Affiliation(s)
- Marc Riera
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Christopher Knight
- Argonne National Laboratory, Computational Science Division, Lemont, Illinois 60439, USA
| | - Ethan F. Bull-Vulpe
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Xuanyu Zhu
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Henry Agnew
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | | | - Andrew C. Simmonett
- Laboratory of Computational Biology, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | |
Collapse
|
9
|
Feng C, Xi J, Zhang Y, Jiang B, Zhou Y. Accurate and Interpretable Dipole Interaction Model-Based Machine Learning for Molecular Polarizability. J Chem Theory Comput 2023; 19:1207-1217. [PMID: 36753749 DOI: 10.1021/acs.jctc.2c01094] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
Polarizabilities play significant roles in describing dispersive and inductive interactions of the atom and molecular systems. However, an accurate prediction of molecular polarizabilities from first principles is computationally prohibitive. Although physical models or statistical machine learning models have been proposed, either a lack of accurate description of local chemical environments or demanding a large number of samples for training has limited their practical applications. In this study, we combine a physically inspired dipole interaction model and an accurate neural network method for predicting the polarizability tensors of molecules. With the local chemical environment precisely described and the requirement of rotational covariance naturally fulfilled, this hybrid model is proven to give an accurate molecular polarizability prediction, essentially reducing the number of training samples. The atomic polarizabilities are physically interpretable and transferable to larger molecules unseen in the training set. This promising method may find its wide range of applications, such as spectroscopic simulations and the construction of polarizable force fields.
Collapse
Affiliation(s)
- Chaoqiang Feng
- Anhui Key Laboratory of Optoelectric Materials Science and Technology, Department of Physics, Anhui Normal University, Wuhu, Anhui 241000, China.,Hefei National Research Center for Physical Sciences at the Microscale, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Jin Xi
- Anhui Key Laboratory of Optoelectric Materials Science and Technology, Department of Physics, Anhui Normal University, Wuhu, Anhui 241000, China
| | - Yaolong Zhang
- Hefei National Research Center for Physical Sciences at the Microscale, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Bin Jiang
- Hefei National Research Center for Physical Sciences at the Microscale, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Yong Zhou
- Anhui Key Laboratory of Optoelectric Materials Science and Technology, Department of Physics, Anhui Normal University, Wuhu, Anhui 241000, China
| |
Collapse
|
10
|
Houston PL, Qu C, Yu Q, Conte R, Nandi A, Li JK, Bowman JM. PESPIP: Software to fit complex molecular and many-body potential energy surfaces with permutationally invariant polynomials. J Chem Phys 2023; 158:044109. [PMID: 36725524 DOI: 10.1063/5.0134442] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
We wish to describe a potential energy surface by using a basis of permutationally invariant polynomials whose coefficients will be determined by numerical regression so as to smoothly fit a dataset of electronic energies as well as, perhaps, gradients. The polynomials will be powers of transformed internuclear distances, usually either Morse variables, exp(-ri,j/λ), where λ is a constant range hyperparameter, or reciprocals of the distances, 1/ri,j. The question we address is how to create the most efficient basis, including (a) which polynomials to keep or discard, (b) how many polynomials will be needed, (c) how to make sure the polynomials correctly reproduce the zero interaction at a large distance, (d) how to ensure special symmetries, and (e) how to calculate gradients efficiently. This article discusses how these questions can be answered by using a set of programs to choose and manipulate the polynomials as well as to write efficient Fortran programs for the calculation of energies and gradients. A user-friendly interface for access to monomial symmetrization approach results is also described. The software for these programs is now publicly available.
Collapse
Affiliation(s)
- Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA and Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Chen Qu
- Independent Researcher, Toronto, Ontario M9B0E3, Canada
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, USA
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, Via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Jeffrey K Li
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| |
Collapse
|
11
|
Bowman JM, Qu C, Conte R, Nandi A, Houston PL, Yu Q. Δ-Machine Learned Potential Energy Surfaces and Force Fields. J Chem Theory Comput 2023; 19:1-17. [PMID: 36527383 DOI: 10.1021/acs.jctc.2c01034] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
There has been great progress in developing machine-learned potential energy surfaces (PESs) for molecules and clusters with more than 10 atoms. Unfortunately, this number of atoms generally limits the level of electronic structure theory to less than the "gold standard" CCSD(T) level. Indeed, for the well-known MD17 dataset for molecules with 9-20 atoms, all of the energies and forces were obtained with DFT calculations (PBE). This Perspective is focused on a Δ-machine learning method that we recently proposed and applied to bring DFT-based PESs to close to CCSD(T) accuracy. This is demonstrated for hydronium, N-methylacetamide, acetyl acetone, and ethanol. For 15-atom tropolone, it appears that special approaches (e.g., molecular tailoring, local CCSD(T)) are needed to obtain the CCSD(T) energies. A new aspect of this approach is the extension of Δ-machine learning to force fields. The approach is based on many-body corrections to polarizable force field potentials. This is examined in detail using the TTM2.1 water potential. The corrections make use of our recent CCSD(T) datasets for 2-b, 3-b, and 4-b interactions for water. These datasets were used to develop a new fully ab initio potential for water, termed q-AQUA.
Collapse
Affiliation(s)
- Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Chen Qu
- Independent Researcher, Toronto, Canada 66777
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States.,Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
| |
Collapse
|
12
|
Shao L, Ma J, Prelesnik JL, Zhou Y, Nguyen M, Zhao M, Jenekhe SA, Kalinin SV, Ferguson AL, Pfaendtner J, Mundy CJ, De Yoreo JJ, Baneyx F, Chen CL. Hierarchical Materials from High Information Content Macromolecular Building Blocks: Construction, Dynamic Interventions, and Prediction. Chem Rev 2022; 122:17397-17478. [PMID: 36260695 DOI: 10.1021/acs.chemrev.2c00220] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Hierarchical materials that exhibit order over multiple length scales are ubiquitous in nature. Because hierarchy gives rise to unique properties and functions, many have sought inspiration from nature when designing and fabricating hierarchical matter. More and more, however, nature's own high-information content building blocks, proteins, peptides, and peptidomimetics, are being coopted to build hierarchy because the information that determines structure, function, and interfacial interactions can be readily encoded in these versatile macromolecules. Here, we take stock of recent progress in the rational design and characterization of hierarchical materials produced from high-information content blocks with a focus on stimuli-responsive and "smart" architectures. We also review advances in the use of computational simulations and data-driven predictions to shed light on how the side chain chemistry and conformational flexibility of macromolecular blocks drive the emergence of order and the acquisition of hierarchy and also on how ionic, solvent, and surface effects influence the outcomes of assembly. Continued progress in the above areas will ultimately usher in an era where an understanding of designed interactions, surface effects, and solution conditions can be harnessed to achieve predictive materials synthesis across scale and drive emergent phenomena in the self-assembly and reconfiguration of high-information content building blocks.
Collapse
Affiliation(s)
- Li Shao
- Physical Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Jinrong Ma
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, Washington 98195, United States
| | - Jesse L Prelesnik
- Department of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Yicheng Zhou
- Physical Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Mary Nguyen
- Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States.,Department of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Mingfei Zhao
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Samson A Jenekhe
- Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States.,Department of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Sergei V Kalinin
- Department of Materials Science and Engineering, University of Tennessee, Knoxville, Tennessee 37996, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Jim Pfaendtner
- Physical Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States.,Materials Science and Engineering, University of Washington, Seattle, Washington 98195, United States
| | - Christopher J Mundy
- Physical Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States.,Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
| | - James J De Yoreo
- Physical Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States.,Materials Science and Engineering, University of Washington, Seattle, Washington 98195, United States
| | - François Baneyx
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, Washington 98195, United States.,Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
| | - Chun-Long Chen
- Physical Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States.,Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
13
|
Conte R, Nandi A, Qu C, Yu Q, Houston PL, Bowman JM. Semiclassical and VSCF/VCI Calculations of the Vibrational Energies of trans- and gauche-Ethanol Using a CCSD(T) Potential Energy Surface. J Phys Chem A 2022; 126:7709-7718. [PMID: 36240438 DOI: 10.1021/acs.jpca.2c06322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A recent full-dimensional Δ-Machine learning potential energy surface (PES) for ethanol is employed in semiclassical and vibrational self-consistent field (VSCF) and virtual-state configuration interaction (VCI) calculations, using MULTIMODE, to determine the anharmonic vibrational frequencies of vibration for both the trans and gauche conformers of ethanol. Both semiclassical and VSCF/VCI energies agree well with the experimental data. We find significant mixing between the VSCF basis states due to Fermi resonances between bending and stretching modes. The same effects are also accurately described by the full-dimensional semiclassical calculations. These are the first high-level anharmonic calculations using a PES, in particular a "gold-standard" CCSD(T) one.
Collapse
Affiliation(s)
- Riccardo Conte
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Chen Qu
- Independent Researcher, Toronto, Ontario M9B0E3, Canada
| | - Qi Yu
- Department of Chemistry Yale University, New Haven, Connecticut 06520, United States
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States.,Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
14
|
Bull-Vulpe EF, Riera M, Bore SL, Paesani F. Data-Driven Many-Body Potential Energy Functions for Generic Molecules: Linear Alkanes as a Proof-of-Concept Application. J Chem Theory Comput 2022. [PMID: 36113028 DOI: 10.1021/acs.jctc.2c00645] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We present a generalization of the many-body energy (MB-nrg) theoretical/computational framework that enables the development of data-driven potential energy functions (PEFs) for generic covalently bonded molecules, with arbitrary quantum mechanical accuracy. The "nearsightedness of electronic matter" is exploited to define monomers as "natural building blocks" on the basis of their distinct chemical identity. The energy of generic molecules is then expressed as a sum of individual many-body energies of incrementally larger subsystems. The MB-nrg PEFs represent the low-order n-body energies, with n = 1-4, using permutationally invariant polynomials derived from electronic structure data carried out at an arbitrary quantum mechanical level of theory, while all higher-order n-body terms (n > 4) are represented by a classical many-body polarization term. As a proof-of-concept application of the general MB-nrg framework, we present MB-nrg PEFs for linear alkanes. The MB-nrg PEFs are shown to accurately reproduce reference energies, harmonic frequencies, and potential energy scans of alkanes, independently of their length. Since, by construction, the MB-nrg framework introduced here can be applied to generic covalently bonded molecules, we envision future computer simulations of complex molecular systems using data-driven MB-nrg PEFs, with arbitrary quantum mechanical accuracy.
Collapse
Affiliation(s)
- Ethan F. Bull-Vulpe
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Marc Riera
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Sigbjørn L. Bore
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Francesco Paesani
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
- Materials Science and Engineering, University of California San Diego, La Jolla, California 92093, United States
- San Diego Supercomputer Center, University of California San Diego, La Jolla, California 92093, United States
| |
Collapse
|
15
|
Houston PL, Qu C, Nandi A, Conte R, Yu Q, Bowman JM. Permutationally invariant polynomial regression for energies and gradients, using reverse differentiation, achieves orders of magnitude speed-up with high precision compared to other machine learning methods. J Chem Phys 2022; 156:044120. [DOI: 10.1063/5.0080506] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Affiliation(s)
- Paul L. Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA and Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Chen Qu
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06511, USA
| | - Joel M. Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| |
Collapse
|
16
|
Nandi A, Qu C, Houston PL, Conte R, Yu Q, Bowman JM. A CCSD(T)-Based 4-Body Potential for Water. J Phys Chem Lett 2021; 12:10318-10324. [PMID: 34662138 DOI: 10.1021/acs.jpclett.1c03152] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
High-level, ab initio calculations find that the 4-body (4-b) interaction is needed to account for near-100% of the total interaction energy for water clusters as large as the 21-mer. Motivated by this, we report a permutationally invariant polynomial potential energy surface (PES) for the 4-body interaction. This machine-learned PES is a fit to 2119 symmetry-unique, CCSD(T)-F12a/haTZ 4-b interaction energies. Configurations for these come from tetramer direct-dynamics calculations, fragments from an MD water simulation at 300 K, and tetramer fragments in a variety of water clusters. The PIP basis is purified to ensure that the PES goes rigorously to zero in monomer+trimer and dimer+dimer dissociations. The 4-b energies of isomers of the hexamer calculated with the new PES are shown to be in better agreement with benchmark CCSD(T) results than those from the MB-pol potential. Tests on larger clusters further validate the high-fidelity of the PES. The PES is shown to be fast to evaluate, taking 2.4 s for 105 evaluations on a single core of 2.4 GHz Intel Xeon processor, and significantly faster using a parallel version of the PES.
Collapse
Affiliation(s)
- Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Chen Qu
- Department of Chemistry & Biochemistry, University of Maryland, College Park, Maryland 20742, United States
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States
- Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
| | - Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
17
|
Moberg DR, Jasper AW, Davis MJ. Parsimonious Potential Energy Surface Expansions Using Dictionary Learning with Multipass Greedy Selection. J Phys Chem Lett 2021; 12:9169-9174. [PMID: 34525799 DOI: 10.1021/acs.jpclett.1c02721] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Potential energy surfaces fit with basis set expansions have been shown to provide accurate representations of electronic energies and have enabled a variety of high-accuracy dynamics, kinetics, and spectroscopy applications. The number of terms in these expansions scales poorly with system size, a drawback that challenges their use for systems with more than ∼10 atoms. A solution is presented here using dictionary learning. Subsets of the full set of conventional basis functions are optimized using a newly developed multipass greedy regression method inspired by forward and backward selection methods from the statistics, signal processing, and machine learning literatures. The optimized representations have accuracies comparable to the full set but are 1 or more orders of magnitude smaller, and notably, the number of terms in the optimized multipass greedy expansions scales approximately linearly with the number of atoms.
Collapse
Affiliation(s)
- Daniel R Moberg
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Ahren W Jasper
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Michael J Davis
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| |
Collapse
|
18
|
Qu C, Houston PL, Conte R, Nandi A, Bowman JM. Breaking the Coupled Cluster Barrier for Machine-Learned Potentials of Large Molecules: The Case of 15-Atom Acetylacetone. J Phys Chem Lett 2021; 12:4902-4909. [PMID: 34006096 PMCID: PMC8279733 DOI: 10.1021/acs.jpclett.1c01142] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Machine-learned potential energy surfaces (PESs) for molecules with more than 10 atoms are typically forced to use lower-level electronic structure methods such as density functional theory (DFT) and second-order Møller-Plesset perturbation theory (MP2). While these are efficient and realistic, they fall short of the accuracy of the "gold standard" coupled-cluster method, especially with respect to reaction and isomerization barriers. We report a major step forward in applying a Δ-machine learning method to the challenging case of acetylacetone, whose MP2 barrier height for H-atom transfer is low by roughly 1.1 kcal/mol relative to the benchmark CCSD(T) barrier of 3.2 kcal/mol. From a database of 2151 local CCSD(T) energies and training with as few as 430 energies, we obtain a new PES with a barrier of 3.5 kcal/mol in agreement with the LCCSD(T) barrier of 3.5 kcal/mol and close to the benchmark value. Tunneling splittings due to H-atom transfer are calculated using this new PES, providing improved estimates over previous ones obtained using an MP2-based PES.
Collapse
Affiliation(s)
- Chen Qu
- Department
of Chemistry & Biochemistry, University
of Maryland, College Park, Maryland 20742, United States
| | - Paul L. Houston
- Department
of Chemistry and Chemical Biology, Cornell
University, Ithaca, New York 14853, United
States
- Department
of Chemistry and Biochemistry, Georgia Institute
of Technology, Atlanta, Georgia 30332, United
States
| | - Riccardo Conte
- Dipartimento
di Chimica, Università degli Studi
di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department
of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Joel M. Bowman
- Department
of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
19
|
Cao X, Tian P. "Dividing and Conquering" and "Caching" in Molecular Modeling. Int J Mol Sci 2021; 22:5053. [PMID: 34068835 PMCID: PMC8126232 DOI: 10.3390/ijms22095053] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 04/26/2021] [Accepted: 04/27/2021] [Indexed: 11/17/2022] Open
Abstract
Molecular modeling is widely utilized in subjects including but not limited to physics, chemistry, biology, materials science and engineering. Impressive progress has been made in development of theories, algorithms and software packages. To divide and conquer, and to cache intermediate results have been long standing principles in development of algorithms. Not surprisingly, most important methodological advancements in more than half century of molecular modeling are various implementations of these two fundamental principles. In the mainstream classical computational molecular science, tremendous efforts have been invested on two lines of algorithm development. The first is coarse graining, which is to represent multiple basic particles in higher resolution modeling as a single larger and softer particle in lower resolution counterpart, with resulting force fields of partial transferability at the expense of some information loss. The second is enhanced sampling, which realizes "dividing and conquering" and/or "caching" in configurational space with focus either on reaction coordinates and collective variables as in metadynamics and related algorithms, or on the transition matrix and state discretization as in Markov state models. For this line of algorithms, spatial resolution is maintained but results are not transferable. Deep learning has been utilized to realize more efficient and accurate ways of "dividing and conquering" and "caching" along these two lines of algorithmic research. We proposed and demonstrated the local free energy landscape approach, a new framework for classical computational molecular science. This framework is based on a third class of algorithm that facilitates molecular modeling through partially transferable in resolution "caching" of distributions for local clusters of molecular degrees of freedom. Differences, connections and potential interactions among these three algorithmic directions are discussed, with the hope to stimulate development of more elegant, efficient and reliable formulations and algorithms for "dividing and conquering" and "caching" in complex molecular systems.
Collapse
Affiliation(s)
- Xiaoyong Cao
- School of Life Sciences, Jilin University, Changchun 130012, China;
| | - Pu Tian
- School of Life Sciences, Jilin University, Changchun 130012, China;
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| |
Collapse
|
20
|
Allen AEA, Dusson G, Ortner C, Csányi G. Atomic permutationally invariant polynomials for fitting molecular force fields. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abd51e] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
21
|
Nandi A, Qu C, Houston PL, Conte R, Bowman JM. Δ-machine learning for potential energy surfaces: A PIP approach to bring a DFT-based PES to CCSD(T) level of theory. J Chem Phys 2021; 154:051102. [DOI: 10.1063/5.0038301] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Affiliation(s)
- Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Chen Qu
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | - Paul L. Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA and Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, Via Golgi 19, 20133 Milano, Italy
| | - Joel M. Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| |
Collapse
|
22
|
Qu C, Conte R, Houston PL, Bowman JM. Full-dimensional potential energy surface for acetylacetone and tunneling splittings. Phys Chem Chem Phys 2021; 23:7758-7767. [DOI: 10.1039/d0cp04221h] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
New, full-dimensional potential energy surface for acetylacetone allows for description of H-tunneling dynamics and characterization of stationary points.
Collapse
Affiliation(s)
- Chen Qu
- Department of Chemistry & Biochemistry
- University of Maryland
- College Park
- USA
| | - Riccardo Conte
- Dipartimento di Chimica
- Università Degli Studi di Milano
- 20133 Milano
- Italy
| | - Paul L. Houston
- Department of Chemistry and Chemical Biology
- Cornell University
- Ithaca
- USA
- Department of Chemistry and Biochemistry
| | - Joel M. Bowman
- Cherry L. Emerson Center for Scientific Computations and Department of Chemistry
- Atlanta
- USA
| |
Collapse
|
23
|
Conte R, Houston PL, Qu C, Li J, Bowman JM. Full-dimensional, ab initio potential energy surface for glycine with characterization of stationary points and zero-point energy calculations by means of diffusion Monte Carlo and semiclassical dynamics. J Chem Phys 2020; 153:244301. [DOI: 10.1063/5.0037175] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Affiliation(s)
- Riccardo Conte
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Paul L. Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA and Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Chen Qu
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | - Jeffrey Li
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Joel M. Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| |
Collapse
|
24
|
Gandolfi M, Rognoni A, Aieta C, Conte R, Ceotto M. Machine learning for vibrational spectroscopy via divide-and-conquer semiclassical initial value representation molecular dynamics with application to N-methylacetamide. J Chem Phys 2020; 153:204104. [DOI: 10.1063/5.0031892] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- Michele Gandolfi
- Dipartimento di Chimica, Università degli Studi di Milano, Via Golgi 19, 20133 Milano, Italy,
| | - Alessandro Rognoni
- Dipartimento di Chimica, Università degli Studi di Milano, Via Golgi 19, 20133 Milano, Italy,
| | - Chiara Aieta
- Dipartimento di Chimica, Università degli Studi di Milano, Via Golgi 19, 20133 Milano, Italy,
| | - Riccardo Conte
- Dipartimento di Chimica, Università degli Studi di Milano, Via Golgi 19, 20133 Milano, Italy,
| | - Michele Ceotto
- Dipartimento di Chimica, Università degli Studi di Milano, Via Golgi 19, 20133 Milano, Italy,
| |
Collapse
|
25
|
Li J, Zhao B, Xie D, Guo H. Advances and New Challenges to Bimolecular Reaction Dynamics Theory. J Phys Chem Lett 2020; 11:8844-8860. [PMID: 32970441 DOI: 10.1021/acs.jpclett.0c02501] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Dynamics of bimolecular reactions in the gas phase are of foundational importance in combustion, atmospheric chemistry, interstellar chemistry, and plasma chemistry. These collision-induced chemical transformations are a sensitive probe of the underlying potential energy surface(s). Despite tremendous progress in past decades, our understanding is still not complete. In this Perspective, we survey the recent advances in theoretical characterization of bimolecular reaction dynamics, stimulated by new experimental observations, and identify key new challenges.
Collapse
Affiliation(s)
- Jun Li
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Theoretical and Computational Chemistry, Chongqing University, Chongqing 401331, China
| | - Bin Zhao
- Theoretische Chemie, Fakultät für Chemie, Universität Bielefeld, Universitätsstraße 25, D-33615 Bielefeld, Germany
| | - Daiqian Xie
- Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, China
| | - Hua Guo
- Department of Chemistry and Chemical Biology, University of New Mexico, Albuquerque, New Mexico 87131, United States
| |
Collapse
|
26
|
Gkeka P, Stoltz G, Barati Farimani A, Belkacemi Z, Ceriotti M, Chodera JD, Dinner AR, Ferguson AL, Maillet JB, Minoux H, Peter C, Pietrucci F, Silveira A, Tkatchenko A, Trstanova Z, Wiewiora R, Lelièvre T. Machine Learning Force Fields and Coarse-Grained Variables in Molecular Dynamics: Application to Materials and Biological Systems. J Chem Theory Comput 2020; 16:4757-4775. [PMID: 32559068 PMCID: PMC8312194 DOI: 10.1021/acs.jctc.0c00355] [Citation(s) in RCA: 96] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Machine learning encompasses tools and algorithms that are now becoming popular in almost all scientific and technological fields. This is true for molecular dynamics as well, where machine learning offers promises of extracting valuable information from the enormous amounts of data generated by simulation of complex systems. We provide here a review of our current understanding of goals, benefits, and limitations of machine learning techniques for computational studies on atomistic systems, focusing on the construction of empirical force fields from ab initio databases and the determination of reaction coordinates for free energy computation and enhanced sampling.
Collapse
Affiliation(s)
- Paraskevi Gkeka
- Integrated Drug Discovery, Sanofi R&D, 91385 Chilly-Mazarin, France
| | - Gabriel Stoltz
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
- Matherials Project-Team, Inria Paris, 75012 Paris, France
| | | | - Zineb Belkacemi
- Integrated Drug Discovery, Sanofi R&D, 91385 Chilly-Mazarin, France
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
| | - Michele Ceriotti
- Laboratory of Computational Science and Modelling, Institute of Materials, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Aaron R Dinner
- Department of Chemistry, The University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, 5640 South Ellis Avenue, Chicago, Illinois 60637, United States
| | | | - Hervé Minoux
- Integrated Drug Discovery, Sanofi R&D, 94403 Vitry-sur-Seine, France
| | | | - Fabio Pietrucci
- UMR CNRS 7590, MNHN, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Sorbonne Université, 75005 Paris, France
| | - Ana Silveira
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Zofia Trstanova
- School of Mathematics, The University of Edinburgh, Edinburgh EH9 3FD, U.K
| | - Rafal Wiewiora
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Tony Lelièvre
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
- Matherials Project-Team, Inria Paris, 75012 Paris, France
| |
Collapse
|
27
|
Koner D, Meuwly M. Permutationally Invariant, Reproducing Kernel-Based Potential Energy Surfaces for Polyatomic Molecules: From Formaldehyde to Acetone. J Chem Theory Comput 2020; 16:5474-5484. [DOI: 10.1021/acs.jctc.0c00535] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Affiliation(s)
- Debasish Koner
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, 4056 Basel, Switzerland
| | - Markus Meuwly
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, 4056 Basel, Switzerland
| |
Collapse
|
28
|
Houston P, Conte R, Qu C, Bowman JM. Permutationally invariant polynomial potential energy surfaces for tropolone and H and D atom tunneling dynamics. J Chem Phys 2020; 153:024107. [DOI: 10.1063/5.0011973] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Affiliation(s)
- Paul Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA and Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Chen Qu
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | - Joel M. Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| |
Collapse
|
29
|
Yanes-Rodríguez R, Arismendi-Arrieta DJ, Prosmiti R. He Inclusion in Ice-like and Clathrate-like Frameworks: A Benchmark Quantum Chemistry Study of Guest-Host Interactions. J Chem Inf Model 2020; 60:3043-3056. [PMID: 32469514 DOI: 10.1021/acs.jcim.0c00349] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Energetics and structural properties of selected type and size He@hydrate frameworks, e.g., from regular structured ice channels to clathrate-like cages, are presented from first-principles quantum chemistry methods. The scarcity of information on He@hydrates makes such complexes challenging targets, while their computational study entails an interesting and arduous task. Some of them have been synthesized in the laboratory, which motivates further investigations on their stability. Hence, the main focus is to examine the performance and accuracy of different wave function-based electronic structure methods, such as MP2, CCSD(T), their explicitly correlated (F12) and domain-based local pair-natural orbital (DLPNO) analogs, as well as modern and conventional density functional theory (DFT) approaches, and analytical model potentials available. Different structures are considered, starting from the "simplest system" formed by a noble gas atom (such as He) and one water molecule, followed by the study of the "fundamental units" present in all ice-like and clathrate-like frameworks (such as pentamers and hexamers) and finally the description of interactions in the "building blocks" of three-dimensional (3D) ice channels (e.g., horizontal and perpendicular ice II and Ih) and clathrate-like cages, such as the 512 present in the most common sI, sII, and sH clathrate-hydrate structures. The idea is to provide well-converged DLPNO-CCSD(T) and DFMP2/CBS reference datasets that in turn are used to validate how DFT functionals (in total, 29 approaches from generalized-gradient approximation (GGA), meta-GGA, to hybrid and range-separated functionals, including dispersion correction treatments, were checked) and analytical semiempirical/ab initio-based potentials perform compared with high-level alternatives. Within all tested approaches, those best-performing were identified and classified. Most of the DFT/DFT-D functionals, as well as available analytical pairwise model potentials, face difficulties in describing both hydrogen-bonded water frameworks and dispersion bound He-water interactions. Including dispersion corrections yields an overall well-balanced performance for LCωPBE-D3BJ and PBE0-D4 functionals. Such benchmark datasets can benefit research into the development of new cheminformatics models, as can serve to guide and cross-check methodologies, lending increased predicted power to future molecular simulations for investigating the role of structures and phase transitions from nanoscale clusters to macroscopic crystalline structures.
Collapse
Affiliation(s)
| | - Daniel J Arismendi-Arrieta
- Institute of Fundamental Physics (IFF-CSIC), CSIC, Serrano 123, 28006 Madrid, Spain.,Donostia International Physics Center (DIPC), Paseo Manuel de Lardizabal 4, Gipuzkoa, 20018 Donostia-San Sebastián, Spain
| | - Rita Prosmiti
- Institute of Fundamental Physics (IFF-CSIC), CSIC, Serrano 123, 28006 Madrid, Spain
| |
Collapse
|
30
|
Koner D, Bemish RJ, Meuwly M. Dynamics on Multiple Potential Energy Surfaces: Quantitative Studies of Elementary Processes Relevant to Hypersonics. J Phys Chem A 2020; 124:6255-6269. [DOI: 10.1021/acs.jpca.0c01870] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Debasish Koner
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, 4056 Basel, Switzerland
| | - Raymond J. Bemish
- Air Force Research Laboratory, Space Vehicles Directorate, Kirtland AFB, New Mexico 87117, United States
| | - Markus Meuwly
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, 4056 Basel, Switzerland
| |
Collapse
|
31
|
Chen R, Shao K, Fu B, Zhang DH. Fitting potential energy surfaces with fundamental invariant neural network. II. Generating fundamental invariants for molecular systems with up to ten atoms. J Chem Phys 2020; 152:204307. [PMID: 32486688 DOI: 10.1063/5.0010104] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Symmetry adaptation is crucial in representing a permutationally invariant potential energy surface (PES). Due to the rapid increase in computational time with respect to the molecular size, as well as the reliance on the algebra software, the previous neural network (NN) fitting with inputs of fundamental invariants (FIs) has practical limits. Here, we report an improved and efficient generation scheme of FIs based on the computational invariant theory and parallel program, which can be readily used as the input vector of NNs in fitting high-dimensional PESs with permutation symmetry. The newly developed method significantly reduces the evaluation time of FIs, thereby extending the FI-NN method for constructing highly accurate PESs to larger systems beyond five atoms. Because of the minimum size of invariants used in the inputs of the NN, the NN structure can be very flexible for FI-NN, which leads to small fitting errors. The resulting FI-NN PES is much faster on evaluating than the corresponding permutationally invariant polynomial-NN PES.
Collapse
Affiliation(s)
- Rongjun Chen
- State Key Laboratory of Molecular Reaction Dynamics and Center for Theoretical and Computational Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, People's Republic of China
| | - Kejie Shao
- State Key Laboratory of Molecular Reaction Dynamics and Center for Theoretical and Computational Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, People's Republic of China
| | - Bina Fu
- State Key Laboratory of Molecular Reaction Dynamics and Center for Theoretical and Computational Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, People's Republic of China
| | - Dong H Zhang
- State Key Laboratory of Molecular Reaction Dynamics and Center for Theoretical and Computational Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, People's Republic of China
| |
Collapse
|
32
|
Conte R, Qu C, Houston PL, Bowman JM. Efficient Generation of Permutationally Invariant Potential Energy Surfaces for Large Molecules. J Chem Theory Comput 2020; 16:3264-3272. [PMID: 32212729 PMCID: PMC7997398 DOI: 10.1021/acs.jctc.0c00001] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
![]()
An
efficient method is described for generating a fragmented, permutationally
invariant polynomial basis to fit electronic energies and, if available,
gradients for large molecules. The method presented rests on the fragmentation
of a large molecule into any number of fragments while maintaining
the permutational invariance and uniqueness of the polynomials. The
new approach improves on a previous one reported by Qu and Bowman
by avoiding repetition of polynomials in the fitting basis set and
speeding up gradient evaluations while keeping the accuracy of the
PES. The method is demonstrated for CH3–NH–CO–CH3 (N-methylacetamide) and NH2–CH2–COOH (glycine).
Collapse
Affiliation(s)
- Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Chen Qu
- Department of Chemistry & Biochemistry, University of Maryland, College Park, Maryland 20742, United States
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States.,Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
33
|
Riera M, Yeh EP, Paesani F. Data-Driven Many-Body Models for Molecular Fluids: CO2/H2O Mixtures as a Case Study. J Chem Theory Comput 2020; 16:2246-2257. [DOI: 10.1021/acs.jctc.9b01175] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Marc Riera
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Eric P. Yeh
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Francesco Paesani
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
- Materials Science and Engineering, University of California San Diego, La Jolla, California 92093, United States
- San Diego Supercomputer Center, University of California San Diego, La Jolla, California 92093, United States
| |
Collapse
|
34
|
Unke OT, Koner D, Patra S, Käser S, Meuwly M. High-dimensional potential energy surfaces for molecular simulations: from empiricism to machine learning. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2020. [DOI: 10.1088/2632-2153/ab5922] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
35
|
Brown SE. From ab initio data to high-dimensional potential energy surfaces: A critical overview and assessment of the development of permutationally invariant polynomial potential energy surfaces for single molecules. J Chem Phys 2019; 151:194111. [PMID: 31757150 DOI: 10.1063/1.5123999] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The representation of high-dimensional potential energy surfaces by way of the many-body expansion and permutationally invariant polynomials has become a well-established tool for improving the resolution and extending the scope of molecular simulations. The high level of accuracy that can be attained by these potential energy functions (PEFs) is due in large part to their specificity: for each term in the many-body expansion, a species-specific training set must be generated at the desired level of theory and a number of fits attempted in order to obtain a robust and reliable PEF. In this work, we attempt to characterize the numerical aspects of the fitting problem, addressing questions which are of simultaneous practical and fundamental importance. These include concrete illustrations of the nonconvexity of the problem, the ill-conditionedness of the linear system to be solved and possible need for regularization, the sensitivity of the solutions to the characteristics of the training set, and limitations of the approach with respect to accuracy and the types of molecules that can be treated. In addition, we introduce a general approach to the generation of training set configurations based on the familiar harmonic approximation and evaluate the possible benefits to the use of quasirandom sequences for sampling configuration space in this context. Using sulfate as a case study, the findings are largely generalizable and expected to ultimately facilitate the efficient development of PIP-based many-body PEFs for general systems via automation.
Collapse
Affiliation(s)
- Sandra E Brown
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|
36
|
Nandi A, Qu C, Bowman JM. Full and fragmented permutationally invariant polynomial potential energy surfaces for trans and cis N-methyl acetamide and isomerization saddle points. J Chem Phys 2019; 151:084306. [DOI: 10.1063/1.5119348] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Chen Qu
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | - Joel M. Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| |
Collapse
|