1
|
Solov'yov AV, Verkhovtsev AV, Mason NJ, Amos RA, Bald I, Baldacchino G, Dromey B, Falk M, Fedor J, Gerhards L, Hausmann M, Hildenbrand G, Hrabovský M, Kadlec S, Kočišek J, Lépine F, Ming S, Nisbet A, Ricketts K, Sala L, Schlathölter T, Wheatley AEH, Solov'yov IA. Condensed Matter Systems Exposed to Radiation: Multiscale Theory, Simulations, and Experiment. Chem Rev 2024. [PMID: 38842266 DOI: 10.1021/acs.chemrev.3c00902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Abstract
This roadmap reviews the new, highly interdisciplinary research field studying the behavior of condensed matter systems exposed to radiation. The Review highlights several recent advances in the field and provides a roadmap for the development of the field over the next decade. Condensed matter systems exposed to radiation can be inorganic, organic, or biological, finite or infinite, composed of different molecular species or materials, exist in different phases, and operate under different thermodynamic conditions. Many of the key phenomena related to the behavior of irradiated systems are very similar and can be understood based on the same fundamental theoretical principles and computational approaches. The multiscale nature of such phenomena requires the quantitative description of the radiation-induced effects occurring at different spatial and temporal scales, ranging from the atomic to the macroscopic, and the interlinks between such descriptions. The multiscale nature of the effects and the similarity of their manifestation in systems of different origins necessarily bring together different disciplines, such as physics, chemistry, biology, materials science, nanoscience, and biomedical research, demonstrating the numerous interlinks and commonalities between them. This research field is highly relevant to many novel and emerging technologies and medical applications.
Collapse
Affiliation(s)
- Andrey V Solov'yov
- MBN Research Center, Altenhöferallee 3, 60438 Frankfurt am Main, Germany
| | | | - Nigel J Mason
- School of Physics and Astronomy, University of Kent, Canterbury CT2 7NH, United Kingdom
| | - Richard A Amos
- Department of Medical Physics and Biomedical Engineering, University College London, London WC1E 6BT, U.K
| | - Ilko Bald
- Institute of Chemistry, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476 Potsdam, Germany
| | - Gérard Baldacchino
- Université Paris-Saclay, CEA, LIDYL, 91191 Gif-sur-Yvette, France
- CY Cergy Paris Université, CEA, LIDYL, 91191 Gif-sur-Yvette, France
| | - Brendan Dromey
- Centre for Light Matter Interactions, School of Mathematics and Physics, Queen's University Belfast, Belfast BT7 1NN, United Kingdom
| | - Martin Falk
- Institute of Biophysics of the Czech Academy of Sciences, Královopolská 135, 61200 Brno, Czech Republic
- Kirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany
| | - Juraj Fedor
- J. Heyrovský Institute of Physical Chemistry, Czech Academy of Sciences, Dolejškova 3, 18223 Prague, Czech Republic
| | - Luca Gerhards
- Institute of Physics, Carl von Ossietzky University, Carl-von-Ossietzky-Str. 9-11, 26129 Oldenburg, Germany
| | - Michael Hausmann
- Kirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany
| | - Georg Hildenbrand
- Kirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany
- Faculty of Engineering, University of Applied Sciences Aschaffenburg, Würzburger Str. 45, 63743 Aschaffenburg, Germany
| | | | - Stanislav Kadlec
- Eaton European Innovation Center, Bořivojova 2380, 25263 Roztoky, Czech Republic
| | - Jaroslav Kočišek
- J. Heyrovský Institute of Physical Chemistry, Czech Academy of Sciences, Dolejškova 3, 18223 Prague, Czech Republic
| | - Franck Lépine
- Université Claude Bernard Lyon 1, CNRS, Institut Lumière Matière, F-69622, Villeurbanne, France
| | - Siyi Ming
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Andrew Nisbet
- Department of Medical Physics and Biomedical Engineering, University College London, London WC1E 6BT, U.K
| | - Kate Ricketts
- Department of Targeted Intervention, University College London, Gower Street, London WC1E 6BT, United Kingdom
| | - Leo Sala
- J. Heyrovský Institute of Physical Chemistry, Czech Academy of Sciences, Dolejškova 3, 18223 Prague, Czech Republic
| | - Thomas Schlathölter
- Zernike Institute for Advanced Materials, University of Groningen, Nijenborgh 4, 9747 AG Groningen, The Netherlands
- University College Groningen, University of Groningen, Hoendiepskade 23/24, 9718 BG Groningen, The Netherlands
| | - Andrew E H Wheatley
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Ilia A Solov'yov
- Institute of Physics, Carl von Ossietzky University, Carl-von-Ossietzky-Str. 9-11, 26129 Oldenburg, Germany
| |
Collapse
|
2
|
Finkbeiner J, Tovey S, Holm C. Generating Minimal Training Sets for Machine Learned Potentials. PHYSICAL REVIEW LETTERS 2024; 132:167301. [PMID: 38701485 DOI: 10.1103/physrevlett.132.167301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 09/11/2023] [Accepted: 03/19/2024] [Indexed: 05/05/2024]
Abstract
This Letter presents a novel approach for identifying uncorrelated atomic configurations from extensive datasets with a nonstandard neural network workflow known as random network distillation (RND) for training machine-learned interatomic potentials (MLPs). This method is coupled with a DFT workflow wherein initial data are generated with cheaper classical methods before only the minimal subset is passed to a more computationally expensive ab initio calculation. This benefits training not only by reducing the number of expensive DFT calculations required but also by providing a pathway to the use of more accurate quantum mechanical calculations. The method's efficacy is demonstrated by constructing machine-learned interatomic potentials for the molten salts KCl and NaCl. Our RND method allows accurate models to be fit on minimal datasets, as small as 32 configurations, reducing the required structures by at least 1 order of magnitude compared to alternative methods. This reduction in dataset sizes not only substantially reduces computational overhead for training data generation but also provides a more comprehensive starting point for active-learning procedures.
Collapse
Affiliation(s)
- Jan Finkbeiner
- Peter Grünberg Institute Forschungszentrum Jülich GmbH Wilhelm-Johnen-Straße, 52428 Jülich, Germany
| | - Samuel Tovey
- Institute for Computational Physics University of Stuttgart Allmandring 3, 70569 Stuttgart, Germany
| | - Christian Holm
- Institute for Computational Physics University of Stuttgart Allmandring 3, 70569 Stuttgart, Germany
| |
Collapse
|
3
|
Liu W, Lin S, Li X, Li W, Deng H, Fang H, Li W. Analysis of dissolved oxygen influencing factors and concentration prediction using input variable selection technique: A hybrid machine learning approach. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 357:120777. [PMID: 38581893 DOI: 10.1016/j.jenvman.2024.120777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 02/29/2024] [Accepted: 03/26/2024] [Indexed: 04/08/2024]
Abstract
Accurate quantification of dissolved oxygen (DO) is critically important for the protection and management of aquatic ecosystems. Successful applications have utilized mechanistic and data-driven models to simulate DO content in aquatic ecosystems. However, mechanistic models present challenges due to their complex and difficult-to-solve conditions, making them less portable. Additionally, data-driven model predictions are hindered by the challenge of numerous input variables, impacting both the running speed and prediction performance of the model. To address these challenges, water quality data and meteorological data of the Tanjiang River were obtained. The maximum information coefficient (MIC) input variable selection technique was employed to identify primary environmental factors influencing DO changes. Furthermore, coupled with support vector regression (SVR), two models (SVR and MIC-SVR) were employed to estimate the DO concentration of the Tanjiang River, and the optimal model was established. The results indicated a shift in the primary pollution factor from ammonia nitrogen to total phosphorus after recent treatment in the Tanjiang River. In comparison with the SVR model, the root mean square error (RMSE) of the MIC-SVR model was reduced by 4.46%, and the Nash-efficiency coefficient (NSE) was improved by 45.85%. In addition, study of kernel function selection revealed that considering as many kernel functions as possible is necessary for improving the performance of the SVR model. Conclusively, the proposed MIC-SVR model serves as an effective tool to analyze the relationship between DO and environmental factors, identifying the primary causes of low DO, and accurately predict the DO concentration in the Tanjiang River (especially in its middle and lower reaches), thus providing a reference for governmental decision-making on water environmental protection and water resource management.
Collapse
Affiliation(s)
- Wei Liu
- School of Environment and Energy, Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Resource Recycling, South China University of Technology, Guangzhou, 510006, China
| | - Shu Lin
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Xiaobao Li
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Wenjing Li
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Hong Deng
- School of Environment and Energy, Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Resource Recycling, South China University of Technology, Guangzhou, 510006, China
| | - Huaiyang Fang
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Weijie Li
- School of Environment and Energy, Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Resource Recycling, South China University of Technology, Guangzhou, 510006, China; The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China.
| |
Collapse
|
4
|
Kim SS, Rhee YM. Potential energy interpolation with target-customized weighting coordinates: application to excited-state dynamics of photoactive yellow protein chromophore in water. Phys Chem Chem Phys 2024; 26:9021-9036. [PMID: 38440829 DOI: 10.1039/d3cp05643k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2024]
Abstract
Interpolation of potential energy surfaces (PESs) can provide a practical route to performing molecular dynamics simulations with a reliability matching a high-level quantum chemical calculation. An obstacle to its widespread use is perhaps the lack of general and optimal interpolation settings that can be applied in a black-box manner for any given molecular system. How to set up the weights for interpolation is one such task, and we still need to diversify the approaches in order to treat various systems. Here, we develop a new interpolation weighting scheme, which allows us to choose the weighting coordinates in a system-specific manner, by amplifying the contribution from specific internal coordinates. The new weighting scheme with an appropriate selection of coordinates is proved to be effective in reducing the interpolation error along the reaction pathway. As a demonstration, we consider the photoactive yellow protein chromophore system, as it constitutes itself as an interesting target that bears long-standing questions related to excited-state dynamics inside protein environments. We build its two-state diabatic interpolated PES with the new weighting scheme. We indeed see the utility of our scheme by conducting nonadiabatic molecular dynamics simulations with the required semi-global PES based on a limited number of data points.
Collapse
Affiliation(s)
- Seung Soo Kim
- Department of Chemistry, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea.
| | - Young Min Rhee
- Department of Chemistry, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea.
| |
Collapse
|
5
|
Célerse F, Wodrich MD, Vela S, Gallarati S, Fabregat R, Juraskova V, Corminboeuf C. From Organic Fragments to Photoswitchable Catalysts: The OFF-ON Structural Repository for Transferable Kernel-Based Potentials. J Chem Inf Model 2024; 64:1201-1212. [PMID: 38319296 PMCID: PMC10900300 DOI: 10.1021/acs.jcim.3c01953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 01/18/2024] [Accepted: 01/22/2024] [Indexed: 02/07/2024]
Abstract
Structurally and conformationally diverse databases are needed to train accurate neural networks or kernel-based potentials capable of exploring the complex free energy landscape of flexible functional organic molecules. Curating such databases for species beyond "simple" drug-like compounds or molecules composed of well-defined building blocks (e.g., peptides) is challenging as it requires thorough chemical space mapping and evaluation of both chemical and conformational diversities. Here, we introduce the OFF-ON (organic fragments from organocatalysts that are non-modular) database, a repository of 7869 equilibrium and 67,457 nonequilibrium geometries of organic compounds and dimers aimed at describing conformationally flexible functional organic molecules, with an emphasis on photoswitchable organocatalysts. The relevance of this database is then demonstrated by training a local kernel regression model on a low-cost semiempirical baseline and comparing it with a PBE0-D3 reference for several known catalysts, notably the free energy surfaces of exemplary photoswitchable organocatalysts. Our results demonstrate that the OFF-ON data set offers reliable predictions for simulating the conformational behavior of virtually any (photoswitchable) organocatalyst or organic compound composed of H, C, N, O, F, and S atoms, thereby opening a computationally feasible route to explore complex free energy surfaces in order to rationalize and predict catalytic behavior.
Collapse
Affiliation(s)
- Frédéric Célerse
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Matthew D. Wodrich
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| | - Sergi Vela
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Simone Gallarati
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Raimon Fabregat
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Veronika Juraskova
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Clémence Corminboeuf
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
- National
Centre for Computational Design and Discovery of Novel Materials (MARVEL), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| |
Collapse
|
6
|
Ju CW, Shen Y, French EJ, Yi J, Bi H, Tian A, Lin Z. Accurate Electronic and Optical Properties of Organic Doublet Radicals Using Machine Learned Range-Separated Functionals. J Phys Chem A 2024. [PMID: 38382058 DOI: 10.1021/acs.jpca.3c07437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
Luminescent organic semiconducting doublet-spin radicals are unique and emergent optical materials because their fluorescent quantum yields (Φfl) are not compromised by the spin-flipping intersystem crossing (ISC) into a dark high-spin state. The multiconfigurational nature of these radicals challenges their electronic structure calculations in the framework of single-reference density functional theory (DFT) and introduces room for method improvement. In the present study, we extended our earlier development of ML-ωPBE [J. Phys. Chem. Lett., 2021, 12, 9516-9524], a range-separated hybrid (RSH) exchange-correlation (XC) functional constructed using the stacked ensemble machine learning (SEML) algorithm, from closed-shell organic semiconducting molecules to doublet-spin organic semiconducting radicals. We assessed its performance for a new test set of 64 doublet-spin radicals from five categories while placing all previously compiled 3926 closed-shell molecules in the new training set. Interestingly, ML-ωPBE agrees with the nonempirical OT-ωPBE functional regarding the prediction of the molecule-dependent range-separation parameter (ω), with a small mean absolute error (MAE) of 0.0197 a0-1, but saves the computational cost by 2.46 orders of magnitude. This result demonstrates an outstanding domain adaptation capacity of ML-ωPBE for diverse organic semiconducting species. To further assess the predictive power of ML-ωPBE in experimental observables, we also applied it to evaluate absorption and fluorescence energies (Eabs and Efl) using linear-response time-dependent DFT (TDDFT), and we compared its behavior with nine popular XC functionals. For most radicals, ML-ωPBE reproduces experimental measurements of Eabs and Efl with small MAEs of 0.299 and 0.254 eV, only marginally different from those of OT-ωPBE. Our work illustrates a successful extension of the SEML framework from closed-shell molecules to doublet-spin radicals and will open the venue for calculating optical properties for organic semiconductors using single-reference TDDFT.
Collapse
Affiliation(s)
- Cheng-Wei Ju
- Department of Chemistry, University of Massachusetts, Amherst, Massachusetts 01003, United States
- Pritzker School of Molecular Engineering, The University of Chicago, Chicago, Illinois 60637, United States
| | - Yili Shen
- Manning College of Information and Computer Sciences, University of Massachusetts, Amherst, Massachusetts 01003, United States
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Ethan J French
- Department of Chemistry, University of Massachusetts, Amherst, Massachusetts 01003, United States
- Department of Mathematics and Statistics, University of Massachusetts, Amherst, Massachusetts 01003, United States
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Charlestown, Massachusetts 02129, United States
| | - Jun Yi
- Department of Chemistry, University of Massachusetts, Amherst, Massachusetts 01003, United States
- Department of Chemistry, Wake Forest University, Winston-Salem, North Carolina 27109, United States
| | - Hongshan Bi
- Department of Chemistry, University of Massachusetts, Amherst, Massachusetts 01003, United States
| | - Aaron Tian
- Manning College of Information and Computer Sciences, University of Massachusetts, Amherst, Massachusetts 01003, United States
- Department of Mathematics and Statistics, University of Massachusetts, Amherst, Massachusetts 01003, United States
| | - Zhou Lin
- Department of Chemistry, University of Massachusetts, Amherst, Massachusetts 01003, United States
| |
Collapse
|
7
|
Morales AW, Du J, Warren DJ, Fernández-Jover E, Martinez-Navarrete G, Bouteiller JMC, McCreery DC, Lazzi G. Machine learning enables non-Gaussian investigation of changes to peripheral nerves related to electrical stimulation. Sci Rep 2024; 14:2795. [PMID: 38307915 PMCID: PMC10837107 DOI: 10.1038/s41598-024-53284-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 01/30/2024] [Indexed: 02/04/2024] Open
Abstract
Electrical stimulation of the peripheral nervous system (PNS) is becoming increasingly important for the therapeutic treatment of numerous disorders. Thus, as peripheral nerves are increasingly the target of electrical stimulation, it is critical to determine how, and when, electrical stimulation results in anatomical changes in neural tissue. We introduce here a convolutional neural network and support vector machines for cell segmentation and analysis of histological samples of the sciatic nerve of rats stimulated with varying current intensities. We describe the methodologies and present results that highlight the validity of the approach: machine learning enabled highly efficient nerve measurement collection, while multivariate analysis revealed notable changes to nerves' anatomy, even when subjected to levels of stimulation thought to be safe according to the Shannon current limits.
Collapse
Affiliation(s)
- Andres W Morales
- Department of Biomedical Engineering, University of Southern California, Los Angeles, CA, 90089, USA.
| | - Jinze Du
- Department of Electrical Engineering, University of Southern California, Los Angeles, CA, 90089, USA
| | - David J Warren
- Department of Biomedical Engineering, University of Utah, Salt Lake City, UT, 84112, USA
| | | | | | - Jean-Marie C Bouteiller
- Department of Biomedical Engineering, University of Southern California, Los Angeles, CA, 90089, USA
- Institute for Technology and Medical Systems (ITEMS), Keck School of Medicine, University of Southern California, Los Angeles, CA, 90089, USA
| | | | - Gianluca Lazzi
- Department of Biomedical Engineering, University of Southern California, Los Angeles, CA, 90089, USA
- Department of Electrical Engineering, University of Southern California, Los Angeles, CA, 90089, USA
- Department of Ophthalmology, University of Southern California, Los Angeles, CA, 90089, USA
- Institute for Technology and Medical Systems (ITEMS), Keck School of Medicine, University of Southern California, Los Angeles, CA, 90089, USA
| |
Collapse
|
8
|
Liu KL, Xiao RL, Ruan Y, Wei B. Active learning prediction and experimental confirmation of atomic structure and thermophysical properties for liquid Hf_{76}W_{24} refractory alloy. Phys Rev E 2023; 108:055310. [PMID: 38115461 DOI: 10.1103/physreve.108.055310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 10/18/2023] [Indexed: 12/21/2023]
Abstract
The determination of liquid atomic structure and thermophysical properties is essential for investigating the physical characteristics and phase transitions of refractory alloys. However, due to the stringent experimental requirements and underdeveloped interatomic potentials, acquiring such information through experimentation or simulation remains challenging. Here, an active learning method incorporating a deep neural network was established to generate the interatomic potential of the Hf_{76}W_{24} refractory alloy. Then the achieved potential was applied to investigate the liquid atomic structure and thermophysical properties of this alloy over a wide temperature range. The simulation results revealed the distinctive bonding preferences among atoms, that is, Hf atoms exhibited a strong tendency for conspecific bonding, while W atoms preferred to form an interspecific bonding. The analysis of short-range order (SRO) in the liquid alloy revealed a significant proportion of icosahedral (ICO) and distorted ICO structures, which even exceeded 30% in the undercooled state. As temperature decreased, SRO structures demonstrated an increase in larger coordination number (CN) clusters and a decrease in smaller CNs. The alterations of the atomic structure indicated that the liquid alloy becomes more ordered, densely packed, and energetically favorable with decreasing temperature, consistent with the obtained fact: Both density and surface tension increase linearly. The simulated thermophysical properties were close to experimental values with minor deviations of 2.8% for density and 3.4% for surface tension. The consistency of the thermophysical properties further attested to the accuracy and reliability of active learning simulation.
Collapse
Affiliation(s)
- K L Liu
- MOE Key Laboratory of Materials Physics and Chemistry under Extraordinary Conditions, School of Physical Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China
| | - R L Xiao
- MOE Key Laboratory of Materials Physics and Chemistry under Extraordinary Conditions, School of Physical Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China
| | - Y Ruan
- MOE Key Laboratory of Materials Physics and Chemistry under Extraordinary Conditions, School of Physical Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China
| | - B Wei
- MOE Key Laboratory of Materials Physics and Chemistry under Extraordinary Conditions, School of Physical Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China
| |
Collapse
|
9
|
Muniz MC, Car R, Panagiotopoulos AZ. Neural Network Water Model Based on the MB-Pol Many-Body Potential. J Phys Chem B 2023; 127:9165-9171. [PMID: 37824703 DOI: 10.1021/acs.jpcb.3c04629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2023]
Abstract
The MB-pol many-body potential accurately predicts many properties of water, including cluster, liquid phase, and vapor-liquid equilibrium properties, but its high computational cost can make applying it in large-scale simulations quite challenging. In order to address this limitation, we developed a "deep potential" neural network (DPMD) model based on the MB-pol potential for water. We find that a DPMD model trained on mostly liquid configurations yields a good description of the bulk liquid phase but severely underpredicts vapor-liquid coexistence densities. By contrast, adding cluster configurations to the neural network training set leads to a good agreement for the vapor coexistence densities. Liquid phase densities under supercooled conditions are also represented well, even though they were not included in the training set. These results confirm that neural network models can combine accuracy and transferability if sufficient attention is given to the construction of a representative training set for the target system.
Collapse
Affiliation(s)
- Maria Carolina Muniz
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | - Roberto Car
- Department of Chemistry, Department of Physics, Program in Applied and Computational Mathematics, and Princeton Materials Institute, Princeton University, Princeton, New Jersey 08544, United States
| | | |
Collapse
|
10
|
Kývala L, Dellago C. Optimizing the architecture of Behler-Parrinello neural network potentials. J Chem Phys 2023; 159:094105. [PMID: 37655764 DOI: 10.1063/5.0167260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 08/10/2023] [Indexed: 09/02/2023] Open
Abstract
The architecture of neural network potentials is typically optimized at the beginning of the training process and remains unchanged throughout. Here, we investigate the accuracy of Behler-Parrinello neural network potentials for varying training set sizes. Using the QM9 and 3BPA datasets, we show that adjusting the network architecture according to the training set size improves the accuracy significantly. We demonstrate that both an insufficient and an excessive number of fitting parameters can have a detrimental impact on the accuracy of the neural network potential. Furthermore, we investigate the influences of descriptor complexity, neural network depth, and activation function on the model's performance. We find that for the neural network potentials studied here, two hidden layers yield the best accuracy and that unbounded activation functions outperform bounded ones.
Collapse
Affiliation(s)
- Lukáš Kývala
- Faculty of Physics, University of Vienna, Kolingasse 14-16, 1090 Vienna, Austria
- Vienna Doctoral School in Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
| | - Christoph Dellago
- Faculty of Physics, University of Vienna, Kolingasse 14-16, 1090 Vienna, Austria
| |
Collapse
|
11
|
Lakhouit A, Shaban M, Alatawi A, Abbas SYH, Asiri E, Al Juhni T, Elsawy M. Machine-learning approaches in geo-environmental engineering: Exploring smart solid waste management. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2023; 330:117174. [PMID: 36586367 DOI: 10.1016/j.jenvman.2022.117174] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/19/2022] [Accepted: 12/28/2022] [Indexed: 06/17/2023]
Abstract
Over the past few decades, increased attention has been paid to domestic waste (DW) generation. DW comprises a large percentage of municipal solid waste (MSW), and its handling and processing involves serious technical issues while also consuming a major portion of municipal budgets. The accurate estimation, prediction, and characterization of DW is an ongoing challenge for many cities, municipalities, and local governments as they strive to implement sustainable strategies for MSW. The main objective of the present study is to estimate and correctly predict DW quantities using machine-learning (ML) algorithms. Several different ML algorithms are used in the research, including linear regression, regression trees, Gaussian process regression, support vector machine, and autoregressive integrated moving average methods for time series analysis. Two case studies are presented in this paper. In the first, domestic waste data covering the period from 2010 to 2021 were collected from the Saudi and Bahrain authorities, and in the second, the domestic waste-generating behavior of a family of eleven members was followed for one month. The results show that the biodegradable and non-biodegradable wastes generated by the family were in the range of 1.7-7.9 kg and 0.0-2.0 kg, respectively, and promising outcomes were obtained using an appropriate selection of input predictors in conjunction with time series analysis. The trained models are validated and tested using several types of evaluation metrics, including calculated residuals, mean square error, root mean square error, and coefficient determination (R2-Score). The latter values are in the range of 0.67-0.85 for the training and testing datasets for many of the predicted waste quantities. The results obtained from the study show that these algorithms can be used to reduce the environmental, economic, and societal impacts of waste by designing a smart waste management engineering system.
Collapse
Affiliation(s)
- Abderrahim Lakhouit
- Department of Civil Engineering, Faculty of Engineering, University of Tabuk, Tabuk 71421, Saudi Arabia.
| | - Mahmoud Shaban
- Department of Electrical Engineering, Faculty of Engineering, Aswan University, Aswan 81542, Egypt; Department of Electrical Engineering, College of Engineering, Qassim University, Unaizah 56452, Saudi Arabia
| | - Aishah Alatawi
- Department of Biology, Faculty of Science, University of Tabuk, Tabuk 71421, Saudi Arabia
| | - Sumaya Y H Abbas
- Department of Natural Resources and Environment College of Graduate Studies Arabian Gulf University, Bahrain
| | - Emad Asiri
- Department of Civil Engineering, Faculty of Engineering, University of Tabuk, Tabuk 71421, Saudi Arabia
| | - Tareq Al Juhni
- Department of Civil Engineering, Faculty of Engineering, University of Tabuk, Tabuk 71421, Saudi Arabia
| | - Mohamed Elsawy
- Department of Civil Engineering, Faculty of Engineering, University of Tabuk, Tabuk 71421, Saudi Arabia; Geotechnical and Foundations Engineering, Department of Civil Engineering, Faculty of Engineering, Aswan University, 81542, Egypt
| |
Collapse
|
12
|
Huang Z, Wang Q, Liu X, Liu X. First-principles based deep neural network force field for molecular dynamics simulation of N-Ga-Al semiconductors. Phys Chem Chem Phys 2023; 25:2349-2358. [PMID: 36598036 DOI: 10.1039/d2cp04697k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Accurate interatomic force fields are of paramount importance for molecular dynamics simulations to explore the thermal transport at the GaN/AlN heterogenous interface, which is a key factor hindering heat dissipation and limiting the performance of GaN power electronic devices. In this work, an interatomic potential (force field) based on a deep neural network technique and first-principles calculations is developed for N-Ga-Al semiconductors to predict the elastic and thermodynamic properties. Using our deep neural network potential (NNP), the precise structural features, elastic constants, and thermal conductivities of GaN, AlN, and their alloy are obtained, which are well consistent with those from experiments and first-principles calculations. The interfacial thermal conductance of GaN/AlN heterostructures with different interfacial morphologies are further studied using molecular dynamics simulations with the NNP. It is found that atomic interdiffusion and disorder at the interfaces dramatically reduces the interfacial thermal conductance. The developed NNP exhibits a larger effective dimension with respect to classical empirical potentials and reaches competitive performances, thus pointing towards attractive advantages in the study of GaN heterostructures and devices with the NNP.
Collapse
Affiliation(s)
- Zixuan Huang
- Institute of Micro/Nano Electromechanical System, College of Mechanical Engineering, State Key Laboratory for Modification of Chemical Fibers and Polymer Materials, Donghua University, Shanghai, China.
| | - Quanjie Wang
- Institute of Micro/Nano Electromechanical System, College of Mechanical Engineering, Donghua University, Shanghai, China
| | - Xinyu Liu
- Institute of Micro/Nano Electromechanical System, College of Mechanical Engineering, Donghua University, Shanghai, China
| | - Xiangjun Liu
- Institute of Micro/Nano Electromechanical System, College of Mechanical Engineering, State Key Laboratory for Modification of Chemical Fibers and Polymer Materials, Donghua University, Shanghai, China.
| |
Collapse
|
13
|
Cameron AR, Proud AJ, Pearson JK. Machine Learned Composite Methods for Electronic Structure Theory. J Chem Theory Comput 2023; 19:51-60. [PMID: 36507875 DOI: 10.1021/acs.jctc.2c00564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Because of the prohibitive scaling of ab initio techniques for modeling chemical species with high accuracy, they are not generally tractable for large systems. It is therefore of considerable interest to develop high-accuracy computational models with low computational cost that can afford predictions of electronic structure and properties of macromolecular species. Composite methods, as first introduced by Pople [Pople, J. A.; Head-Gordon, M.; Fox, D. J.; Raghavachari, K.; Curtiss, L. A. J. Chem. Phys.1989, 90, 5622.], are an intuitive solution to this problem as they seek to systematically increase accuracy in model chemistries by taking advantage of favorable error cancellation among reasonably low-cost models. By linearly combining a series of carefully chosen model chemistries, the result of a prohibitive-scaling correlated model chemistry with a large basis set may be approximated with relatively good fidelity. However, the full extent to which the choice of low-cost models dictates the predictive accuracy of composite methods is not known, and a full exploration of all model chemistries would be advantageous for the design and validation of a generalizable composite method for widespread application. Here, we show that remarkable accuracy can be generally achieved with composite methods that are more judiciously constructed, leading to increased accuracy with significantly reduced computational cost. By designing a systematic procedure for the automated generation and assessment of over 10 billion unique composite methods, we have extensively explored the space of modern model chemistries to elucidate important design principles in the construction of reliable composite procedures. We anticipate our work to be the starting point in the pursuit of creative approaches to modeling large chemical systems with high accuracy by using novel combinatorial modeling.
Collapse
Affiliation(s)
- Andrew R Cameron
- Institute for Quantum Computing, University of Waterloo, Waterloo, OntarioN2L 3G1, Canada.,Department of Physics & Astronomy, University of Waterloo, Waterloo, OntarioN2L 3G1, Canada
| | - Adam J Proud
- Department of Chemistry, University of Prince Edward Island, 550 University Avenue, Charlottetown, Prince Edward IslandC1A 4P3, Canada
| | - Jason K Pearson
- Department of Chemistry, University of Prince Edward Island, 550 University Avenue, Charlottetown, Prince Edward IslandC1A 4P3, Canada
| |
Collapse
|
14
|
Nguyen TH, Nguyen LH, Truong TN. Application of Machine Learning in Developing Quantitative Structure-Property Relationship for Electronic Properties of Polyaromatic Compounds. ACS OMEGA 2022; 7:22879-22888. [PMID: 35811887 PMCID: PMC9261278 DOI: 10.1021/acsomega.2c02650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 05/25/2022] [Indexed: 06/15/2023]
Abstract
The degree of π orbital overlap (DPO) model has been demonstrated to be an excellent quantitative structure-property relationship (QSPR) that can map two-dimensional structural information of polycyclic aromatic hydrocarbons (PAHs) and thienoacenes to their electronic properties, namely, band gaps, electron affinities, and ionization potentials. However, the model suffers from significant limitations that narrow its applications due to inefficient manual procedures in parameter optimization and descriptor formulation. In this work, we developed a machine learning (ML)-based method for efficiently optimizing DPO parameters and proposed a truncated DPO descriptor, which is simple enough that can be automatically extracted from simplified molecular-input line-entry system strings of PAHs and thienoacenes. Compared with the result from our previous studies, the ML-based methodology can optimize DPO parameters with four times fewer data, while it can achieve the same level of accuracy in predictions of the mentioned electronic properties to within 0.1 eV. The truncated DPO model also has similar accuracy to the full DPO model. Consequently, the ML-based DPO approach coupled with the truncated DPO model enables new possibilities for developing automatic pipelines for high-throughput screening and investigating new QSPR for new chemical classes.
Collapse
Affiliation(s)
- Tuan H Nguyen
- Institute for Computational Science and Technology, Ho Chi Minh City 700000, Vietnam
| | - Lam H Nguyen
- Institute for Computational Science and Technology, Ho Chi Minh City 700000, Vietnam
| | - Thanh N Truong
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| |
Collapse
|
15
|
Sullivan J, Mirhashemi A, Lee J. Deep learning based analysis of microstructured materials for thermal radiation control. Sci Rep 2022; 12:9785. [PMID: 35697745 PMCID: PMC9192759 DOI: 10.1038/s41598-022-13832-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Accepted: 05/30/2022] [Indexed: 12/21/2022] Open
Abstract
Microstructured materials that can selectively control the optical properties are crucial for the development of thermal management systems in aerospace and space applications. However, due to the vast design space available for microstructures with varying material, wavelength, and temperature conditions relevant to thermal radiation, the microstructure design optimization becomes a very time-intensive process and with results for specific and limited conditions. Here, we develop a deep neural network to emulate the outputs of finite-difference time-domain simulations (FDTD). The network we show is the foundation of a machine learning based approach to microstructure design optimization for thermal radiation control. Our neural network differentiates materials using discrete inputs derived from the materials’ complex refractive index, enabling the model to build relationships between the microtexture’s geometry, wavelength, and material. Thus, material selection does not constrain our network and it is capable of accurately extrapolating optical properties for microstructures of materials not included in the training process. Our surrogate deep neural network can synthetically simulate over 1,000,000 distinct combinations of geometry, wavelength, temperature, and material in less than a minute, representing a speed increase of over 8 orders of magnitude compared to typical FDTD simulations. This speed enables us to perform sweeping thermal-optical optimizations rapidly to design advanced passive cooling or heating systems. The deep learning-based approach enables complex thermal and optical studies that would be impossible with conventional simulations and our network design can be used to effectively replace optical simulations for other microstructures.
Collapse
Affiliation(s)
- Jonathan Sullivan
- Department of Mechanical and Aerospace Engineering, University of California, Irvine, USA
| | | | - Jaeho Lee
- Department of Mechanical and Aerospace Engineering, University of California, Irvine, USA.
| |
Collapse
|
16
|
Oren E, Kartoon D, Makov G. Machine Learning-Based Modeling of High-Pressure phase diagrams: Anomalousmelting of Rb. J Chem Phys 2022; 157:014502. [DOI: 10.1063/5.0088089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Modeling of phase diagrams and in particular the anomalous reentrant melting curves of thealkali metals is an open challenge for interatomic potentials. Machine learning-based interatomicpotentials have shown promise in overcoming this challenge, unlike earlier embedded atom-basedapproaches. We introduce a relatively simple and cheap approach to develop, train, and validate aneural network-based, wide-ranging interatomic potential transferable across both temperature andpressure. This approach is based on training the potential at high pressures only in the liquid phaseand on validating its transferability on the relatively easy-to-calculate cold compression curve. Ourapproach is demonstrated on the phase diagram of Rb for which we reproduce the cold compressioncurve over the Rb-I (BCC), Rb-II (FCC), and Rb-V (tI4) phases, followed by the high-pressuremelting curve including the re-entry after the maximum and then the minimum at the triple liquid-FCC-BCC point. Furthermore, our potential is able to partially capture even the very recentlyreported liquid-liquid transition in Rb, indicating the utility of machine learning-based potentials.
Collapse
Affiliation(s)
- Eyal Oren
- Ben-Gurion University of the Negev, Israel
| | | | - Guy Makov
- Materials Engineering, Ben-Gurion University of the Negev, Israel
| |
Collapse
|
17
|
Packwood D, Nguyen LTH, Cesana P, Zhang G, Staykov A, Fukumoto Y, Nguyen DH. Machine Learning in Materials Chemistry: An Invitation. MACHINE LEARNING WITH APPLICATIONS 2022. [DOI: 10.1016/j.mlwa.2022.100265] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
|
18
|
Comprehensive Modeling in Predicting Liquid Density of the Refrigerant Systems Using Least-Squares Support Vector Machine Approach. INTERNATIONAL JOURNAL OF CHEMICAL ENGINEERING 2022. [DOI: 10.1155/2022/8356321] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
A robust machine learning algorithm known as the least-squares support vector machine (LSSVM) model was used to predict the liquid densities of 48 different refrigerant systems. Hence, a massive dataset was gathered using the reports published previously. The proposed model was evaluated via various analyses. Based on the statistical analysis results, the actual values predicted by this model have high accuracy, and the calculated values of RMSE, MRE, STD, and R2 were 0.0116, 0.158, 0.1070, and 0.999, respectively. Moreover, sensitivity analysis was done on the efficient input parameters, and it was found that CF2H2 has the most positive effect on the output parameter (with a relevancy factor of +50.19). Furthermore, for checking the real data accuracy, the technique of leverage was considered, the results of which revealed that most of the considered data are reliable. The power and accuracy of this simple model in predicting liquid densities of different refrigerant systems are high; therefore, it is an appropriate alternative for laboratory data.
Collapse
|
19
|
Herbold M, Behler J. A Hessian-based assessment of atomic forces for training machine learning interatomic potentials. J Chem Phys 2022; 156:114106. [DOI: 10.1063/5.0082952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
In recent years, many types of machine learning potentials (MLPs) have been introduced, which are able to represent high-dimensional potential-energy surfaces (PESs) with close to first-principles accuracy. Most current MLPs rely on atomic energy contributions given as a function of the local chemical environments. Frequently, in addition to total energies, atomic forces are also used to construct the potentials, as they provide detailed local information about the PES. Since many systems are too large for electronic structure calculations, obtaining reliable reference forces from smaller subsystems, such as molecular fragments or clusters, can substantially simplify the construction of the training sets. Here, we propose a method to determine structurally converged molecular fragments, providing reliable atomic forces based on an analysis of the Hessian. The method, which serves as a locality test and allows us to estimate the importance of long-range interactions, is illustrated for a series of molecular model systems and the metal–organic framework MOF-5 as an example for a complex organic–inorganic hybrid material.
Collapse
Affiliation(s)
- Marius Herbold
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077 Göttingen, Germany
| | - Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077 Göttingen, Germany
| |
Collapse
|
20
|
Fabregat R, Fabrizio A, Engel EA, Meyer B, Juraskova V, Ceriotti M, Corminboeuf C. Local Kernel Regression and Neural Network Approaches to the Conformational Landscapes of Oligopeptides. J Chem Theory Comput 2022; 18:1467-1479. [PMID: 35179897 PMCID: PMC8908737 DOI: 10.1021/acs.jctc.1c00813] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
The application of
machine learning to theoretical chemistry has
made it possible to combine the accuracy of quantum chemical energetics
with the thorough sampling of finite-temperature fluctuations. To
reach this goal, a diverse set of methods has been proposed, ranging
from simple linear models to kernel regression and highly nonlinear
neural networks. Here we apply two widely different approaches to
the same, challenging problem: the sampling of the conformational
landscape of polypeptides at finite temperature. We develop a local
kernel regression (LKR) coupled with a supervised sparsity method
and compare it with a more established approach based on Behler-Parrinello
type neural networks. In the context of the LKR, we discuss how the
supervised selection of the reference pool of environments is crucial
to achieve accurate potential energy surfaces at a competitive computational
cost and leverage the locality of the model to infer which chemical
environments are poorly described by the DFTB baseline. We then discuss
the relative merits of the two frameworks and perform Hamiltonian-reservoir
replica-exchange Monte Carlo sampling and metadynamics simulations,
respectively, to demonstrate that both frameworks can achieve converged
and transferable sampling of the conformational landscape of complex
and flexible biomolecules with comparable accuracy and computational
cost.
Collapse
Affiliation(s)
| | | | - Edgar A Engel
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | | | | | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | | |
Collapse
|
21
|
Xia D, Chen J, Fu Z, Xu T, Wang Z, Liu W, Xie HB, Peijnenburg WJGM. Potential Application of Machine-Learning-Based Quantum Chemical Methods in Environmental Chemistry. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022; 56:2115-2123. [PMID: 35084191 DOI: 10.1021/acs.est.1c05970] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
It is an important topic in environmental sciences to understand the behavior and toxicology of chemical pollutants. Quantum chemical methodologies have served as useful tools for probing behavior and toxicology of chemical pollutants in recent decades. In recent years, machine learning (ML) techniques have brought revolutionary developments to the field of quantum chemistry, which may be beneficial for investigating environmental behavior and toxicology of chemical pollutants. However, the ML-based quantum chemical methods (ML-QCMs) have only scarcely been used in environmental chemical studies so far. To promote applications of the promising methods, this Perspective summarizes recent progress in the ML-QCMs and focuses on their potential applications in environmental chemical studies that could hardly be achieved by the conventional quantum chemical methods. Potential applications and challenges of the ML-QCMs in predicting degradation networks of chemical pollutants, searching global minima for atmospheric nanoclusters, discovering heterogeneous or photochemical transformation pathways of pollutants, as well as predicting environmentally relevant end points with wave functions as descriptors are introduced and discussed.
Collapse
Affiliation(s)
- Deming Xia
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zhiqiang Fu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Tong Xu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zhongyu Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Wenjia Liu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Hong-Bin Xie
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Willie J G M Peijnenburg
- Institute of Environmental Sciences (CML), Leiden University, Leiden 2300 RA, The Netherlands
- Centre for Safety of Substances and Products, National Institute of Public Health and the Environment (RIVM), Bilthoven 3720 BA, The Netherlands
| |
Collapse
|
22
|
Velez C, Acevedo O. Simulation of deep eutectic solvents: Progress to promises. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1598] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Caroline Velez
- Department of Chemistry University of Miami Coral Gables Florida USA
| | - Orlando Acevedo
- Department of Chemistry University of Miami Coral Gables Florida USA
| |
Collapse
|
23
|
Abstract
In the past two decades, machine learning potentials (MLPs) have reached a level of maturity that now enables applications to large-scale atomistic simulations of a wide range of systems in chemistry, physics, and materials science. Different machine learning algorithms have been used with great success in the construction of these MLPs. In this review, we discuss an important group of MLPs relying on artificial neural networks to establish a mapping from the atomic structure to the potential energy. In spite of this common feature, there are important conceptual differences among MLPs, which concern the dimensionality of the systems, the inclusion of long-range electrostatic interactions, global phenomena like nonlocal charge transfer, and the type of descriptor used to represent the atomic structure, which can be either predefined or learnable. A concise overview is given along with a discussion of the open challenges in the field. Expected final online publication date for the Annual Review of Physical Chemistry, Volume 73 is April 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Emir Kocer
- Institut für Physikalische Chemie, Theoretische Chemie, Universität Göttingen, Göttingen, Germany;, ,
| | - Tsz Wai Ko
- Institut für Physikalische Chemie, Theoretische Chemie, Universität Göttingen, Göttingen, Germany;, ,
| | - Jörg Behler
- Institut für Physikalische Chemie, Theoretische Chemie, Universität Göttingen, Göttingen, Germany;, ,
| |
Collapse
|
24
|
Farrar EHE, Grayson MN. Machine learning and semi-empirical calculations: a synergistic approach to rapid, accurate, and mechanism-based reaction barrier prediction. Chem Sci 2022; 13:7594-7603. [PMID: 35872815 PMCID: PMC9242013 DOI: 10.1039/d2sc02925a] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 06/08/2022] [Indexed: 11/21/2022] Open
Abstract
A synergistic approach that combines machine learning with semi-empirical methods enables the fast and accurate prediction of DFT-quality reaction barriers, with mechanistic insights available from semi-empirical transition state geometries.
Collapse
Affiliation(s)
- Elliot H. E. Farrar
- Department of Chemistry, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Matthew N. Grayson
- Department of Chemistry, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| |
Collapse
|
25
|
Xue LY, Guo F, Wen YS, Feng SQ, Huang XN, Guo L, Li HS, Cui SX, Zhang GQ, Wang QL. ReaxFF-MPNN machine learning potential: a combination of reactive force field and message passing neural networks. Phys Chem Chem Phys 2021; 23:19457-19464. [PMID: 34524283 DOI: 10.1039/d1cp01656c] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Reactive force field (ReaxFF) is a powerful computational tool for exploring material properties. In this work, we proposed an enhanced reactive force field model, which uses message passing neural networks (MPNN) to compute the bond order and bond energies. MPNN are a variation of graph neural networks (GNN), which are derived from graph theory. In MPNN or GNN, molecular structures are treated as a graph and atoms and chemical bonds are represented by nodes and edges. The edge states correspond to the bond order in ReaxFF and are updated by message functions according to the message passing algorithms. The results are very encouraging; the investigation of the potential, such as the potential energy surface, reaction energies and equation of state, are greatly improved by this simple improvement. The new potential model, called reactive force field with message passing neural networks (ReaxFF-MPNN), is provided as an interface in an atomic simulation environment (ASE) with which the original ReaxFF and ReaxFF-MPNN potential models can do MD simulations and geometry optimizations within the ASE. Furthermore, machine learning, based on an active learning algorithm and gradient optimizer, is designed to train the model. We found that the active learning machine not only saves the manual work to collect the training data but is also much more effective than the general optimizer.
Collapse
Affiliation(s)
- Li-Yuan Xue
- Shandong Provincial Key Laboratory of Optical Communication Science and Technology, Liaocheng, 252000, China.
| | - Feng Guo
- Shandong Provincial Key Laboratory of Optical Communication Science and Technology, Liaocheng, 252000, China. .,School of Physical Science and Information Technology, Liaocheng University, Liaocheng, 252000, China
| | - Yu-Shi Wen
- Institute of Chemical Materials, China Academy of Engineering Physics (CAEP), Mianyang, Sichuan, 621900, China.
| | - Shi-Quan Feng
- School of Physics and Electronic Engineering, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Xiao-Na Huang
- School of Power and Mechanical Engineering, Wuhan University, Wuhan, Hubei, 430072, China
| | - Lei Guo
- School of Business, Shandong Normal University, Jinan, 250014, China
| | - Heng-Shuai Li
- Shandong Provincial Key Laboratory of Optical Communication Science and Technology, Liaocheng, 252000, China.
| | - Shou-Xin Cui
- School of Physical Science and Information Technology, Liaocheng University, Liaocheng, 252000, China
| | - Gui-Qing Zhang
- School of Physical Science and Information Technology, Liaocheng University, Liaocheng, 252000, China
| | - Qing-Lin Wang
- Shandong Provincial Key Laboratory of Optical Communication Science and Technology, Liaocheng, 252000, China. .,School of Physical Science and Information Technology, Liaocheng University, Liaocheng, 252000, China
| |
Collapse
|
26
|
Guan S, Shang C, Liu Z. Structure and Dynamics of Energy Materials from Machine Learning Simulations: A Topical Review
†. CHINESE J CHEM 2021. [DOI: 10.1002/cjoc.202100299] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Shu‐Hui Guan
- Shanghai Academy of Agricultural Sciences Shanghai 201403 China
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry Fudan University Shanghai 200438 China
| | - Cheng Shang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry Fudan University Shanghai 200438 China
| | - Zhi‐Pan Liu
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry Fudan University Shanghai 200438 China
| |
Collapse
|
27
|
Miksch AM, Morawietz T, Kästner J, Urban A, Artrith N. Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abfd96] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Recent advances in machine-learning interatomic potentials have enabled the efficient modeling of complex atomistic systems with an accuracy that is comparable to that of conventional quantum-mechanics based methods. At the same time, the construction of new machine-learning potentials can seem a daunting task, as it involves data-science techniques that are not yet common in chemistry and materials science. Here, we provide a tutorial-style overview of strategies and best practices for the construction of artificial neural network (ANN) potentials. We illustrate the most important aspects of (a) data collection, (b) model selection, (c) training and validation, and (d) testing and refinement of ANN potentials on the basis of practical examples. Current research in the areas of active learning and delta learning are also discussed in the context of ANN potentials. This tutorial review aims at equipping computational chemists and materials scientists with the required background knowledge for ANN potential construction and application, with the intention to accelerate the adoption of the method, so that it can facilitate exciting research that would otherwise be challenging with conventional strategies.
Collapse
|
28
|
Koutsoukos S, Philippi F, Malaret F, Welton T. A review on machine learning algorithms for the ionic liquid chemical space. Chem Sci 2021; 12:6820-6843. [PMID: 34123314 PMCID: PMC8153233 DOI: 10.1039/d1sc01000j] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 04/28/2021] [Indexed: 01/05/2023] Open
Abstract
There are thousands of papers published every year investigating the properties and possible applications of ionic liquids. Industrial use of these exceptional fluids requires adequate understanding of their physical properties, in order to create the ionic liquid that will optimally suit the application. Computational property prediction arose from the urgent need to minimise the time and cost that would be required to experimentally test different combinations of ions. This review discusses the use of machine learning algorithms as property prediction tools for ionic liquids (either as standalone methods or in conjunction with molecular dynamics simulations), presents common problems of training datasets and proposes ways that could lead to more accurate and efficient models.
Collapse
Affiliation(s)
- Spyridon Koutsoukos
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| | - Frederik Philippi
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| | - Francisco Malaret
- Department of Chemical Engineering, Imperial College London South Kensington Campus London SW7 2AZ UK
| | - Tom Welton
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| |
Collapse
|
29
|
Butowska K, Woziwodzka A, Borowik A, Piosik J. Polymeric Nanocarriers: A Transformation in Doxorubicin Therapies. MATERIALS (BASEL, SWITZERLAND) 2021; 14:2135. [PMID: 33922291 PMCID: PMC8122860 DOI: 10.3390/ma14092135] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 04/15/2021] [Accepted: 04/20/2021] [Indexed: 02/06/2023]
Abstract
Doxorubicin, a member of the anthracycline family, is a common anticancer agent often used as a first line treatment for the wide spectrum of cancers. Doxorubicin-based chemotherapy, although effective, is associated with serious side effects, such as irreversible cardiotoxicity or nephrotoxicity. Those often life-threatening adverse risks, responsible for the elongation of the patients' recuperation period and increasing medical expenses, have prompted the need for creating novel and safer drug delivery systems. Among many proposed concepts, polymeric nanocarriers are shown to be a promising approach, allowing for controlled and selective drug delivery, simultaneously enhancing its activity towards cancerous cells and reducing toxic effects on healthy tissues. This article is a chronological examination of the history of the work progress on polymeric nanostructures, designed as efficient doxorubicin nanocarriers, with the emphasis on the main achievements of 2010-2020. Numerous publications have been reviewed to provide an essential summation of the nanopolymer types and their essential properties, mechanisms towards efficient drug delivery, as well as active targeting stimuli-responsive strategies that are currently utilized in the doxorubicin transportation field.
Collapse
Affiliation(s)
- Kamila Butowska
- Laboratory of Biophysics, Intercollegiate Faculty of Biotechnology, University of Gdansk and Medical University of Gdansk, Abrahama 58, 80-307 Gdańsk, Poland; (K.B.); (A.W.); (A.B.)
| | - Anna Woziwodzka
- Laboratory of Biophysics, Intercollegiate Faculty of Biotechnology, University of Gdansk and Medical University of Gdansk, Abrahama 58, 80-307 Gdańsk, Poland; (K.B.); (A.W.); (A.B.)
| | - Agnieszka Borowik
- Laboratory of Biophysics, Intercollegiate Faculty of Biotechnology, University of Gdansk and Medical University of Gdansk, Abrahama 58, 80-307 Gdańsk, Poland; (K.B.); (A.W.); (A.B.)
- Aging and Metabolism Research Program, Oklahoma Medical Research Foundation (OMRF), Oklahoma City, OK 73104, USA
| | - Jacek Piosik
- Laboratory of Biophysics, Intercollegiate Faculty of Biotechnology, University of Gdansk and Medical University of Gdansk, Abrahama 58, 80-307 Gdańsk, Poland; (K.B.); (A.W.); (A.B.)
| |
Collapse
|
30
|
Paleico ML, Behler J. A bin and hash method for analyzing reference data and descriptors in machine learning potentials. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abe663] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Abstract
In recent years the development of machine learning potentials (MLPs) has become a very active field of research. Numerous approaches have been proposed, which allow one to perform extended simulations of large systems at a small fraction of the computational costs of electronic structure calculations. The key to the success of modern MLPs is the close-to first principles quality description of the atomic interactions. This accuracy is reached by using very flexible functional forms in combination with high-level reference data from electronic structure calculations. These data sets can include up to hundreds of thousands of structures covering millions of atomic environments to ensure that all relevant features of the potential energy surface are well represented. The handling of such large data sets is nowadays becoming one of the main challenges in the construction of MLPs. In this paper we present a method, the bin-and-hash (BAH) algorithm, to overcome this problem by enabling the efficient identification and comparison of large numbers of multidimensional vectors. Such vectors emerge in multiple contexts in the construction of MLPs. Examples are the comparison of local atomic environments to identify and avoid unnecessary redundant information in the reference data sets that is costly in terms of both the electronic structure calculations as well as the training process, the assessment of the quality of the descriptors used as structural fingerprints in many types of MLPs, and the detection of possibly unreliable data points. The BAH algorithm is illustrated for the example of high-dimensional neural network potentials using atom-centered symmetry functions for the geometrical description of the atomic environments, but the method is general and can be combined with any current type of MLP.
Collapse
|
31
|
Cordero JA, He K, Janya K, Echigo S, Itoh S. Predicting formation of haloacetic acids by chlorination of organic compounds using machine-learning-assisted quantitative structure-activity relationships. JOURNAL OF HAZARDOUS MATERIALS 2021; 408:124466. [PMID: 33191030 DOI: 10.1016/j.jhazmat.2020.124466] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 10/30/2020] [Accepted: 10/31/2020] [Indexed: 06/11/2023]
Abstract
The presence of disinfection byproducts (DBPs) in drinking water is a major public health concern, and an effective strategy to limit the formation of these DBPs is to prevent their precursors. In silico prediction from chemical structure would allow rapid identification of precursors and could be used as a prescreening tool to prioritize testing. We present models using machine learning algorithms (i.e., support vector regressor, random forest regressor, and multilayer perceptron regressor) and chemical descriptors as features to predict the formation of haloacetic acids (HAAs). A robust model with good predictivity (i.e., leave-one-out cross-validated Q2 > 0.5) to predict the formation of trichloroacetic acid (TCAA) was developed using a random forest regressor. The number of aromatic bonds, hydrophilicity, and electrotopological descriptors related to electrostatic interactions and the atomic distribution of electronegativity were identified as important predictors of TCAA formation potentials (FPs). However, the prediction of dichloroacetic acid was less accurate, which is congruent with the presence of different types of precursors exhibiting distinct mechanisms. This study demonstrates that nonlinear combinations of general chemical descriptors can adequately estimate HAAFPs, and we hope that our study can be used to predict precursors of other disinfection byproducts based on chemical structures using a similar workflow.
Collapse
Affiliation(s)
- José Andrés Cordero
- Department of Environmental Engineering, Graduate School of Engineering, Kyoto University, Nishikyo, Kyoto 6158540, Japan
| | - Kai He
- Research Center for Environmental Quality Management, Kyoto University, 1-2 Yumihama, Otsu, Shiga 5200811, Japan.
| | - Kanjira Janya
- Department of Chemical Engineering, Faculty of Engineering, Mahidol University, Nakorn Pathom 73170, Thailand
| | - Shinya Echigo
- Department of Environmental Engineering, Graduate School of Engineering, Kyoto University, Nishikyo, Kyoto 6158540, Japan
| | - Sadahiko Itoh
- Department of Environmental Engineering, Graduate School of Engineering, Kyoto University, Nishikyo, Kyoto 6158540, Japan
| |
Collapse
|
32
|
Zubatiuk T, Isayev O. Development of Multimodal Machine Learning Potentials: Toward a Physics-Aware Artificial Intelligence. Acc Chem Res 2021; 54:1575-1585. [PMID: 33715355 DOI: 10.1021/acs.accounts.0c00868] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Machine learning interatomic potentials (MLIPs) are widely used for describing molecular energy and continue bridging the speed and accuracy gap between quantum mechanical (QM) and classical approaches like force fields. In this Account, we focus on the out-of-the-box approaches to developing transferable MLIPs for diverse chemical tasks. First, we introduce the "Accurate Neural Network engine for Molecular Energies," ANAKIN-ME, method (or ANI for short). The ANI model utilizes Justin Smith Symmetry Functions (JSSFs) and realizes training for vast data sets. The training data set of several orders of magnitude larger than before has become the key factor of the knowledge transferability and flexibility of MLIPs. As the quantity, quality, and types of interactions included in the training data set will dictate the accuracy of MLIPs, the task of proper data selection and model training could be assisted with advanced methods like active learning (AL), transfer learning (TL), and multitask learning (MTL).Next, we describe the AIMNet "Atoms-in-Molecules Network" that was inspired by the quantum theory of atoms in molecules. The AIMNet architecture lifts multiple limitations in MLIPs. It encodes long-range interactions and learnable representations of chemical elements. We also discuss the AIMNet-ME model that expands the applicability domain of AIMNet from neutral molecules toward open-shell systems. The AIMNet-ME encompasses a dependence of the potential on molecular charge and spin. It brings ML and physical models one step closer, ensuring the correct molecular energy behavior over the total molecular charge.We finally describe perhaps the simplest possible physics-aware model, which combines ML and the extended Hückel method. In ML-EHM, "Hierarchically Interacting Particle Neural Network," HIP-NN generates the set of a molecule- and environment-dependent Hamiltonian elements αμμ and K‡. As a test example, we show how in contrast to traditional Hückel theory, ML-EHM correctly describes orbital crossing with bond rotations. Hence it learns the underlying physics, highlighting that the inclusion of proper physical constraints and symmetries could significantly improve ML model generalization.
Collapse
Affiliation(s)
- Tetiana Zubatiuk
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Olexandr Isayev
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
33
|
Real-time release testing of dissolution based on surrogate models developed by machine learning algorithms using NIR spectra, compression force and particle size distribution as input data. Int J Pharm 2021; 597:120338. [DOI: 10.1016/j.ijpharm.2021.120338] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 01/26/2021] [Accepted: 01/30/2021] [Indexed: 12/28/2022]
|
34
|
Yue S, Muniz MC, Calegari Andrade MF, Zhang L, Car R, Panagiotopoulos AZ. When do short-range atomistic machine-learning models fall short? J Chem Phys 2021; 154:034111. [DOI: 10.1063/5.0031215] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Affiliation(s)
- Shuwen Yue
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, USA
| | - Maria Carolina Muniz
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, USA
| | | | - Linfeng Zhang
- Program in Applied and Computational Mathematics, Princeton University, Princeton, New Jersey 08544, USA
| | - Roberto Car
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, USA
| | | |
Collapse
|
35
|
Ko TW, Finkler JA, Goedecker S, Behler J. A fourth-generation high-dimensional neural network potential with accurate electrostatics including non-local charge transfer. Nat Commun 2021; 12:398. [PMID: 33452239 PMCID: PMC7811002 DOI: 10.1038/s41467-020-20427-2] [Citation(s) in RCA: 140] [Impact Index Per Article: 46.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 11/18/2020] [Indexed: 11/16/2022] Open
Abstract
Machine learning potentials have become an important tool for atomistic simulations in many fields, from chemistry via molecular biology to materials science. Most of the established methods, however, rely on local properties and are thus unable to take global changes in the electronic structure into account, which result from long-range charge transfer or different charge states. In this work we overcome this limitation by introducing a fourth-generation high-dimensional neural network potential that combines a charge equilibration scheme employing environment-dependent atomic electronegativities with accurate atomic energies. The method, which is able to correctly describe global charge distributions in arbitrary systems, yields much improved energies and substantially extends the applicability of modern machine learning potentials. This is demonstrated for a series of systems representing typical scenarios in chemistry and materials science that are incorrectly described by current methods, while the fourth-generation neural network potential is in excellent agreement with electronic structure calculations. Machine learning potentials do not account for long-range charge transfer. Here the authors introduce a fourth-generation high-dimensional neural network potential including non-local information of charge populations that is able to provide forces, charges and energies in excellent agreement with DFT data.
Collapse
Affiliation(s)
- Tsz Wai Ko
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077, Göttingen, Germany.
| | - Jonas A Finkler
- Department of Physics, Universität Basel, Klingelbergstrasse 82, 4056, Basel, Switzerland.
| | - Stefan Goedecker
- Department of Physics, Universität Basel, Klingelbergstrasse 82, 4056, Basel, Switzerland
| | - Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077, Göttingen, Germany
| |
Collapse
|
36
|
Bedolla E, Padierna LC, Castañeda-Priego R. Machine learning for condensed matter physics. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2020; 33:053001. [PMID: 32932243 DOI: 10.1088/1361-648x/abb895] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Accepted: 09/15/2020] [Indexed: 06/11/2023]
Abstract
Condensed matter physics (CMP) seeks to understand the microscopic interactions of matter at the quantum and atomistic levels, and describes how these interactions result in both mesoscopic and macroscopic properties. CMP overlaps with many other important branches of science, such as chemistry, materials science, statistical physics, and high-performance computing. With the advancements in modern machine learning (ML) technology, a keen interest in applying these algorithms to further CMP research has created a compelling new area of research at the intersection of both fields. In this review, we aim to explore the main areas within CMP, which have successfully applied ML techniques to further research, such as the description and use of ML schemes for potential energy surfaces, the characterization of topological phases of matter in lattice systems, the prediction of phase transitions in off-lattice and atomistic simulations, the interpretation of ML theories with physics-inspired frameworks and the enhancement of simulation methods with ML algorithms. We also discuss in detail the main challenges and drawbacks of using ML methods on CMP problems, as well as some perspectives for future developments.
Collapse
Affiliation(s)
- Edwin Bedolla
- División de Ciencias e Ingenierías, Universidad de Guanajuato, Loma del Bosque 103, 37150 León, Mexico
| | - Luis Carlos Padierna
- División de Ciencias e Ingenierías, Universidad de Guanajuato, Loma del Bosque 103, 37150 León, Mexico
| | - Ramón Castañeda-Priego
- División de Ciencias e Ingenierías, Universidad de Guanajuato, Loma del Bosque 103, 37150 León, Mexico
| |
Collapse
|
37
|
Kang PL, Shang C, Liu ZP. Large-Scale Atomic Simulation via Machine Learning Potentials Constructed by Global Potential Energy Surface Exploration. Acc Chem Res 2020; 53:2119-2129. [PMID: 32940999 DOI: 10.1021/acs.accounts.0c00472] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Atomic simulations based on quantum mechanics (QM) calculations have entered into the tool box of chemists over the past few decades, facilitating an understanding of a wide range of chemistry problems, from structure characterization to reactivity determination. Due to the poor scaling and high computational cost intrinsic to QM calculations, one has to either sacrifice accuracy or time when performing large-scale atomic simulations. The battle to find a better compromise between accuracy and speed has been central to the development of new theoretical methods.The recent advances of machine-learning (ML)-based large-scale atomic simulations has shown great promise to the benefit of many branches of chemistry. Instead of solving the Schrödinger equation directly, ML-based simulations rely on a large data set of accurate potential energy surfaces (PESs) and complex numerical models to predict the total energy. These simulations feature both a high speed and a high accuracy for computing large systems. Due to the lack of a physical foundation in numerical models, ML models are often frustrated in their predictivity and robustness, which are key to applications. Focusing on these concerns, here we overview the recent advances in ML methodologies for atomic simulations on three key aspects. Namely, the generation of a representative data set, the extensity of ML models, and the continuity of data representation. While global optimization methods are the natural choice for building a representative data set, the stochastic surface walking method is shown to provide the desired PES sampling for both minima and transition regions on the PES. The current ML models generally utilize local geometrical descriptors as an input and consider the total energy as the sum of atomic energies. There are many flavors of data descriptors and ML models, but the applications for material and reaction predictions are still limited, not least because of the difficulty to train the associated vast global data sets. We show that our recently designed power-type structure descriptors together with a feed-forward neural network (NN) model are compatible with highly complex global PES data, which has led to a large family of global NN (G-NN) potentials.Two recent applications of G-NN potentials in material and reaction simulations are selected to illustrate how ML-based atomic simulations can help the discovery of new materials and reactions.
Collapse
Affiliation(s)
- Pei-Lin Kang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Cheng Shang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Zhi-Pan Liu
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| |
Collapse
|
38
|
Lindsey RK, Fried LE, Goldman N, Bastea S. Active learning for robust, high-complexity reactive atomistic simulations. J Chem Phys 2020; 153:134117. [DOI: 10.1063/5.0021965] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Rebecca K. Lindsey
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - Laurence E. Fried
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - Nir Goldman
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
- Department of Chemical Engineering, University of California, Davis, California 95616, USA
| | - Sorin Bastea
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| |
Collapse
|
39
|
Muraro C, Polato M, Bortoli M, Aiolli F, Orian L. Radical scavenging activity of natural antioxidants and drugs: Development of a combined machine learning and quantum chemistry protocol. J Chem Phys 2020; 153:114117. [DOI: 10.1063/5.0013278] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Affiliation(s)
- Cecilia Muraro
- Dipartimento di Scienze Chimiche, Università degli Studi di Padova, Via Marzolo 1, 35131 Padova, Italy
| | - Mirko Polato
- Dipartimento di Matematica “Tullio Levi-Civita,” Università degli Studi di Padova, Via Trieste 63, 35121 Padova, Italy
| | - Marco Bortoli
- Dipartimento di Scienze Chimiche, Università degli Studi di Padova, Via Marzolo 1, 35131 Padova, Italy
| | - Fabio Aiolli
- Dipartimento di Matematica “Tullio Levi-Civita,” Università degli Studi di Padova, Via Trieste 63, 35121 Padova, Italy
| | - Laura Orian
- Dipartimento di Scienze Chimiche, Università degli Studi di Padova, Via Marzolo 1, 35131 Padova, Italy
| |
Collapse
|
40
|
Li W, Fang H, Qin G, Tan X, Huang Z, Zeng F, Du H, Li S. Concentration estimation of dissolved oxygen in Pearl River Basin using input variable selection and machine learning techniques. THE SCIENCE OF THE TOTAL ENVIRONMENT 2020; 731:139099. [PMID: 32434098 DOI: 10.1016/j.scitotenv.2020.139099] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Revised: 04/27/2020] [Accepted: 04/27/2020] [Indexed: 06/11/2023]
Abstract
Dissolved oxygen (DO) concentration is an essential index for water environment assessment. Here, we present a modeling approach to estimate DO concentrations using input variable selection and data-driven models. Specifically, the input variable selection technique, the maximal information coefficient (MIC), was used to identify and screen the primary environmental factors driving variation in DO. The data-driven model, support vector regression (SVR), was then used to construct a robust model to estimate DO concentration. The approach was illustrated through a case study of the Pearl River Basin in China. We show that the MIC technique can effectively screen major local environmental factors affecting DO concentrations. MIC value tended to stabilize when the sample size >3000 and EC had the highest score with an MIC >0.3 at both of the stations. The variable-reduced datasets improved the performance of the SVR model by a reduction of 28.65% in RMSE, and increase of 22.16%, 56.27% in R2, NSE, respectively, relative to complete candidate sets. The MIC-SVR model constructed at the tidal river network performed better than nontidal river network by a reduction of approximately 63.01% in RMSE, an increase of 62.36% in NSE, and R2 >0.9. Overall, the proposed technique was able to handle nonlinearity among environmental factors and accurately estimate DO concentrations in tidal river network regions.
Collapse
Affiliation(s)
- Wenjing Li
- National Key Laboratory of Water Environmental Simulation and Pollution Control, Guangdong Key Laboratory of Water and Air Pollution Control, South China Institute of Environmental Sciences, Ministry of Environmental Protection of the People's Republic of China, Guangzhou 510530, China
| | - Huaiyang Fang
- National Key Laboratory of Water Environmental Simulation and Pollution Control, Guangdong Key Laboratory of Water and Air Pollution Control, South China Institute of Environmental Sciences, Ministry of Environmental Protection of the People's Republic of China, Guangzhou 510530, China
| | - Guangxiong Qin
- National Key Laboratory of Water Environmental Simulation and Pollution Control, Guangdong Key Laboratory of Water and Air Pollution Control, South China Institute of Environmental Sciences, Ministry of Environmental Protection of the People's Republic of China, Guangzhou 510530, China
| | - Xiuqin Tan
- National Key Laboratory of Water Environmental Simulation and Pollution Control, Guangdong Key Laboratory of Water and Air Pollution Control, South China Institute of Environmental Sciences, Ministry of Environmental Protection of the People's Republic of China, Guangzhou 510530, China
| | - Zhiwei Huang
- National Key Laboratory of Water Environmental Simulation and Pollution Control, Guangdong Key Laboratory of Water and Air Pollution Control, South China Institute of Environmental Sciences, Ministry of Environmental Protection of the People's Republic of China, Guangzhou 510530, China
| | - Fantang Zeng
- National Key Laboratory of Water Environmental Simulation and Pollution Control, Guangdong Key Laboratory of Water and Air Pollution Control, South China Institute of Environmental Sciences, Ministry of Environmental Protection of the People's Republic of China, Guangzhou 510530, China
| | - Hongwei Du
- National Key Laboratory of Water Environmental Simulation and Pollution Control, Guangdong Key Laboratory of Water and Air Pollution Control, South China Institute of Environmental Sciences, Ministry of Environmental Protection of the People's Republic of China, Guangzhou 510530, China.
| | - Shuping Li
- School of Geography and Planning, Sun Yat-sen University, Guangzhou 510275, China.
| |
Collapse
|
41
|
Dral PO, Owens A, Dral A, Csányi G. Hierarchical machine learning of potential energy surfaces. J Chem Phys 2020; 152:204110. [DOI: 10.1063/5.0006498] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Affiliation(s)
- Pavlo O. Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Alec Owens
- Department of Physics and Astronomy, University College London, Gower Street, WC1E 6BT London, United Kingdom
| | - Alexey Dral
- BigData Team, 1A Tormoznoye Shosse Off 17, Yaroslavl, Yaroslavl 150022, Russian Federation
| | - Gábor Csányi
- Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| |
Collapse
|
42
|
Abstract
As the quantum chemistry (QC) community embraces machine learning (ML), the number of new methods and applications based on the combination of QC and ML is surging. In this Perspective, a view of the current state of affairs in this new and exciting research field is offered, challenges of using machine learning in quantum chemistry applications are described, and potential future developments are outlined. Specifically, examples of how machine learning is used to improve the accuracy and accelerate quantum chemical research are shown. Generalization and classification of existing techniques are provided to ease the navigation in the sea of literature and to guide researchers entering the field. The emphasis of this Perspective is on supervised machine learning.
Collapse
Affiliation(s)
- Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| |
Collapse
|
43
|
Mueller T, Hernandez A, Wang C. Machine learning for interatomic potential models. J Chem Phys 2020; 152:050902. [PMID: 32035452 DOI: 10.1063/1.5126336] [Citation(s) in RCA: 109] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The use of supervised machine learning to develop fast and accurate interatomic potential models is transforming molecular and materials research by greatly accelerating atomic-scale simulations with little loss of accuracy. Three years ago, Jörg Behler published a perspective in this journal providing an overview of some of the leading methods in this field. In this perspective, we provide an updated discussion of recent developments, emerging trends, and promising areas for future research in this field. We include in this discussion an overview of three emerging approaches to developing machine-learned interatomic potential models that have not been extensively discussed in existing reviews: moment tensor potentials, message-passing networks, and symbolic regression.
Collapse
Affiliation(s)
- Tim Mueller
- Department of Materials Science and Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Alberto Hernandez
- Department of Materials Science and Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Chuhong Wang
- Department of Materials Science and Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| |
Collapse
|
44
|
Tkachev V, Sorokin M, Borisov C, Garazha A, Buzdin A, Borisov N. Flexible Data Trimming Improves Performance of Global Machine Learning Methods in Omics-Based Personalized Oncology. Int J Mol Sci 2020; 21:ijms21030713. [PMID: 31979006 PMCID: PMC7037338 DOI: 10.3390/ijms21030713] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 01/16/2020] [Accepted: 01/17/2020] [Indexed: 12/21/2022] Open
Abstract
(1) Background: Machine learning (ML) methods are rarely used for an omics-based prescription of cancer drugs, due to shortage of case histories with clinical outcome supplemented by high-throughput molecular data. This causes overtraining and high vulnerability of most ML methods. Recently, we proposed a hybrid global-local approach to ML termed floating window projective separator (FloWPS) that avoids extrapolation in the feature space. Its core property is data trimming, i.e., sample-specific removal of irrelevant features. (2) Methods: Here, we applied FloWPS to seven popular ML methods, including linear SVM, k nearest neighbors (kNN), random forest (RF), Tikhonov (ridge) regression (RR), binomial naïve Bayes (BNB), adaptive boosting (ADA) and multi-layer perceptron (MLP). (3) Results: We performed computational experiments for 21 high throughput gene expression datasets (41–235 samples per dataset) totally representing 1778 cancer patients with known responses on chemotherapy treatments. FloWPS essentially improved the classifier quality for all global ML methods (SVM, RF, BNB, ADA, MLP), where the area under the receiver-operator curve (ROC AUC) for the treatment response classifiers increased from 0.61–0.88 range to 0.70–0.94. We tested FloWPS-empowered methods for overtraining by interrogating the importance of different features for different ML methods in the same model datasets. (4) Conclusions: We showed that FloWPS increases the correlation of feature importance between the different ML methods, which indicates its robustness to overtraining. For all the datasets tested, the best performance of FloWPS data trimming was observed for the BNB method, which can be valuable for further building of ML classifiers in personalized oncology.
Collapse
Affiliation(s)
- Victor Tkachev
- OmicsWayCorp, Walnut, CA 91788, USA; (V.T.); (M.S.); (A.G.)
| | - Maxim Sorokin
- OmicsWayCorp, Walnut, CA 91788, USA; (V.T.); (M.S.); (A.G.)
- Institute for Personailzed Medicine, I.M. Sechenov First Moscow State Medical University, 119991 Moscow, Russia
| | - Constantin Borisov
- National Research University—Higher School of Economics, 101000 Moscow, Russia;
| | - Andrew Garazha
- OmicsWayCorp, Walnut, CA 91788, USA; (V.T.); (M.S.); (A.G.)
| | - Anton Buzdin
- OmicsWayCorp, Walnut, CA 91788, USA; (V.T.); (M.S.); (A.G.)
- Institute for Personailzed Medicine, I.M. Sechenov First Moscow State Medical University, 119991 Moscow, Russia
- Moscow Institute of Physics and Technology, 141701 Moscow Oblast, Russia
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, 117997 Moscow, Russia
| | - Nicolas Borisov
- OmicsWayCorp, Walnut, CA 91788, USA; (V.T.); (M.S.); (A.G.)
- Institute for Personailzed Medicine, I.M. Sechenov First Moscow State Medical University, 119991 Moscow, Russia
- Moscow Institute of Physics and Technology, 141701 Moscow Oblast, Russia
- Correspondence: ; Tel.: +7-903-218-7261
| |
Collapse
|
45
|
Soroush E, Mesbah M, Zendehboudi S. An efficient tool to determine physical properties of ternary mixtures containing 1-alkyl-3-methylimidazolium based ILs and molecular solvents. Chem Eng Res Des 2019. [DOI: 10.1016/j.cherd.2019.07.022] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
46
|
Interpolation of Instantaneous Air Temperature Using Geographical and MODIS Derived Variables with Machine Learning Techniques. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2019. [DOI: 10.3390/ijgi8090382] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Several methods have been tried to estimate air temperature using satellite imagery. In this paper, the results of two machine learning algorithms, Support Vector Machines and Random Forest, are compared with Multiple Linear Regression and Ordinary kriging. Several geographic, remote sensing and time variables are used as predictors. The validation is carried out using two different approaches, a leave-one-out cross validation in the spatial domain and a spatio-temporal k-block cross-validation, and four different statistics on a daily basis, allowing the use of ANOVA to compare the results. The main conclusion is that Random Forest produces the best results (R 2 = 0.888 ± 0.026, Root mean square error = 3.01 ± 0.325 using k-block cross-validation). Regression methods (Support Vector Machine, Random Forest and Multiple Linear Regression) are calibrated with MODIS data and several predictors easily calculated from a Digital Elevation Model. The most important variables in the Random Forest model were satellite temperature, potential irradiation and cdayt, a cosine transformation of the julian day.
Collapse
|
47
|
Sun G, Sautet P. Toward Fast and Reliable Potential Energy Surfaces for Metallic Pt Clusters by Hierarchical Delta Neural Networks. J Chem Theory Comput 2019; 15:5614-5627. [DOI: 10.1021/acs.jctc.9b00465] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Geng Sun
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, Los Angeles, California 90095, United States
| | - Philippe Sautet
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, Los Angeles, California 90095, United States
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, California 90095, United States
| |
Collapse
|
48
|
Han D, Tan J, Men J, Li C, Zhang X. Quantitative Structure Activity/Pharmacokinetics Relationship Studies of HIV-1 Protease Inhibitors Using Three Modelling Methods. Med Chem 2019; 17:396-406. [PMID: 31448716 DOI: 10.2174/1573406415666190826154505] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Revised: 07/21/2019] [Accepted: 08/05/2019] [Indexed: 11/22/2022]
Abstract
BACKGROUND HIV-1 protease inhibitor (PIs) is a good choice for AIDS patients. Nevertheless, for PIs, there are several bugs in clinical application, like drug resistance, the large dose, the high costs and so on, among which, the poor pharmacokinetics property is one of the important reasons that leads to the failure of its clinical application. OBJECTIVE We aimed to build computational models for studying the relationship between PIs structure and its pharmacological activities. METHODS We collected experimental values of koff/Ki and structures of 50 PIs through a careful literature and database search. Quantitative structure activity/pharmacokinetics relationship (QSAR/QSPR) models were constructed by support vector machine (SVM), partial-least squares regression (PLSR) and back-propagation neural network (BPNN). RESULTS For QSAR models, SVM, PLSR and BPNN all generated reliable prediction models with the r2 of 0.688, 0.768 and 0.787, respectively, and r2pred of 0.748, 0.696 and 0.640, respectively. For QSPR models, the optimum models of SVM, PLSR and BPNN obtained the r2 of 0.952, 0.869 and 0.960, respectively, and the r2pred of 0.852, 0.628 and 0.814, respectively. CONCLUSION Among these three modelling methods, SVM showed superior ability than PLSR and BPNN both in QSAR/QSPR modelling of PIs, thus, we suspected that SVM was more suitable for predicting activities of PIs. In addition, 3D-MoRSE descriptors may have a tight relationship with the Ki values of PIs, and the GETAWAY descriptors have significant influence on both koff and Ki in PLSR equations.
Collapse
Affiliation(s)
- Dan Han
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, China
| | - Jianjun Tan
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, China
| | - Jingrui Men
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, China
| | - Chunhua Li
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, China
| | - Xiaoyi Zhang
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
49
|
A CFD Based Application of Support Vector Regression to Determine the Optimum Smooth Twist for Wind Turbine Blades. SUSTAINABILITY 2019. [DOI: 10.3390/su11164502] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Computational fluid dynamics (CFD) is a powerful tool to estimate accurately the aerodynamic loads on wind turbine blades at the expense of high requirements like the duration of computation. Such requirements grow in the case of blade shape optimization in which several analyses are needed. A fast and reliable way to mimic the CFD solutions is to use surrogate models. In this study, a machine learning technique, the support vector regression (SVR) method based on a set of CFD solutions, is used as the surrogate model. CFD solutions are calculated by solving the Reynolds-averaged Navier–Stokes equation with the k-epsilon turbulence model using a commercial solver. The support vector regression model is then trained to give a functional relationship between the spanwise twist distribution and the generated torque. The smooth twist distribution is defined using a three-node cubic spline with four parameters in total. The optimum twist is determined for two baseline blade cases: the National Renewable Energy Laboratory (NREL) Phase II and Phase VI rotor blades. In the optimization process, extremum points that give the maximum torque are easily determined since the SVR gives an analytical model. Results show that it is possible to increase the torque generated by the NREL VI blade more than 10% just by redistributing the spanwise twist without carrying out a full geometry optimization of the blade shape with many shape-defining parameters. The increase in torque for the NREL II case is much higher.
Collapse
|
50
|
Bhavsar R, Ramakrishnan R. Machine learning modeling of Wigner intracule functionals for two electrons in one-dimension. J Chem Phys 2019; 150:144114. [PMID: 30981252 DOI: 10.1063/1.5089597] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
In principle, many-electron correlation energy can be precisely computed from a reduced Wigner distribution function (W), thanks to a universal functional transformation (F), whose formal existence is akin to that of the exchange-correlation functional in density functional theory. While the exact dependence of F on W is unknown, a few approximate parametric models have been proposed in the past. Here, for a dataset of 923 one-dimensional external potentials with two interacting electrons, we apply machine learning to model F within the kernel Ansatz. We deal with over-fitting of the kernel to a specific region of phase-space by a one-step regularization not depending on any hyperparameters. Reference correlation energies have been computed by performing exact and Hartree-Fock calculations using discrete variable representation. The resulting models require W calculated at the Hartree-Fock level as input while yielding monotonous decay in the predicted correlation energies of new molecules reaching sub-chemical accuracy with training.
Collapse
Affiliation(s)
- Rutvij Bhavsar
- Department of Physics, Indian Institute of Technology Kanpur, Kanpur 208016, India
| | - Raghunathan Ramakrishnan
- Centre for Interdisciplinary Sciences, Tata Institute of Fundamental Research, Hyderabad 500107, India
| |
Collapse
|