1
|
Kong L, Bryce RA. Discriminating High from Low Energy Conformers of Druglike Molecules: An Assessment of Machine Learning Potentials and Quantum Chemical Methods. Chemphyschem 2025; 26:e202400992. [PMID: 40017058 PMCID: PMC12005129 DOI: 10.1002/cphc.202400992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2024] [Revised: 01/16/2025] [Indexed: 03/01/2025]
Abstract
Accurate and efficient prediction of high energy ligand conformations is important in structure-based drug discovery for the exclusion of unrealistic structures in docking-based virtual screening and de novo design approaches. In this work, we constructed a database of 140 solution conformers from 20 druglike molecules of varying size and chemical complexity, with energetics evaluated at the DLPNO-CCSD(T)/complete basis set (CBS) level. We then assessed a selection of machine learning potentials and semiempirical quantum mechanical models for their ability to predict conformational energetics. The GFN2-xTB tight binding density functional method correlates with reference conformer energies, yielding a Kendall's τ of 0.63 and mean absolute error of 2.2 kcal/mol. As putative internal energy filters for screening, we find that the GFN2-xTB, ANI-2x and MACE-OFF23(L) models perform well in identifying low energy conformer geometries, with sensitivities of 95 %, 89 % and 95 % respectively, but display a reduced ability to exclude high energy conformers, with respective specificities of 80 %, 61 % and 63 %. The GFN2-xTB method therefore exhibited the best overall performance and appears currently the most suitable of the three methods to act as an internal energy filter for integration into drug discovery workflows. Enrichment of high energy conformers in the training of machine learning potentials could improve their performance as conformational filters.
Collapse
Affiliation(s)
- Linghan Kong
- Division of Pharmacy and OptometrySchool of Health SciencesManchester Academic Health Sciences CentreUniversity of ManchesterOxford RoadManchesterM13 9PTUK
| | - Richard A. Bryce
- Division of Pharmacy and OptometrySchool of Health SciencesManchester Academic Health Sciences CentreUniversity of ManchesterOxford RoadManchesterM13 9PTUK
| |
Collapse
|
2
|
Ghukasyan T, Altunyan V, Bughdaryan A, Aghajanyan T, Smbatyan K, Papoian GA, Petrosyan G. Smart distributed data factory volunteer computing platform for active learning-driven molecular data acquisition. Sci Rep 2025; 15:7122. [PMID: 40016468 PMCID: PMC11868574 DOI: 10.1038/s41598-025-90981-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Accepted: 02/17/2025] [Indexed: 03/01/2025] Open
Abstract
This paper presents the smart distributed data factory (SDDF), an AI-driven distributed computing platform designed to address challenges in drug discovery by creating comprehensive datasets of molecular conformations and their properties. SDDF uses volunteer computing, leveraging the processing power of personal computers worldwide to accelerate quantum chemistry (DFT) calculations. To tackle the vast chemical space and limited high-quality data, SDDF employs an ensemble of machine learning (ML) models to predict molecular properties and selectively choose the most challenging data points for further DFT calculations. The platform also generates new molecular conformations using molecular dynamics with the forces derived from these models. SDDF makes several contributions: the volunteer computing platform for DFT calculations; an active learning framework for constructing a dataset of molecular conformations; a large public dataset of diverse ENAMINE molecules with calculated energies; an ensemble of ML models for accurate energy prediction. The energy dataset was generated to validate the SDDF approach of reducing the need for extensive calculations. With its strict scaffold split, the dataset can be used for training and benchmarking energy models. By combining active learning, distributed computing, and quantum chemistry, SDDF offers a scalable, cost-effective solution for developing accurate molecular models and ultimately accelerating drug discovery.
Collapse
|
3
|
Nováček M, Řezáč J. PM6-ML: The Synergy of Semiempirical Quantum Chemistry and Machine Learning Transformed into a Practical Computational Method. J Chem Theory Comput 2025; 21:678-690. [PMID: 39752295 PMCID: PMC11780751 DOI: 10.1021/acs.jctc.4c01330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Revised: 12/04/2024] [Accepted: 12/19/2024] [Indexed: 01/29/2025]
Abstract
Machine learning (ML) methods offer a promising route to the construction of universal molecular potentials with high accuracy and low computational cost. It is becoming evident that integrating physical principles into these models, or utilizing them in a Δ-ML scheme, significantly enhances their robustness and transferability. This paper introduces PM6-ML, a Δ-ML method that synergizes the semiempirical quantum-mechanical (SQM) method PM6 with a state-of-the-art ML potential applied as a universal correction. The method demonstrates superior performance over standalone SQM and ML approaches and covers a broader chemical space than its predecessors. It is scalable to systems with thousands of atoms, which makes it applicable to large biomolecular systems. Extensive benchmarking confirms PM6-ML's accuracy and robustness. Its practical application is facilitated by a direct interface to MOPAC. The code and parameters are available at https://github.com/Honza-R/mopac-ml.
Collapse
Affiliation(s)
- Martin Nováček
- Institute of Organic Chemistry and
Biochemistry, Czech Academy of Sciences, 160 00 Prague, Czech Republic
| | - Jan Řezáč
- Institute of Organic Chemistry and
Biochemistry, Czech Academy of Sciences, 160 00 Prague, Czech Republic
| |
Collapse
|
4
|
Pracht P, Pillai Y, Kapil V, Csányi G, Gönnheimer N, Vondrák M, Margraf JT, Wales DJ. Efficient Composite Infrared Spectroscopy: Combining the Double-Harmonic Approximation with Machine Learning Potentials. J Chem Theory Comput 2024; 20:10986-11004. [PMID: 39665618 DOI: 10.1021/acs.jctc.4c01157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2024]
Abstract
Vibrational spectroscopy is a cornerstone technique for molecular characterization and offers an ideal target for the computational investigation of molecular materials. Building on previous comprehensive assessments of efficient methods for infrared (IR) spectroscopy, this study investigates the predictive accuracy and computational efficiency of gas-phase IR spectra calculations, accessible through a combination of modern semiempirical quantum mechanical and transferable machine learning potentials. A composite approach for IR spectra prediction based on the double-harmonic approximation, utilizing harmonic vibrational frequencies in combination squared derivatives of the molecular dipole moment, is employed. This approach allows for methodical flexibility in the calculation of IR intensities from molecular dipoles and the corresponding vibrational modes. Various methods are systematically tested to suggest a suitable protocol with an emphasis on computational efficiency. Among these methods, semiempirical extended tight-binding (xTB) models, classical charge equilibrium models, and machine learning potentials trained for dipole moment prediction are assessed across a diverse data set of organic molecules. We particularly focus on the recently reported foundational machine learning potential MACE-OFF23 to address the accuracy limitations of conventional low-cost quantum mechanical and force-field methods. This study aims to establish a standard for the efficient computational prediction of IR spectra, facilitating the rapid and reliable identification of unknown compounds and advancing automated high-throughput analytical workflows in chemistry.
Collapse
Affiliation(s)
- Philipp Pracht
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Yuthika Pillai
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
| | - Venkat Kapil
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
- Department of Physics and Astronomy, University College London, 17-19 Gordon Street, London WC1H 0AH, U.K
- Thomas Young Centre & London Centre for Nanotechnology, 19 Gordon Street, London WC1H 0AH, U.K
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K
| | - Nils Gönnheimer
- University of Bayreuth, Bavarian Center for Battery Technology (BayBatt), 95448 Bayreuth, Germany
| | - Martin Vondrák
- University of Bayreuth, Bavarian Center for Battery Technology (BayBatt), 95448 Bayreuth, Germany
| | - Johannes T Margraf
- University of Bayreuth, Bavarian Center for Battery Technology (BayBatt), 95448 Bayreuth, Germany
| | - David J Wales
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
| |
Collapse
|
5
|
Ryzhako AS, Tuma AA, Otlyotov AA, Minenkov Y. An influence of electronic structure theory method, thermodynamic and implicit solvation corrections on the organic carbonates conformational and binding energies. J Comput Chem 2024; 45:3004-3016. [PMID: 39286905 DOI: 10.1002/jcc.27471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 06/28/2024] [Accepted: 07/24/2024] [Indexed: 09/19/2024]
Abstract
An impact of an electronic structure or force field method, gas-phase thermodynamic correction, and continuum solvation model on organic carbonate clusters (S)n conformational and binding energies is explored. None of the tested force field (GFN-FF, GAFF, MMFF94) and standard semiempirical methods (PM3, AM1, RM1, PM6, PM6-D3, PM6-D3H4, PM7) can reproduce reference RI-SCS-MP2 conformational energies. Tight-binding GFNn-xTB methods provide more realistic conformational energies which are accurate enough to discard the least stable conformers. The effect of thermodynamic correction is moderate and can be ignored if the gas phase conformational stability ranking is a goal. The influence of continuum solvation is stronger, especially if reinforced with the Gibbs free energy thermodynamic correction, and results in the reduced spread of conformational energies. The cluster formation binding energies strongly depend on a particular approach to vibrational thermochemistry with the difference between traditional harmonic and modified scaled rigid - harmonic oscillator approximations reaching 10 kcal mol-1.
Collapse
Affiliation(s)
- Alexander S Ryzhako
- N.N. Semenov Federal Research Center for Chemical Physics RAS, Moscow, Russian Federation
- The Faculty of Natural Sciences, Dmitry Mendeleev University of Chemical Technology of Russia, Moscow, Russian Federation
| | - Anna A Tuma
- N.N. Semenov Federal Research Center for Chemical Physics RAS, Moscow, Russian Federation
- Department of Chemistry, Lomonosov Moscow State University, Moscow, Russian Federation
| | - Arseniy A Otlyotov
- N.N. Semenov Federal Research Center for Chemical Physics RAS, Moscow, Russian Federation
| | - Yury Minenkov
- N.N. Semenov Federal Research Center for Chemical Physics RAS, Moscow, Russian Federation
| |
Collapse
|
6
|
Hölzer C, Oerder R, Grimme S, Hamaekers J. ConfRank: Improving GFN-FF Conformer Ranking with Pairwise Training. J Chem Inf Model 2024; 64:8909-8925. [PMID: 39565928 DOI: 10.1021/acs.jcim.4c01524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2024]
Abstract
Conformer ranking is a crucial task for drug discovery, with methods for generating conformers often based on molecular (meta)dynamics or sophisticated sampling techniques. These methods are constrained by the underlying force computation regarding runtime and energy ranking accuracy, limiting their effectiveness for large-scale screening applications. To address these ranking limitations, we introduce ConfRank, a machine learning-based approach that enhances conformer ranking using pairwise training. We demonstrate its performance using GFN-FF-generated conformer ensembles, leveraging the DimeNet++ architecture trained on pairs of 159 760 uncharged organic compounds from the GEOM data set with r2SCAN-3c reference level. Instead of predicting only on single molecules, this approach captures relative energy differences between conformers, leading to a significant improvement of the overall conformational ranking, outperforming GFN-FF and GFN2-xTB. Thereby, the pairwise RMSD of the relative energy difference of two conformers can be reduced from 5.65 to 0.71 kcal mol-1 on the test data set, allowing to correctly identify up to 81% of all lowest lying conformers correctly (GFN-FF: 10%, GFN2-xTB: 47%). The ConfRank approach is cost-effective, allowing for scalable deployment on both CPU and GPU, achieving runtime accelerations by up to 2 orders of magnitude compared to GFN2-xTB. Out-of-sample investigations on CREST-generated conformer ensembles from the QM9 data set and conformers taken from an extended GMTKN55 data set show promising results for the robustness of this approach. Thereby, ranking correlation coefficient such as Spearman can be improved to 0.90 (GFN-FF: 0.39, GFN2-xTB: 0.84) reducing the probability of an incorrect sign flip in pairwise energy comparison from 32 to 7%. On the extended GMTKN55 subsets the pairwise MAD (RMSD) could be reduced on almost all subsets by up to 62% (58%) with an average improvement of 30% (29%). Moreover, an exemplary case study on vancomycin shows similar performance, indicating applicability to larger (bio)molecular structures. Furthermore, we motivate the usage of the pairwise training approach from a theoretical perspective, highlighting that while pairwise training can lead to a decline in single sample prediction of absolute energies for ML models, it significantly enhances conformer ranking performance. The data and models used in this study are available at https://github.com/grimme-lab/confrank.
Collapse
Affiliation(s)
- Christian Hölzer
- Mulliken Center for Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | - Rick Oerder
- Institute for Numerical Simulation, Friedrich-Hirzebruch-Allee 7, 53115 Bonn, Germany
- Fraunhofer Institute for Algorithms and Scientific Computing SCAI, Schloss Birlinghoven 1, 53757 Sankt Augustin, Germany
| | - Stefan Grimme
- Mulliken Center for Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | - Jan Hamaekers
- Fraunhofer Institute for Algorithms and Scientific Computing SCAI, Schloss Birlinghoven 1, 53757 Sankt Augustin, Germany
| |
Collapse
|
7
|
Plett C, Grimme S, Hansen A. Toward Reliable Conformational Energies of Amino Acids and Dipeptides─The DipCONFS Benchmark and DipCONL Datasets. J Chem Theory Comput 2024. [PMID: 39259679 DOI: 10.1021/acs.jctc.4c00801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2024]
Abstract
Simulating peptides and proteins is becoming increasingly important, leading to a growing need for efficient computational methods. These are typically semiempirical quantum mechanical (SQM) methods, force fields (FFs), or machine-learned interatomic potentials (MLIPs), all of which require a large amount of accurate data for robust training and evaluation. To assess potential reference methods and complement the available data, we introduce two sets, DipCONFL and DipCONFS, which cover large parts of the conformational space of 17 amino acids and their 289 possible dipeptides in aqueous solution. The conformers were selected from the exhaustive PeptideCS dataset by Andris et al. [ J. Phys. Chem. B 2022, 126, 5949-5958]. The structures, originally generated with GFN2-xTB, were reoptimized using the accurate r2SCAN-3c density functional theory (DFT) composite method including the implicit CPCM water solvation model. The DipCONFS benchmark set contains 918 conformers and is one of the largest sets with highly accurate coupled cluster conformational energies so far. It is employed to evaluate various DFT and wave function theory (WFT) methods, especially regarding whether they are accurate enough to be used as reliable reference methods for larger datasets intended for training and testing more approximated SQM, FF, and MLIP methods. The results reveal that the originally provided BP86-D3(BJ)/DGauss-DZVP conformational energies are not sufficiently accurate. Among the DFT methods tested as an alternative reference level, the revDSD-PBEP86-D4 double hybrid performs best with a mean absolute error (MAD) of 0.2 kcal mol-1 compared with the PNO-LCCSD(T)-F12b reference. The very efficient r2SCAN-3c composite method also shows excellent results, with an MAD of 0.3 kcal mol-1, similar to the best-tested hybrid ωB97M-D4. With these findings, we compiled the large DipCONFL set, which includes over 29,000 realistic conformers in solution with reasonably accurate r2SCAN-3c reference conformational energies, gradients, and further properties potentially relevant for training MLIP methods. This set, also in comparison to DipCONFS, is used to assess the performance of various SQM, FF, and MLIP methods robustly and can complement training sets for those.
Collapse
Affiliation(s)
- Christoph Plett
- Mulliken Center for Theoretical Chemistry, Clausius-Institut für Physikalische und Theoretische Chemie, Universität Bonn, Beringstraße 4, 53115 Bonn, Germany
| | - Stefan Grimme
- Mulliken Center for Theoretical Chemistry, Clausius-Institut für Physikalische und Theoretische Chemie, Universität Bonn, Beringstraße 4, 53115 Bonn, Germany
| | - Andreas Hansen
- Mulliken Center for Theoretical Chemistry, Clausius-Institut für Physikalische und Theoretische Chemie, Universität Bonn, Beringstraße 4, 53115 Bonn, Germany
| |
Collapse
|
8
|
Mészáros B, Kubicskó K, Németh DD, Daru J. Emerging Conformational-Analysis Protocols from the RTCONF55-16K Reaction Thermochemistry Conformational Benchmark Set. J Chem Theory Comput 2024; 20:7385-7392. [PMID: 38899777 PMCID: PMC11498139 DOI: 10.1021/acs.jctc.4c00565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 06/07/2024] [Accepted: 06/07/2024] [Indexed: 06/21/2024]
Abstract
RTCONF55-16K is a new, reactive conformational data set based on cost-efficient methods to assess different conformational analysis protocols. Our reference calculations underpinned the accuracy of the CENSO (Grimme et al. J. Phys. Chem. A, 2021, 125, 4039) procedure and resulted in alternative recipes with different cost-accuracy compromises. Our general-purpose and economical protocols (CENSO-light and zero, respectively) were found to be 10-30 times faster than the original algorithm, adding only 0.4-0.7 kcal/mol absolute error to the relative free energy estimates.
Collapse
Affiliation(s)
- Bence
Balázs Mészáros
- Hevesy
György PhD School of Chemistry, ELTE
Eötvös Loránd University, Pázmány Péter sétány
1/A, 1117 Budapest, Hungary
- Department
of Organic Chemistry, ELTE Eötvös
Loránd University, Pázmány Péter sétány
1/A, 1117 Budapest, Hungary
| | - Károly Kubicskó
- Hevesy
György PhD School of Chemistry, ELTE
Eötvös Loránd University, Pázmány Péter sétány
1/A, 1117 Budapest, Hungary
- Department
of Organic Chemistry, ELTE Eötvös
Loránd University, Pázmány Péter sétány
1/A, 1117 Budapest, Hungary
| | - Dávid Dorián Németh
- Department
of Organic Chemistry, ELTE Eötvös
Loránd University, Pázmány Péter sétány
1/A, 1117 Budapest, Hungary
| | - János Daru
- Department
of Organic Chemistry, ELTE Eötvös
Loránd University, Pázmány Péter sétány
1/A, 1117 Budapest, Hungary
| |
Collapse
|
9
|
Nagy PR. State-of-the-art local correlation methods enable affordable gold standard quantum chemistry for up to hundreds of atoms. Chem Sci 2024:d4sc04755a. [PMID: 39246365 PMCID: PMC11376132 DOI: 10.1039/d4sc04755a] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 07/30/2024] [Indexed: 09/10/2024] Open
Abstract
In this feature, we review the current capabilities of local electron correlation methods up to the coupled cluster model with single, double, and perturbative triple excitations [CCSD(T)], which is a gold standard in quantum chemistry. The main computational aspects of the local method types are assessed from the perspective of applications, but the focus is kept on how to achieve chemical accuracy (i.e., <1 kcal mol-1 uncertainty), as well as on the broad scope of chemical problems made accessible. The performance of state-of-the-art methods is also compared, including the most employed DLPNO and, in particular, our local natural orbital (LNO) CCSD(T) approach. The high accuracy and efficiency of the LNO method makes chemically accurate CCSD(T) computations accessible for molecules of hundreds of atoms with resources affordable to a broad computational community (days on a single CPU and 10-100 GB of memory). Recent developments in LNO-CCSD(T) enable systematic convergence and robust error estimates even for systems of complicated electronic structure or larger size (up to 1000 atoms). The predictive power of current local CCSD(T) methods, usually at about 1-2 order of magnitude higher cost than hybrid density functional theory (DFT), has become outstanding on the palette of computational chemistry applicable for molecules of practical interest. We also review more than 50 LNO-based and other advanced local-CCSD(T) applications for realistic, large systems across molecular interactions as well as main group, transition metal, bio-, and surface chemistry. The examples show that properly executed local-CCSD(T) can contribute to binding, reaction equilibrium, rate constants, etc. which are able to match measurements within the error estimates. These applications demonstrate that modern, open-access, and broadly affordable local methods, such as LNO-CCSD(T), already enable predictive computations and atomistic insight for complicated, real-life molecular processes in realistic environments.
Collapse
Affiliation(s)
- Péter R Nagy
- Department of Physical Chemistry and Materials Science, Faculty of Chemical Technology and Biotechnology, Budapest University of Technology and Economics Műegyetem rkp. 3. H-1111 Budapest Hungary
- HUN-REN-BME Quantum Chemistry Research Group Műegyetem rkp. 3. H-1111 Budapest Hungary
- MTA-BME Lendület Quantum Chemistry Research Group Műegyetem rkp. 3. H-1111 Budapest Hungary
| |
Collapse
|
10
|
Behara PK, Jang H, Horton JT, Gokey T, Dotson DL, Boothroyd S, Bayly CI, Cole DJ, Wang LP, Mobley DL. Benchmarking Quantum Mechanical Levels of Theory for Valence Parametrization in Force Fields. J Phys Chem B 2024; 128:7888-7902. [PMID: 39087913 PMCID: PMC11331531 DOI: 10.1021/acs.jpcb.4c03167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 07/09/2024] [Accepted: 07/15/2024] [Indexed: 08/02/2024]
Abstract
A wide range of density functional methods and basis sets are available to derive the electronic structure and properties of molecules. Quantum mechanical calculations are too computationally intensive for routine simulation of molecules in the condensed phase, prompting the development of computationally efficient force fields based on quantum mechanical data. Parametrizing general force fields, which cover a vast chemical space, necessitates the generation of sizable quantum mechanical data sets with optimized geometries and torsion scans. To achieve this efficiently, choosing a quantum mechanical method that balances computational cost and accuracy is crucial. In this study, we seek to assess the accuracy of quantum mechanical theory for specific properties such as conformer energies and torsion energetics. To comprehensively evaluate various methods, we focus on a representative set of 59 diverse small molecules, comparing approximately 25 combinations of functional and basis sets against the reference level coupled cluster calculations at the complete basis set limit.
Collapse
Affiliation(s)
- Pavan Kumar Behara
- Center
for Neurotherapeutics, University of California, Irvine, California 92697, United States
| | - Hyesu Jang
- Chemistry
Department, University of California at
Davis, Davis, California 95616, United States
- OpenEye
Scientific Software, Santa
Fe, New Mexico 87508, United States
| | - Joshua T. Horton
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - Trevor Gokey
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| | - David L. Dotson
- The
Open Force Field Initiative, Open Molecular Software Foundation, Davis, California 95616, United States
- Datryllic
LLC, Phoenix, Arizona 85003, United States
| | | | | | - Daniel J. Cole
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - Lee-Ping Wang
- Chemistry
Department, University of California at
Davis, Davis, California 95616, United States
| | - David L. Mobley
- Center
for Neurotherapeutics, University of California, Irvine, California 92697, United States
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| |
Collapse
|
11
|
Xu R, Jiang Z, Yang Q, Bloino J, Biczysko M. Harmonic and anharmonic vibrational computations for biomolecular building blocks: Benchmarking DFT and basis sets by theoretical and experimental IR spectrum of glycine conformers. J Comput Chem 2024; 45:1846-1869. [PMID: 38682874 DOI: 10.1002/jcc.27377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 04/01/2024] [Accepted: 04/02/2024] [Indexed: 05/01/2024]
Abstract
Advanced vibrational spectroscopic experiments have reached a level of sophistication that can only be matched by numerical simulations in order to provide an unequivocal analysis, a crucial step to understand the structure-function relationship of biomolecules. While density functional theory (DFT) has become the standard method when targeting medium-size or larger systems, the problem of its reliability and accuracy are well-known and have been abundantly documented. To establish a reliable computational protocol, especially when accuracy is critical, a tailored benchmark is usually required. This is generally done over a short list of known candidates, with the basis set often fixed a priori. In this work, we present a systematic study of the performance of DFT-based hybrid and double-hybrid functionals in the prediction of vibrational energies and infrared intensities at the harmonic level and beyond, considering anharmonic effects through vibrational perturbation theory at the second order. The study is performed for the six-lowest energy glycine conformers, utilizing available "state-of-the-art" accurate theoretical and experimental data as reference. Focusing on the most intense fundamental vibrations in the mid-infrared range of glycine conformers, the role of the basis sets is also investigated considering the balance between computational cost and accuracy. Targeting larger systems, a broad range of hybrid schemes with different computational costs is also tested.
Collapse
Affiliation(s)
- Ruiqin Xu
- Department of Physics, College of Sciences, Shanghai University, Shanghai, China
| | | | - Qin Yang
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Science, Prague, Czechia
| | - Julien Bloino
- Classe di Scienze, Scuola Normale Superiore, Pisa, Italy
| | - Malgorzata Biczysko
- Department of Physics, College of Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
12
|
Wang L, Behara PK, Thompson MW, Gokey T, Wang Y, Wagner JR, Cole DJ, Gilson MK, Shirts MR, Mobley DL. The Open Force Field Initiative: Open Software and Open Science for Molecular Modeling. J Phys Chem B 2024; 128:7043-7067. [PMID: 38989715 DOI: 10.1021/acs.jpcb.4c01558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Force fields are a key component of physics-based molecular modeling, describing the energies and forces in a molecular system as a function of the positions of the atoms and molecules involved. Here, we provide a review and scientific status report on the work of the Open Force Field (OpenFF) Initiative, which focuses on the science, infrastructure and data required to build the next generation of biomolecular force fields. We introduce the OpenFF Initiative and the related OpenFF Consortium, describe its approach to force field development and software, and discuss accomplishments to date as well as future plans. OpenFF releases both software and data under open and permissive licensing agreements to enable rapid application, validation, extension, and modification of its force fields and software tools. We discuss lessons learned to date in this new approach to force field development. We also highlight ways that other force field researchers can get involved, as well as some recent successes of outside researchers taking advantage of OpenFF tools and data.
Collapse
Affiliation(s)
- Lily Wang
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Pavan Kumar Behara
- Center for Neurotherapeutics, University of California, Irvine, California 92697, United States
| | - Matthew W Thompson
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Trevor Gokey
- Department of Chemistry, University of California, Irvine, California 92697, United States
| | - Yuanqing Wang
- Simons Center for Computational Physical Chemistry and Center for Data Science, New York, New York 10004, United States
| | - Jeffrey R Wagner
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80305, United States
| | - David L Mobley
- Department of Chemistry, University of California, Irvine, California 92697, United States
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| |
Collapse
|
13
|
Hancock AC, Giudici E, Goerigk L. How do spin-scaled double hybrids designed for excitation energies perform for noncovalent excited-state interactions? An investigation on aromatic excimer models. J Comput Chem 2024; 45:1667-1681. [PMID: 38553847 DOI: 10.1002/jcc.27351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 03/07/2024] [Accepted: 03/12/2024] [Indexed: 06/04/2024]
Abstract
Time-dependent double hybrids with spin-component or spin-opposite scaling to their second-order perturbative correlation correction have demonstrated competitive robustness in the computation of electronic excitation energies. Some of the most robust are those recently published by our group (M. Casanova-Páez, L. Goerigk, J. Chem. Theory Comput. 2021, 20, 5165). So far, the implementation of these functionals has not allowed correctly calculating their ground-state total energies. Herein, we define their correct spin-scaled ground-state energy expressions which enables us to test our methods on the noncovalent excited-state interaction energies of four aromatic excimers. A range of 22 double hybrids with and without spin scaling are compared to the reasonably accurate wavefunction reference from our previous work (A. C. Hancock, L. Goerigk, RSC Adv. 2023, 13, 35964). The impact of spin scaling is highly dependent on the underlying functional expression, however, the smallest overall errors belong to spin-scaled functionals with range separation: SCS- and SOS- ω PBEPP86, and SCS-RSX-QIDH. We additionally determine parameters for DFT-D3(BJ)/D4 ground-state dispersion corrections of these functionals, which reduce errors in most cases. We highlight the necessity of dispersion corrections for even the most robust TD-DFT methods but also point out that ground-state based corrections are insufficient to completely capture dispersion effects for excited-state interaction energies.
Collapse
Affiliation(s)
- Amy C Hancock
- School of Chemistry, The University of Melbourne, Parkville, Victoria, Australia
| | - Erica Giudici
- School of Chemistry, The University of Melbourne, Parkville, Victoria, Australia
| | - Lars Goerigk
- School of Chemistry, The University of Melbourne, Parkville, Victoria, Australia
| |
Collapse
|
14
|
Mun H, Lorpaiboon W, Ho J. In Search of the Best Low-Cost Methods for Efficient Screening of Conformers. J Phys Chem A 2024; 128:4391-4400. [PMID: 38754085 DOI: 10.1021/acs.jpca.4c01407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024]
Abstract
Locating the lowest energy conformer is crucial for the accurate computation of equilibrium properties of molecular systems. This paper examines the performance of efficient low-cost methods in terms of the alignment and relative energies of their energy minima against the benchmark revDSD-PBEP86-D4/def2-TZVPP//MP2/cc-pVTZ potential energy surface. The low-cost methods considered include GFN-FF, GFN2-xTB, DFTB3, HF-3c, B97-3c, PBEh-3c, and r2SCAN-3c composite methods against a diverse test set of 20 compounds including alkanes, perfluoroalkyl molecules, peptides, open-shell radicals, and Zn(II) complexes of varying sizes. The "3c" composite methods are generally more accurate, but are at least 2-3 orders of magnitude more expensive than tight-binding methods which have energy minima that align well with the benchmark potential energy surface. The findings of this paper were further exploited to introduce a simple strategy involving Grimme's CENSO energy-sorting algorithm that resulted in up to an order of magnitude reduction in computational time for locating the lowest energy conformer on the revDSD-PBEP86-D4/def2-TZVPP//MP2/cc-pVTZ surface.
Collapse
Affiliation(s)
- Haedam Mun
- School of Chemistry, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Wanutcha Lorpaiboon
- School of Chemistry, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Junming Ho
- School of Chemistry, The University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
15
|
Ai WJ, Li J, Cao D, Liu S, Yuan YY, Li Y, Tan GS, Xu KP, Yu X, Kang F, Zou ZX, Wang WX. A Very Deep Graph Convolutional Network for 13C NMR Chemical Shift Calculations with Density Functional Theory Level Performance for Structure Assignment. JOURNAL OF NATURAL PRODUCTS 2024; 87:743-752. [PMID: 38359467 DOI: 10.1021/acs.jnatprod.3c00862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/17/2024]
Abstract
Nuclear magnetic resonance (NMR) chemical shift calculations are powerful tools for structure elucidation and have been extensively employed in both natural product and synthetic chemistry. However, density functional theory (DFT) NMR chemical shift calculations are usually time-consuming, while fast data-driven methods often lack reliability, making it challenging to apply them to computationally intensive tasks with a high requirement on quality. Herein, we have constructed a 54-layer-deep graph convolutional network for 13C NMR chemical shift calculations, which achieved high accuracy with low time-cost and performed competitively with DFT NMR chemical shift calculations on structure assignment benchmarks. Our model utilizes a semiempirical method, GFN2-xTB, and is compatible with a broad variety of organic systems, including those composed of hundreds of atoms or elements ranging from H to Rn. We used this model to resolve the controversial J/K ring junction problem of maitotoxin, which is the largest whole molecule assigned by NMR calculations to date. This model has been developed into user-friendly software, providing a useful tool for routine rapid structure validation and assignation as well as a new approach to elucidate the large structures that were previously unsuitable for NMR calculations.
Collapse
Affiliation(s)
- Wen-Jing Ai
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Jing Li
- Department of Pharmacy, National Clinical Research Center for Geriatric Disorder, in Xiangya Hospital, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Shao Liu
- Department of Pharmacy, National Clinical Research Center for Geriatric Disorder, in Xiangya Hospital, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Yi-Yun Yuan
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Yan Li
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Gui-Shan Tan
- Department of Pharmacy, National Clinical Research Center for Geriatric Disorder, in Xiangya Hospital, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Kang-Ping Xu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Xia Yu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Fenghua Kang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Zhen-Xing Zou
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Wen-Xuan Wang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
- Hunan Prima Drug Research Center Co., Ltd, Hunan Research Center for Drug Safety Evaluation, Hunan Key Laboratory of Pharmacodynamics and Safety Evaluation of New Drugs, Changsha, Hunan 410331, People's Republic of China
| |
Collapse
|
16
|
Pracht P, Grimme S, Bannwarth C, Bohle F, Ehlert S, Feldmann G, Gorges J, Müller M, Neudecker T, Plett C, Spicher S, Steinbach P, Wesołowski PA, Zeller F. CREST-A program for the exploration of low-energy molecular chemical space. J Chem Phys 2024; 160:114110. [PMID: 38511658 DOI: 10.1063/5.0197592] [Citation(s) in RCA: 39] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Accepted: 02/29/2024] [Indexed: 03/22/2024] Open
Abstract
Conformer-rotamer sampling tool (CREST) is an open-source program for the efficient and automated exploration of molecular chemical space. Originally developed in Pracht et al. [Phys. Chem. Chem. Phys. 22, 7169 (2020)] as an automated driver for calculations at the extended tight-binding level (xTB), it offers a variety of molecular- and metadynamics simulations, geometry optimization, and molecular structure analysis capabilities. Implemented algorithms include automated procedures for conformational sampling, explicit solvation studies, the calculation of absolute molecular entropy, and the identification of molecular protonation and deprotonation sites. Calculations are set up to run concurrently, providing efficient single-node parallelization. CREST is designed to require minimal user input and comes with an implementation of the GFNn-xTB Hamiltonians and the GFN-FF force-field. Furthermore, interfaces to any quantum chemistry and force-field software can easily be created. In this article, we present recent developments in the CREST code and show a selection of applications for the most important features of the program. An important novelty is the refactored calculation backend, which provides significant speed-up for sampling of small or medium-sized drug molecules and allows for more sophisticated setups, for example, quantum mechanics/molecular mechanics and minimum energy crossing point calculations.
Collapse
Affiliation(s)
- Philipp Pracht
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Stefan Grimme
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | - Christoph Bannwarth
- Institute for Physical Chemistry, RWTH Aachen University, Melatener Str. 20, 52056 Aachen, Germany
| | - Fabian Bohle
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | - Sebastian Ehlert
- AI4Science, Microsoft Research, Evert van de Beekstraat 354, 1118 CZ Schiphol, The Netherlands
| | - Gereon Feldmann
- Institute for Physical Chemistry, RWTH Aachen University, Melatener Str. 20, 52056 Aachen, Germany
| | - Johannes Gorges
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | - Marcel Müller
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | - Tim Neudecker
- Institute for Physical and Theoretical Chemistry, University of Bremen, 28359 Bremen, Germany
| | - Christoph Plett
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | | | - Pit Steinbach
- Institute for Physical Chemistry, RWTH Aachen University, Melatener Str. 20, 52056 Aachen, Germany
| | - Patryk A Wesołowski
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Felix Zeller
- Institute for Physical and Theoretical Chemistry, University of Bremen, 28359 Bremen, Germany
| |
Collapse
|
17
|
Plett C, Grimme S, Hansen A. Conformational energies of biomolecules in solution: Extending the MPCONF196 benchmark with explicit water molecules. J Comput Chem 2024; 45:419-429. [PMID: 37982322 DOI: 10.1002/jcc.27248] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 10/16/2023] [Accepted: 10/17/2023] [Indexed: 11/21/2023]
Abstract
A prerequisite for the computational prediction of molecular properties like conformational energies of biomolecules is a reliable, robust, and computationally affordable method usually selected according to its performance for relevant benchmark sets. However, most of these sets comprise molecules in the gas phase and do not cover interactions with a solvent, even though biomolecules typically occur in aqueous solution. To address this issue, we introduce a with explicit water molecules solvated version of a gas-phase benchmark set containing 196 conformers of 13 peptides and other relevant macrocycles, namely MPCONF196 [J. Řezáč et al., JCTC 2018, 14, 1254-1266], and provide very accurate PNO-LCCSD(T)-F12b/AVQZ' reference values. The novel solvMPCONF196 benchmark set features two additional challenges beyond the description of conformers in the gas phase: conformer-water and water-water interactions. The overall best performing method for this set is the double hybrid revDSDPBEP86-D4/def2-QZVPP yielding conformational energies of almost coupled cluster quality. Furthermore, some (meta-)GGAs and hybrid functionals like B97M-V and ω B97M-D with a large basis set reproduce the coupled cluster reference with an MAD below 1 kcal mol- 1 . If more efficient methods are required, the composite DFT-method r2 SCAN-3c (MAD of 1.2 kcal mol- 1 ) is a good alternative, and when conformational energies of polypeptides or macrocycles with more than 500-1000 atoms are in the focus, the semi-empirical GFN2-xTB or the MMFF94 force field (for very large systems) are recommended.
Collapse
Affiliation(s)
- Christoph Plett
- Mulliken Center for Theoretical Chemistry, Clausius-Institut für Physikalische und Theoretische Chemie, Universität Bonn, Bonn, Germany
| | - Stefan Grimme
- Mulliken Center for Theoretical Chemistry, Clausius-Institut für Physikalische und Theoretische Chemie, Universität Bonn, Bonn, Germany
| | - Andreas Hansen
- Mulliken Center for Theoretical Chemistry, Clausius-Institut für Physikalische und Theoretische Chemie, Universität Bonn, Bonn, Germany
| |
Collapse
|
18
|
Osifová Z, Kalvoda T, Galgonek J, Culka M, Vondrášek J, Bouř P, Bednárová L, Andrushchenko V, Dračínský M, Rulíšek L. What are the minimal folding seeds in proteins? Experimental and theoretical assessment of secondary structure propensities of small peptide fragments. Chem Sci 2024; 15:594-608. [PMID: 38179543 PMCID: PMC10763034 DOI: 10.1039/d3sc04960d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 11/22/2023] [Indexed: 01/06/2024] Open
Abstract
Certain peptide sequences, some of them as short as amino acid triplets, are significantly overpopulated in specific secondary structure motifs in folded protein structures. For example, 74% of the EAM triplet is found in α-helices, and only 3% occurs in the extended parts of proteins (typically β-sheets). In contrast, other triplets (such as VIV and IYI) appear almost exclusively in extended parts (79% and 69%, respectively). In order to determine whether such preferences are structurally encoded in a particular peptide fragment or appear only at the level of a complex protein structure, NMR, VCD, and ECD experiments were carried out on selected tripeptides: EAM (denoted as pro-'α-helical' in proteins), KAM(α), ALA(α), DIC(α), EKF(α), IYI(pro-β-sheet or more generally, pro-extended), and VIV(β), and the reference α-helical CATWEAMEKCK undecapeptide. The experimental data were in very good agreement with extensive quantum mechanical conformational sampling. Altogether, we clearly showed that the pro-helical vs. pro-extended propensities start to emerge already at the level of tripeptides and can be fully developed at longer sequences. We postulate that certain short peptide sequences can be considered minimal "folding seeds". Admittedly, the inherent secondary structure propensity can be overruled by the large intramolecular interaction energies within the folded and compact protein structures. Still, the correlation of experimental and computational data presented herein suggests that the secondary structure propensity should be considered as one of the key factors that may lead to understanding the underlying physico-chemical principles of protein structure and folding from the first principles.
Collapse
Affiliation(s)
- Zuzana Osifová
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
- Department of Organic Chemistry, Faculty of Science, Charles University Hlavova 2030 Prague 128 00 Czech Republic
| | - Tadeáš Kalvoda
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Jakub Galgonek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Martin Culka
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Jiří Vondrášek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Petr Bouř
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Lucie Bednárová
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Valery Andrushchenko
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Martin Dračínský
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Lubomír Rulíšek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| |
Collapse
|
19
|
Stylianakis I, Zervos N, Lii JH, Pantazis DA, Kolocouris A. Conformational energies of reference organic molecules: benchmarking of common efficient computational methods against coupled cluster theory. J Comput Aided Mol Des 2023; 37:607-656. [PMID: 37597063 PMCID: PMC10618395 DOI: 10.1007/s10822-023-00513-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 06/03/2023] [Indexed: 08/21/2023]
Abstract
We selected 145 reference organic molecules that include model fragments used in computer-aided drug design. We calculated 158 conformational energies and barriers using force fields, with wide applicability in commercial and free softwares and extensive application on the calculation of conformational energies of organic molecules, e.g. the UFF and DREIDING force fields, the Allinger's force fields MM3-96, MM3-00, MM4-8, the MM2-91 clones MMX and MM+, the MMFF94 force field, MM4, ab initio Hartree-Fock (HF) theory with different basis sets, the standard density functional theory B3LYP, the second-order post-HF MP2 theory and the Domain-based Local Pair Natural Orbital Coupled Cluster DLPNO-CCSD(T) theory, with the latter used for accurate reference values. The data set of the organic molecules includes hydrocarbons, haloalkanes, conjugated compounds, and oxygen-, nitrogen-, phosphorus- and sulphur-containing compounds. We reviewed in detail the conformational aspects of these model organic molecules providing the current understanding of the steric and electronic factors that determine the stability of low energy conformers and the literature including previous experimental observations and calculated findings. While progress on the computer hardware allows the calculations of thousands of conformations for later use in drug design projects, this study is an update from previous classical studies that used, as reference values, experimental ones using a variety of methods and different environments. The lowest mean error against the DLPNO-CCSD(T) reference was calculated for MP2 (0.35 kcal mol-1), followed by B3LYP (0.69 kcal mol-1) and the HF theories (0.81-1.0 kcal mol-1). As regards the force fields, the lowest errors were observed for the Allinger's force fields MM3-00 (1.28 kcal mol-1), ΜΜ3-96 (1.40 kcal mol-1) and the Halgren's MMFF94 force field (1.30 kcal mol-1) and then for the MM2-91 clones MMX (1.77 kcal mol-1) and MM+ (2.01 kcal mol-1) and MM4 (2.05 kcal mol-1). The DREIDING (3.63 kcal mol-1) and UFF (3.77 kcal mol-1) force fields have the lowest performance. These model organic molecules we used are often present as fragments in drug-like molecules. The values calculated using DLPNO-CCSD(T) make up a valuable data set for further comparisons and for improved force field parameterization.
Collapse
Affiliation(s)
- Ioannis Stylianakis
- Department of Medicinal Chemistry, Faculty of Pharmacy, National and Kapodistrian University of Athens, Panepistimioupolis Zografou, 15771, Athens, Greece
| | - Nikolaos Zervos
- Department of Medicinal Chemistry, Faculty of Pharmacy, National and Kapodistrian University of Athens, Panepistimioupolis Zografou, 15771, Athens, Greece
| | - Jenn-Huei Lii
- Department of Chemistry, National Changhua University of Education, Changhua City, Taiwan
| | - Dimitrios A Pantazis
- Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470, Mülheim an der Ruhr, Germany
| | - Antonios Kolocouris
- Department of Medicinal Chemistry, Faculty of Pharmacy, National and Kapodistrian University of Athens, Panepistimioupolis Zografou, 15771, Athens, Greece.
- Laboratory of Medicinal Chemistry, Section of Pharmaceutical Chemistry, Department of Pharmacy, National and Kapodistrian University of Athens, Panepistimiopolis-Zografou, 15771, Athens, Greece.
| |
Collapse
|
20
|
Brown SM, Mayer-Bacon C, Freeland S. Xeno Amino Acids: A Look into Biochemistry as We Do Not Know It. Life (Basel) 2023; 13:2281. [PMID: 38137883 PMCID: PMC10744825 DOI: 10.3390/life13122281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 11/18/2023] [Accepted: 11/20/2023] [Indexed: 12/24/2023] Open
Abstract
Would another origin of life resemble Earth's biochemical use of amino acids? Here, we review current knowledge at three levels: (1) Could other classes of chemical structure serve as building blocks for biopolymer structure and catalysis? Amino acids now seem both readily available to, and a plausible chemical attractor for, life as we do not know it. Amino acids thus remain important and tractable targets for astrobiological research. (2) If amino acids are used, would we expect the same L-alpha-structural subclass used by life? Despite numerous ideas, it is not clear why life favors L-enantiomers. It seems clearer, however, why life on Earth uses the shortest possible (alpha-) amino acid backbone, and why each carries only one side chain. However, assertions that other backbones are physicochemically impossible have relaxed into arguments that they are disadvantageous. (3) Would we expect a similar set of side chains to those within the genetic code? Many plausible alternatives exist. Furthermore, evidence exists for both evolutionary advantage and physicochemical constraint as explanatory factors for those encoded by life. Overall, as focus shifts from amino acids as a chemical class to specific side chains used by post-LUCA biology, the probable role of physicochemical constraint diminishes relative to that of biological evolution. Exciting opportunities now present themselves for laboratory work and computing to explore how changing the amino acid alphabet alters the universe of protein folds. Near-term milestones include: (a) expanding evidence about amino acids as attractors within chemical evolution; (b) extending characterization of other backbones relative to biological proteins; and (c) merging computing and laboratory explorations of structures and functions unlocked by xeno peptides.
Collapse
|
21
|
Vuong VQ, Aradi B, Niklasson AMN, Cui Q, Irle S. Multipole Expansion of Atomic Electron Density Fluctuation Interactions in the Density-Functional Tight-Binding Method. J Chem Theory Comput 2023; 19:7592-7605. [PMID: 37890454 PMCID: PMC10821749 DOI: 10.1021/acs.jctc.3c00778] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
The accuracy of the density-functional tight-binding (DFTB) method in describing noncovalent interactions is limited due to its reliance on monopole-based spherical charge densities. In this study, we present a multipole-extended second-order DFTB (mDFTB2) method that takes into account atomic dipole and quadrupole interactions. Furthermore, we combine the multipole expansion with the monopole-based third-order contribution, resulting in the mDFTB3 method. To assess the accuracy of mDFTB2 and mDFTB3, we evaluate their performance in describing noncovalent interactions, proton transfer barriers, and dipole moments. Our benchmark results show promising improvements even when using the existing electronic parameters optimized for the original DFTB3 model. Both mDFTB2 and mDFTB3 outperform their monopole-based counterparts, DFTB2 and DFTB3, in terms of accuracy. While mDFTB2 and mDFTB3 perform comparably for neutral and positively charged systems, mDFTB3 exhibits superior performance over mDFTB2 when dealing with negatively charged systems and proton transfers. Overall, the incorporation of the multipole expansion significantly enhances the accuracy of the DFTB method in describing noncovalent interactions and proton transfers.
Collapse
Affiliation(s)
- Van-Quan Vuong
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee, Knoxville, Tennessee 37996, United States
| | - Bálint Aradi
- Bremen Center for Computational Materials Science, Universität Bremen, Bremen 28359, Germany
| | - Anders M N Niklasson
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Qiang Cui
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
- Department of Physics, Boston University, Boston, Massachusetts 02215, United States
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215, United States
| | - Stephan Irle
- Computational Sciences & Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| |
Collapse
|
22
|
Boothroyd S, Behara PK, Madin OC, Hahn DF, Jang H, Gapsys V, Wagner JR, Horton JT, Dotson DL, Thompson MW, Maat J, Gokey T, Wang LP, Cole DJ, Gilson MK, Chodera JD, Bayly CI, Shirts MR, Mobley DL. Development and Benchmarking of Open Force Field 2.0.0: The Sage Small Molecule Force Field. J Chem Theory Comput 2023; 19:3251-3275. [PMID: 37167319 PMCID: PMC10269353 DOI: 10.1021/acs.jctc.3c00039] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Indexed: 05/13/2023]
Abstract
We introduce the Open Force Field (OpenFF) 2.0.0 small molecule force field for drug-like molecules, code-named Sage, which builds upon our previous iteration, Parsley. OpenFF force fields are based on direct chemical perception, which generalizes easily to highly diverse sets of chemistries based on substructure queries. Like the previous OpenFF iterations, the Sage generation of OpenFF force fields was validated in protein-ligand simulations to be compatible with AMBER biopolymer force fields. In this work, we detail the methodology used to develop this force field, as well as the innovations and improvements introduced since the release of Parsley 1.0.0. One particularly significant feature of Sage is a set of improved Lennard-Jones (LJ) parameters retrained against condensed phase mixture data, the first refit of LJ parameters in the OpenFF small molecule force field line. Sage also includes valence parameters refit to a larger database of quantum chemical calculations than previous versions, as well as improvements in how this fitting is performed. Force field benchmarks show improvements in general metrics of performance against quantum chemistry reference data such as root-mean-square deviations (RMSD) of optimized conformer geometries, torsion fingerprint deviations (TFD), and improved relative conformer energetics (ΔΔE). We present a variety of benchmarks for these metrics against our previous force fields as well as in some cases other small molecule force fields. Sage also demonstrates improved performance in estimating physical properties, including comparison against experimental data from various thermodynamic databases for small molecule properties such as ΔHmix, ρ(x), ΔGsolv, and ΔGtrans. Additionally, we benchmarked against protein-ligand binding free energies (ΔGbind), where Sage yields results statistically similar to previous force fields. All the data is made publicly available along with complete details on how to reproduce the training results at https://github.com/openforcefield/openff-sage.
Collapse
Affiliation(s)
| | - Pavan Kumar Behara
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
| | - Owen C. Madin
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - David F. Hahn
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Hyesu Jang
- Chemistry
Department, The University of California
at Davis, Davis, California 95616, United States
- OpenEye
Scientific Software, Santa
Fe, New Mexico 87508, United States
| | - Vytautas Gapsys
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
- Computational
Biomolecular Dynamics Group, Department of Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, Am Fassberg 11, D-37077, Göttingen, Germany
| | - Jeffrey R. Wagner
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
| | - Joshua T. Horton
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - David L. Dotson
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
- Datryllic LLC, Phoenix, Arizona 85003, United
States
| | - Matthew W. Thompson
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
| | - Jessica Maat
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| | - Trevor Gokey
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| | - Lee-Ping Wang
- Chemistry
Department, The University of California
at Davis, Davis, California 95616, United States
| | - Daniel J. Cole
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - Michael K. Gilson
- Skaggs
School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - John D. Chodera
- Computational
& Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | | | - Michael R. Shirts
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - David L. Mobley
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| |
Collapse
|
23
|
Kříž K, Schmidt L, Andersson AT, Walz MM, van der Spoel D. An Imbalance in the Force: The Need for Standardized Benchmarks for Molecular Simulation. J Chem Inf Model 2023; 63:412-431. [PMID: 36630710 PMCID: PMC9875315 DOI: 10.1021/acs.jcim.2c01127] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Indexed: 01/12/2023]
Abstract
Force fields (FFs) for molecular simulation have been under development for more than half a century. As with any predictive model, rigorous testing and comparisons of models critically depends on the availability of standardized data sets and benchmarks. While such benchmarks are rather common in the fields of quantum chemistry, this is not the case for empirical FFs. That is, few benchmarks are reused to evaluate FFs, and development teams rather use their own training and test sets. Here we present an overview of currently available tests and benchmarks for computational chemistry, focusing on organic compounds, including halogens and common ions, as FFs for these are the most common ones. We argue that many of the benchmark data sets from quantum chemistry can in fact be reused for evaluating FFs, but new gas phase data is still needed for compounds containing phosphorus and sulfur in different valence states. In addition, more nonequilibrium interaction energies and forces, as well as molecular properties such as electrostatic potentials around compounds, would be beneficial. For the condensed phases there is a large body of experimental data available, and tools to utilize these data in an automated fashion are under development. If FF developers, as well as researchers in artificial intelligence, would adopt a number of these data sets, it would become easier to compare the relative strengths and weaknesses of different models and to, eventually, restore the balance in the force.
Collapse
Affiliation(s)
- Kristian Kříž
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Lisa Schmidt
- Faculty
of Biosciences, University of Heidelberg, Heidelberg69117, Germany
| | - Alfred T. Andersson
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Marie-Madeleine Walz
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - David van der Spoel
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| |
Collapse
|
24
|
Müller M, Hansen A, Grimme S. ωB97X-3c: A composite range-separated hybrid DFT method with a molecule-optimized polarized valence double-ζ basis set. J Chem Phys 2023; 158:014103. [PMID: 36610980 DOI: 10.1063/5.0133026] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
A new composite density functional theory (DFT) method is presented. It is based on ωB97X-V as one of the best-performing density functionals for the GMTKN55 thermochemistry database and completes the family of "3c" methods toward range-separated hybrid DFT. This method is consistently available for all elements up to Rn (Z = 1-86). Its further key ingredients are a polarized valence double-ζ (vDZP) Gaussian basis set, which was fully optimized in molecular DFT calculations, in combination with large-core effective core potentials and a specially adapted D4 dispersion correction. Unlike most existing double-ζ atomic orbital sets, vDZP shows only small basis set superposition errors (BSSEs) and can compete with standard sets of triple-ζ quality. Small residual BSSE effects are efficiently absorbed by the D4 damping scheme, which overall eliminates the need for an explicit treatment or empirical corrections for BSSE. Thorough tests on a variety of thermochemistry benchmark sets show that the new composite method, dubbed ωB97X-3c, is on par with or even outperforms standard hybrid DFT methods in a quadruple-zeta basis set at a small fraction of the computational cost. Particular strengths of this method are the description of non-covalent interactions and barrier heights, for which it is among the best-performing density functionals overall.
Collapse
Affiliation(s)
- Marcel Müller
- Mulliken Center for Theoretical Chemistry, Clausius-Institut für Physikalische und Theoretische Chemie, Rheinische Friedrich-Wilhelms Universität Bonn, Beringstraße 4, 53115 Bonn, Germany
| | - Andreas Hansen
- Mulliken Center for Theoretical Chemistry, Clausius-Institut für Physikalische und Theoretische Chemie, Rheinische Friedrich-Wilhelms Universität Bonn, Beringstraße 4, 53115 Bonn, Germany
| | - Stefan Grimme
- Mulliken Center for Theoretical Chemistry, Clausius-Institut für Physikalische und Theoretische Chemie, Rheinische Friedrich-Wilhelms Universität Bonn, Beringstraße 4, 53115 Bonn, Germany
| |
Collapse
|
25
|
Staś M, Najgebauer P, Siodłak D. Imidazole-amino acids. Conformational switch under tautomer and pH change. Amino Acids 2023; 55:33-49. [PMID: 36319875 PMCID: PMC9877100 DOI: 10.1007/s00726-022-03201-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 08/16/2022] [Indexed: 01/26/2023]
Abstract
Replacement of the main chain peptide bond by imidazole ring seems to be a promising tool for the peptide-based drug design, due to the specific prototropic tautomeric as well as amphoteric properties. In this study, we present that both tautomer and pH change can cause a conformational switch of the studied residues of alanine (1-4) and dehydroalanine (5-8) with the C-terminal peptide group replaced by imidazole. The DFT methods are applied and an environment of increasing polarity is simulated. The conformational maps (Ramachandram diagrams) are presented and the stability of possible conformations is discussed. The neutral forms, tautomers τ (1) and π (2), adapt the conformations αRτ (φ, ψ = - 75°, - 114°) and C7eq (φ, ψ = - 75°, 66°), respectively. Their torsion angles ψ differ by about 180°, which results in a considerable impact on the peptide chain conformation. The cation form (3) adapts both these conformations, whereas the anion analogue (4) prefers the conformations C5 (φ, ψ = - 165°, - 178°) and β2 (φ, ψ ~ - 165°, - 3°). Dehydroamino acid analogues, the tautomers τ (5) and π (6) as well as the anion form (8), have a strong tendency toward the conformations β2 (φ, ψ = - 179°, 0°) and C5 (φ, ψ = - 180°, 180°). The preferences of the protonated imidazolium form (7) depend on the environment. The imidazole ring, acting as a donor or acceptor of the hydrogen bonds created within the studied residues, has a profound effect on the type of conformation.
Collapse
Affiliation(s)
- Monika Staś
- Faculty of Chemistry, University of Opole, 45-052, Opole, Poland.
| | - Piotr Najgebauer
- Faculty of Chemistry, University of Opole, 45-052, Opole, Poland
| | - Dawid Siodłak
- Faculty of Chemistry, University of Opole, 45-052, Opole, Poland
| |
Collapse
|
26
|
D’Amore L, Hahn DF, Dotson DL, Horton JT, Anwar J, Craig I, Fox T, Gobbi A, Lakkaraju SK, Lucas X, Meier K, Mobley DL, Narayanan A, Schindler CE, Swope WC, in ’t Veld PJ, Wagner J, Xue B, Tresadern G. Collaborative Assessment of Molecular Geometries and Energies from the Open Force Field. J Chem Inf Model 2022; 62:6094-6104. [PMID: 36433835 PMCID: PMC9873353 DOI: 10.1021/acs.jcim.2c01185] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Force fields form the basis for classical molecular simulations, and their accuracy is crucial for the quality of, for instance, protein-ligand binding simulations in drug discovery. The huge diversity of small-molecule chemistry makes it a challenge to build and parameterize a suitable force field. The Open Force Field Initiative is a combined industry and academic consortium developing a state-of-the-art small-molecule force field. In this report, industry members of the consortium worked together to objectively evaluate the performance of the force fields (referred to here as OpenFF) produced by the initiative on a combined public and proprietary dataset of 19,653 relevant molecules selected from their internal research and compound collections. This evaluation was important because it was completely blind; at most partners, none of the molecules or data were used in force field development or testing prior to this work. We compare the Open Force Field "Sage" version 2.0.0 and "Parsley" version 1.3.0 with GAFF-2.11-AM1BCC, OPLS4, and SMIRNOFF99Frosst. We analyzed force-field-optimized geometries and conformer energies compared to reference quantum mechanical data. We show that OPLS4 performs best, and the latest Open Force Field release shows a clear improvement compared to its predecessors. The performance of established force fields such as GAFF-2.11 was generally worse. While OpenFF researchers were involved in building the benchmarking infrastructure used in this work, benchmarking was done entirely in-house within industrial organizations and the resulting assessment is reported here. This work assesses the force field performance using separate benchmarking steps, external datasets, and involving external research groups. This effort may also be unique in terms of the number of different industrial partners involved, with 10 different companies participating in the benchmark efforts.
Collapse
Affiliation(s)
- Lorenzo D’Amore
- Computational Chemistry, Janssen R&D, C/ Jarama 75A, 45007 Toledo, Spain
| | - David F. Hahn
- Computational Chemistry, Janssen R&D, Turnhoutseweg 30, Beerse B-2340, Belgium
| | - David L. Dotson
- The Open Force Field Initiative, Open Molecular Software Foundation, Davis, California 95616, USA
| | - Joshua T. Horton
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Jamshed Anwar
- Department of Chemistry, Lancaster University, Lancaster LA1 4YW, UK
| | - Ian Craig
- Molecular Modeling & Drug Discovery, BASF SE, 67056 Ludwigshafen, Germany
| | - Thomas Fox
- Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co KG, 88397 Biberach/Riss, Germany
| | - Alberto Gobbi
- Genentech, Inc., 1 DNA Way, South San Francisco, California, 94080, USA
| | | | - Xavier Lucas
- Roche Pharma Research and Early Development, Therapeutic Modalities, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Grenzacherstrasse 124, 4070 Basel, Switzerland
| | - Katharina Meier
- Computational Life Science Technology Functions, Crop Science, R&D, Bayer AG, 40789 Monheim, Germany
| | - David L. Mobley
- Departments of Pharmaceutical Sciences and Chemistry, University of California 92617, Irvine, USA
| | - Arjun Narayanan
- Data and Computational Sciences, Vertex Pharmaceuticals, 50 Northern Ave, Boston, MA 02210, USA
| | | | - William C. Swope
- Genentech, Inc., 1 DNA Way, South San Francisco, California, 94080, USA
| | | | - Jeffrey Wagner
- The Open Force Field Initiative, Open Molecular Software Foundation, Davis, California, 95616, USA,Chemistry Department, The University of California at Irvine, Irvine, California, 92617, USA
| | - Bai Xue
- XtalPi Inc. Floor 3, International Biomedical Innovation Park II, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen, Guangdong, 518040 China
| | - Gary Tresadern
- Computational Chemistry, Janssen R&D, Turnhoutseweg 30, Beerse B-2340, Belgium
| |
Collapse
|
27
|
Shi X, Tang R, Dong Z, Liu H, Xu F, Zhang Q, Zong W, Cheng J. A neglected pathway for the accretion products formation in the atmosphere. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 848:157494. [PMID: 35914590 DOI: 10.1016/j.scitotenv.2022.157494] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/09/2022] [Accepted: 07/15/2022] [Indexed: 06/15/2023]
Abstract
Highly oxygenated organic molecules (HOM) formed by the autoxidation of α-pinene initiated by OH radicals play an important role in new particle formation. It is believed that the accretion products, ROOR´, formed by the self- and cross-reaction of peroxy radicals (RO2 + R'O2 reactions), have extremely low volatility and are more likely to participate in nucleation. However, the mechanism of ROOR´ formation has not been fully demonstrated by experiment or theoretical calculation. Herein, we propose a novel mechanism of RO2 reacting with α-pinene (RO2 + α-pinene reactions) that have much lower potential barriers and larger rate constants than the reaction of RO2 with R'O2, which explains the ROOR´ formation found in the mass spectrometry experiments. The ROOR´ resulting from the reaction of RO2 with α-pinene can produce HOM dimers and trimers with a higher oxygen-to‑carbon (O/C) ratio through a autoxidation chain. We also demonstrated that the presence of NOx and HO2 radical will reduce the RO2 concentration, but cannot completely inhibit the formation of HOM monomers and ROOR´. Even if one or both of RO2 radicals are acyl peroxy radicals (RC(O)O2), the potential barriers of the reactions between RC(O)O2 and α-pinene (RC(O)O2 + α-pinene reactions) are lower than that of RO2 reacting with RC(O)O2 (RO2 + RC(O)O2 reactions) or RC(O)O2 self-reactions (RC(O)O2 + RC(O)O2 reactions). The current work revealed, for the first time, a mechanism of RO2/RC(O)O2 reacting with α-pinene in the atmosphere, which provides new insight into the atmospheric chemistry of accretion products as SOA precursors.
Collapse
Affiliation(s)
- Xiangli Shi
- College of Geography and Environment, Shandong Normal University, Jinan 250014, PR China
| | - Ruoyu Tang
- College of Geography and Environment, Shandong Normal University, Jinan 250014, PR China
| | - Zuokang Dong
- College of Geography and Environment, Shandong Normal University, Jinan 250014, PR China
| | - Houfeng Liu
- College of Geography and Environment, Shandong Normal University, Jinan 250014, PR China
| | - Fei Xu
- Environment Research Institute, Shandong University, Qingdao 266237, PR China
| | - Qingzhu Zhang
- Environment Research Institute, Shandong University, Qingdao 266237, PR China
| | - Wansong Zong
- College of Geography and Environment, Shandong Normal University, Jinan 250014, PR China.
| | - Jiemin Cheng
- College of Geography and Environment, Shandong Normal University, Jinan 250014, PR China
| |
Collapse
|
28
|
Kalvoda T, Culka M, Rulíšek L, Andris E. Exhaustive Mapping of the Conformational Space of Natural Dipeptides by the DFT-D3//COSMO-RS Method. J Phys Chem B 2022; 126:5949-5958. [PMID: 35930560 DOI: 10.1021/acs.jpcb.2c02861] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We extensively mapped energy landscapes and conformations of 22 (including three His protonation states) proteinogenic α-amino acids in trans configuration and the corresponding 484 (222) dipeptides. To mimic the environment in a protein chain, the N- and C-termini of the studied systems were capped with acetyl and N-methylamide groups, respectively. We systematically varied the main chain dihedral angles (ϕ, ψ) by 40° steps and all side chain angles by 90° or 120° steps. We optimized the molecular geometries with the GFN2-xTB semiempirical (SQM) method and performed single point density functional theory calculations at the BP86-D3/DGauss-DZVP//COSMO-RS level in water, 1-octanol, N,N-dimethylformamide, and n-hexane. For each restrained (nonequilibrium) structure, we also calculated energy gradients (in water) and natural atomic charges. The exhaustive and unprecedented QM-based sampling enabled us to construct Ramachandran plots of quantum mechanical (QM(BP86-D3)//COSMO-RS) energies calculated on SQM structures, for all 506 (484 dipeptides and 22 amino acids) studied systems. We showed how the character of an amino acid side chain influences the conformational space of single amino acids and dipeptides. With clustering techniques, we were able to identify unique minima of amino acids and dipeptides (i.e., minima on the GFN2-xTB potential energy surfaces) and analyze the distribution of their BP86-D3//COSMO-RS conformational energies in all four solvents. We also derived an empirical formula for the number of unique minima based on the overall number of rotatable bonds within each peptide. The final peptide conformer data set (PeptideCs) comprises over 400 million structures, all of them annotated with QM(BP86-D3)//COSMO-RS energies. Thanks to its completeness and unbiased nature, the PeptideCs can serve, inter alia, as a data set for the validation of new methods for predicting the energy landscapes of protein structures. This data set may also prove to be useful in the development and reparameterization of biomolecular force fields. The data set is deposited at Figshare (10.25452/figshare.plus.19607172) and can be accessed using a simple web interface at http://peptidecs.uochb.cas.cz.
Collapse
Affiliation(s)
- Tadeáš Kalvoda
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha, Czech Republic
| | - Martin Culka
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha, Czech Republic
| | - Lubomír Rulíšek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha, Czech Republic
| | - Erik Andris
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha, Czech Republic
| |
Collapse
|
29
|
Conformation and structural features of diuron and irgarol: insights from quantum chemistry calculations. COMPUT THEOR CHEM 2022. [DOI: 10.1016/j.comptc.2022.113844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
30
|
Ringrose C, Horton JT, Wang LP, Cole DJ. Exploration and validation of force field design protocols through QM-to-MM mapping. Phys Chem Chem Phys 2022; 24:17014-17027. [PMID: 35792069 DOI: 10.1039/d2cp02864f] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The scale of the parameter optimisation problem in traditional molecular mechanics force field construction means that design of a new force field is a long process, and sub-optimal choices made in the early stages can persist for many generations. We hypothesise that careful use of quantum mechanics to inform molecular mechanics parameter derivation (QM-to-MM mapping) should be used to significantly reduce the number of parameters that require fitting to experiment and increase the pace of force field development. Here, we design and train a collection of 15 new protocols for small, organic molecule force field derivation, and test their accuracy against experimental liquid properties. Our best performing model has only seven fitting parameters, yet achieves mean unsigned errors of just 0.031 g cm-3 and 0.69 kcal mol-1 in liquid densities and heats of vaporisation, compared to experiment. The software required to derive the designed force fields is freely available at https://github.com/qubekit/QUBEKit.
Collapse
Affiliation(s)
- Chris Ringrose
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, UK.
| | - Joshua T Horton
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, UK.
| | - Lee-Ping Wang
- Department of Chemistry, The University of California at Davis, Davis, California 95616, USA
| | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, UK.
| |
Collapse
|
31
|
Hu X, Lenz-Himmer MO, Baldauf C. Better force fields start with better data: A data set of cation dipeptide interactions. Sci Data 2022; 9:327. [PMID: 35715420 PMCID: PMC9205945 DOI: 10.1038/s41597-022-01297-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Accepted: 03/18/2022] [Indexed: 11/08/2022] Open
Abstract
We present a data set from a first-principles study of amino-methylated and acetylated (capped) dipeptides of the 20 proteinogenic amino acids - including alternative possible side chain protonation states and their interactions with selected divalent cations (Ca2+, Mg2+ and Ba2+). The data covers 21,909 stationary points on the respective potential-energy surfaces in a wide relative energy range of up to 4 eV (390 kJ/mol). Relevant properties of interest, like partial charges, were derived for the conformers. The motivation was to provide a solid data basis for force field parameterization and further applications like machine learning or benchmarking. In particular the process of creating all this data on the same first-principles footing, i.e. density-functional theory calculations employing the generalized gradient approximation with a van der Waals correction, makes this data suitable for first principles data-driven force field development. To make the data accessible across domain borders and to machines, we formalized the metadata in an ontology.
Collapse
Affiliation(s)
- Xiaojuan Hu
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195, Berlin, Germany.
| | | | - Carsten Baldauf
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195, Berlin, Germany.
| |
Collapse
|
32
|
Gasevic T, Stückrath JB, Grimme S, Bursch M. Optimization of the r 2SCAN-3c Composite Electronic-Structure Method for Use with Slater-Type Orbital Basis Sets. J Phys Chem A 2022; 126:3826-3838. [PMID: 35654439 PMCID: PMC9255700 DOI: 10.1021/acs.jpca.2c02951] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The "Swiss army knife" composite density functional electronic-structure method r2SCAN-3c (J. Chem. Phys. 2021, 154, 064103) is extended and optimized for the use with Slater-type orbital basis sets. The meta generalized-gradient approximation (meta-GGA) functional r2SCAN by Furness et al. is combined with a tailor-made polarized triple-ζ Slater-type atomic orbital (STO) basis set (mTZ2P), the semiclassical London dispersion correction (D4), and a geometrical counterpoise (gCP) correction. Relativistic effects are treated explicitly with the scalar-relativistic zeroth-order regular approximation (SR-ZORA). The performance of the new implementation is assessed on eight geometry and 74 energy benchmark sets, including the extensive GMTKN55 database as well as recent sets such as ROST61 and IONPI19. In geometry optimizations, the STO-based r2SCAN-3c is either on par with or more accurate than the hybrid density functional approximation M06-2X-D3(0)/TZP. In energy calculations, the overall accuracy is similar to the original implementation of r2SCAN-3c with Gaussian-type atomic orbitals (GTO), but basic properties, intermolecular noncovalent interactions, and barrier heights are better described with the STO approach, resulting in a lower weighted mean absolute deviation (WTMAD-2(STO) = 7.15 vs 7.50 kcal mol-1 with the original method) for the GMTKN55 database. The STO-optimized r2SCAN-3c outperforms many conventional hybrid/QZ approaches in most common applications at a fraction of their cost. The reliable, robust, and accurate r2SCAN-3c implementation with STOs is a promising alternative to the original implementation with GTOs and can be generally used for a broad field of quantum chemical problems.
Collapse
Affiliation(s)
- Thomas Gasevic
- Mulliken Center for Theoretical Chemistry, Universität Bonn, Beringstr. 4, D-53115 Bonn, Germany
| | - Julius B Stückrath
- Mulliken Center for Theoretical Chemistry, Universität Bonn, Beringstr. 4, D-53115 Bonn, Germany
| | - Stefan Grimme
- Mulliken Center for Theoretical Chemistry, Universität Bonn, Beringstr. 4, D-53115 Bonn, Germany
| | - Markus Bursch
- Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, D-45470 Mülheim an der Ruhr, Germany
| |
Collapse
|
33
|
Chan B, Dawson W, Nakajima T. Searching for a Reliable Density Functional for Molecule-Environment Interactions, Found B97M-V/def2-mTZVP. J Phys Chem A 2022; 126:2397-2406. [PMID: 35390254 DOI: 10.1021/acs.jpca.2c02032] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In the present study, we have examined density functional theory methods for the calculation of the interaction energy between a small molecule and its environment. For simple systems such as a neutral solute in a neutral solvent, good accuracy can be attained using low-cost "3c" methods, in particular r2SCAN-3c. When part(s) of the system is charged, the accurate computation of the interactions is more challenging. In these cases, we find the B97M-V/def2-mTZVP method to agree well with reference values; it also shows good accuracy for the more straightforward neutral systems. Thus, B97M-V/def2-mTZVP provides a means for accurate and low-cost computation of interaction energies, notably the binding between a substrate or a drug molecule and an enzyme, which may facilitate rational drug design.
Collapse
Affiliation(s)
- Bun Chan
- Graduate School of Engineering, Nagasaki University, Bunkyo 1-14, Nagasaki 852-8521, Japan
| | - William Dawson
- RIKEN Center for Computational Science, 7-1-26, Minatojima-minami-machi, Chuo-ku, Kobe 650-0047, Japan
| | - Takahito Nakajima
- RIKEN Center for Computational Science, 7-1-26, Minatojima-minami-machi, Chuo-ku, Kobe 650-0047, Japan
| |
Collapse
|
34
|
Prasad VK, Otero-de-la-Roza A, DiLabio GA. Small-Basis Set Density-Functional Theory Methods Corrected with Atom-Centered Potentials. J Chem Theory Comput 2022; 18:2913-2930. [PMID: 35412817 DOI: 10.1021/acs.jctc.2c00036] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Density functional theory (DFT) is currently the most popular method for modeling noncovalent interactions and thermochemistry. The accurate calculation of noncovalent interaction energies, reaction energies, and barrier heights requires choosing an appropriate functional and, typically, a relatively large basis set. Deficiencies of the density-functional approximation and the use of a limited basis set are the leading sources of error in the calculation of noncovalent and thermochemical properties in molecular systems. In this article, we present three new DFT methods based on the BLYP, M06-2X, and CAM-B3LYP functionals in combination with the 6-31G* basis set and corrected with atom-centered potentials (ACPs). ACPs are one-electron potentials that have the same form as effective-core potentials, except they do not replace any electrons. The ACPs developed in this work are used to generate energy corrections to the underlying DFT/basis-set method such that the errors in predicted chemical properties are minimized while maintaining the low computational cost of the parent methods. ACPs were developed for the elements H, B, C, N, O, F, Si, P, S, and Cl. The ACP parameters were determined using an extensive training set of 118655 data points, mostly of complete basis set coupled-cluster level quality. The target molecular properties for the ACP-corrected methods include noncovalent interaction energies, molecular conformational energies, reaction energies, barrier heights, and bond separation energies. The ACPs were tested first on the training set and then on a validation set of 42567 additional data points. We show that the ACP-corrected methods can predict the target molecular properties with accuracy close to complete basis set wavefunction theory methods, but at a computational cost of double-ζ DFT methods. This makes the new BLYP/6-31G*-ACP, M06-2X/6-31G*-ACP, and CAM-B3LYP/6-31G*-ACP methods uniquely suited to the calculation of noncovalent, thermochemical, and kinetic properties in large molecular systems.
Collapse
Affiliation(s)
- Viki Kumar Prasad
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia V1V 1V7, Canada
| | - Alberto Otero-de-la-Roza
- Departamento de Química Física y Analítica, Facultad de Química, Universidad de Oviedo, MALTA Consolider Team, Oviedo E-33006, Spain
| | - Gino A DiLabio
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia V1V 1V7, Canada
| |
Collapse
|
35
|
Prasad VK, Otero-de-la-Roza A, DiLabio GA. Fast and Accurate Quantum Mechanical Modeling of Large Molecular Systems Using Small Basis Set Hartree-Fock Methods Corrected with Atom-Centered Potentials. J Chem Theory Comput 2022; 18:2208-2232. [PMID: 35313106 DOI: 10.1021/acs.jctc.1c01128] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
There has been significant interest in developing fast and accurate quantum mechanical methods for modeling large molecular systems. In this work, by utilizing a machine learning regression technique, we have developed new low-cost quantum mechanical approaches to model large molecular systems. The developed approaches rely on using one-electron Gaussian-type functions called atom-centered potentials (ACPs) to correct for the basis set incompleteness and the lack of correlation effects in the underlying minimal or small basis set Hartree-Fock (HF) methods. In particular, ACPs are proposed for ten elements common in organic and bioorganic chemistry (H, B, C, N, O, F, Si, P, S, and Cl) and four different base methods: two minimal basis sets (MINIs and MINIX) plus a double-ζ basis set (6-31G*) in combination with dispersion-corrected HF (HF-D3/MINIs, HF-D3/MINIX, HF-D3/6-31G*) and the HF-3c method. The new ACPs are trained on a very large set (73 832 data points) of noncovalent properties (interaction and conformational energies) and validated additionally on a set of 32 048 data points. All reference data are of complete basis set coupled-cluster quality, mostly CCSD(T)/CBS. The proposed ACP-corrected methods are shown to give errors in the tenths of a kcal/mol range for noncovalent interaction energies and up to 2 kcal/mol for molecular conformational energies. More importantly, the average errors are similar in the training and validation sets, confirming the robustness and applicability of these methods outside the boundaries of the training set. In addition, the performance of the new ACP-corrected methods is similar to complete basis set density functional theory (DFT) but at a cost that is orders of magnitude lower, and the proposed ACPs can be used in any computational chemistry program that supports effective-core potentials without modification. It is also shown that ACPs improve the description of covalent and noncovalent bond geometries of the underlying methods and that the improvement brought about by the application of the ACPs is directly related to the number of atoms to which they are applied, allowing the treatment of systems containing some atoms for which ACPs are not available. Overall, the ACP-corrected methods proposed in this work constitute an alternative accurate, economical, and reliable quantum mechanical approach to describe the geometries, interaction energies, and conformational energies of systems with hundreds to thousands of atoms.
Collapse
Affiliation(s)
- Viki Kumar Prasad
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| | - Alberto Otero-de-la-Roza
- MALTA Consolider Team, Departamento de Química Física y Analítica, Facultad de Química, Universidad de Oviedo, E-33006 Oviedo, Spain
| | - Gino A DiLabio
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| |
Collapse
|
36
|
Wang P, Shu C, Ye H, Biczysko M. Structural and Energetic Properties of Amino Acids and Peptides Benchmarked by Accurate Theoretical and Experimental Data. J Phys Chem A 2021; 125:9826-9837. [PMID: 34752094 DOI: 10.1021/acs.jpca.1c06504] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Structural, energetic, and spectroscopic data derived in this work aim at the setup of an "experimentally validated" database for amino acids and polypeptides conformers. First, the "cheap" composite scheme (ChS, CCSD(T)/(CBS+CV)MP2) is tested for evaluation of conformational energies of all eight stable conformers of glycine, by comparing to the more accurate CCSD(T)/CBS+CV computations (Phys. Chem. Chem. Phys. 2013, 15, 10094-10111 and J Mol. Model. 2020, 26, 129). The recently proposed jun-ChS (J. Chem. Theory and Comput. 2020, 16, 988-1006), employing the jun-cc-pVnZ basis set family for CCSD(T) computations and CBS extrapolation, yields conformational energies accurate to 0.2 kJ·mol-1, at reduced computational cost with respect to aug-ChS employing aug-cc-pVnZ basis sets. The jun-ChS composite scheme is further applied to derive conformational energies for three dipeptide analogues Ac-Gly-NH2, Ac-Ala-NH2, and Gly-Gly. Finally, dipeptide conformational energies and semiexperimental equilibrium rotational constants along with the CCSD(T)/(CBS+CV)MP2 structural parameters (J. Phys. Chem. Lett. 2014, 5, 534-540) stand as the reference for benchmarking of selected density functional methodologies. The double-hybrid functionals B2-PLYP-D3(BJ) and DSD-PBEP86, perform best for structural and energetic characterization of all dipeptide analogues. From hybrid functionals CAM-B3LYP-D3(BJ) and ωB97X-D3(BJ) represent promising methods applicable for larger peptide-based systems for which computations with double-hybrid functionals are not feasible.
Collapse
Affiliation(s)
- Ping Wang
- International Centre for Quantum and Molecular Structures, Physics Department, College of Science, Shanghai University, 99 Shangda Road, Shanghai 200444, China
| | - Chong Shu
- International Centre for Quantum and Molecular Structures, Physics Department, College of Science, Shanghai University, 99 Shangda Road, Shanghai 200444, China
| | - Hexu Ye
- International Centre for Quantum and Molecular Structures, Physics Department, College of Science, Shanghai University, 99 Shangda Road, Shanghai 200444, China
| | - Malgorzata Biczysko
- International Centre for Quantum and Molecular Structures, Physics Department, College of Science, Shanghai University, 99 Shangda Road, Shanghai 200444, China
| |
Collapse
|
37
|
Qiu Y, Smith DGA, Boothroyd S, Jang H, Hahn DF, Wagner J, Bannan CC, Gokey T, Lim VT, Stern CD, Rizzi A, Tjanaka B, Tresadern G, Lucas X, Shirts MR, Gilson MK, Chodera JD, Bayly CI, Mobley DL, Wang LP. Development and Benchmarking of Open Force Field v1.0.0-the Parsley Small-Molecule Force Field. J Chem Theory Comput 2021; 17:6262-6280. [PMID: 34551262 PMCID: PMC8511297 DOI: 10.1021/acs.jctc.1c00571] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
We present a methodology for defining and optimizing a general force field for classical molecular simulations, and we describe its use to derive the Open Force Field 1.0.0 small-molecule force field, codenamed Parsley. Rather than using traditional atom typing, our approach is built on the SMIRKS-native Open Force Field (SMIRNOFF) parameter assignment formalism, which handles increases in the diversity and specificity of the force field definition without needlessly increasing the complexity of the specification. Parameters are optimized with the ForceBalance tool, based on reference quantum chemical data that include torsion potential energy profiles, optimized gas-phase structures, and vibrational frequencies. These quantum reference data are computed and are maintained with QCArchive, an open-source and freely available distributed computing and database software ecosystem. In this initial application of the method, we present essentially a full optimization of all valence parameters and report tests of the resulting force field against compounds and data types outside the training set. These tests show improvements in optimized geometries and conformational energetics and demonstrate that Parsley's accuracy for liquid properties is similar to that of other general force fields, as is accuracy on binding free energies. We find that this initial Parsley force field affords accuracy similar to that of other general force fields when used to calculate relative binding free energies spanning 199 protein-ligand systems. Additionally, the resulting infrastructure allows us to rapidly optimize an entirely new force field with minimal human intervention.
Collapse
Affiliation(s)
- Yudong Qiu
- Chemistry Department, The University of California at Davis, Davis, California 95616, United States
| | - Daniel G A Smith
- The Molecular Sciences Software Institute (MolSSI), Blacksburg, Virginia 24060, United States
| | - Simon Boothroyd
- Computational & Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Hyesu Jang
- Chemistry Department, The University of California at Davis, Davis, California 95616, United States
| | - David F Hahn
- Computational Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Jeffrey Wagner
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
| | - Caitlin C Bannan
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - Trevor Gokey
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
| | - Victoria T Lim
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
| | - Chaya D Stern
- Computational & Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Andrea Rizzi
- Computational & Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
- Tri-Institutional Training Program in Computational Biology and Medicine, New York, New York 10065, United States
| | - Bryon Tjanaka
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
| | - Gary Tresadern
- Computational Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Xavier Lucas
- F. Hoffmann-La Roche AG, Basel 4070, Switzerland
| | - Michael R Shirts
- Chemical & Biological Engineering Department, The University of Colorado at Boulder, Boulder, Colorado 80309, United States
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - John D Chodera
- Computational & Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | | | - David L Mobley
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
| | - Lee-Ping Wang
- Chemistry Department, The University of California at Davis, Davis, California 95616, United States
| |
Collapse
|
38
|
Chen J, Kato J, Harper JB, Shao Y, Ho J. On the Accuracy of QM/MM Models: A Systematic Study of Intramolecular Proton Transfer Reactions of Amino Acids in Water. J Phys Chem B 2021; 125:9304-9316. [PMID: 34355564 DOI: 10.1021/acs.jpcb.1c04876] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
This work presents a systematic assessment of QM/QM' and QM/MM models with respect to direct QM calculations for the tautomerization (neutral to zwitterion) reactions of amino acids (glycine, alanine, valine, aspartate, and neutral and protonated histidine) solvated in a 160 water cluster. The effect of varying QM region size and choice of embedding potentials, including fixed-charge and polarizable molecular mechanics force fields (TIP3P and EFP) and various semiempirical QM methods (PM7, GFN2-xTB, DFTBA, DFTB3, HF-3c, and PBEh-3c), on the accuracy of the models was examined. A surprising finding was that molecular mechanics force fields outperformed many of the semiempirical methods. Generally, the errors in the QM/QM' and QM/MM models converge slowly with respect to the QM region size, requiring 50 or more waters to be included in the QM region before the error in the model falls below 1 kcal mol-1 of its pure QM result. Different QM region selection schemes were also compared, and it was found that selection based on Natural Population Analysis (NPA) atomic charges significantly reduced the error in the QM/QM' and QM/MM models particularly if a low-quality embedding potential was used. It is envisaged that these results will be useful for the development of future hybrid QM models.
Collapse
Affiliation(s)
- Junbo Chen
- School of Chemistry, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Jin Kato
- School of Chemistry, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Jason B Harper
- School of Chemistry, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, Oklahoma 73019, United States
| | - Junming Ho
- School of Chemistry, The University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
39
|
Gaston JJ, Tague AJ, Smyth JE, Butler NM, Willis AC, van Eikema Hommes N, Yu H, Clark T, Keller PA. The Detosylation of Chiral 1,2-Bis(tosylamides). J Org Chem 2021; 86:9163-9180. [PMID: 34153182 DOI: 10.1021/acs.joc.1c00359] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The deprotection of chiral 1,2-bis(tosylamides) to their corresponding 1,2-diamines is mostly unsuccessful under standard conditions. In a new methodology, the use of Mg/MeOH with sufficient steric additions allows the facile synthesis of 1,2-diamines in 78-98% yields. These results are rationalized using density functional theory and the examination of inner and outer-sphere reduction mechanisms.
Collapse
Affiliation(s)
- Jayden J Gaston
- School of Chemistry and Molecular Bioscience, Molecular Horizons, University of Wollongong and Illawarra Health and Medical Research Institute, Wollongong, NSW 2522, Australia
| | - Andrew J Tague
- School of Chemistry and Molecular Bioscience, Molecular Horizons, University of Wollongong and Illawarra Health and Medical Research Institute, Wollongong, NSW 2522, Australia
| | - Jamie E Smyth
- School of Chemistry and Molecular Bioscience, Molecular Horizons, University of Wollongong and Illawarra Health and Medical Research Institute, Wollongong, NSW 2522, Australia
| | - Nicholas M Butler
- School of Chemistry and Molecular Bioscience, Molecular Horizons, University of Wollongong and Illawarra Health and Medical Research Institute, Wollongong, NSW 2522, Australia
| | - Anthony C Willis
- School of Chemistry, The Australian National University, Canberra, ACT 2601, Australia
| | - Nico van Eikema Hommes
- Computer Chemistry Center, Department of Chemistry and Pharmacy, Friedrich-Alexander Universität Erlangen-Nürnberg, Nägelsbachstraße 25, 91052 Erlangen, Germany
| | - Haibo Yu
- School of Chemistry and Molecular Bioscience, Molecular Horizons, University of Wollongong and Illawarra Health and Medical Research Institute, Wollongong, NSW 2522, Australia
| | - Timothy Clark
- Computer Chemistry Center, Department of Chemistry and Pharmacy, Friedrich-Alexander Universität Erlangen-Nürnberg, Nägelsbachstraße 25, 91052 Erlangen, Germany
| | - Paul A Keller
- School of Chemistry and Molecular Bioscience, Molecular Horizons, University of Wollongong and Illawarra Health and Medical Research Institute, Wollongong, NSW 2522, Australia
| |
Collapse
|
40
|
Santra G, Semidalas E, Martin JML. Exploring Avenues beyond Revised DSD Functionals: II. Random-Phase Approximation and Scaled MP3 Corrections. J Phys Chem A 2021; 125:4628-4638. [PMID: 34019413 PMCID: PMC8279643 DOI: 10.1021/acs.jpca.1c01295] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
![]()
For revDSD double
hybrids, the Görling–Levy second-order
perturbation theory component is an Achilles’ heel when applied
to systems with significant near-degeneracy (“static”)
correlation. We have explored its replacement by the direct random
phase approximation (dRPA), inspired by the SCS-dRPA75 functional
of Kállay and co-workers. The addition to the final energy
of both a D4 empirical dispersion correction and of a semilocal correlation
component lead to significant improvements, with DSD-PBEdRPA75-D4 approaching the performance of revDSD-PBEP86-D4 and the Berkeley
ωB97M(2). This form appears to be fairly insensitive to the
choice of the semilocal functional but does exhibit stronger basis
set sensitivity than the PT2-based double hybrids (due to much larger
prefactors for the nonlocal correlation). As an alternative, we explored
adding an MP3-like correction term (in a medium-sized basis set) to
a range-separated ωDSD-PBEP86-D4 double hybrid and found it
to have significantly lower WTMAD2 (weighted mean absolute deviation)
for the large and chemically diverse GMTKN55 benchmark suite; the
added computational cost can be mitigated through density fitting
techniques.
Collapse
Affiliation(s)
- Golokesh Santra
- Department of Organic Chemistry, Weizmann Institute of Science, 7610001 Reḥovot, Israel
| | - Emmanouil Semidalas
- Department of Organic Chemistry, Weizmann Institute of Science, 7610001 Reḥovot, Israel
| | - Jan M L Martin
- Department of Organic Chemistry, Weizmann Institute of Science, 7610001 Reḥovot, Israel
| |
Collapse
|
41
|
Santra G, Cho M, Martin JML. Exploring Avenues beyond Revised DSD Functionals: I. Range Separation, with xDSD as a Special Case. J Phys Chem A 2021; 125:4614-4627. [PMID: 34009986 PMCID: PMC8279641 DOI: 10.1021/acs.jpca.1c01294] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 05/06/2021] [Indexed: 01/16/2023]
Abstract
We have explored the use of range separation as a possible avenue for further improvement on our revDSD minimally empirical double hybrid functionals. Such ωDSD functionals encompass the XYG3 type of double hybrid (i.e., xDSD) as a special case for ω → 0. As in our previous studies, the large and chemically diverse GMTKN55 benchmark suite was used for evaluation. Especially when using the D4 rather than D3BJ dispersion model, xDSD has a slight performance advantage in WTMAD2. As in previous studies, PBEP86 is the winning combination for the semilocal parts. xDSDn-PBEP86-D4 marginally outperforms the previous "best in class" ωB97M(2) Berkeley double hybrid but without range separation and using fewer than half the number of empirical parameters. Range separation turns out to offer only marginal further improvements on GMTKN55 itself. While ωB97M(2) still yields better performance for small-molecule thermochemistry, this is compensated in WTMAD2 by the superior performance of the new functionals for conformer equilibria. Results for two external test sets with pronounced static correlation effects may indicate that range-separated double hybrids are more resilient to such effects.
Collapse
Affiliation(s)
- Golokesh Santra
- Department
of Organic Chemistry, Weizmann Institute
of Science, 7610001 Reḥovot, Israel
| | - Minsik Cho
- Department
of Organic Chemistry, Weizmann Institute
of Science, 7610001 Reḥovot, Israel
- Department
of Chemistry, Brown University, Providence, Rhode Island 02912, United States
| | - Jan M. L. Martin
- Department
of Organic Chemistry, Weizmann Institute
of Science, 7610001 Reḥovot, Israel
| |
Collapse
|
42
|
Barone V, Alessandrini S, Biczysko M, Cheeseman JR, Clary DC, McCoy AB, DiRisio RJ, Neese F, Melosso M, Puzzarini C. Computational molecular spectroscopy. ACTA ACUST UNITED AC 2021. [DOI: 10.1038/s43586-021-00034-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
43
|
Smola M, Gutten O, Dejmek M, Kožíšek M, Evangelidis T, Tehrani ZA, Novotná B, Nencka R, Birkuš G, Rulíšek L, Boura E. Ligand Strain and Its Conformational Complexity Is a Major Factor in the Binding of Cyclic Dinucleotides to STING Protein. Angew Chem Int Ed Engl 2021; 60:10172-10178. [PMID: 33616279 PMCID: PMC8251555 DOI: 10.1002/anie.202016805] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Indexed: 12/19/2022]
Abstract
STING (stimulator of interferon genes) is a key regulator of innate immunity that has recently been recognized as a promising drug target. STING is activated by cyclic dinucleotides (CDNs) which eventually leads to expression of type I interferons and other cytokines. Factors underlying the affinity of various CDN analogues are poorly understood. Herein, we correlate structural biology, isothermal calorimetry (ITC) and computational modeling to elucidate factors contributing to binding of six CDNs-three pairs of natural (ribo) and fluorinated (2'-fluororibo) 3',3'-CDNs. X-ray structural analyses of six {STING:CDN} complexes did not offer any explanation for the different affinities of the studied ligands. ITC showed entropy/enthalpy compensation up to 25 kcal mol-1 for this set of similar ligands. The higher affinities of fluorinated analogues are explained with help of computational methods by smaller loss of entropy upon binding and by smaller strain (free) energy.
Collapse
Affiliation(s)
- Miroslav Smola
- Gilead Sciences Research Centre at IOCBInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesFlemingovo náměstí 216610PragueCzech Republic
| | - Ondrej Gutten
- Gilead Sciences Research Centre at IOCBInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesFlemingovo náměstí 216610PragueCzech Republic
| | - Milan Dejmek
- Gilead Sciences Research Centre at IOCBInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesFlemingovo náměstí 216610PragueCzech Republic
| | - Milan Kožíšek
- Gilead Sciences Research Centre at IOCBInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesFlemingovo náměstí 216610PragueCzech Republic
| | - Thomas Evangelidis
- Gilead Sciences Research Centre at IOCBInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesFlemingovo náměstí 216610PragueCzech Republic
| | - Zahra Aliakbar Tehrani
- Gilead Sciences Research Centre at IOCBInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesFlemingovo náměstí 216610PragueCzech Republic
| | - Barbora Novotná
- Gilead Sciences Research Centre at IOCBInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesFlemingovo náměstí 216610PragueCzech Republic
| | - Radim Nencka
- Gilead Sciences Research Centre at IOCBInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesFlemingovo náměstí 216610PragueCzech Republic
| | - Gabriel Birkuš
- Gilead Sciences Research Centre at IOCBInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesFlemingovo náměstí 216610PragueCzech Republic
| | - Lubomír Rulíšek
- Gilead Sciences Research Centre at IOCBInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesFlemingovo náměstí 216610PragueCzech Republic
| | - Evzen Boura
- Gilead Sciences Research Centre at IOCBInstitute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesFlemingovo náměstí 216610PragueCzech Republic
| |
Collapse
|
44
|
Feng M, Gilson MK. Mechanistic analysis of light-driven overcrowded alkene-based molecular motors by multiscale molecular simulations. Phys Chem Chem Phys 2021; 23:8525-8540. [PMID: 33876015 PMCID: PMC8102045 DOI: 10.1039/d0cp06685k] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
We analyze light-driven overcrowded alkene-based molecular motors, an intriguing class of small molecules that have the potential to generate MHz-scale rotation rates. The full rotation process is simulated at multiple scales by combining quantum surface-hopping molecular dynamics (MD) simulations for the photoisomerization step with classical MD simulations for the thermal helix inversion step. A Markov state analysis resolves conformational substates, their interconversion kinetics, and their roles in the motor's rotation process. Furthermore, motor performance metrics, including rotation rate and maximal power output, are computed to validate computations against experimental measurements and to inform future designs. Lastly, we find that to correctly model these motors, the force field must be optimized by fitting selected parameters to reference quantum mechanical energy surfaces. Overall, our simulations yield encouraging agreement with experimental observables such as rotation rates, and provide mechanistic insights that may help future designs.
Collapse
Affiliation(s)
- Mudong Feng
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, 92093, USA.
| | | |
Collapse
|
45
|
Smola M, Gutten O, Dejmek M, Kožíšek M, Evangelidis T, Tehrani ZA, Novotná B, Nencka R, Birkuš G, Rulíšek L, Boura E. Ligand Strain and Its Conformational Complexity Is a Major Factor in the Binding of Cyclic Dinucleotides to STING Protein. Angew Chem Int Ed Engl 2021. [DOI: 10.1002/ange.202016805] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Miroslav Smola
- Gilead Sciences Research Centre at IOCB Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2 16610 Prague Czech Republic
| | - Ondrej Gutten
- Gilead Sciences Research Centre at IOCB Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2 16610 Prague Czech Republic
| | - Milan Dejmek
- Gilead Sciences Research Centre at IOCB Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2 16610 Prague Czech Republic
| | - Milan Kožíšek
- Gilead Sciences Research Centre at IOCB Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2 16610 Prague Czech Republic
| | - Thomas Evangelidis
- Gilead Sciences Research Centre at IOCB Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2 16610 Prague Czech Republic
| | - Zahra Aliakbar Tehrani
- Gilead Sciences Research Centre at IOCB Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2 16610 Prague Czech Republic
| | - Barbora Novotná
- Gilead Sciences Research Centre at IOCB Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2 16610 Prague Czech Republic
| | - Radim Nencka
- Gilead Sciences Research Centre at IOCB Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2 16610 Prague Czech Republic
| | - Gabriel Birkuš
- Gilead Sciences Research Centre at IOCB Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2 16610 Prague Czech Republic
| | - Lubomír Rulíšek
- Gilead Sciences Research Centre at IOCB Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2 16610 Prague Czech Republic
| | - Evzen Boura
- Gilead Sciences Research Centre at IOCB Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2 16610 Prague Czech Republic
| |
Collapse
|
46
|
Gutten O, Jurečka P, Aliakbar Tehrani Z, Buděšínský M, Řezáč J, Rulíšek L. Conformational energies and equilibria of cyclic dinucleotides in vacuo and in solution: computational chemistry vs. NMR experiments. Phys Chem Chem Phys 2021; 23:7280-7294. [PMID: 33876088 DOI: 10.1039/d0cp05993e] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Performance of computational methods in modelling cyclic dinucleotides - an important and challenging class of compounds - has been evaluated by two different benchmarks: (1) gas-phase conformational energies and (2) qualitative agreement with NMR observations of the orientation of the χ-dihedral angle in solvent. In gas-phase benchmarks, where CCSD(T) and DLPNO-CCSD(T) methods have been used as the reference, most of the (dispersion corrected) density functional approximations are accurate enough to justify prioritizing computational cost and compatibility with other modelling options as the criterion of choice. NMR experiments of 3'3'-c-di-AMP, 3'3'-c-GAMP, and 3'3'-c-di-GMP show the overall prevalence of the anti-conformation of purine bases, but some population of syn-conformations is observed for guanines. Implicit solvation models combined with quantum-chemical methods struggle to reproduce this behaviour, probably due to a lack of dynamics and explicitly modelled solvent, leading to structures that are too compact. Molecular dynamics simulations overrepresent the syn-conformation of guanine due to the overestimation of an intramolecular hydrogen bond. Our combination of experimental and computational benchmarks provides "error bars" for modelling cyclic dinucleotides in solvent, where such information is generally difficult to obtain, and should help gauge the interpretability of studies dealing with binding of cyclic dinucleotides to important pharmaceutical targets. At the same time, the presented analysis calls for improvement in both implicit solvation models and force-field parameters.
Collapse
Affiliation(s)
- Ondrej Gutten
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10, Praha 6, Czech Republic.
| | | | | | | | | | | |
Collapse
|
47
|
Ravikumar A, de Brevern AG, Srinivasan N. Conformational Strain Indicated by Ramachandran Angles for the Protein Backbone Is Only Weakly Related to the Flexibility. J Phys Chem B 2021; 125:2597-2606. [PMID: 33666418 DOI: 10.1021/acs.jpcb.1c00168] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Studies on energy associated with free dipeptides have shown that conformers with unfavorable (ϕ,ψ) torsion angles have higher energy compared to conformers with favorable (ϕ,ψ) angles. It is expected that higher energy confers higher dynamics and flexibility to that part of the protein. Here, we explore a potential relationship between conformational strain in a residue due to unfavorable (ϕ,ψ) angles and its flexibility and dynamics in the context of protein structures. We compared flexibility of strained and relaxed residues, which are recognized based on outlier/allowed and favorable (ϕ,ψ) angles respectively, using normal-mode analysis (NMA). We also performed in-depth analysis on flexibility and dynamics at catalytic residues in protein kinases, which exhibit different strain status in different kinase structures using NMA and molecular dynamics simulations. We underline that strain of a residue, as defined by backbone torsion angles, is almost unrelated to the flexibility and dynamics associated with it. Even the overall trend observed among all high-resolution structures in which relaxed residues tend to have slightly higher flexibility than strained residues is counterintuitive. Consequently, we propose that identifying strained residues based on (ϕ,ψ) values is not an effective way to recognize energetic strain in protein structures.
Collapse
Affiliation(s)
- Ashraya Ravikumar
- Molecular Biophysics Unit, Indian Institute of Science, Bengaluru, India, 560012
| | - Alexandre G de Brevern
- INSERM, U 1134, DSIMB, Paris F-75739, France.,University of Paris, Paris F-75739, France.,Institut National de la Transfusion Sanguine (INTS), Paris F-75739, France.,Laboratoire d'Excellence GR-Ex, Paris F-75739, France
| | | |
Collapse
|
48
|
Grimme S, Bohle F, Hansen A, Pracht P, Spicher S, Stahn M. Efficient Quantum Chemical Calculation of Structure Ensembles and Free Energies for Nonrigid Molecules. J Phys Chem A 2021; 125:4039-4054. [PMID: 33688730 DOI: 10.1021/acs.jpca.1c00971] [Citation(s) in RCA: 132] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The application of quantum chemical, automatic multilevel modeling workflows for the determination of thermodynamic (e.g., conformation equilibria, partition coefficients, pKa values) and spectroscopic properties of relatively large, nonrigid molecules in solution is described. Key points are the computation of rather complete structure (conformer) ensembles with extremely fast but still reasonable GFN2-xTB or GFN-FF semiempirical methods in the CREST searching approach and subsequent refinement at a recently developed, accurate r2SCAN-3c DFT composite level. Solvation effects are included in all steps by accurate continuum solvation models (ALPB, (D)COSMO-RS). Consistent inclusion of thermostatistical contributions in the framework of the modified rigid-rotor-harmonic-oscillator approximation (mRRHO) based on xTB/FF computed PES is also recommended.
Collapse
Affiliation(s)
- Stefan Grimme
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstrasse 4, 53115 Bonn, Germany
| | - Fabian Bohle
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstrasse 4, 53115 Bonn, Germany
| | - Andreas Hansen
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstrasse 4, 53115 Bonn, Germany
| | - Philipp Pracht
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstrasse 4, 53115 Bonn, Germany
| | - Sebastian Spicher
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstrasse 4, 53115 Bonn, Germany
| | - Marcel Stahn
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstrasse 4, 53115 Bonn, Germany
| |
Collapse
|
49
|
Liu Z, Lin L, Jia Q, Cheng Z, Jiang Y, Guo Y, Ma J. Transferable Multilevel Attention Neural Network for Accurate Prediction of Quantum Chemistry Properties via Multitask Learning. J Chem Inf Model 2021; 61:1066-1082. [PMID: 33629839 DOI: 10.1021/acs.jcim.0c01224] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The development of efficient models for predicting specific properties through machine learning is of great importance for the innovation of chemistry and material science. However, predicting global electronic structure properties like Frontier molecular orbital highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) energy levels and their HOMO-LUMO gaps from the small-sized molecule data to larger molecules remains a challenge. Here, we develop a multilevel attention neural network, named DeepMoleNet, to enable chemical interpretable insights being fused into multitask learning through (1) weighting contributions from various atoms and (2) taking the atom-centered symmetry functions (ACSFs) as the teacher descriptor. The efficient prediction of 12 properties including dipole moment, HOMO, and Gibbs free energy within chemical accuracy is achieved by using multiple benchmarks, both at the equilibrium and nonequilibrium geometries, including up to 110,000 records of data in QM9, 400,000 records in MD17, and 280,000 records in ANI-1ccx for random split evaluation. The good transferability for predicting larger molecules outside the training set is demonstrated in both equilibrium QM9 and Alchemy data sets at the density functional theory (DFT) level. Additional tests on nonequilibrium molecular conformations from DFT-based MD17 data set and ANI-1ccx data set with coupled cluster accuracy as well as the public test sets of singlet fission molecules, biomolecules, long oligomers, and protein with up to 140 atoms show reasonable predictions for thermodynamics and electronic structure properties. The proposed multilevel attention neural network is applicable to high-throughput screening of numerous chemical species in both equilibrium and nonequilibrium molecular spaces to accelerate rational designs of drug-like molecules, material candidates, and chemical reactions.
Collapse
Affiliation(s)
- Ziteng Liu
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, P. R. China
| | - Liqiang Lin
- National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, P. R. China
| | - Qingqing Jia
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, P. R. China
| | - Zheng Cheng
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, P. R. China
| | - Yanyan Jiang
- National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, P. R. China
| | - Yanwen Guo
- National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, P. R. China
| | - Jing Ma
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, P. R. China.,Jiangsu Key Laboratory of Advanced Organic Materials, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, P. R. China
| |
Collapse
|
50
|
Sandler I, Chen J, Taylor M, Sharma S, Ho J. Accuracy of DLPNO-CCSD(T): Effect of Basis Set and System Size. J Phys Chem A 2021; 125:1553-1563. [PMID: 33560853 DOI: 10.1021/acs.jpca.0c11270] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The DLPNO-CCSD(T) method is designed to study large molecular systems at significantly reduced cost relative to its canonical counterpart. However, the error in this approach is also size-extensive and relies on cancellation of errors for the calculation of relative energies. This work provides a direct comparison of canonical CCSD(T) and TightPNO DLPNO-CCSD(T) calculations of reaction energies and barriers of a broad range of chemical reactions. The dataset includes acidities, anion binding affinities, enolization, Diels-Alder, nucleophilic substitution, and atom transfer reactions and complements existing theoretical datasets in terms of system size as well as new reaction types (e.g., anion binding affinities and chlorine atom transfer reactions). The performance of DLPNO-CCSD(T) was further examined with respect to systematic variation of basis set and system size and amounts of nonbonded interaction present in the system. The errors in the DLPNO-CCSD(T) were found to be relatively insensitive to the choice of basis set for small systems but increase monotonically with system size. Additionally, calculations of barriers appear to be more challenging than reaction energies with errors exceeding 5 kJ mol-1 for many Diels-Alder reactions. Further tests on three realistic organic reactions reveal the impact of the DLPNO approximation in calculating absolute and relative barriers that are important for predictions such as stereoselectivity.
Collapse
Affiliation(s)
- Isolde Sandler
- School of Chemistry, University of New South Wales, Sydney, NSW 2052, Australia
| | - Junbo Chen
- School of Chemistry, University of New South Wales, Sydney, NSW 2052, Australia
| | - Mackenzie Taylor
- School of Chemistry, University of New South Wales, Sydney, NSW 2052, Australia
| | - Shaleen Sharma
- School of Chemistry, University of New South Wales, Sydney, NSW 2052, Australia
| | - Junming Ho
- School of Chemistry, University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|