1
|
Manchev YT, Burn MJ, Popelier PLA. Ichor: A Python library for computational chemistry data management and machine learning force field development. J Comput Chem 2024. [PMID: 39215569 DOI: 10.1002/jcc.27477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Revised: 07/09/2024] [Accepted: 07/18/2024] [Indexed: 09/04/2024]
Abstract
We present ichor, an open-source Python library that simplifies data management in computational chemistry and streamlines machine learning force field development. Ichor implements many easily extensible file management tools, in addition to a lazy file reading system, allowing efficient management of hundreds of thousands of computational chemistry files. Data from calculations can be readily stored into databases for easy sharing and post-processing. Raw data can be directly processed by ichor to create machine learning-ready datasets. In addition to powerful data-related capabilities, ichor provides interfaces to popular workload management software employed by High Performance Computing clusters, making for effortless submission of thousands of separate calculations with only a single line of Python code. Furthermore, a simple-to-use command line interface has been implemented through a series of menu systems to further increase accessibility and efficiency of common important ichor tasks. Finally, ichor implements general tools for visualization and analysis of datasets and tools for measuring machine-learning model quality both on test set data and in simulations. With the current functionalities, ichor can serve as an end-to-end data procurement, data management, and analysis solution for machine-learning force-field development.
Collapse
Affiliation(s)
- Yulian T Manchev
- Department of Chemistry, The University of Manchester, Manchester, UK
| | - Matthew J Burn
- Department of Chemistry, The University of Manchester, Manchester, UK
| | - Paul L A Popelier
- Department of Chemistry, The University of Manchester, Manchester, UK
| |
Collapse
|
2
|
Lu T. A comprehensive electron wavefunction analysis toolbox for chemists, Multiwfn. J Chem Phys 2024; 161:082503. [PMID: 39189657 DOI: 10.1063/5.0216272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Accepted: 08/07/2024] [Indexed: 08/28/2024] Open
Abstract
Analysis of electron wavefunction is a key component of quantum chemistry investigations and is indispensable for the practical research of many chemical problems. After more than ten years of active development, the wavefunction analysis program Multiwfn has accumulated very rich functions, and its application scope has covered numerous aspects of theoretical chemical research, including charge distribution, chemical bond, electron localization and delocalization, aromaticity, intramolecular and intermolecular interactions, electronic excitation, and response property. This article systematically introduces the features and functions of the latest version of Multiwfn and provides many representative examples. Through this article, readers will be able to fully understand the characteristics and recognize the unique value of Multiwfn. The source code and precompiled executable files of Multiwfn, as well as the manual containing a detailed introduction to theoretical backgrounds and very rich tutorials, can all be downloaded for free from the Multiwfn website (http://sobereva.com/multiwfn).
Collapse
Affiliation(s)
- Tian Lu
- Beijing Kein Research Center for Natural Sciences, Beijing 100024, People's Republic of China
| |
Collapse
|
3
|
Tehrani A, Richer M, Heidar-Zadeh F. CuGBasis: High-performance CUDA/Python library for efficient computation of quantum chemistry density-based descriptors for larger systems. J Chem Phys 2024; 161:072501. [PMID: 39158048 DOI: 10.1063/5.0216781] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 06/17/2024] [Indexed: 08/20/2024] Open
Abstract
CuGBasis is a free and open-source CUDA®/Python library for efficient computation of scalar, vector, and matrix quantities crucial for the post-processing of electronic structure calculations. CuGBasis integrates high-performance Graphical Processing Unit (GPU) computing with the ease and flexibility of Python programming, making it compatible with a vast ecosystem of libraries. We showcase its utility as a Python library and demonstrate its seamless interoperability with existing Python software to gain chemical insight from quantum chemistry calculations. Leveraging GPU-accelerated code, cuGBasis exhibits remarkable performance, making it highly applicable to larger systems or large databases. Our benchmarks reveal a 100-fold performance gain compared to alternative software packages, including serial/multi-threaded Central Processing Unit and GPU implementations. This paper outlines various features and computational strategies that lead to cuGBasis's enhanced performance, guiding developers of GPU-accelerated code.
Collapse
Affiliation(s)
- Alireza Tehrani
- Department of Chemistry, Queen's University, Kingston, Ontario K7L-3N6, Canada
| | - Michelle Richer
- Department of Chemistry, Queen's University, Kingston, Ontario K7L-3N6, Canada
| | - Farnaz Heidar-Zadeh
- Department of Chemistry, Queen's University, Kingston, Ontario K7L-3N6, Canada
| |
Collapse
|
4
|
Berquist E, Dumi A, Upadhyay S, Abarbanel OD, Cho M, Gaur S, Cano Gil VH, Hutchison GR, Lee OS, Rosen AS, Schamnad S, Schneider FSS, Steinmann C, Stolyarchuk M, Vandezande JE, Zak W, Langner KM. cclib 2.0: An updated architecture for interoperable computational chemistry. J Chem Phys 2024; 161:042501. [PMID: 39051837 DOI: 10.1063/5.0216778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 07/01/2024] [Indexed: 07/27/2024] Open
Abstract
Interoperability in computational chemistry is elusive, impeded by the independent development of software packages and idiosyncratic nature of their output files. The cclib library was introduced in 2006 as an attempt to improve this situation by providing a consistent interface to the results of various quantum chemistry programs. The shared API across programs enabled by cclib has allowed users to focus on results as opposed to output and to combine data from multiple programs or develop generic downstream tools. Initial development, however, did not anticipate the rapid progress of computational capabilities, novel methods, and new programs; nor did it foresee the growing need for customizability. Here, we recount this history and present cclib 2, focused on extensibility and modularity. We also introduce recent design pivots-the formalization of cclib's intermediate data representation as a tree-based structure, a new combinator-based parser organization, and parsed chemical properties as extensible objects.
Collapse
Affiliation(s)
- Eric Berquist
- Sandia National Laboratories, Albuquerque, New Mexico 87185, USA
| | - Amanda Dumi
- Sandia National Laboratories, Albuquerque, New Mexico 87185, USA
| | - Shiv Upadhyay
- Department of Chemistry, University of Washington, Seattle, Washington 98195, USA
| | - Omri D Abarbanel
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, USA
| | - Minsik Cho
- Department of Chemistry, Massachusetts Institute of Technology, 77 Massachusetts Ave., Cambridge, Massachusetts 02139, USA
| | - Sagar Gaur
- MarkovML 23, Geary St. Suite 600, San Francisco, California 94108, USA
- International Institute of Information Technology, Prof. CR Rao Road Gachibowli, Hyderabad 500032, Telangana, India
| | | | - Geoffrey R Hutchison
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, USA
| | - Oliver S Lee
- Organic Semiconductor Centre, EaStCHEM School of Chemistry, University of St Andrews, St. Andrews KY16 9ST, United Kingdom
- Organic Semiconductor Centre, SUPA School of Physics and Astronomy, University of St Andrews, St. Andrews KY16 9SS, United Kingdom
| | - Andrew S Rosen
- Sandia National Laboratories, Albuquerque, New Mexico 87185, USA
- Department of Materials Science and Engineering, University of California, Berkeley, California 94720, USA
| | | | | | - Casper Steinmann
- Department of Chemistry and Bioscience, Aalborg University, DK-9230 Aalborg, Denmark
| | | | | | - Weronika Zak
- Department of Computer Science, Loughborough University, Epinal Way, Loughborough, Leicestershire LE11 3TU, United Kingdom
| | | |
Collapse
|
5
|
Kim TD, Pujal L, Richer M, van Zyl M, Martínez-González M, Tehrani A, Chuiko V, Sánchez-Díaz G, Sanchez W, Adams W, Huang X, Kelly BD, Vöhringer-Martinez E, Verstraelen T, Heidar-Zadeh F, Ayers PW. GBasis: A Python library for evaluating functions, functionals, and integrals expressed with Gaussian basis functions. J Chem Phys 2024; 161:042503. [PMID: 39077908 DOI: 10.1063/5.0216776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 07/05/2024] [Indexed: 07/31/2024] Open
Abstract
GBasis is a free and open-source Python library for molecular property computations based on Gaussian basis functions in quantum chemistry. Specifically, GBasis allows one to evaluate functions expanded in Gaussian basis functions (including molecular orbitals, electron density, and reduced density matrices) and to compute functionals of Gaussian basis functions (overlap integrals, one-electron integrals, and two-electron integrals). Unique features of GBasis include supporting evaluation and analytical integration of arbitrary-order derivatives of the density (matrices), computation of a broad range of (screened) Coulomb interactions, and evaluation of overlap integrals of arbitrary numbers of Gaussians in arbitrarily high dimensions. For circumstances where the flexibility of GBasis is less important than high performance, a seamless Python interface to the Libcint C package is provided. GBasis is designed to be easy to use, maintain, and extend following many standards of sustainable software development, including code-quality assurance through continuous integration protocols, extensive testing, comprehensive documentation, up-to-date package management, and continuous delivery. This article marks the official release of the GBasis library, outlining its features, examples, and development.
Collapse
Affiliation(s)
- Taewon David Kim
- Department of Chemistry & Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - Leila Pujal
- Department of Chemistry, Queen's University, 90 Bader Lane, Kingston, Ontario K7L-3N6, Canada
| | - Michelle Richer
- Department of Chemistry, Queen's University, 90 Bader Lane, Kingston, Ontario K7L-3N6, Canada
| | - Maximilian van Zyl
- Department of Chemistry, Queen's University, 90 Bader Lane, Kingston, Ontario K7L-3N6, Canada
| | - Marco Martínez-González
- Department of Chemistry & Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - Alireza Tehrani
- Department of Chemistry, Queen's University, 90 Bader Lane, Kingston, Ontario K7L-3N6, Canada
| | - Valerii Chuiko
- Department of Chemistry & Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - Gabriela Sánchez-Díaz
- Department of Chemistry & Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - Wesley Sanchez
- Department of Chemistry & Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - William Adams
- Department of Chemistry & Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - Xiaomin Huang
- Department of Chemistry & Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - Braden D Kelly
- Department of Chemistry & Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - Esteban Vöhringer-Martinez
- Departamento de Físico-Química, Facultad de Ciencias Químicas, Universidad de Concepción, Concepción, Chile
| | - Toon Verstraelen
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark-Zwijnaarde 46, B-9052 Zwijnaarde, Belgium
| | - Farnaz Heidar-Zadeh
- Department of Chemistry, Queen's University, 90 Bader Lane, Kingston, Ontario K7L-3N6, Canada
| | - Paul W Ayers
- Department of Chemistry & Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| |
Collapse
|
6
|
Ryzhkov FV, Ryzhkova YE, Elinson MN. Python tools for structural tasks in chemistry. Mol Divers 2024:10.1007/s11030-024-10889-7. [PMID: 38744790 DOI: 10.1007/s11030-024-10889-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 04/27/2024] [Indexed: 05/16/2024]
Abstract
In recent decades, the use of computational approaches and artificial intelligence in the scientific environment has become more widespread. In this regard, the popular and versatile programming language Python has attracted considerable attention from scientists in the field of chemistry. It is used to solve a variety of chemical and structural problems, including calculating descriptors, molecular fingerprints, graph construction, and computing chemical reaction networks. Python offers high-quality visualization tools for analyzing chemical spaces and compound libraries. This review is a list of tools for the above tasks, including scripts, libraries, ready-made programs, and web interfaces. Inevitably this manuscript does not claim to be an all-encompassing handbook including all the existing Python-based structural chemistry codes. The review serves as a starting point for scientists wishing to apply automatization or optimization to routine chemistry problems.
Collapse
Affiliation(s)
- Fedor V Ryzhkov
- N. D. Zelinsky Institute of Organic Chemistry Russian Academy of Sciences, 47 Leninsky Prospekt, Moscow, 119991, Russia.
| | - Yuliya E Ryzhkova
- N. D. Zelinsky Institute of Organic Chemistry Russian Academy of Sciences, 47 Leninsky Prospekt, Moscow, 119991, Russia
| | - Michail N Elinson
- N. D. Zelinsky Institute of Organic Chemistry Russian Academy of Sciences, 47 Leninsky Prospekt, Moscow, 119991, Russia
| |
Collapse
|
7
|
Chan M, Verstraelen T, Tehrani A, Richer M, Yang XD, Kim TD, Vöhringer-Martinez E, Heidar-Zadeh F, Ayers PW. The tale of HORTON: Lessons learned in a decade of scientific software development. J Chem Phys 2024; 160:162501. [PMID: 38651814 DOI: 10.1063/5.0196638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 02/28/2024] [Indexed: 04/25/2024] Open
Abstract
HORTON is a free and open-source electronic-structure package written primarily in Python 3 with some underlying C++ components. While HORTON's development has been mainly directed by the research interests of its leading contributing groups, it is designed to be easily modified, extended, and used by other developers of quantum chemistry methods or post-processing techniques. Most importantly, HORTON adheres to modern principles of software development, including modularity, readability, flexibility, comprehensive documentation, automatic testing, version control, and quality-assurance protocols. This article explains how the principles and structure of HORTON have evolved since we started developing it more than a decade ago. We review the features and functionality of the latest HORTON release (version 2.3) and discuss how HORTON is evolving to support electronic structure theory research for the next decade.
Collapse
Affiliation(s)
- Matthew Chan
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - Toon Verstraelen
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark-Zwijnaarde 46, B-9052 Ghent, Belgium
| | - Alireza Tehrani
- Department of Chemistry, Queen's University, Kingston, Ontario K7L-3N6, Canada
| | - Michelle Richer
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - Xiaotian Derrick Yang
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - Taewon David Kim
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| | - Esteban Vöhringer-Martinez
- Departamento de Físico Química, Facultad de Ciencias Químicas, Universidad de Concepción, 4070371 Concepción, Chile
| | - Farnaz Heidar-Zadeh
- Department of Chemistry, Queen's University, Kingston, Ontario K7L-3N6, Canada
| | - Paul W Ayers
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario L8S-4L8, Canada
| |
Collapse
|
8
|
Lehtola S. A call to arms: Making the case for more reusable libraries. J Chem Phys 2023; 159:180901. [PMID: 37947507 DOI: 10.1063/5.0175165] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 10/23/2023] [Indexed: 11/12/2023] Open
Abstract
The traditional foundation of science lies on the cornerstones of theory and experiment. Theory is used to explain experiment, which in turn guides the development of theory. Since the advent of computers and the development of computational algorithms, computation has risen as the third cornerstone of science, joining theory and experiment on an equal footing. Computation has become an essential part of modern science, amending experiment by enabling accurate comparison of complicated theories to sophisticated experiments, as well as guiding by triage both the design and targets of experiments and the development of novel theories and computational methods. Like experiment, computation relies on continued investment in infrastructure: it requires both hardware (the physical computer on which the calculation is run) as well as software (the source code of the programs that performs the wanted simulations). In this Perspective, I discuss present-day challenges on the software side in computational chemistry, which arise from the fast-paced development of algorithms, programming models, as well as hardware. I argue that many of these challenges could be solved with reusable open source libraries, which are a public good, enhance the reproducibility of science, and accelerate the development and availability of state-of-the-art methods and improved software.
Collapse
Affiliation(s)
- Susi Lehtola
- Department of Chemistry, University of Helsinki, P.O. Box 55, FI-00014 Helsinki, Finland
| |
Collapse
|
9
|
Carter-Fenk K, Liu M, Pujal L, Loipersberger M, Tsanai M, Vernon RM, Forman-Kay JD, Head-Gordon M, Heidar-Zadeh F, Head-Gordon T. The Energetic Origins of Pi-Pi Contacts in Proteins. J Am Chem Soc 2023; 145. [PMID: 37917924 PMCID: PMC10655088 DOI: 10.1021/jacs.3c09198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 10/04/2023] [Accepted: 10/05/2023] [Indexed: 11/04/2023]
Abstract
Accurate potential energy models of proteins must describe the many different types of noncovalent interactions that contribute to a protein's stability and structure. Pi-pi contacts are ubiquitous structural motifs in all proteins, occurring between aromatic and nonaromatic residues and play a nontrivial role in protein folding and in the formation of biomolecular condensates. Guided by a geometric criterion for isolating pi-pi contacts from classical molecular dynamics simulations of proteins, we use quantum mechanical energy decomposition analysis to determine the molecular interactions that stabilize different pi-pi contact motifs. We find that neutral pi-pi interactions in proteins are dominated by Pauli repulsion and London dispersion rather than repulsive quadrupole electrostatics, which is central to the textbook Hunter-Sanders model. This results in a notable lack of variability in the interaction profiles of neutral pi-pi contacts even with extreme changes in the dielectric medium, explaining the prevalence of pi-stacked arrangements in and between proteins. We also find interactions involving pi-containing anions and cations to be extremely malleable, interacting like neutral pi-pi contacts in polar media and like typical ion-pi interactions in nonpolar environments. Like-charged pairs such as arginine-arginine contacts are particularly sensitive to the polarity of their immediate surroundings and exhibit canonical pi-pi stacking behavior only if the interaction is mediated by environmental effects, such as aqueous solvation.
Collapse
Affiliation(s)
- Kevin Carter-Fenk
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
| | - Meili Liu
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, Beijing Normal University, Beijing 100875, China
| | - Leila Pujal
- Department
of Chemistry, Queen’s University, Kingston, Ontario K7L 3N6, Canada
| | - Matthias Loipersberger
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
| | - Maria Tsanai
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
| | - Robert M. Vernon
- Molecular
Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department
of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Julie D. Forman-Kay
- Molecular
Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department
of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Martin Head-Gordon
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
| | - Farnaz Heidar-Zadeh
- Department
of Chemistry, Queen’s University, Kingston, Ontario K7L 3N6, Canada
- Center
for Molecular Modeling (CMM), Ghent University, 9052 Zwijnaarde, Belgium
| | - Teresa Head-Gordon
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemical and Biomolecular Engineering, University of California, Berkeley, California 94720, United States
- Department
of Bioengineering, University of California, Berkeley, California 94720, United States
| |
Collapse
|
10
|
Tehrani A, Anderson JSM, Chakraborty D, Rodriguez-Hernandez JI, Thompson DC, Verstraelen T, Ayers PW, Heidar-Zadeh F. An information-theoretic approach to basis-set fitting of electron densities and other non-negative functions. J Comput Chem 2023; 44:1998-2015. [PMID: 37526138 DOI: 10.1002/jcc.27170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 05/03/2023] [Accepted: 05/08/2023] [Indexed: 08/02/2023]
Abstract
The numerical ill-conditioning associated with approximating an electron density with a convex sum of Gaussian or Slater-type functions is overcome by using the (extended) Kullback-Leibler divergence to measure the deviation between the target and approximate density. The optimized densities are non-negative and normalized, and they are accurate enough to be used in applications related to molecular similarity, the topology of the electron density, and numerical molecular integration. This robust, efficient, and general approach can be used to fit any non-negative normalized functions (e.g., the kinetic energy density and molecular electron density) to a convex sum of non-negative basis functions. We present a fixed-point iteration method for optimizing the Kullback-Leibler divergence and compare it to conventional gradient-based optimization methods. These algorithms are released through the free and open-source BFit package, which also includes a L2-norm squared optimization routine applicable to any square-integrable scalar function.
Collapse
Affiliation(s)
- Alireza Tehrani
- Department of Chemistry, Queen's University, Kingston, Ontario, Canada
| | - James S M Anderson
- Instituto de Química, Universidad Nacional Autónoma de México, Ciudad de México, Mexico
| | - Debajit Chakraborty
- Department of Physics, Wake Forest University, Winston-Salem, North Carolina, USA
- Center for Functional Materials, Wake Forest University, Winston-Salem, North Carolina, USA
| | | | | | - Toon Verstraelen
- Center for Molecular Modeling (CMM), Ghent University, Zwijnaarde, Belgium
| | - Paul W Ayers
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario, Canada
| | | |
Collapse
|
11
|
Posenitskiy E, Chilkuri VG, Ammar A, Hapka M, Pernal K, Shinde R, Landinez Borda EJ, Filippi C, Nakano K, Kohulák O, Sorella S, de Oliveira Castro P, Jalby W, Ríos PL, Alavi A, Scemama A. TREXIO: A file format and library for quantum chemistry. J Chem Phys 2023; 158:2888846. [PMID: 37144717 DOI: 10.1063/5.0148161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 04/04/2023] [Indexed: 05/06/2023] Open
Abstract
TREXIO is an open-source file format and library developed for the storage and manipulation of data produced by quantum chemistry calculations. It is designed with the goal of providing a reliable and efficient method of storing and exchanging wave function parameters and matrix elements, making it an important tool for researchers in the field of quantum chemistry. In this work, we present an overview of the TREXIO file format and library. The library consists of a front-end implemented in the C programming language and two different back-ends: a text back-end and a binary back-end utilizing the hierarchical data format version 5 library, which enables fast read and write operations. It is compatible with a variety of platforms and has interfaces for Fortran, Python, and OCaml programming languages. In addition, a suite of tools have been developed to facilitate the use of the TREXIO format and library, including converters for popular quantum chemistry codes and utilities for validating and manipulating data stored in TREXIO files. The simplicity, versatility, and ease of use of TREXIO make it a valuable resource for researchers working with quantum chemistry data.
Collapse
Affiliation(s)
- Evgeny Posenitskiy
- Laboratoire de Chimie et Physique Quantiques - UMR5626, CNRS/Université Paul Sabatier, Bat. 3R1b4, 118 route de Narbonne, 31062 Toulouse Cedex 09, France
- Qubit Pharmaceuticals, Incubateur Paris Biotech Santé, 24 Rue du Faubourg Saint Jacques, 75014 Paris, France
| | - Vijay Gopal Chilkuri
- Laboratoire de Chimie et Physique Quantiques - UMR5626, CNRS/Université Paul Sabatier, Bat. 3R1b4, 118 route de Narbonne, 31062 Toulouse Cedex 09, France
- Institut des Sciences Moléculaires de Marseille, Service 561, Campus Scientifique de St. Jérôme, Aix Marseille Université, Centrale Marseille 13, 397 Marseille Cedex 20, France
| | - Abdallah Ammar
- Laboratoire de Chimie et Physique Quantiques - UMR5626, CNRS/Université Paul Sabatier, Bat. 3R1b4, 118 route de Narbonne, 31062 Toulouse Cedex 09, France
| | - Michał Hapka
- Faculty of Chemistry, University of Warsaw, ul. L. Pasteura 1, 02-093 Warsaw, Poland
| | - Katarzyna Pernal
- Institute of Physics, Lodz University of Technology, ul. Wolczanska 217/221, 93-005 Lodz, Poland
| | - Ravindra Shinde
- MESA+ Institute for Nanotechnology, University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands
| | - Edgar Josué Landinez Borda
- MESA+ Institute for Nanotechnology, University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands
| | - Claudia Filippi
- MESA+ Institute for Nanotechnology, University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands
| | - Kosuke Nakano
- Research and Services Division of Materials Data and Integrated System, National Institute for Materials Science (NIMS), Tsukuba, Ibaraki 305-0047, Japan
- International School for Advanced Studies (SISSA), Via Bonomea 265, 34136 Trieste, Italy
| | - Otto Kohulák
- Laboratoire de Chimie et Physique Quantiques - UMR5626, CNRS/Université Paul Sabatier, Bat. 3R1b4, 118 route de Narbonne, 31062 Toulouse Cedex 09, France
- International School for Advanced Studies (SISSA), Via Bonomea 265, 34136 Trieste, Italy
| | - Sandro Sorella
- International School for Advanced Studies (SISSA), Via Bonomea 265, 34136 Trieste, Italy
| | | | - William Jalby
- Université Paris-Saclay, UVSQ, LI-PaRAD, 9 Boulevard d'Alembert, 78280 Guyancourt, France
| | - Pablo López Ríos
- Max Planck Institute for Solid State Research, Heisenbergstrasse 1, 70569 Stuttgart, Germany
| | - Ali Alavi
- Max Planck Institute for Solid State Research, Heisenbergstrasse 1, 70569 Stuttgart, Germany
| | - Anthony Scemama
- Laboratoire de Chimie et Physique Quantiques - UMR5626, CNRS/Université Paul Sabatier, Bat. 3R1b4, 118 route de Narbonne, 31062 Toulouse Cedex 09, France
| |
Collapse
|
12
|
Kim TD, Richer M, Sánchez-Díaz G, Miranda-Quintana RA, Verstraelen T, Heidar-Zadeh F, Ayers PW. Fanpy: A python library for prototyping multideterminant methods in ab initio quantum chemistry. J Comput Chem 2023; 44:697-709. [PMID: 36440947 DOI: 10.1002/jcc.27034] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 09/05/2022] [Indexed: 11/30/2022]
Abstract
Fanpy is a free and open-source Python library for developing and testing multideterminant wavefunctions and related ab initio methods in electronic structure theory. The main use of Fanpy is to quickly prototype new methods by making it easier to convert the mathematical formulation of a new wavefunction ansätze to a working implementation. Fanpy is designed based on our recently introduced Flexible Ansatz for N-electron Configuration Interaction (FANCI) framework, where multideterminant wavefunctions are represented by their overlaps with Slater determinants of orthonormal spin-orbitals. In the simplest case, a new wavefunction ansatz can be implemented by simply writing a function for evaluating its overlap with an arbitrary Slater determinant. Fanpy is modular in both implementation and theory: the wavefunction model, the system's Hamiltonian, and the choice of objective function are all independent modules. This modular structure makes it easy for users to mix and match different methods and for developers to quickly explore new ideas. Fanpy is written purely in Python with standard dependencies, making it accessible for various operating systems. In addition, it adheres to principles of modern software development, including comprehensive documentation, extensive testing, quality assurance, and continuous integration and delivery protocols. This article is considered to be the official release notes for the Fanpy library.
Collapse
Affiliation(s)
- Taewon D Kim
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario, Canada.,Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida, USA
| | - M Richer
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario, Canada
| | - Gabriela Sánchez-Díaz
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario, Canada
| | | | - Toon Verstraelen
- Center for Molecular Modeling (CMM), Ghent University, Ghent, Belgium
| | | | - Paul W Ayers
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
13
|
González D, Macaya L, Castillo-Orellana C, Verstraelen T, Vogt-Geisse S, Vöhringer-Martinez E. Nonbonded Force Field Parameters from Minimal Basis Iterative Stockholder Partitioning of the Molecular Electron Density Improve CB7 Host-Guest Affinity Predictions. J Chem Inf Model 2022; 62:4162-4174. [PMID: 35959540 DOI: 10.1021/acs.jcim.2c00316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Binding affinity prediction by means of computer simulation has been increasingly incorporated in drug discovery projects. Its wide application, however, is limited by the prediction accuracy of the free energy calculations. The main error sources are force fields used to describe molecular interactions and incomplete sampling of the configurational space. Organic host-guest systems have been used to address force field quality because they share similar interactions found in ligands and receptors, and their rigidity facilitates configurational sampling. Here, we test the binding free energy prediction accuracy for 14 guests with an aromatic or adamantane core and the CB7 host using molecular electron density derived nonbonded force field parameters. We developed a computational workflow written in Python to derive atomic charges and Lennard-Jones parameters with the Minimal Basis Iterative Stockholder method using the polarized electron density of several configurations of each guest in the bound and unbound states. The resulting nonbonded force field parameters improve binding affinity prediction, especially for guests with an adamantane core in which repulsive exchange and dispersion interactions to the host dominate.
Collapse
Affiliation(s)
- Duván González
- Departamento de Físico-Química, Facultad de Ciencias Químicas, Universidad de Concepción, 4070386 Concepción, Chile
| | - Luis Macaya
- Departamento de Físico-Química, Facultad de Ciencias Químicas, Universidad de Concepción, 4070386 Concepción, Chile
| | - Carlos Castillo-Orellana
- Departamento de Físico-Química, Facultad de Ciencias Químicas, Universidad de Concepción, 4070386 Concepción, Chile
| | - Toon Verstraelen
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark-Zwijnnarde 46, B-9052 Ghent, Belgium
| | - Stefan Vogt-Geisse
- Departamento de Físico-Química, Facultad de Ciencias Químicas, Universidad de Concepción, 4070386 Concepción, Chile
| | - Esteban Vöhringer-Martinez
- Departamento de Físico-Química, Facultad de Ciencias Químicas, Universidad de Concepción, 4070386 Concepción, Chile
| |
Collapse
|
14
|
Pujal L, van Zyl M, Vöhringer-Martinez E, Verstraelen T, Bultinck P, Ayers PW, Heidar-Zadeh F. Constrained iterative Hirshfeld charges: A variational approach. J Chem Phys 2022; 156:194109. [PMID: 35597660 DOI: 10.1063/5.0089466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We develop a variational procedure for the iterative Hirshfeld (HI) partitioning scheme. The main practical advantage of having a variational framework is that it provides a formal and straightforward approach for imposing constraints (e.g., fixed charges on certain atoms or molecular fragments) when computing HI atoms and their properties. Unlike many other variants of the Hirshfeld partitioning scheme, HI charges do not arise naturally from the information-theoretic framework, but only as a reverse-engineered construction of the objective function. However, the procedure we use is quite general and could be applied to other problems as well. We also prove that there is always at least one solution to the HI equations, but we could not prove that its self-consistent equations would always converge for any given initial pro-atom charges. Our numerical assessment of the constrained iterative Hirshfeld method shows that it satisfies many desirable traits of atoms in molecules and has the potential to surpass existing approaches for adding constraints when computing atomic properties.
Collapse
Affiliation(s)
- Leila Pujal
- Department of Chemistry, Queen's University, 90 Bader Lane, Kingston, Ontario K7N 3N6, Canada
| | - Maximilian van Zyl
- Department of Chemistry, Queen's University, 90 Bader Lane, Kingston, Ontario K7N 3N6, Canada
| | - Esteban Vöhringer-Martinez
- Departamento de Físico-Química, Facultad de Ciencias Químicas, Universidad de Concepción, Concepción, Chile
| | - Toon Verstraelen
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark-Zwijnaarde 46, B-9052 Zwijnaarde, Belgium
| | - Patrick Bultinck
- Ghent Quantum Chemistry Group, Department of Chemistry, Ghent University, Krijgslaan 281 S3, B-9000 Ghent, Belgium
| | - Paul W Ayers
- Department of Chemistry and Chemical Biology, McMaster University, Hamilton, Ontario L8S 4L8, Canada
| | - Farnaz Heidar-Zadeh
- Department of Chemistry, Queen's University, 90 Bader Lane, Kingston, Ontario K7N 3N6, Canada
| |
Collapse
|
15
|
Guan X, Das A, Stein CJ, Heidar-Zadeh F, Bertels L, Liu M, Haghighatlari M, Li J, Zhang O, Hao H, Leven I, Head-Gordon M, Head-Gordon T. A benchmark dataset for Hydrogen Combustion. Sci Data 2022; 9:215. [PMID: 35581204 PMCID: PMC9114378 DOI: 10.1038/s41597-022-01330-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 04/20/2022] [Indexed: 11/21/2022] Open
Abstract
The generation of reference data for deep learning models is challenging for reactive systems, and more so for combustion reactions due to the extreme conditions that create radical species and alternative spin states during the combustion process. Here, we extend intrinsic reaction coordinate (IRC) calculations with ab initio MD simulations and normal mode displacement calculations to more extensively cover the potential energy surface for 19 reaction channels for hydrogen combustion. A total of ∼290,000 potential energies and ∼1,270,000 nuclear force vectors are evaluated with a high quality range-separated hybrid density functional, ωB97X-V, to construct the reference data set, including transition state ensembles, for the deep learning models to study hydrogen combustion reaction.
Collapse
Affiliation(s)
- Xingyi Guan
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Akshaya Das
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
| | - Christopher J Stein
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Theoretical Physics and Center for Nanointegration Duisburg-Essen (CENIDE), University of Duisburg-Essen, 47048, Duisburg, Germany
| | - Farnaz Heidar-Zadeh
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
- Department of Chemistry, Queen's University, Kingston, Ontario, K7L 3N6, Canada
| | - Luke Bertels
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
| | - Meili Liu
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
- Department of Chemistry, Beijing Normal University, Beijing, 100875, China
| | - Mojtaba Haghighatlari
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
| | - Jie Li
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
| | - Oufan Zhang
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
| | - Hongxia Hao
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Itai Leven
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Martin Head-Gordon
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Teresa Head-Gordon
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA.
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Departments of Bioengineering and Chemical and Biomolecular Engineering, University of California, Berkeley, CA, USA.
| |
Collapse
|
16
|
Shi Y, Chávez VH, Wasserman A. n2v
: A density‐to‐potential inversion suite. A sandbox for creating, testing, and benchmarking density functional theory inversion methods. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1617] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Yuming Shi
- Department of Physics and Astronomy Purdue University West Lafayette Indiana USA
| | - Victor H. Chávez
- Department of Chemistry Purdue University West Lafayette Indiana USA
| | - Adam Wasserman
- Department of Physics and Astronomy Purdue University West Lafayette Indiana USA
- Department of Chemistry Purdue University West Lafayette Indiana USA
| |
Collapse
|
17
|
Omar ÖH, Nematiaram T, Troisi A, Padula D. Organic materials repurposing, a data set for theoretical predictions of new applications for existing compounds. Sci Data 2022; 9:54. [PMID: 35165288 PMCID: PMC8844419 DOI: 10.1038/s41597-022-01142-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 12/21/2021] [Indexed: 01/28/2023] Open
Abstract
We present a data set of 48182 organic semiconductors, constituted of molecules that were prepared with a documented synthetic pathway and are stable in solid state. We based our search on the Cambridge Structural Database, from which we selected semiconductors with a computational funnel procedure. For each entry we provide a set of electronic properties relevant for organic materials research, and the electronic wavefunction for further calculations and/or analyses. This data set has low bias because it was not built from a set of materials designed for organic electronics, and thus it provides an excellent starting point in the search of new applications for known materials, with a great potential for novel physical insight. The data set contains molecules used as benchmarks in many fields of organic materials research, allowing to test the reliability of computational screenings for the desired application, "rediscovering" well-known molecules. This is demonstrated by a series of different applications in the field of organic materials, confirming the potential for the repurposing of known organic molecules.
Collapse
Affiliation(s)
- Ömer H Omar
- University of Liverpool, Department of Chemistry, Liverpool, L69 7ZD, UK
| | - Tahereh Nematiaram
- University of Liverpool, Department of Chemistry, Liverpool, L69 7ZD, UK
| | - Alessandro Troisi
- University of Liverpool, Department of Chemistry, Liverpool, L69 7ZD, UK.
| | - Daniele Padula
- Università di Siena, Dipartimento di Biotecnologie, Chimica e Farmacia, Siena, 53100, Italy.
| |
Collapse
|
18
|
Guan X, Leven I, Heidar-Zadeh F, Head-Gordon T. Protein C-GeM: A Coarse-Grained Electron Model for Fast and Accurate Protein Electrostatics Prediction. J Chem Inf Model 2021; 61:4357-4369. [PMID: 34490776 DOI: 10.1021/acs.jcim.1c00388] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
The electrostatic potential (ESP) is a powerful property for understanding and predicting electrostatic charge distributions that drive interactions between molecules. In this study, we compare various charge partitioning schemes including fitted charges, density-based quantum mechanical (QM) partitioning schemes, charge equilibration methods, and our recently introduced coarse-grained electron model, C-GeM, to describe the ESP for protein systems. When benchmarked against high quality density functional theory calculations of the ESP for tripeptides and the crambin protein, we find that the C-GeM model is of comparable accuracy to ab initio charge partitioning methods, but with orders of magnitude improvement in computational efficiency since it does not require either the electron density or the electrostatic potential as input.
Collapse
Affiliation(s)
- Xingyi Guan
- Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States.,Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Itai Leven
- Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States.,Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Farnaz Heidar-Zadeh
- Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States.,Department of Chemistry, Queen's University, Kingston, Ontario K7L 3N6, Canada
| | - Teresa Head-Gordon
- Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States.,Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States.,Departments of Bioengineering and Chemical and Biomolecular Engineering, University of California, Berkeley, California 94720, United States
| |
Collapse
|