1
|
Tsou PK, Phan HT, Kuo JL. Using building block structures and a cooperative approach with neural networks and random forest to identify reactions: a case study on the dissociation of sodiated disaccharides. Phys Chem Chem Phys 2025; 27:4355-4367. [PMID: 39927432 DOI: 10.1039/d4cp04275a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2025]
Abstract
A new search scheme utilizing machine-learning methods has been developed to explore the reactions of di-saccharides. It incorporates structure sampling, neural network potential (NNP) training, and target search methodologies, addressing the challenges of their structural diversity and flexibility. We introduce building block sampling to identify transition state (TS) structures and examine the dissociation mechanism of α-maltose under collision-induced dissociation conditions. With a decent NNP model with a mean absolute error of 5 kJ mol-1 for M06-2X/6-311+G(d,p), the 4 main dissociation channels are explored, and more than 70 000 TSs can be located in an extensive search. To prioritize computational resources for low-energy TSs, a target search using random forest is conducted and low-energy TSs with only 42% of the extensive computational workload are identified. With the NNP-accelerated target search scheme, we demonstrated that di-saccharide reaction exploration can be done within a few days.
Collapse
Affiliation(s)
- Pei-Kang Tsou
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei, 10617, Taiwan.
| | - Huu Trong Phan
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei, 10617, Taiwan.
- Molecular Science and Technology Program, Taiwan International Graduate Program, Academia Sinica, Taipei, 11529, Taiwan
- Department of Chemistry, National Tsing Hua University, Hsinchu 30013, Taiwan
| | - Jer-Lai Kuo
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei, 10617, Taiwan.
- Molecular Science and Technology Program, Taiwan International Graduate Program, Academia Sinica, Taipei, 11529, Taiwan
- Department of Chemistry, National Tsing Hua University, Hsinchu 30013, Taiwan
- International Graduate Program of Molecular Science and Technology (NTU-MST), National Taiwan University, Taipei 10617, Taiwan
| |
Collapse
|
2
|
Hsu PJ, Mizuide A, Kuo JL, Fujii A. Hydrogen bond network structures of protonated 2,2,2-trifluoroethanol/ethanol mixed clusters probed by infrared spectroscopy combined with a deep-learning structure sampling approach: the origin of the linear type network preference in protonated fluoroalcohol clusters. Phys Chem Chem Phys 2024; 26:27751-27762. [PMID: 39470069 DOI: 10.1039/d4cp03534h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/30/2024]
Abstract
While preferential hydrogen bond network structures of cold protonated alcohol clusters H+(ROH)n are generally switched from a linear type to a cyclic one at n = 4-5, those of protonated 2,2,2-trifluoroethanol (TFE) clusters maintain linear type structures at least in the size range of n = 3-7. To explore the origin of the strong linear type network preference of H+(TFE)n, infrared spectra of protonated mixed clusters H+(TFE)m(ethanol)n (m + n = 5) were measured. An efficient structure sampling technique using parallelized basin-hopping algorithms and deep-learning neural network potentials is developed to search for essential isomers of the mixed clusters. Vibrational simulations based on the harmonic superposition approximation were compared with the observed spectra to identify the major isomer component at each mixing ratio. It was found that the formation of the cyclic structure occurs only in n ≥ 3 of the mixed clusters, in which the proton solvating sites and the double acceptor site are occupied by ethanol. The crucial role of the stability of the double acceptor site in the cyclic structure formation is discussed.
Collapse
Affiliation(s)
- Po-Jen Hsu
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei 10617, Taiwan.
| | - Atsuya Mizuide
- Department of Chemistry, Graduate School of Science, Tohoku University, Sendai 980-8578, Japan.
| | - Jer-Lai Kuo
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei 10617, Taiwan.
| | - Asuka Fujii
- Department of Chemistry, Graduate School of Science, Tohoku University, Sendai 980-8578, Japan.
| |
Collapse
|
3
|
Williams CD, Kalayan J, Burton NA, Bryce RA. Stable and accurate atomistic simulations of flexible molecules using conformationally generalisable machine learned potentials. Chem Sci 2024; 15:12780-12795. [PMID: 39148799 PMCID: PMC11323334 DOI: 10.1039/d4sc01109k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 07/07/2024] [Indexed: 08/17/2024] Open
Abstract
Computational simulation methods based on machine learned potentials (MLPs) promise to revolutionise shape prediction of flexible molecules in solution, but their widespread adoption has been limited by the way in which training data is generated. Here, we present an approach which allows the key conformational degrees of freedom to be properly represented in reference molecular datasets. MLPs trained on these datasets using a global descriptor scheme are generalisable in conformational space, providing quantum chemical accuracy for all conformers. These MLPs are capable of propagating long, stable molecular dynamics trajectories, an attribute that has remained a challenge. We deploy the MLPs in obtaining converged conformational free energy surfaces for flexible molecules via well-tempered metadynamics simulations; this approach provides a hitherto inaccessible route to accurately computing the structural, dynamical and thermodynamical properties of a wide variety of flexible molecular systems. It is further demonstrated that MLPs must be trained on reference datasets with complete coverage of conformational space, including in barrier regions, to achieve stable molecular dynamics trajectories.
Collapse
Affiliation(s)
- Christopher D Williams
- Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester Oxford Road Manchester M13 9PL UK
| | - Jas Kalayan
- Science and Technologies Facilities Council (STFC), Daresbury Laboratory Keckwick Lane, Daresbury Warrington WA4 4AD UK
| | - Neil A Burton
- Department of Chemistry, School of Natural Sciences, Faculty of Science and Engineering, The University of Manchester Oxford Road Manchester M13 9PL UK
| | - Richard A Bryce
- Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester Oxford Road Manchester M13 9PL UK
| |
Collapse
|
4
|
Dong HC, Hsu PJ, Kuo JL. Searching low-energy conformers of neutral and protonated di-, tri-, and tetra-glycine using first-principles accuracy assisted by the use of neural network potentials. Phys Chem Chem Phys 2024; 26:11126-11139. [PMID: 38530660 DOI: 10.1039/d3cp05659g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2024]
Abstract
In the last ten years, combinations of state-of-the-art gas-phase spectroscopies and quantum chemistry calculations have suggested several intuitive trends in the structure of small polypeptides that may not hold true. For example, the preference for the cis form of the peptide bond and multiple protonated sites was proposed by comparing experimental spectra with low-energy minima obtained from limited structural sampling using various density functional theory methods. For understanding the structures of polypeptides, extensive sampling of their configurational space with high-accuracy computational methods is required. In this work, we demonstrated the use of deep-learning neural network potential (DL-NNP) to assist in exploring the structure and energy landscape of di-, tri-, and tetra-glycine with the accuracy of high-level quantum chemistry methods, and low-energy conformers of small polypeptides can be efficiently located. We hope that the structures of these polypeptides we found and our preliminary analysis will stimulate further experimental investigations.
Collapse
Affiliation(s)
- Hieu Cao Dong
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei, 10617, Taiwan.
- Molecular Science and Technology Program, Taiwan International Graduate Program, Academia Sinica, Taipei, 11529, Taiwan
- International Graduate Program of Molecular Science and Technology (NTU-MST), National Taiwan University, Taipei 10617, Taiwan
| | - Po-Jen Hsu
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei, 10617, Taiwan.
| | - Jer-Lai Kuo
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei, 10617, Taiwan.
- Molecular Science and Technology Program, Taiwan International Graduate Program, Academia Sinica, Taipei, 11529, Taiwan
- International Graduate Program of Molecular Science and Technology (NTU-MST), National Taiwan University, Taipei 10617, Taiwan
- Department of Chemistry, National Tsing Hua University, Hsinchu 30013, Taiwan
| |
Collapse
|
5
|
Phan HT, Tsou PK, Hsu PJ, Kuo JL. A first-principles exploration of the conformational space of sodiated di-saccharides assisted by semi-empirical methods and neural network potentials. Phys Chem Chem Phys 2024; 26:9556-9567. [PMID: 38456454 DOI: 10.1039/d3cp05362h] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
Previous exploration of the conformational space of sodiated mono-saccharides using a random search algorithm leads to ∼103 structurally distinct conformers covering an energy range of ∼150 kJ mol-1. Thus, it is reasonable to expect that the number of distinct conformers for a given disaccharide would be on the order of 106. Efficient identification of distinct conformers at the first-principles level has been demonstrated with the assistance of neural network potential (NNP) with an accuracy of ∼1 kJ mol-1 compared to DFT. Leveraging a local minima database of neutral and sodiated glucose (Glc), we develop algorithms to systematically explore the conformation landscape of 19 Glc-based sodiated disaccharides. To accelerate the exploration, the NNP method is implemented. The NNP achieves an accuracy of ∼2.3 kJ mol-1 compared to DFT, offering a comparable quality to that of DFT. Through a multi-model approach integrating DFTB3, NNP and DFT, we can rapidly locate low-energy disaccharide conformers at the first-principles level. The methodology we show here can be used to efficiently explore the potential energy landscape of any di-saccharides when first-principles accuracy is required.
Collapse
Affiliation(s)
- Huu Trong Phan
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei, 10617, Taiwan.
- Molecular Science and Technology Program, Taiwan International Graduate Program, Academia Sinica, Taipei, 11529, Taiwan
- Department of Chemistry, National Tsing Hua University, Hsinchu 30013, Taiwan
| | - Pei-Kang Tsou
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei, 10617, Taiwan.
| | - Po-Jen Hsu
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei, 10617, Taiwan.
| | - Jer-Lai Kuo
- Institute of Atomic and Molecular Sciences, Academia Sinica, Taipei, 10617, Taiwan.
- Molecular Science and Technology Program, Taiwan International Graduate Program, Academia Sinica, Taipei, 11529, Taiwan
- Department of Chemistry, National Tsing Hua University, Hsinchu 30013, Taiwan
- International Graduate Program of Molecular Science and Technology (NTU-MST), National Taiwan University, Taipei 10617, Taiwan
| |
Collapse
|